pairwise

Functions to infer phylogenetic history between two extant hstrat columns.

Functions

ballpark_patristic_distance_between(first, ...)

Calculate a fast, rough estimate of the patristic distance between first and second.

ballpark_rank_of_mrca_between(first, second)

Calculate a fast, rough estimate of the rank of the MRCA beteen first and second.

ballpark_ranks_since_mrca_with(focal, other)

Calculate a fast, rough estimate of generations elapsed since MRCA with other.

calc_patristic_distance_bounds_between(...)

What is the total phylogenetic distance along the branch path connecting first and second?

calc_rank_of_earliest_detectable_mrca_between(...)

After what generation is common ancstry robustly detectable?

calc_rank_of_mrca_bounds_between(first, ...)

Within what generation range did MRCA fall?

calc_rank_of_mrca_bounds_provided_confidence_level(...)

Calculate provided confidence for a MRCA generation estimate.

calc_rank_of_mrca_uncertainty_between(first, ...)

How wide is the estimate window for generation of MRCA?

calc_ranks_since_earliest_detectable_mrca_with(...)

How many generations have elapsed since the first where common ancestry with other could be detected?

calc_ranks_since_mrca_bounds_provided_confidence_level(...)

Calculate provided confidence for a MRCA generation estimate.

calc_ranks_since_mrca_bounds_with(focal, ...)

How many generations have elapsed since MRCA?

calc_ranks_since_mrca_uncertainty_with(...)

How wide is the estimation window for generations elapsed since MRCA?

does_definitively_have_no_common_ancestor(...)

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

does_have_any_common_ancestor(first, second)

Determine if common ancestry is evidenced with second.

estimate_patristic_distance_between(first, ...)

Estimate the total phylogenetic distance along the branch path connecting the columns.

estimate_rank_of_mrca_between(first, second, ...)

At what generation did the most recent common ancestor of first and second occur?

estimate_ranks_since_mrca_with(focal, other, ...)

How many generations have elapsed since focal's most recent common ancestor with other?

ballpark_patristic_distance_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]) float | None

Calculate a fast, rough estimate of the patristic distance between first and second.

See estimate_patristic_distance_between for details.

ballpark_rank_of_mrca_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]) float | None

Calculate a fast, rough estimate of the rank of the MRCA beteen first and second.

See estimate_rank_of_mrca_between for details.

ballpark_ranks_since_mrca_with(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]) float | None

Calculate a fast, rough estimate of generations elapsed since MRCA with other.

See estimate_ranks_since_mrca_with for details.

calc_patristic_distance_bounds_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], confidence_level: float = 0.95) Tuple[int, int] | None

What is the total phylogenetic distance along the branch path connecting first and second?

Calculate confidence interval for patristic distance estimate. Branch length here is in terms of number of generations elapsed. So the calculated distance estimates the sum of the number of generations elapsed

Parameters

prior{“arbitrary”}

Prior probability density distribution over possible generations of the MRCA.

Currently only “arbitrary” supported.

confidence_levelfloat, optional

Bounds must capture what probability of containing the true patristic distance? Default 0.95.

Returns

(int, int), optional

Inclusive lower and then exclusive upper bound on patristic distance estimate or None if no common ancestor between first and second can be resolved with sufficient confidence. (Sufficient confidence depends on confidence_level.)

See Also

calc_rank_of_earliest_detectable_mrca_between :

Could any MRCA be detected between first and second? What is the rank of the earliest MRCA that could be reliably detected?

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

Notes

The true patristic distance is guaranteed to never fall below the returned bounds but may fall above.

calc_rank_of_earliest_detectable_mrca_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], confidence_level: float = 0.95) int | None

After what generation is common ancstry robustly detectable?

Calculates the earliest possible rank a MRCA between first and second could be reliably detected at.

Even if a true MRCA of first and second exists, if it occured earlier than the rank calculated here it could not be reliably detected with sufficient confidence after accounting for the possibility of spurious differentia collisions. (Although subsequent spurious differentia collisions after the true MRCA of first and second could lead to MRCA detection at such a rank.)

Returns None if insufficient common ranks exist between first and second to ever conclude at the given confidence level the existance of any common ancestry between first and second (even if all strata at common ranks had equivalent differentiae).

calc_rank_of_mrca_bounds_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], confidence_level: float = 0.95, strict=True) Tuple[int, int] | None

Within what generation range did MRCA fall?

Calculate bounds on estimate for the number of depositions elapsed along the line of descent before the most recent common ancestor with second.

Parameters

prior{“arbitrary”}

Prior probability density distribution over possible generations of the MRCA.

Currently only “arbitrary” supported.

confidence_levelfloat, optional

Bounds must capture what probability of containing the true rank of the MRCA? Default 0.95.

Returns

(int, int), optional

Inclusive lower and then exclusive upper bound on estimate or None if no common ancestor between first and second can be resolved with sufficient confidence. (Sufficient confidence depends on confidence_level.)

See Also

calc_rank_of_mrca_uncertainty_between :

Wrapper to report uncertainty of calculated bounds.

calc_rank_of_earliest_detectable_mrca_between :

Could any MRCA be detected between first and second? What is the rank of the earliest MRCA that could be reliably detected?

calc_rank_of_mrca_bounds_provided_confidence_level :

With what actual confidence (i.e., more than requested) is the true rank of the MRCA captured within the calculated bounds?

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

Notes

The true rank of the MRCA is guaranteed to never fall above the returned bounds but may fall below.

An alternate approach could be to construct the bounds such that the true rank of the MRCA will fall above or below the bounds with equal probability. This would involve setting the confidence level for calculating the first disparity with second to significance_level/2 and the confidence level for calculaing the last comonality with second to 1 - significance_level/2. This means the confidence level applied to calculating the first disparity with second would always be <= 0.5. However, shifting the calculated first disparity with second below the definitive max first retained disparity requires confidence level >= 0.5. So, in practice such a symmetric approach would only result in the lower bound being shifted downward. For this reason, it is no longer provided as an option.

In the absence of evidence to the contrary (i.e., more common strata than spurious differentia collisions alone could plausibly cause), this method assumes no common ancestry between first and second, returning None. This means that if few enough common ranks are shared between first and second (and the differentia bit with is small enough), it may not be possible to detect any common ancestry after accounting for the possibility of spurious differentia collisions (even if common ancestry did exist). So, calls to this method would always return None. Likewise, MRCAs at very early ranks may not be able to be reliably detected due to insufficient evidence. This can lead to cases where columns with true common ancestry have MRCA bounds estimated as None at much higher than the expected failure rate at the given confidence level. Note that with sufficient differentia bit width (i.e., so that even one collision is implausible at the given confidence level) this issue does not occur. Use CalcRankOfEarliestDetectableMrcaWith to determine the earliest rank at which an MRCA could be reliably detected between first and second.

calc_rank_of_mrca_bounds_provided_confidence_level(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], requested_confidence_level: float = 0.95) float

Calculate provided confidence for a MRCA generation estimate.

With what actual confidence is the true rank of the MRCA captured within the calculated estimate bounds for a requested confidence level? Guaranteed greater than or equal to the requested confidence level.

The same argument may be provided for focal and other.

calc_rank_of_mrca_uncertainty_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], confidence_level: float = 0.95) int | None

How wide is the estimate window for generation of MRCA?

Calculate uncertainty of estimate for the number of depositions elapsed along the line of descent before the most common recent ancestor with second.

Returns 0 if no common ancestor between first and second can be resolved with sufficient confidence. If insufficient common ranks between first and second are available to resolve any common ancestor, returns None.

See Also

calc_rank_of_mrca_bounds_between :

Calculates bound whose uncertainty this method reports. See the corresponding docstring for explanation of parameters.

calc_ranks_since_earliest_detectable_mrca_with(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], confidence_level: float = 0.95) int | None

How many generations have elapsed since the first where common ancestry with other could be detected?

How many depositions have elapsed along focal’s lineage since the earliest possible rank a MRCA between focal and other could be reliably detected at?

Even if a true MRCA of focal and other exists, if it occured earlier than the rank calculated here it could not be reliably detected with sufficient confidence after accounting for the possibility of spurious differentia collisions. (Although subsequent spurious differentia collisions after the true MRCA of focal and other could lead to MRCA detection at such a rank.)

Returns None if insufficient common ranks exist between focal and other to ever conclude at the given confidence level the existance of any common ancestry between focal and other (even if all strata at common ranks had equivalent differentiae).

calc_ranks_since_mrca_bounds_with(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], confidence_level: float = 0.95) Tuple[int, int] | None

How many generations have elapsed since MRCA?

Calculate bounds on estimate for the number of depositions elapsed along focal column’s line of descent since the most recent common ancestor with other.

Parameters

prior{“arbitrary”}

Prior probability density distribution over possible generations of the MRCA.

Currently only “arbitrary” supported.

confidence_levelfloat, optional

With what probability should the true rank of the MRCA fall within the calculated bounds? Default 0.95.

Returns

(int, int), optional

Inclusive lower bound and then exclusive upper bound on estimate or None if no common ancestor between focal and other can be resolved with sufficient confidence. (Sufficient confidence depends on bound_type.)

See Also

calc_ranks_since_mrca_uncertainty_between :

Wrapper to report uncertainty of calculated bounds.

calc_ranks_since_earliest_detectable_mrca_between :

Could any MRCA be detected between focal and other? How many ranks have elapsed since the earliest MRCA that could be reliably detected?

calc_ranks_since_mrca_bounds_provided_confidence_level :

With what actual confidence (i.e., more than requested) is the true rank of the MRCA captured within the calculated bounds?

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

Notes

The true number of ranks since the MRCA is guaranteed to never fall below the bounds but may fall above.

An alternate approach could be to construct the bounds such that the true number of ranks since the MRCA will fall above or below the bounds with equal probability. This would involve setting the confidence level for calculating the first disparity with other to significance_level/2 and the confidence level for calculaing the last comonality with other to 1 - significance_level/2. This means the confidence level applied to calculating the first disparity with other would always be <= 0.5. However, shifting the calculated first disparity with other above the definitive min first retained disparity requires confidence level >= 0.5. So, in practice such a symmetric approach would only result in the upper bound being shifted upward. For this reason, it is no longer provided as an option.

In the absence of evidence to the contrary (i.e., more common strata than spurious differentia collisions alone could plausibly cause), this method assumes no common ancestry between focal and other, returning None. This means that if few enough common ranks are shared between focal and other (and the differentia bit with is small enough), it may not be possible to detect any common ancestry after accounting for the possibility of spurious differentia collisions (even if common ancestry did exist). So, calls to this method would always return None. Likewise, MRCAs at very early ranks may not be able to be reliably detected due to insufficient evidence. This can lead to cases where columns with true common ancestry have MRCA bounds estimated as None at much higher than the expected failure rate at the given confidence level. Note that with sufficient differentia bit width (i.e., so that even one collision is implausible at the given confidence level) this issue does not occur. Use CalcRanksSinceEarliestDetectableMrcaWith to determine the earliest rank at which an MRCA could be reliably detected between focal and other.

calc_ranks_since_mrca_bounds_provided_confidence_level(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], requested_confidence_level: float = 0.95) float

Calculate provided confidence for a MRCA generation estimate.

With what actual confidence is the true rank of the MRCA captured within the calculated estimate bounds for a requested confidence level? Guaranteed greater than or equal to the requested confidence level.

The same argument may be provided for focal and other.

calc_ranks_since_mrca_uncertainty_with(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], prior: Literal['arbitrary'], confidence_level: float = 0.95) int | None

How wide is the estimation window for generations elapsed since MRCA?

Calculates uncertainty of estimate for the number of depositions elapsed along focal column’s line of descent since the most common recent ancestor with other.

Returns 0 if no common ancestor between focal and other can be resolved with sufficient confidence. If insufficient common ranks between focal and other are available to resolve any common ancestor, returns None.

See Also

calc_ranks_since_mrca_bounds_with :

Calculates bound whose uncertainty this method reports. See the corresponding docstring for explanation of parameters.

does_definitively_have_no_common_ancestor(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]) bool

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

If the founding strata of first and second (i.e., generation 0) have unequal differentia, then first and second cannot possibly share common ancestry.

Note equal differentia at generation 0 does not necessarily imply common ancestry; colliding differentia values could have been independently generated by chance.

Note also that stratum rention policies are strictly required to permanently retain the most ancient stratum.

See Also

does_have_any_common_ancestor :

Can we conclude with confidence_level confidence that first and second share a common ancestor?

does_have_any_common_ancestor(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], confidence_level: float = 0.95) bool | None

Determine if common ancestry is evidenced with second.

If insufficient common ranks between first and second are available to resolve any common ancestor, returns None.

Note that stratum rention policies are strictly required to permanently retain the most ancient stratum.

Parameters

confidence_levelfloat, optional

The probability that we will correctly conclude no common ancestor is shared with second if, indeed, no common ancestor is actually shared. Default 0.95.

See Also

does_definitively_have_no_common_ancestor :

Can we definitively conclude that first and second share no common ancestor?

estimate_patristic_distance_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], estimator: str, prior: Literal['arbitrary', 'uniform'] | PriorBase) float | None

Estimate the total phylogenetic distance along the branch path connecting the columns.

Branch length here is in terms of number of generations elapsed. So the calculated distance estimates the sum of the number of generations elapsed from each to their most recent common ancestor.

Parameters

estimator{“maximum_likelihood”, “unbiased”}

What estimation method should be used? Options are “maximum_likelihood” or “unbiased”.

See estimate_ranks_since_mrca_with for discussion of estimator options.

prior{“arbitrary”, “uniform”} or object implementing prior interface

Prior probability density distribution over possible generations of the MRCA.

See estimate_rank_of_mrca_between for discussion of prior options.

Returns

float, optional

Estimate of patristic distance, unless first and second definitively share no common ancestor in which case None will be returned.

See Also

calc_patristic_distance_bounds_between :

Calculates confidence intervals for patristic distance between two hereditary stratigraphic columns.

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

estimate_rank_of_mrca_between(first: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], second: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], estimator: str, prior: Literal['arbitrary', 'uniform'] | PriorBase) float | None

At what generation did the most recent common ancestor of first and second occur?

Parameters

estimator{“maximum_likelihood”, “unbiased”}

What estimation method should be used? Options are “maximum_likelihood” or “unbiased”.

The “maximum_likelihood” estimator is faster to compute than the “unbiased” estimator.

prior{“arbitrary”, “uniform”} or object implementing prior interface

Prior probability density distribution over possible generations of the MRCA.

Implementations for arbitrary, geometric, exponential, and uniform priors are available in hstrat.phylogenetic_inference.priors. User -defined classes specifying custom priors can also be provided.

Returns

float, optional

Estimate of MRCA rank, unless first and second definitively share no common ancestor in which case None will be returned.

See Also

calc_rank_of_mrca_bounds_between :

Calculates confidence intervals for generation of most recent common ancestor between two hereditary stratigraphic columns.

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?

estimate_ranks_since_mrca_with(focal: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], other: Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen], estimator: str, prior: str | Any) float | None

How many generations have elapsed since focal’s most recent common ancestor with other?

More specifically, estimate the number of depositions elapsed along focal column’s line of descent since the most recent common ancestor with other.

Parameters

estimator{“maximum_likelihood”, “unbiased”}

What estimation method should be used? Options are “maximum_likelihood” or “unbiased”.

See estimate_ranks_since_mrca_with for discussion of estimator options.

prior : {“arbitrary”, “uniform”} or object implementing prior interface

Prior probability density distribution over possible generations of the MRCA.

See estimate_ranks_since_mrca_with for discussion of prior options.

Returns

float, optional

Estimate of generations elapsed since MRCA, unless first and second definitively share no common ancestor in which case None will be returned.

See Also

calc_ranks_since_mrca_bounds_with :

Calculates confidence intervals for generations elapsed along one column’s line of descent since most recent common ancestor with another column.

does_definitively_have_no_common_anestor :

Does the hereditary stratigraphic record definitively prove that first and second could not possibly share a common ancestor?