population
Functions to infer phylogenetic history among a population of extant hstrat columns.
Functions
|
Compute all-pairs patristic distance among a population of extant hereditary stratigraphic columns as a BioPython DistanceMatrix. |
|
Compute all-pairs patristic distance among a population of extant hereditary stratigraphic columns. |
After what generation is common ancstry robustly detectable? |
|
|
Within what generation range did MRCA fall? |
|
How wide is the estimate window for generation of MRCA? |
Could the population possibly share a common ancestor? |
|
|
Determine if common ancestry is evidenced within the population. |
- build_distance_matrix_biopython(population: Sequence[Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]], estimator: str, prior: str | Any, taxon_labels: Iterable | None = None, force_common_ancestry: bool | None = False) DistanceMatrix
Compute all-pairs patristic distance among a population of extant hereditary stratigraphic columns as a BioPython DistanceMatrix.
Parameters
- populationSequence[HereditaryStratigraphicArtifact]
The extant hereditary stratigraphic columns to compare.
The ordering of rows and columns within the returned matrix will correspond to the ordering of columns in this sequence.
- estimator{“maximum_likelihood”, “unbiased”}
What patristic distance estimation method should be used? Options are “maximum_likelihood” or “unbiased”.
See estimate_ranks_since_mrca_with for discussion of estimator options.
- prior :
Prior probability density distribution over possible generations of the MRCA for MRCA estimation.
See estimate_ranks_since_mrca_with for discussion of prior options.
- taxon_labelsIterable[str]], optional
How should leaf nodes representing extant hereditary stratigraphic columns be named?
Label order should correspond to the order of corresponding hereditary stratigraphic columns within population. If None, taxons will be named according to their numerical index.
- force_common_ancestryOptional[bool], optional
How should columns that definitively share no common ancestry be handled?
If set to True, treat columns with no common ancestry as if they shared a common ancestor immediately before the genesis of the lineages. If set to False, set the patristic distance between columns with no common ancestry as NaN.
If set to None, as default, the presence of columns that definitively share no common ancestry will raise a ValueError.
Returns
- np.ndarray
The patristic distance matrix of the population, a square numpy array of shape (len(population), len(population)).
Raises
- ValueError
If the distance between two columns that definitively share no common ancestry are attempted to be computed without setting force_common_ancestry.
- build_distance_matrix_numpy(population: Sequence[Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]], estimator: str, prior: str | Any, force_common_ancestry: bool | None = False) ndarray
Compute all-pairs patristic distance among a population of extant hereditary stratigraphic columns.
Parameters
- populationSequence[HereditaryStratigraphicArtifact]
The extant hereditary stratigraphic columns to compare.
The ordering of rows and columns within the returned matrix will correspond to the ordering of columns in this sequence.
- estimator{“maximum_likelihood”, “unbiased”}
What patristic distance estimation method should be used? Options are “maximum_likelihood” or “unbiased”.
See estimate_ranks_since_mrca_with for discussion of estimator options.
- prior :
Prior probability density distribution over possible generations of the MRCA for MRCA estimation.
See estimate_ranks_since_mrca_with for discussion of prior options.
- force_common_ancestryOptional[bool], optional
How should columns that definitively share no common ancestry be handled?
If set to True, treat columns with no common ancestry as if they shared a common ancestor immediately before the genesis of the lineages. If set to False, set the patristic distance between columns with no common ancestry as NaN.
If set to None, as default, the presence of columns that definitively share no common ancestry will raise a ValueError.
Returns
- np.ndarray
The patristic distance matrix of the population, a square numpy array of shape (len(population), len(population)).
Raises
- ValueError
If the distance between two columns that definitively share no common ancestry are attempted to be computed without setting force_common_ancestry.
- calc_rank_of_earliest_detectable_mrca_among(population: Iterable[Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]], confidence_level: float = 0.95) int | None
After what generation is common ancstry robustly detectable?
Calculates the earliest possible rank a MRCA among the population could be reliably detected at.
Even if a true MRCA of the population exists, if it occured earlier than the rank calculated here it could not be reliably detected with sufficient confidence after accounting for the possibility of spurious differentia collisions. (Although subsequent spurious differentia collisions after the true MRCA of first and second could lead to MRCA detection at such a rank.)
Returns None if insufficient common ranks exist with population to ever conclude at the given confidence level the existance of any common ancestry among th epopulation (even if all strata at common ranks had equivalent differentiae).
Also returns None if population is empty or singleton.
Notes
Currently implementaiton uses a naive O(n^2) approach. A more efficient implementation should be possible.
- calc_rank_of_mrca_bounds_among(population: Iterable[Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]], prior: str, confidence_level: float = 0.95) Tuple[int, int] | None
Within what generation range did MRCA fall?
Calculate bounds on estimate for the number of depositions elapsed along the line of descent before the most recent common ancestor among population.
Parameters
- prior{“arbitrary”}
Prior probability density distribution over possible generations of the MRCA.
Currently only “arbitrary” supported.
- confidence_levelfloat, optional
Bounds must capture what probability of containing the true rank of the MRCA? Default 0.95.
Returns
- (int, int), optional
Inclusive lower and then exclusive upper bound on estimate or None if no common ancestor between first and second can be resolved with sufficient confidence. (Sufficient confidence depends on confidence_level.) Also returns None for empty or singleton population.
Notes
Currently implementaiton uses a naive O(n^2) approach. A more efficient implementation should be possible.
The true rank of the MRCA is guaranteed to never fall above the bounds but may fall below.
- calc_rank_of_mrca_uncertainty_among(population: Iterable[Union[HereditaryStratigraphicColumn, HereditaryStratigraphicSpecimen]], prior: str, confidence_level: float = 0.95) int | None
How wide is the estimate window for generation of MRCA?
Calculate uncertainty of estimate for the number of depositions elapsed along the line of descent before the most common recent ancestor with second.
Returns 0 if no common ancestor between first and second can be resolved with sufficient confidence. If insufficient common ranks between first and second are available to resolve any common ancestor, returns None. If population is empty or singleton, also returns None.
See Also
- calc_rank_of_mrca_bounds_among :
Calculates bound whose uncertainty this method reports. See the corresponding docstring for explanation of parameters.
Could the population possibly share a common ancestor?
Note that stratum rention policies are strictly required to permanently retain the most ancient stratum.
Returns None if population is empty or singleton.
See Also
- does_share_any_common_ancestor:
Can we conclude with confidence_level confidence that the population shares a common ancestor?
Determine if common ancestry is evidenced within the population.
If insufficient common ranks between strata are available to resolve any common ancestor, returns None.
Note that stratum rention policies are strictly required to permanently retain the most ancient stratum.
Parameters
- confidence_levelfloat, optional
The probability that we will correctly conclude no common ancestor is shared if, indeed, no common ancestor is actually shared. Default 0.95.
See Also
- does_definitively_share_no_common_ancestor :
Can we definitively conclude that first and second share no common ancestor?