frozen_instrumentation

Frozen representations of genome annotations for efficient postprocessing and analysis.

Classes

HereditaryStratigraphicAssemblage

A collection of HereditaryStratigraphicSpecimens, padded to include entries for all ranks retained by any specimen within the assemblage.

HereditaryStratigraphicAssemblageSpecimen

Postprocessing representation of the differentia retained by an extant HereditaryStratigraphicColumn, indexed by deposition rank.

HereditaryStratigraphicSpecimen

Postprocessing representation of the differentia retained by an extant HereditaryStratigraphicColumn, indexed by deposition rank.

class HereditaryStratigraphicAssemblage

A collection of HereditaryStratigraphicSpecimens, padded to include entries for all ranks retained by any specimen within the assemblage.

This allows for more efficient comparisons between specimens, due to direct alignment.

Parameters

specimensiterable of HereditaryStratigraphicSpecimen

The specimens that make up the assemblage.

See Also

HereditaryStratigraphicSpecimen

Type alias for a postprocessing representation of the differentia retained by an extant HereditaryStratigraphicColumn, indexed by deposition rank.

assemblage_from_records

Deserialize a HereditaryStratigraphicSpecimen from a dict composed of builtin data types.

pop_to_assemblage

Create a HereditaryStratigraphicAssemblage from a collection of `HereditaryStratigraphicColumn`s.

BuildSpecimens() Iterator[HereditaryStratigraphicAssemblageSpecimen][source]

Iterator over specimens in assemblage as potentially-padded HereditaryStratigraphicAssemblageSpecimen objects.

__init__(specimens: Iterable[HereditaryStratigraphicSpecimen]) None[source]

Construct a new HereditaryStratigraphicAssemblage instance.

Takes a collection of HereditaryStratigraphicSpecimen instances and creates a new HereditaryStratigraphicAssemblage instance containing those specimens.

class HereditaryStratigraphicAssemblageSpecimen

Postprocessing representation of the differentia retained by an extant HereditaryStratigraphicColumn, indexed by deposition rank.

Differentia are stored using a nullable integer representation, which allows for inclusion of entries for all ranks retained by any specimen within the assemblage, even if that particualr rank is not retained by this specimen. This allows for more efficient comparisons between specimens, due to direct alignment.

See Also

HereditaryStratigraphicSpecimen

Specimen representation that can contain only ranks retained by that specimen.

HereditaryStratigraphicAssemblage

Gathers a collection of `HereditaryStratigraphicSpecimen`s and facilitates creation of corresponding aligned `HereditaryStratigraphicAssemblageSpecimen`s.

GetData() Series[DataType(UInt64)][source]

Get the underlying Pandas Series containing differentia values indexed by rank.

Notes

This function directly returns the specimen’s underlying Series data, so mutation of the returned object will alter or invalidate this specimen.

GetDifferentiaVals() ndarray[source]

Get the integer underlying values in the stored Pandas NullableInteger Series.

Returns

differentianp.ndarray

A 1-dimensional NumPy integer ndarray containing differentia values, including garbage values where the underlying Series is null.

Notes

This function returns a direct view into the Series data, so no copy is made. Changes to the returned array will propagate to the Series object’s underlying values, and vice versa.

GetNumDiscardedStrata() int[source]

How many deposited strata have been discarded?

Determined by number of generations elapsed and the configured column retention policy.

GetNumStrataDeposited() int[source]

How many strata have been depostited on the column?

Note that a first stratum is deposited on the column during initialization.

GetNumStrataRetained() int[source]

How many strata are currently stored within the column?

May be fewer than the number of strata deposited if strata have been discarded as part of the configured stratum retention policy.

GetRankAtColumnIndex(index: int) int[source]

Map array position to generation of deposition.

What is the deposition rank of the stratum positioned at index i among retained strata? Index order is from most ancient (index 0) to most recent.

GetRankIndex() ndarray[source]

Get the integer index in the stored Pandas Series, representing the ranks of stratum entries.

Returns

ranksnp.ndarray

A numpy array containing ranks of differentia entries, including null entries for differentia that are not retained.

GetStratumDifferentiaBitWidth() int[source]

How many bits wide are the differentia of strata?

GetStratumMask() ndarray[source]

Get a boolean mask indicating which entries in the stored Pandas NullableInteger Series are null.

I.e., which ranks does this specimen not retain differentia at?

Returns

masknp.ndarray

A 1-dimensional boolean NumPy ndarray.

True values indicate nullness.

Notes

This function returns the underlying boolean mask used by the stored Pandas Series object to represent null values. This mask is a direct view into the Series data, so no copy is made. Changes to the mask will propagate to the store Series object, and vice versa.

HasDiscardedStrata() bool[source]

Have any deposited strata been discarded?

IterRankDifferentiaZip(copyable: bool = False) Iterator[Tuple[int, int]][source]

Iterate over ranks of retained strata and their differentia.

If copyable, return an iterator that can be copied to produce a new fully-independent iterator at the same position.

Equivalent to zip(specimen.IterRetainedRanks(), specimen.IterRetainedDifferentia()), but may be more efficient.

IterRetainedDifferentia() Iterator[int][source]

Iterate over differentia of strata retained in the specimen.

Differentia yielded from most ancient to most recent.

IterRetainedRanks() Iterator[int][source]

Iterate over deposition ranks of strata retained in the specimen.

__init__(stratum_differentia_series: Series[DataType(UInt64)], stratum_differentia_bit_width: int) None[source]

Initialize a HereditaryStratigraphicAssemblageSpecimen object with a (potentially sparse) sequence of rank-indexed differentia and a differentia bit width.

class HereditaryStratigraphicSpecimen

Postprocessing representation of the differentia retained by an extant HereditaryStratigraphicColumn, indexed by deposition rank.

All entries correspond to retained differentia (i.e., no entries are null).

See Also

HereditaryStratigraphicAssemblageSpecimen

Specimen representation that allows for easier alignment among members of a population without perfectly homogeneous retained ranks.

specimen_from_records

Deserialize a HereditaryStratigraphicSpecimen from a dict composed of builtin data types.

col_to_specimen

Create a HereditaryStratigraphicSpecimen from a HereditaryStratigraphicColumn.

GetData() Series[DataType(uint64)][source]

Get the underlying Pandas Series containing differentia values indexed by rank.

Notes

This function directly returns the specimen’s underlying Series data, so mutation of the returned object will alter or invalidate this specimen.

GetDifferentiaVals() ndarray[source]

Get the integer underlying values in the stored Pandas Series.

Returns

differentianp.ndarray

A 1-dimensional NumPy integer ndarray containing differentia values, including garbage values where the underlying Series is null.

Notes

This function returns a direct view into the Series data, so no copy is made. Changes to the returned array will propagate to the Series object’s underlying values, and vice versa.

GetNumDiscardedStrata() int[source]

How many deposited strata have been discarded?

Determined by number of generations elapsed and the configured column retention policy.

GetNumStrataDeposited() int[source]

How many strata have been depostited on the column?

Note that a first stratum is deposited on the column during initialization.

GetNumStrataRetained() int[source]

How many strata are currently stored within the column?

May be fewer than the number of strata deposited if strata have been discarded as part of the configured stratum retention policy.

GetRankAtColumnIndex(index: int) int[source]

Map array position to generation of deposition.

What is the deposition rank of the stratum positioned at index i among retained strata? Index order is from most ancient (index 0) to most recent.

GetRankIndex() ndarray[source]

Get the integer index in the stored Pandas Series, representing the ranks of stratum entries.

Returns

ranksnp.ndarray

A numpy array containing ranks of differentia entries, including null entries for differentia that are not retained.

GetStratumDifferentiaBitWidth() int[source]

How many bits wide are the differentia of strata?

HasDiscardedStrata() bool[source]

Have any deposited strata been discarded?

IterRankDifferentiaZip(copyable: bool = False) Iterator[Tuple[int, int]][source]

Iterate over ranks of retained strata and their differentia.

If copyable, return an iterator that can be copied to produce a new fully-independent iterator at the same position.

Equivalent to zip(specimen.IterRetainedRanks(), specimen.IterRetainedDifferentia()), but may be more efficient.

IterRetainedDifferentia() Iterator[int][source]

Iterate over differentia of strata retained in the specimen.

Differentia yielded from most ancient to most recent.

IterRetainedRanks() Iterator[int][source]

Iterate over deposition ranks of strata retained in the specimen.

__init__(stratum_differentia_series: Series[DataType(uint64)], stratum_differentia_bit_width: int) None[source]

Initialize a HereditaryStratigraphicSpecimen object with a sequence of rank-indexed differentia and a differentia bit width.