assemblage_from_dstream_df

assemblage_from_dstream_df(df: ~pandas.core.frame.DataFrame, progress_wrap: ~typing.Callable = <function <lambda>>) HereditaryStratigraphicAssemblage

Deserialize a HereditaryStratigraphicAssemblage from a pandas DataFrame containing dstream surface data.

Each row of the DataFrame represents a single hereditary stratigraphic surface, serialized as a hex string with associated dstream metadata. Surfaces are deserialized, converted to specimens, and assembled into a HereditaryStratigraphicAssemblage.

Parameters

dfpd.DataFrame

DataFrame with dstream surface data.

Required schema:
  • ‘data_hex’string

    Raw genome data as a hexadecimal string.

  • ‘dstream_algo’string or categorical

    Name of downstream curation algorithm (e.g., 'dstream.steady_algo').

  • ‘dstream_storage_bitoffset’integer

    Bit offset of the dstream buffer field in data_hex.

  • ‘dstream_storage_bitwidth’integer

    Bit width of the dstream buffer field in data_hex.

  • ‘dstream_T_bitoffset’integer

    Bit offset of the dstream counter (“rank”) field in data_hex.

  • ‘dstream_T_bitwidth’integer

    Bit width of the dstream counter field in data_hex.

  • ‘dstream_S’integer

    Capacity of the dstream buffer (number of differentia stored per annotation).

progress_wrapcallable, optional

Wrapper applied to the row iterator, e.g. tqdm.tqdm for a progress bar. Must accept and return an iterable. Default is the identity function (no wrapping).

Returns

HereditaryStratigraphicAssemblage

Assemblage built from the deserialized surfaces.

Raises

ValueError

If any required column is missing from the DataFrame.

See Also

surf_from_hex :

Deserialize a single surface from a hex string.

pop_to_assemblage :

Create an assemblage from a collection of HereditaryStratigraphicColumn objects.

assemblage_from_records :

Deserialize an assemblage from a dict of builtin types.