{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/gist/mmore500/34d08f1f1d4ff74c7b9884c419955085/hstrat_ping_demo.ipynb)" ] }, { "cell_type": "markdown", "source": [ "## Demo: Real-time Ping Derby\n", "\n", "This demonstration showcases end-to-end application of `hstrat` to track phylogenetic history of a real-time evolutionary process using the `hstrat` library.\n", "This simulation dispatches genomes out over-the-wire on round trips to various web servers via ping requests.\n", "This network transport mechanism is intended to demonstrate conditions analogous to distributed (i.e., many-CPU) evolutionary simulations, where trivial centralized ancestry tracking is not possible.\n", "\n", "Digital \"genomes\" in this simulation comprise a four character domain name (e.g., `abcd` for `abcd.com`) that probabilistically mutate from one generation to the next.\n", "\n", "Each generation, all genomes in the population are serialized and then dispatched as payloads within ping requests to their encoded domain.\n", "The first `n` ping responses received back are deserialized and used to create the next generation.\n", "In this way, the availaibilty of genomes' encoded domain to expediently return ping responses influences their selection for reproduction.\n", "\n", "The functional content (i.e., domain name) of genomes is supplemented with hstrat instrumentation to allow for phylogenetic reconstruction of simulation history.\n", "This instrumentation is also sent over the wire, bundled with the functional genome content.\n", "\n", "The key features of `hstrat` demonstrated include:\n", "- Use of hstrat's stratum retention policies to balance accuracy of phylogenetic reconstruction against the constraint of payload size.\n", "- Serialization and deserialization of genomes (functional content plus instrumentation) for transmission in ping payloads.\n", "- Reconstruction of evolutionary history from the final population's hstrat instrumentation, allowing visualization of phylogenetic history to understand evolutionary dynamics over time.\n", "\n", "The simulation proceeds by initializing a population from a common ancestor, subjecting this population to mutations and selections across generations, and finally analyzing the resulting phylogenetic tree to infer the evolutionary history.\n" ], "metadata": { "id": "BnNKgqtFao9u" }, "id": "BnNKgqtFao9u" }, { "cell_type": "markdown", "source": [ "### Set Up: Environment, Dependencies, and Parameters\n", "\n", "In addition to hstrat, we'll use some tools from `mmore500/alife-phylogeny-tutorial` to perform ping operations." ], "metadata": { "id": "rrfs8wHQWIW_" }, "id": "rrfs8wHQWIW_" }, { "cell_type": "code", "execution_count": null, "id": "f180fff5", "metadata": { "id": "f180fff5" }, "outputs": [], "source": [ "# environment...\n", "!find . -name . -o -prune -exec rm -rf -- {} +\n", "!git init\n", "!git remote add origin https://github.com/mmore500/alife-phylogeny-tutorial.git\n", "!git fetch origin\n", "!git checkout 007340472e021588636b50157c7d3045269e707f\n", "!python3 -m pip install -r requirements.txt" ] }, { "cell_type": "code", "execution_count": 2, "id": "0bb0c853", "metadata": { "id": "0bb0c853", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "11571039-908f-4f88-9e10-1924f7deaee9" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Demo uses hstrat v1.8.2\n" ] } ], "source": [ "# dependencies...\n", "import random\n", "import string\n", "import typing\n", "\n", "import alifedata_phyloinformatics_convert as apc\n", "from hstrat import hstrat; print(f\"Demo uses hstrat v{hstrat.__version__}\")\n", "import pandas as pd\n", "import typing_extensions\n", "\n", "import pylib # local Python library @ ./pylib/" ] }, { "cell_type": "code", "execution_count": 3, "id": "8856885a", "metadata": { "id": "8856885a" }, "outputs": [], "source": [ "# parameters...\n", "# how many characters can genomes' domain string be?\n", "TARGET_DOMAIN_LEN: int = 4\n", "CHAR_MUTATE_RATE: float = 0.1\n", "N_POP: int = 8\n", "N_GEN: int = 10\n", "\n", "# how many copies can each genome make of itself\n", "# per reproduction event\n", "# i.e., how many outgoing pings to send at once\n", "PING_COPY_COUNT: int = 2\n", "\n", "# use 1 byte differentia values for hstrat instrumentation\n", "DIFFERENTIA_BIT_WIDTH: int = 8\n", "\n", "# use 4 bytes to store generation counter when serializing hstrat instrumentation\n", "GEN_COUNTER_BYTE_WIDTH: int = 4" ] }, { "cell_type": "markdown", "id": "efd1f4d4", "metadata": { "id": "efd1f4d4" }, "source": [ "### Choose Retention Policy\n", "\n", "Configure stratum retention policy to manage trade-off between instrumentation size (i.e., bytes occupied) vs. reconstruction accuracy.\n", "\n", "We don't want our ping payloads to get too large, so we will pick a retention policy that keeps instrumentation size below a fixed size cap.\n", "Let's use the curbed recency-proportional resolution stratum retention algorithm, which is a good go-to choice for space-constrained evolutionary applications of hstrat.\n", "\n", "Setting up a retention policy based on this algorithm requires calculation of the maximum number of differentia we would like to retain.\n", "\n", "For this example, suppose a 32 byte size budget for ping payload.\n", "Allocate 4 bytes for target domain string (functional genome content).\n", "\n", "Allocating another 4 bytes for the generation counter component of hstrat instrumentation will allow us up to 4,294,967,295 generations, which is plenty for this use case.\n", "\n", "We have 24 bytes left, so at 1 byte per differentia, we can accomodate up to 24 differentia." ] }, { "cell_type": "code", "execution_count": 4, "id": "315ab1ce", "metadata": { "id": "315ab1ce" }, "outputs": [], "source": [ "ping_stratum_retention_policy = (\n", " hstrat.recency_proportional_resolution_curbed_algo.Policy(\n", " size_curb=24, # max num differentia retained at any one time\n", " )\n", ")" ] }, { "cell_type": "markdown", "id": "6fcc17ae", "metadata": { "id": "6fcc17ae" }, "source": [ "### Define Genome\n", "\n", "Define a genome object to bundle the usual functional genetic information (i.e., influences phenotype/fitness) with purely-instrumentative hstrat information, which is invisible to fitness evaluation and only used to observe simulation history --- just like you might use a generation counter or mutation counter in other circumstances.\n", "\n", "The genome object applies mutation to functional genetic information (in this case, the domain to ping against) and the column update process to hstrat instrumentation when replicated to create an offspring.\n", "\n", "The `to_packet` and `from_packet` methods handle serialization/deserialization so that genome objects can be converted to raw bytes to be sent off in a ping payload and then read back from raw bytes when the ping payload returns.\n", "\n", "Note the use of built-in library tools to manage serialization/deserialization of hstrat instrumentation.\n", "In addition to binary format used here, the library also includes built-in tools to serialize/deserialize to a variety of plain-text formats.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "9aed8673", "metadata": { "id": "9aed8673" }, "outputs": [], "source": [ "class PingGenome:\n", "\n", " # where to ping this genome against\n", " target_domain: str\n", "\n", " # instrumentation to facilitate phylogenetic inference\n", " hstrat_column: hstrat.HereditaryStratigraphicColumn\n", "\n", " def __init__(\n", " self: \"PingGenome\",\n", " target_domain: typing.Optional[str] = None,\n", " hstrat_column: typing.Optional[hstrat.HereditaryStratigraphicColumn] = None,\n", " ):\n", " if target_domain is None:\n", " # create random target domain\n", " target_domain = \"\".join(\n", " random.choice(string.ascii_lowercase)\n", " for __ in range(TARGET_DOMAIN_LEN)\n", " )\n", " self.target_domain = target_domain\n", "\n", " if hstrat_column is None:\n", " self.hstrat_column = hstrat.HereditaryStratigraphicColumn(\n", " # stratum_retention_policy: typing.Any\n", " # Policy struct that specifies the set of strata ranks\n", " # that should be pruned from a hereditary\n", " # stratigraphic column when the nth stratum is deposited.\n", " stratum_retention_policy=ping_stratum_retention_policy,\n", " # always_store_rank_in_stratum : bool, optional\n", " # Should the deposition rank be stored as a data member of generated\n", " # strata, even if not strictly necessary?\n", " always_store_rank_in_stratum=False,\n", " # stratum_differentia_bit_width : int, optional\n", " # The bit width of the generated differentia. Default 64, allowing\n", " # for 2^64 distinct values.\n", " stratum_differentia_bit_width=DIFFERENTIA_BIT_WIDTH,\n", " )\n", " else:\n", " self.hstrat_column = hstrat_column\n", "\n", " def mutate(self: \"PingGenome\") -> None:\n", " # for each target_domain character,\n", " # apply a scramble event with CHAR_MUTATE_RATE probability\n", " self.target_domain = \"\".join(\n", " random.choice(string.ascii_lowercase)\n", " if random.random() < CHAR_MUTATE_RATE\n", " else char\n", " for char in self.target_domain\n", " )\n", "\n", " def create_offspring(self: \"PingGenome\") -> \"PingGenome\":\n", " offspring = PingGenome(\n", " target_domain=self.target_domain, # inherit target_domain\n", " hstrat_column=(\n", " # register elapsed generation w/ hstrat instrumentation,\n", " # then pass instrumentation along to offspring\n", " self.hstrat_column.CloneDescendant()\n", " ),\n", " )\n", " offspring.mutate() # mutate target_domain\n", " return offspring\n", "\n", " def to_packet(self: \"PingGenome\") -> typing_extensions.Buffer:\n", " # serialize genome to a binary string\n", " # that can be transmitted within ping payload\n", " annotation_packet_bytes = hstrat.col_to_packet(\n", " self.hstrat_column,\n", " num_strata_deposited_byte_width=GEN_COUNTER_BYTE_WIDTH,\n", " )\n", " return self.target_domain.encode() + annotation_packet_bytes\n", "\n", " @staticmethod\n", " def from_packet(data: typing_extensions.Buffer) -> \"PingGenome\":\n", " # deserialize genome from a binary string\n", " # i.e., extracted from a ping payload\n", "\n", " # first TARGET_DOMAIN_LEN bytes are target_domain string\n", " target_domain = data[:TARGET_DOMAIN_LEN].decode()\n", "\n", " # all the rest is the hstrat instrumentation\n", " hstrat_column = hstrat.col_from_packet_buffer(\n", " packet_buffer=data[TARGET_DOMAIN_LEN:],\n", " differentia_bit_width=DIFFERENTIA_BIT_WIDTH,\n", " num_strata_deposited_byte_width=GEN_COUNTER_BYTE_WIDTH,\n", " stratum_retention_policy=ping_stratum_retention_policy,\n", " )\n", "\n", " # put deserialized components together into a genome object\n", " return PingGenome(\n", " target_domain=target_domain,\n", " hstrat_column=hstrat_column,\n", " )" ] }, { "cell_type": "markdown", "id": "fd9a0329", "metadata": { "id": "fd9a0329" }, "source": [ "### Define Selection\n", "\n", "Process one generation of evolution on a population of `PingGenome`'s and return \"winning\" offspring who made it back first as the next population." ] }, { "cell_type": "code", "execution_count": 6, "id": "9fbf0dce", "metadata": { "id": "9fbf0dce" }, "outputs": [], "source": [ "def elapse_generation(\n", " population: typing.List[PingGenome],\n", ") -> typing.List[PingGenome]:\n", "\n", " # manages socket resources, etc.\n", " pinger = pylib.PayloadPinger()\n", "\n", " # loop until we get enough packets back\n", " # to fill next population to same size as current population\n", " next_population_packets: typing.List[typing_extensions.Buffer] = []\n", " while len(next_population_packets) < len(population):\n", "\n", " # how many more packets do we need?\n", " num_empty_next_population_slots = len(population) - len(\n", " next_population_packets\n", " )\n", "\n", " # dispatch ping requests\n", " for __ in range(num_empty_next_population_slots):\n", "\n", " # selection is random among current population\n", " selection = random.choice(population)\n", " # create several offspring and dispatch into ping payloads\n", " for __ in range(PING_COPY_COUNT):\n", " # create_offspring makes genome copy, applies mutation,\n", " # & registers elapsed generation w/ hstrat instrumentaiton\n", " offspring = selection.create_offspring()\n", "\n", " # figure out where offspring points to\n", " # and dispatch it as a ping payload\n", " target_url = offspring.target_domain + \".com\"\n", " pinger.send(target_url, offspring.to_packet())\n", " # log request event\n", " print(f\"---> packet sent to {target_url}\")\n", "\n", " # collect all available ping responses\n", " # & extact their payloads into next_population_packets\n", " # until we have enough packets for next population\n", " while len(next_population_packets) < len(population):\n", "\n", " maybe_packet = pinger.read()\n", " if maybe_packet is None:\n", " break # no more ping responses to read right now\n", " else:\n", " next_population_packets.append(maybe_packet)\n", "\n", " # log response event\n", " packet_domain = maybe_packet[:TARGET_DOMAIN_LEN].decode()\n", " print(f\" <=== packet returned from {packet_domain}\")\n", "\n", " # deserialize packets back into genome objects\n", " next_population: typing.List[PingGenome] = [\n", " PingGenome.from_packet(packet) for packet in next_population_packets\n", " ]\n", " return next_population" ] }, { "cell_type": "markdown", "id": "2848a5f2", "metadata": { "id": "2848a5f2" }, "source": [ "# Do Evolution\n", "\n", "Create a common ancestor, initialize population of `N_POP` offspring of common ancestor, and update population using selection process `N_GEN` times." ] }, { "cell_type": "code", "execution_count": 7, "id": "2c1449cc", "metadata": { "scrolled": false, "id": "2c1449cc", "outputId": "4a5d1e21-bf67-43b4-b8f3-b2a2f80ca48b", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\n", "------- generation 0 -------\n", "---> packet sent to sjqa.com\n", "---> packet sent to ooqa.com\n", "---> packet sent to ojhd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojua.com\n", "---> packet sent to kjqa.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to gjqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to bjqh.com\n", "---> packet sent to wjqh.com\n", "---> packet sent to ojqa.com\n", "---> packet sent to ojqa.com\n", " <=== packet returned from ojhd\n", " <=== packet returned from sjqa\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojua\n", " <=== packet returned from ojqd\n", " <=== packet returned from kjqa\n", "\n", "------- generation 1 -------\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqb.com\n", "---> packet sent to ojqi.com\n", "---> packet sent to kjqd.com\n", "---> packet sent to ojuv.com\n", "---> packet sent to ojua.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to zjkd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to qoua.com\n", "---> packet sent to ojua.com\n", "---> packet sent to tjqd.com\n", "---> packet sent to ojqd.com\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqa\n", " <=== packet returned from ojqa\n", " <=== packet returned from ojqi\n", " <=== packet returned from ojua\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", "\n", "------- generation 2 -------\n", "---> packet sent to oeqn.com\n", "---> packet sent to ujqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to onqd.com\n", "---> packet sent to ojza.com\n", "---> packet sent to bjua.com\n", "---> packet sent to ojqv.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to opua.com\n", "---> packet sent to ojua.com\n", "---> packet sent to ojua.com\n", "---> packet sent to ojua.com\n", "---> packet sent to ojpa.com\n", "---> packet sent to ojqa.com\n", " <=== packet returned from ojqd\n", " <=== packet returned from oeqn\n", " <=== packet returned from ojqd\n", " <=== packet returned from tjqd\n", " <=== packet returned from onqd\n", " <=== packet returned from ojza\n", " <=== packet returned from bjua\n", " <=== packet returned from ojqd\n", "\n", "------- generation 3 -------\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to gjuk.com\n", "---> packet sent to bjua.com\n", "---> packet sent to mjqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to tjqd.com\n", "---> packet sent to tjqd.com\n", "---> packet sent to ojga.com\n", "---> packet sent to ojza.com\n", "---> packet sent to ojqz.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to onqd.com\n", "---> packet sent to onqd.com\n", " <=== packet returned from ojpa\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqa\n", " <=== packet returned from bjua\n", " <=== packet returned from gjuk\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", "\n", "------- generation 4 -------\n", "---> packet sent to gjuk.com\n", "---> packet sent to gkuk.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to gjuk.com\n", "---> packet sent to gjuk.com\n", "---> packet sent to ojqh.com\n", "---> packet sent to ofqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to ojqa.com\n", "---> packet sent to ojqa.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", " <=== packet returned from ojqd\n", " <=== packet returned from gjuk\n", " <=== packet returned from gkuk\n", " <=== packet returned from onqd\n", " <=== packet returned from onqd\n", " <=== packet returned from ojod\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqh\n", "\n", "------- generation 5 -------\n", "---> packet sent to ojwd.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojpd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to mpqw.com\n", "---> packet sent to gkfk.com\n", "---> packet sent to gvdk.com\n", "---> packet sent to gjuk.com\n", "---> packet sent to gjuk.com\n", "---> packet sent to onqd.com\n", "---> packet sent to onqd.com\n", "---> packet sent to oiod.com\n", "---> packet sent to ojod.com\n", " <=== packet returned from ojmd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqa\n", " <=== packet returned from ojqa\n", " <=== packet returned from ojod\n", " <=== packet returned from ojqd\n", " <=== packet returned from ojqd\n", "\n", "------- generation 6 -------\n", "---> packet sent to ojod.com\n", "---> packet sent to ljod.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to njqd.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ouqd.com\n", "---> packet sent to opqd.com\n", "---> packet sent to ojod.com\n", "---> packet sent to opod.com\n", "---> packet sent to ojqa.com\n", "---> packet sent to ojqa.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ormd.com\n", "---> packet sent to ojmd.com\n", " <=== packet returned from ojod\n", " <=== packet returned from ljod\n", " <=== packet returned from ojqd\n", " <=== packet returned from oiod\n", " <=== packet returned from ojod\n", " <=== packet returned from ojod\n", " <=== packet returned from ouqd\n", " <=== packet returned from ojod\n", "\n", "------- generation 7 -------\n", "---> packet sent to ljld.com\n", "---> packet sent to ljod.com\n", "---> packet sent to ouod.com\n", "---> packet sent to cjld.com\n", "---> packet sent to oiod.com\n", "---> packet sent to oiod.com\n", "---> packet sent to oiod.com\n", "---> packet sent to oiod.com\n", "---> packet sent to ojqd.com\n", "---> packet sent to ojnd.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojod.com\n", "---> packet sent to ojom.com\n", " <=== packet returned from opod\n", " <=== packet returned from ormd\n", " <=== packet returned from ojmd\n", " <=== packet returned from ljld\n", " <=== packet returned from ljod\n", " <=== packet returned from ouod\n", " <=== packet returned from cjld\n", " <=== packet returned from ojqd\n", "\n", "------- generation 8 -------\n", "---> packet sent to ouod.com\n", "---> packet sent to ouod.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ljld.com\n", "---> packet sent to cjld.com\n", "---> packet sent to cjld.com\n", "---> packet sent to ouod.com\n", "---> packet sent to ouod.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to ocod.com\n", "---> packet sent to opeb.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ouod.com\n", "---> packet sent to onot.com\n", " <=== packet returned from ljld\n", " <=== packet returned from ljld\n", " <=== packet returned from cjld\n", " <=== packet returned from ouod\n", " <=== packet returned from cjld\n", " <=== packet returned from ouod\n", " <=== packet returned from ojmd\n", " <=== packet returned from ojmd\n", "\n", "------- generation 9 -------\n", "---> packet sent to djld.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to ojmz.com\n", "---> packet sent to ojid.com\n", "---> packet sent to olmd.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ljlp.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to ojmd.com\n", "---> packet sent to cjld.com\n", "---> packet sent to cjld.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ljld.com\n", "---> packet sent to ouod.com\n", "---> packet sent to ouod.com\n", " <=== packet returned from djld\n", " <=== packet returned from ljld\n", " <=== packet returned from ojmd\n", " <=== packet returned from ojid\n", " <=== packet returned from ljld\n", " <=== packet returned from olmd\n", " <=== packet returned from ljld\n", " <=== packet returned from ljld\n" ] } ], "source": [ "# create a common ancestor\n", "common_ancestor = PingGenome()\n", "\n", "# initialize population with offspring of common ancestor\n", "population = [common_ancestor.create_offspring() for __ in range(N_POP)]\n", "\n", "# update population N_GEN times\n", "for generation in range(N_GEN):\n", " print(f\"\\n------- generation {generation} -------\")\n", " population = elapse_generation(population)" ] }, { "cell_type": "markdown", "id": "29aec55b", "metadata": { "id": "29aec55b" }, "source": [ "### Extract Annotations and Build Tree\n", "\n", "Isolate hstrat columns from population at end of simulation.\n", "Use `hstrat.build_tree` to synthesize phylogeny estimate from population members' hstrat instrumentation." ] }, { "cell_type": "code", "execution_count": 8, "id": "4b7e84e0", "metadata": { "id": "4b7e84e0" }, "outputs": [], "source": [ "# hstrat instrumentation from population at end of simulation\n", "extant_annotations = [\n", " # extract hstrat columns from genomes\n", " # & freeze dynamic instrumentation as \"specimens,\"\n", " # which are optimized for postprocessing analysis\n", " hstrat.col_to_specimen(genome.hstrat_column)\n", " for genome in population\n", "]\n", "\n", "# estimated_phylogeny is stored in alife data standards format\n", "# https://alife-data-standards.github.io/alife-data-standards/phylogeny.html\n", "estimated_phylogeny: pd.DataFrame = hstrat.build_tree(\n", " population=extant_annotations,\n", " taxon_labels=[genome.target_domain for genome in population],\n", " # the `build_tree` function tracks the current best-known general\n", " # purpose reconstruction algorithm\n", " # pin to the current version (e.g., \"1.7.2\") for long-term stability\n", " # or pin to hstrat.__version__ to track latest algorithm updates\n", " version_pin=hstrat.__version__,\n", ")" ] }, { "cell_type": "markdown", "id": "54799cb0", "metadata": { "id": "54799cb0" }, "source": [ "### Visualize Phylogeny\n", "\n", "Draw an ascii visualization of reconstructed phylogeny." ] }, { "cell_type": "code", "execution_count": 9, "id": "fea2a097", "metadata": { "id": "fea2a097", "outputId": "d4c4f7a0-7831-420b-8665-aefc81119b81", "colab": { "base_uri": "https://localhost:8080/" } }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ " /----- ojid\n", " /----+ \n", " /---------+ \\----- olmd\n", " | | \n", " | \\---------- ojmd\n", " | \n", "+-----------------------+ /----- djld\n", " | | \n", " | /----+----- ljld\n", " | | | \n", " \\---------+ \\----- ljld\n", " | \n", " | /----- ljld\n", " \\----+ \n", " \\----- ljld\n", " \n", " \n" ] } ], "source": [ "# translate to dendropy (which provides lots of phylogenetics tools)\n", "# via alifedata phyloinformatics conversion tool\n", "dendropy_tree = apc.alife_dataframe_to_dendropy_tree(\n", " estimated_phylogeny,\n", " setup_edge_lengths=True,\n", ")\n", "\n", "# draw the reconstruction!\n", "print(dendropy_tree.as_ascii_plot(plot_metric=\"age\", width=50))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" }, "colab": { "provenance": [], "include_colab_link": true } }, "nbformat": 4, "nbformat_minor": 5 }