Skip to content

feat: add public sample_count_het() and cohort_heterozygosity() methods (#775)#1077

Open
Tanisha127 wants to merge 3 commits intomalariagen:masterfrom
Tanisha127:feature/775-heterozygosity-support
Open

feat: add public sample_count_het() and cohort_heterozygosity() methods (#775)#1077
Tanisha127 wants to merge 3 commits intomalariagen:masterfrom
Tanisha127:feature/775-heterozygosity-support

Conversation

@Tanisha127
Copy link
Contributor

Summary

This PR addresses the remaining work from issue #775 by improving the
heterozygosity API with two new public methods and unit tests.

Changes

New Methods

sample_count_het()

  • Adds a public counterpart to the existing private _sample_count_het() method
  • Returns a pandas DataFrame with one row per window, containing:
    sample_id, window_start, window_stop, and heterozygosity
  • Makes heterozygosity data directly accessible to users without
    relying on private methods

cohort_heterozygosity()

  • Enables cohort-level comparisons of heterozygosity across a genome region
  • Accepts the standard cohorts parameter (e.g. "taxon", "admin1_year")
    consistent with other cohort-level methods in the API
  • Returns a DataFrame with cohort, n_samples, and mean_heterozygosity

Tests

  • Added test_sample_count_het() and test_cohort_heterozygosity() to
    tests/anoph/test_heterozygosity.py
  • All 24 tests pass across ag3, af1, adir1, and amin1 simulated datasets

Related Issue

Closes #775

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

More heterozygosity support

1 participant