Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
advanced_usage.md		advanced_usage.md
cdot-json-schema.json		cdot-json-schema.json
cdot_vs_uta.md		cdot_vs_uta.md
coordinates_and_exons.md		coordinates_and_exons.md
create_data_from_scratch.md		create_data_from_scratch.md
design_notes.md		design_notes.md
examples_biocommons.md		examples_biocommons.md
examples_pyhgvs.md		examples_pyhgvs.md
fasta_seqfetcher.md		fasta_seqfetcher.md
json_data_format.md		json_data_format.md
local_json_files.md		local_json_files.md
notes.txt		notes.txt
release_files.md		release_files.md
todo.txt		todo.txt
transcript_version_safety.md		transcript_version_safety.md

README.md

cdot documentation

Reference and how-to docs for cdot. These live in the repo, versioned with the code.

Getting started

Using local downloaded JSON.gz files - load release files into the HGVS libraries.
Biocommons HGVS examples - c→g / g→c, plus a T2T-CHM13v2.0 example.
PyHGVS examples - legacy PyHGVS integration (prefer biocommons).
FastaSeqFetcher - local FASTA sequence fetching (SeqRepo replacement).

Advanced usage

Advanced usage - fixing messy HGVS input (fix_hgvs / clean_hgvs) and read-ahead batch retrieval for bulk processing (RESTDataProvider.prefetch).
Transcript-version safety - the opt-in safe version fallback: how cdot decides a version substitution is coordinate-preserving (is_version_substitution_safe), and the study behind it.

Data format reference

JSON data format - every field in a cdot JSON(.gz) file, auto-generated from the typed models in cdot/models.py. Machine-readable JSON Schema alongside it.
Coordinates & exon alignments - how exon coordinates, exon IDs and the alignment gap (CIGAR-like) strings work, with worked examples.

Data files & generation

GitHub release file details - what each released .json.gz file contains.
Create data from scratch - build the JSON files yourself from GTF/GFF3.

Background

cdot vs UTA - how cdot compares to the Universal Transcript Archive.
Design notes & project direction - why JSON, known issues, project goals.