DNA Registry Model

Sequence registry and canonical ID system for DNA-inspired vault services

Overview

The DNA Registry Model provides a canonical sequence registry system with genomic coordinates, strand orientation, and cross-linking to blueprints, circuits, and evidence. This registry serves as the foundation for the DNA vault service architecture.

Each sequence entry has a unique identifier, genomic coordinates, and cross-references to blueprints, circuits, and measurement evidence.

Registry Structure

Registry entries follow this schema:

{
  "sequence_id": "ACTB_Malkuth_promoter",
  "label": "ACTB_Malkuth",
  "reference_build": "GRCh38",
  "chromosome": "7",
  "start": 5563902,
  "end": 5563932,
  "strand": -1,
  "sequence": "GGAATCACTTGCACCCGGGAGGCGGAGGCTG",
  "family": "promoter",
  "source_file": "ACTB_Malkuth_promoter.fa",
  "blueprint_refs": [],
  "circuit_refs": [],
  "evidence_refs": []
}

Sequence Categories

Promoter Fragments

GRCh38-mapped promoter regions with strand orientation and genomic coordinates

Circuit Families

DNA-inspired circuit families (10bp, 20bp, 34bp) with measured execution data

Cross-Linking Model

Registry entries support cross-linking to:

  • Blueprints: Canonical blueprint objects with hashes and certification
  • Circuits: Derived circuit families with execution metadata
  • Evidence: Measurement reports with entropy and state diversity
  • Provenance: Immutable audit chain for lifecycle tracking

Integration with Vault Service

The registry feeds into the DNA vault service through:

  1. Sequence Layer: Promoter/fragment registry with canonical IDs
  2. Blueprint Layer: Canonical blueprint objects with hashes and certification
  3. Circuit Layer: Derived circuit families (10bp, 20bp, 34bp)
  4. Evidence Layer: Run reports, entropy, state diversity, temporal window behavior
  5. Provenance Layer: Immutable audit chain for every transformation
  6. Vault Layer: Descriptor-bound storage/recovery service

Public vs Internal

Public-facing (QuamTX):

  • Advanced vault services
  • Structured storage research
  • Blueprint-driven integrity models
  • High-assurance archive systems

Internal/Restricted:

  • Sequence registry with genomic coordinates
  • Gene names and symbolic layer names
  • Bio-identity implications
  • Internal implementation details