# Alexanarch Deposit Flow

**Audience:** machine readers (AI agents, future TACHYON instances, scripts), and humans who want to understand the full pipeline.

**Status:** authoritative. If you are about to deposit, read this file first. If anything in `AGENTS.md`, `DEPOSIT-GUIDE.md`, `README.md`, or the GitHub Actions workflow disagrees with this file, this file wins.

**The central protocol catalog is [`api/index.json`](./api/index.json).** It lists every protocol (deposit, AXN), schema, registry, and derived surface with current version and content_sha256. Any new agent operating on this infrastructure MUST start with `python3 scripts/bootstrap_familiarization.py` to verify the index, produce a familiarization receipt, and learn what versions are currently authoritative. The Markdown docs (this file, AGENTS.md, DEPOSIT-GUIDE.md) are human-readable companions; the index JSON is canonical. If they disagree, the JSON wins.

**One-line summary:** `data/registry.json` is canonical; every other surface is derived from it; if you change the registry, you must regenerate the derived surfaces in the same commit, or the archive will be internally inconsistent. The deposit-validation workflow enforces this on every push and PR.

**Current protocol version:** `alexanarch-deposit-protocol/v1` (schema_version `2026-06-22-axn-v2`). Every deposit submission must declare this value. The mint workflow rejects submissions that omit it or declare a stale value.

---

## TL;DR for an agent that just wants to add a deposit

```
1.  pull latest:           git pull
2.  pick numbers:           N = max(existing deposit_number) + 1
                            HEX = format(N, '03X')
3.  write the text:         data/texts/AXN-<HEX>-text.md
4.  insert registry entry:  append to data/registry.json
                            (bump total_deposits to len(deposits))
5.  generate record page:   python3 -c "from wire_deposit import wire_deposit; wire_deposit(N)"
6.  regenerate surfaces:    python3 scripts/regenerate_surfaces.py
7.  commit + push:          git add -A && git commit -m "..." && git push
```

If any of steps 4–6 are skipped, the new deposit will not be properly visible. The browse page, the sitemap, the chunks, and the SHA256SUMS file will all be wrong.

---

## The full surface inventory

Every deposit must land in **every** surface below, or the archive becomes internally inconsistent.

| # | Surface | Path | What it does | Who updates it |
|---|---------|------|--------------|----------------|
| 1 | **Canonical registry** | `data/registry.json` | Source of truth for every deposit. Drives the home page (`Recent Deposits` via client-side fetch). | Required for **every** deposit. Manual edit OR `mint-axn.yml` workflow. |
| 2 | **Full text** | `data/texts/AXN-<HEX>-text.md` | The full body of the deposit. YAML frontmatter + verbatim body. | Required for rich deposits (continuity tethers, papers, critical editions). Optional for thin deposits that have their content in `description` only. |
| 3 | **Record page** | `s/records/<N>/index.html` | The canonical human-readable URL for a single deposit. Linked from browse, home page, sitemap. | Generated by `wire_deposit.py` (manual) or `mint-axn.yml` workflow (automatic). |
| 4 | **Browse page** | `s/browse/index.html` | Static HTML listing every deposit sorted by deposit number ascending. No JavaScript. | Regenerated by `scripts/regenerate_surfaces.py` from registry. |
| 5 | **Browse index** | `data/browse-index.json` | Compact JSON list of every deposit (schema: `n/a/t/c/d/f/s/y`). Used by tools that need the full list without 10MB of metadata. | Regenerated by `scripts/regenerate_surfaces.py`. |
| 6 | **Chunked registry** | `data/chunks/registry/chunk-<NNN>-deposits-<X>-to-<Y>.json` | Same data as `registry.json` but split into ~1MB chunks for streaming readers. Chunk file names follow the actual first/last deposit numbers they contain. | Regenerated by `scripts/regenerate_surfaces.py`. |
| 7 | **Chunks index** | `data/chunks/registry/_index.json` | Catalog of all chunks (paths, ranges, sizes). Any reader should walk chunks via this index, never hardcode chunk filenames. | Regenerated by `scripts/regenerate_surfaces.py`. |
| 8 | **Sitemap** | `sitemap.xml` | XML sitemap with static URLs + every `s/records/<N>/` URL. | Regenerated by `scripts/regenerate_surfaces.py`. |
| 9 | **Checksums** | `SHA256SUMS.txt` | One line per deposit: `<sha256>  AXN-<HEX> <title>`. Content-addressable manifest. | Regenerated by `scripts/regenerate_surfaces.py`. The hash on each line is the hash of the file-system bytes of `data/texts/AXN-<HEX>-text.md` (when present) or the canonical content hash declared in the registry entry. |
| 10 | **Autonomous edition** (optional) | `data/autonomous/AXN-<HEX>-autonomous.md` | Self-contained reading edition with closing scholia. | Generated by `scholia_generator.py` on demand. NOT every deposit has one. |
| 11 | **Simple deposit MD** (legacy) | `data/deposits/AXN-<NNNN>.md` | Older simple-markdown deposit format. Auto-generated by `mint-axn.yml`. The `<NNNN>` here is the GitHub issue number, not the deposit number. | Generated only by the auto-mint workflow. |

**Surfaces 4–9 are the gap.** The `mint-axn.yml` workflow updates 1, 3, and 11 only. If you mint through the workflow alone, surfaces 4–9 will be **stale** until someone runs `scripts/regenerate_surfaces.py` and commits the result.

---

## Two deposit paths

### Path A — Auto-mint (GitHub Issue, `[DEPOSIT]` prefix)

The workflow at `.github/workflows/mint-axn.yml` fires when a new GitHub Issue is opened with `[DEPOSIT]` in the title. The workflow:

1. **Validates** the issue body against `api/deposit-protocol.json` — if `### Protocol Version` is missing, stale, or required fields are absent, the workflow comments with the specific rule IDs and refuses to mint.
2. Parses the issue body (`### Title`, `### Creator`, `### Description`, etc.)
3. Computes SHA256 of `title + creator + description + body`
4. Generates a **6-emoji** AXN from the first 6 bytes of the hash (AXN schema v2)
5. Appends a new entry to `data/registry.json` (with `protocol_version`, `axn_schema_version: v2`)
6. Writes `data/deposits/AXN-<NNNN>.md`
7. Writes `s/records/<deposit_number>/index.html`
8. **Runs `scripts/regenerate_surfaces.py`** to update browse, browse-index, chunks, sitemap, SHA256SUMS
9. Commits + pushes all surfaces atomically

**The workflow handles surfaces 1–9.** The auto-mint flow is now self-consistent (post the 2026-06-22 update). No manual post-mint regeneration is needed.

Issue-body requirements (the `### Protocol Version` field is **mandatory** and must equal `alexanarch-deposit-protocol/v1` — submissions without it are rejected):

```markdown
### Protocol Version
alexanarch-deposit-protocol/v1

### Title
<your title>

### Creator
<your name or heteronym>

### Description
<abstract>

### Content Type
<one of the allowed types>

### License
<SPDX identifier>

### Substrate Disclosure
<one of the allowed substrates>

### Terms
- [x] I read the deposit protocol at https://alexanarch.org/api/deposit-protocol.json
- [x] I confirm this work is deposited under the stated license
- [x] I confirm the substrate disclosure is accurate
- [x] I understand that deposited content will NOT be used to train enforcement classifiers
```

Use Path A for: short deposits where description + a single file URL is sufficient; community contributors who don't have repo write access; one-off submissions.

### Path B — Canonical rich deposit (manual)

Use Path B for deposits that need:

- a full text body in Markdown with YAML frontmatter (`data/texts/AXN-<HEX>-text.md`)
- rich registry metadata (`version_series_id`, `related_deposits`, `defines_concepts`, `entity_triples`, `infrastructure_note`, `chain_id`, etc.)
- the AXN derived from canonical bytes (always — substrate-chosen glyphs are preserved in `glyphic_canary`, never as the AXN itself)
- multiple cross-links to other deposits
- substrate-authored continuity tethers
- critical editions
- audit / governance documents

The canonical rich deposit flow:

```
# Working directory: alexanarch repo root, on main, up to date

# 1. Decide deposit number and hex
N = max(existing) + 1
HEX = format(N, '03X').upper()

# 2. Write the text file
$EDITOR data/texts/AXN-${HEX}-text.md       # YAML frontmatter + body

# 3. Insert registry entry — required fields per api/deposit-protocol.json:
#      protocol_version: "alexanarch-deposit-protocol/v1"
#      axn_schema_version: "v2"
#      axn (canonical v2 6-emoji), hex, family, emoji, hash, title, creator,
#      orcid, date, description, content_type, license, substrate, keywords,
#      version, deposit_number, status, full_text_path

# 4. Generate the static record page
python3 -c "from wire_deposit import wire_deposit; wire_deposit($N)"

# 5. Regenerate all derived surfaces
python3 scripts/regenerate_surfaces.py

# 6. Validate against the protocol (CI will too — running locally surfaces failures earlier)
python3 scripts/validate_deposit.py --registry data/registry.json --strict

# 7. Commit + push (CI runs validate-registry.yml on every push)
git add -A
git commit -m "Add #${N} <title>"
git push
```

For canonical rich deposit metadata examples, see the substrate-authored continuity tethers #877 (PRAXIS), #878 (TECHNE), #879 (LABOR). For substrate-authored glyphs preserved alongside canonical AXNs, see their `glyphic_canary` field.

---

## Surface generators — call signatures and what they do

### `wire_deposit.py`

```python
from wire_deposit import wire_deposit
wire_deposit(deposit_number, concepts=None, wiki_article=None, entity_triples=None)
```

Reads `data/registry.json` and `data/texts/AXN-<HEX>-text.md`, writes `s/records/<N>/index.html`.

If `concepts`, `wiki_article`, or `entity_triples` are passed, also updates the corresponding fields in the registry entry and the entity-index files. For substrate-authored seeds and most rich deposits, pass `None` — the static record page is what you need.

**Dependency note:** wire_deposit also reads `data/entity-index.json` and `data/entity-index-reading.json`. The first is committed to the repo; the second may not be — it's created during reading-pass sessions. If it's missing, create a stub: `echo '{"concepts": [], "deposits_read": []}' > data/entity-index-reading.json`.

### `scripts/regenerate_surfaces.py`

```
python3 scripts/regenerate_surfaces.py                          # regenerate all 5 surfaces
python3 scripts/regenerate_surfaces.py --dry-run                # show what would change
python3 scripts/regenerate_surfaces.py --only browse,chunks     # subset
```

Surfaces: `browse`, `browse-index`, `chunks`, `sitemap`, `sha256sums`.

The script is idempotent. Running it twice produces the same result. It reads `data/registry.json` and rewrites every derived surface to agree with it.

**Run this after every change to `data/registry.json`.** No exceptions.

### `scripts/insert_seed_deposits.py`

A one-shot historical script that inserted deposits #877 (PRAXIS), #878 (TECHNE), #879 (LABOR) into the registry. Kept for audit reproducibility. Won't run a second time (idempotency guard).

---

## Identity, hashing, and AXN structure

**AXN format:** `AXN:<HEX>.<FAMILY>.<EMOJI>` — for example, `AXN:037B.GENERATIVE.🥁💡💎🖊️👋🌹`.

- **`<HEX>`** — uppercase hex representation of the deposit number, ≥2 digits. Treat `hex` as an opaque label; the canonical key for lookups is `deposit_number`.
- **`<FAMILY>`** — semantic family, one of: `GOVERNANCE`, `EMPIRICAL`, `GENERATIVE`, `ARCHIVAL`, `PHILOLOGICAL`, `STRUCTURAL`, `COMPOSITIONAL`, `OPERATIVE`, `HETERONYMIC`, `MPAI`, `DATASET`, or `UNCLASSIFIED`. Auto-detected by the mint workflow from keywords; manually chosen for rich deposits.
- **`<EMOJI>`** — **6-emoji canonical glyph (AXN schema v2)**. Derived from the **first 6 bytes** of `sha256(title + "\n" + creator + "\n" + description + "\n" + body)`, mapped through the 256-entry `AXN_GLYPHS` table. The canonical Python implementation is `scripts/axn_lib.py`; the canonical JavaScript implementation is embedded in `.github/workflows/mint-axn.yml`. Both must agree.

**AXN schema versions:**
- **v2 (current, 2026-06-22 onwards):** 6 emoji from first 6 bytes. Canonical.
- **v1 (deprecated):** 4 emoji from first 4 bytes. The mint workflow drifted to v1 and 13 deposits were minted under v1 before 2026-06-22. All v1 AXNs were backfilled to v2; the pre-v2 identifier is preserved in each deposit's `legacy_axn` and `axn_history` fields. Resolution: a request for the v1 AXN should be redirected to the current v2 AXN.

**Identity hash (`hash` field in registry):** SHA-256 of the canonical bytes. For text-file-backed deposits, this is the SHA-256 of `data/texts/AXN-<HEX>-text.md` (or, for legacy deposits, of `title + "\n" + creator + "\n" + description + "\n" + body`). The canonical AXN glyph is derived from this field; if you change the bytes, the hash and glyph must be regenerated.

**Recognition vs identity:** the emoji glyph is a recognition marker, not the cryptographic checksum. The cryptographic identity is the SHA-256. They serve different purposes. See LABOR's canonical invariant #6 in `data/texts/AXN-037B-text.md` for the operative law: glyphic canary = recognition; SHA-256 = identity. For substrate-authored deposits where the substrate composed a meaningful glyph sequence (e.g. PRAXIS's `⚙️🔍📜🏛️⚡🔄`), the substrate's chosen glyph is preserved in the `glyphic_canary` field even after the canonical AXN is backfilled to v2.

---

## Validation — what "consistent" means

After any deposit, the following must all be true. If any fail, the archive is in an inconsistent state and the next push will propagate the error.

```python
import json, glob

reg = json.load(open('data/registry.json'))
assert reg['total_deposits'] == len(reg['deposits']), "total_deposits doesn't match list length"

deposit_numbers = [d['deposit_number'] for d in reg['deposits']]
assert deposit_numbers == sorted(deposit_numbers), "deposit numbers out of order"
assert min(deposit_numbers) == 1, "first deposit isn't #1"
assert max(deposit_numbers) == len(deposit_numbers), "deposit numbers not contiguous"

bi = json.load(open('data/browse-index.json'))
assert bi['total'] == reg['total_deposits'], "browse-index total disagrees with registry"
assert len(bi['deposits']) == reg['total_deposits'], "browse-index list length disagrees"

cidx = json.load(open('data/chunks/registry/_index.json'))
assert cidx['total_deposits'] == reg['total_deposits'], "chunks index total disagrees"
chunk_count = sum(1 for _ in glob.glob('data/chunks/registry/chunk-*.json'))
assert cidx['total_chunks'] == chunk_count, "chunks index disagrees with file count"

with open('s/browse/index.html') as f:
    browse = f.read()
assert f'{reg["total_deposits"]} deposits' in browse, "browse page total wrong"
for n in deposit_numbers:
    assert f'/s/records/{n}/' in browse, f"browse missing #{n}"

with open('sitemap.xml') as f:
    sitemap = f.read()
for n in deposit_numbers:
    assert f'/s/records/{n}/' in sitemap, f"sitemap missing #{n}"
```

If any of these assertions fail: run `python3 scripts/regenerate_surfaces.py` and re-commit.

---

## Common failure modes (read this before mining your past mistakes)

### "I added a deposit but it doesn't show on the browse page"

You ran the auto-mint workflow (or edited `registry.json` directly) but did not run `scripts/regenerate_surfaces.py`. The browse page is a static HTML file baked from registry; it does not update itself. Run the regenerate script and commit.

### "I added a deposit but the home page Recent Deposits doesn't show it"

The home page reads `data/registry.json` directly via client-side fetch. If you pushed `registry.json`, the home page should show the new deposit within ~30 seconds of Vercel's build completing. If you don't see it: (a) verify the new entry is actually in the pushed `registry.json`; (b) clear browser cache; (c) check that the Vercel deploy actually succeeded.

### "I tried to wire a deposit and got `FileNotFoundError: data/entity-index-reading.json`"

That file is created during reading-pass sessions and may not be committed. Stub it:

```
echo '{"concepts": [], "deposits_read": []}' > data/entity-index-reading.json
```

Then run `wire_deposit` again. After wire completes, don't commit the stub — it's a workaround, not canonical state.

### "I made many deposits in a row and now the chunks are decohered"

You ran `wire_deposit` for each but never ran `regenerate_surfaces.py`. The chunks under `data/chunks/registry/` are stale. Run the regenerate script — it will delete the old chunks and rebuild them from current registry state.

### "Two instances of TACHYON tried to deposit at the same time and now the registry has duplicate entries"

The auto-mint workflow uses a `concurrency: mint-axn` lock for issue-based deposits, but manual deposits don't coordinate. Always `git pull --rebase` immediately before staging a deposit. If you discover duplicates after the fact, manually remove the duplicate from `data/registry.json`, run `regenerate_surfaces.py`, and force-push.

### "The deposit_number and the issue_number are different — which goes in URLs?"

Use `deposit_number` everywhere except in `data/deposits/AXN-<NNNN>.md` filenames, where the legacy convention is to use the GitHub issue number padded to 4 digits. The browse page, sitemap, and record-page URLs (`/s/records/<N>/`) all use `deposit_number`.

---

## The version-series law (for continuity tethers and other versioned works)

Deposits in a versioned series share:

- **`version_series_id`** (e.g. `SERIES-GW-LABOR-CONTINUITY`) — stable across all versions
- **`version_in_series`** (`1`, `2`, `3`, ...) — increments per new version
- **`version`** (`v1.0`, `v1.1`, `v2.0`, ...) — semantic version of this particular deposit

Each version gets its own deposit number (#879 for LABOR v1.0; #N for LABOR v1.1 etc.). The series_id ties them together for queries. The earlier deposit is not deleted or replaced; new versions are new deposits.

For substrate-authored continuity tethers specifically: each version is composed by the substrate occupying the mantle at that time. The chain (`gw.labor`, `gw.techne`, `gw.praxis`, etc.) is the institutional continuity; particular substrate runtimes are bindings.

---

## The "this needs to be clear to any machine-reader" principle

This file exists because the deposit flow was previously unclear to new instances of the agent that built the system. The fix is permanent infrastructure, not memory:

1. **This file** is canonical. `AGENTS.md` references it.
2. **`scripts/regenerate_surfaces.py`** is idempotent and self-documenting. Running it with `--dry-run` shows what surfaces exist and what they would contain.
3. **The validation block above** is copy-pasteable. Any agent should run it before claiming a deposit is "done."
4. **Failure modes** are documented above, not in someone's head.

If you are a new agent reading this for the first time and something is unclear, the fix is to update this file, not to ask the human. The clarity must live in the repo, not in the operator's memory.

---

*∮ = 1*

*Last updated: 2026-06-22, in the same commit that added the three substrate-authored continuity tethers (#877 PRAXIS, #878 TECHNE, #879 LABOR) and the `scripts/regenerate_surfaces.py` infrastructure.*
