Add architecture/infrastructure.md — container topology and API protocol

This commit is contained in:
otivm
2026-04-28 07:17:10 +00:00
parent 5f22273393
commit aa0afb46b9

View File

@@ -0,0 +1,209 @@
# Infrastructure Architecture
### TheRON — OTIVM / CIVICVS / TESSERA Stack
### Status: Settled — do not reverse without explicit project owner instruction
### Date: 2026-04-28
---
## 1. Governing Principle
**ROBUST is always the first response to any architectural decision.**
Optimisation that encodes assumptions about co-location, shared infrastructure,
or current scale is not optimisation — it is deferred liability. Every decision
in this document was made by asking: what happens when (not if) the second box
exists in a different datacenter?
---
## 2. Container Topology
Five LXC containers on srv-a (10.0.0.11, Proxmox). Each container owns exactly
one domain. No container reaches into another container's data directly.
| CT | Name | Role | WireGuard IP |
|---|---|---|---|
| **1101** | tessera-pipeline | TESSERA data pipeline — ingests, validates, and promotes physical world data | TBD |
| **1102** | tessera-store | TESSERA master store — authoritative read-only source for physical world data | TBD |
| **1103** | tessera-dev | Aggregation — reads player behavior, derives collective patterns, writes back to 1102 | TBD |
| **1104** | apt-cache | Infrastructure only — Debian package cache for the local network | TBD |
| **1105** | otivm-dev | Game server — serves the OTIVM browser game, holds 128 per-player SQLite databases | 10.110.0.18 |
---
## 3. API Protocol
**All inter-container data flows are REST over HTTPS on the WireGuard mesh.**
No exceptions. No shared filesystem mounts between containers. No direct database
access across container boundaries. No assumptions about co-location.
This discipline is enforced not because the containers are currently in different
datacenters — they are not — but because the architecture must survive the moment
they are. An API call works identically whether the target container is on the same
physical host or on a node in another country. A shared filesystem mount does not.
Every API is:
- Versioned (`/api/v1/...`)
- Logged — every call, every response, every error
- Narrow — one domain, one owner
- Independently deployable
---
## 4. Container APIs
### 4.1 — 1101 tessera-pipeline
**Write API (internal only)**
Pushes validated physical world data into 1102. Not accessible by game containers.
No player-facing traffic. Called by the dataset assistant pipeline scripts.
Defined in: `docs/architecture/api-1101.md` (pending)
### 4.2 — 1102 tessera-store
**Read API**
The authoritative source for TESSERA physical data — cells, epochs, terrain,
hydrology, elevation, geology, occupation evidence. Every consumer that needs
physical world data calls this API.
The `data/otivm.sqlite3` file currently on 1105 is a local cache of a subset of
what 1102 will eventually serve. When 1102's API is live, 1105 reads from it
directly. The local cache becomes a pre-seeded performance layer, not a source.
Defined in: `docs/architecture/api-1102.md` (pending)
### 4.3 — 1103 tessera-dev
**Read API** for derived behavioral data — market prices, route saturation,
collective patterns derived from aggregating player behavior.
Also exposes a **scheduler** that calls 1105's internal API on a defined schedule
to collect player event snapshots, processes them, and writes derived aggregates
back to 1102 via its write endpoint.
Runs no game logic. Has no player-facing traffic.
Defined in: `docs/architecture/api-1103.md` (pending)
### 4.4 — 1105 otivm-dev
Two APIs:
**Player-facing API** — serves the OTIVM browser game. Handles save state,
map data, and all game logic. Currently live on port 3000.
**Internal API** — exposes player event snapshots to 1103 on the aggregation
schedule. Returns anonymised behavioral data only — no personal identifiers,
no save file contents. 1103 never touches player SQLite files directly.
Defined in: `docs/architecture/api-1105.md` (pending)
---
## 5. Data Flow
```
1101 (pipeline)
│ write — validated physical data
1102 (tessera-store)
│ read — physical world data
1105 (game) ◄──────────────────────────────── player browser
│ read — player event snapshots (scheduled)
1103 (aggregation)
│ write — derived aggregates
1102 (tessera-store)
```
**Write domains — one per container, never shared:**
| Container | Writes to |
|---|---|
| 1101 | 1102 physical data tables |
| 1103 | 1102 derived aggregate tables |
| 1105 | Its own 128 per-player SQLite databases only |
No container writes to another container's primary data. The flow is always
downstream from physical reality toward player experience, with one upstream
path: aggregated behavior flowing back to inform the physical world model.
---
## 6. Per-Player Database Model
128 SQLite databases on 1105. One per player slot, pre-provisioned.
Each database is named by player token and lives in `data/players/`.
A new player is assigned to an existing pre-provisioned database.
No database is created on demand under player load.
The atomic unit of the per-player database is **time**. Every parameter,
every action, every event is a timestamped record. Voyage, otium, journal
entry, chapter advance — these are derived labels applied to intervals
on a continuous timeline. The database records moments. The application
derives meaning from sequences of moments.
Schema defined in: `docs/architecture/player-database.md` (pending)
---
## 7. Backup and Restore Strategy
Each container is independently restorable from vzdump snapshots.
Archives are documented in `docs/archives.md`.
**Recovery hierarchy by criticality:**
| Priority | Container | Reason |
|---|---|---|
| Highest | 1102 | Physical world data — rebuilt from 1101 if lost, but slow |
| High | 1105 | Player databases — irreplaceable behavioral history |
| Medium | 1101 | Pipeline code — recoverable from Gitea |
| Medium | 1103 | Aggregation code — recoverable from Gitea |
| Low | 1104 | Infrastructure only — trivially replaceable |
**Rule:** 1105 is backed up more frequently than any other container because
player databases are gitignored and exist only on disk. When the Simulator
launches and player databases represent months of participant history, this
frequency increases further.
---
## 8. Scalability Path
The second game container — 128 more participants, potentially in a different
datacenter — is a configuration change, not an architectural change. It:
- Runs the same game server code as 1105
- Calls 1102's API for physical world data (same endpoint, different client)
- Reports player events to 1103 via the same internal API contract
- Is backed up independently
Nothing in this architecture assumes a single game container. The API boundaries
ensure that adding capacity is additive, not disruptive.
---
## 9. What This Document Does Not Cover
These topics are settled in principle but not yet specified in detail.
Each will receive its own architecture document.
| Topic | Document |
|---|---|
| 1102 API endpoints and schema | `docs/architecture/api-1102.md` |
| 1103 aggregation schedule and logic | `docs/architecture/api-1103.md` |
| 1105 internal API for event export | `docs/architecture/api-1105.md` |
| Per-player SQLite schema | `docs/architecture/player-database.md` |
| Parameter registry | `docs/architecture/parameters.md` |
| CIVICVS Simulator integration | `docs/architecture/simulator.md` |
---
*Infrastructure Architecture — settled 2026-04-28*
*TheRON — single contributor. AI assistants implement, document, flag — do not direct.*