Initial push
This commit is contained in:
357
docs/DUNITER-RPC-FINDINGS.md
Normal file
357
docs/DUNITER-RPC-FINDINGS.md
Normal file
@@ -0,0 +1,357 @@
|
||||
# Duniter Node Architecture & Substrate Storage Key Derivation
|
||||
|
||||
**Status:** Verified working — 2026-06-12
|
||||
**Context:** cry01 Value Layer, Ğ1 balance lookup feature
|
||||
|
||||
This document records findings from implementing and debugging the Ğ1 balance
|
||||
lookup feature in `cry01`. These were established through direct
|
||||
experimentation against the live Ğ1 mainnet via the orchestrator's Duniter
|
||||
nodes, and are not assumptions — every claim below was verified against
|
||||
real RPC responses.
|
||||
|
||||
---
|
||||
|
||||
## 1. Substrate Storage Key Derivation
|
||||
|
||||
To read any value from a Substrate chain's state (e.g. an account's balance),
|
||||
you construct a storage key and call `state_getStorage`. For a `StorageMap`
|
||||
like `System.Account`, the key is:
|
||||
|
||||
```
|
||||
storage_key = Twox128(PalletName) . Twox128(StorageItemName) . Hasher(map_key)
|
||||
```
|
||||
|
||||
For `System.Account(account_id)`:
|
||||
|
||||
```
|
||||
storage_key = Twox128("System") . Twox128("Account") . Blake2_128Concat(account_id)
|
||||
= Twox128("System") . Twox128("Account") . Blake2b_128(account_id) . account_id
|
||||
```
|
||||
|
||||
### 1.1 Twox128 — THE CRITICAL GOTCHA
|
||||
|
||||
**Substrate's "Twox128" is NOT the same algorithm as the generic "xxHash128"
|
||||
(xxh128) that PHP's `hash()` function natively supports.** They produce
|
||||
different 16-byte outputs for the same input, despite the similar name and
|
||||
identical output size. This distinction cost most of a debugging session and
|
||||
must not be re-litigated.
|
||||
|
||||
**Correct Twox128 construction:**
|
||||
|
||||
```
|
||||
Twox128(data) = reverse(xxh64(data, seed=0)) . reverse(xxh64(data, seed=1))
|
||||
```
|
||||
|
||||
That is: two separate 64-bit xxHash digests (seeds 0 and 1), each
|
||||
**byte-reversed**, then concatenated to form 16 bytes.
|
||||
|
||||
**PHP implementation** (verified correct, PHP 8.1+):
|
||||
|
||||
```php
|
||||
function cry01_twox128($data) {
|
||||
$h0 = strrev(hash('xxh64', $data, true, ['seed' => 0]));
|
||||
$h1 = strrev(hash('xxh64', $data, true, ['seed' => 1]));
|
||||
return $h0 . $h1;
|
||||
}
|
||||
```
|
||||
|
||||
**Verification:** `Twox128("System") = 26aa394eea5630e07c48ae0c9558cef7` and
|
||||
`Twox128("Account") = b99d880ec681799c0cf30e8886371da9` — these match the
|
||||
canonical `System::Account` storage prefix published throughout Substrate/
|
||||
Polkadot documentation. This is strong independent confirmation: any
|
||||
Substrate-based chain explorer or tool will recognize this prefix.
|
||||
|
||||
**What does NOT work:** `hash('xxh128', $data, true)`. This is a different,
|
||||
single-pass 128-bit xxHash variant. It passes the generic xxh128 test vectors
|
||||
(e.g. `hash('xxh128', 'php.watch')` = `16c27099bd855aff3b3efe27980515ad`,
|
||||
which IS correct for plain xxh128) — but plain xxh128 is simply the wrong
|
||||
algorithm for Substrate storage prefixes. A test vector passing for "xxh128
|
||||
in general" tells you nothing about whether it's the right primitive for
|
||||
"Substrate's Twox128" — these are unrelated facts that happen to share a
|
||||
name fragment.
|
||||
|
||||
### 1.2 Blake2_128Concat — confirmed correct
|
||||
|
||||
`Blake2_128Concat(key) = Blake2b_128(key) . key` — i.e. the Blake2b-128 hash
|
||||
of the key, followed by the raw key bytes appended (not replaced).
|
||||
|
||||
Blake2b-128 is **RFC 7693 parameterized** output (the output length is part
|
||||
of the hash's parameter block, NOT a truncation of Blake2b-512). PHP's
|
||||
`hash()` function does **not** support `blake2b` as an algorithm on this
|
||||
PHP 8.2.31 build at all (`hash_algos()` does not list `blake2b` or
|
||||
`blake2b512`). We vendor `deemru/Blake2b` (pure PHP, MIT license,
|
||||
`hubzilla/addon/cry01/vendor/Blake2b.php`) for this.
|
||||
|
||||
Verified test vectors (RFC 7693, cross-checked via Python `hashlib.blake2b`):
|
||||
```
|
||||
Blake2b-128("") = cae66941d9efbd404e4d88758ea67670
|
||||
Blake2b-128("abc") = cf4ab791c62b8d2b2109c90275287816
|
||||
```
|
||||
|
||||
### 1.3 Full worked example
|
||||
|
||||
For account `g1LvTpYXkKEASMiBYLp8RQmSN5kZyXtoHX8XE2FqQ9hDjqp5B`:
|
||||
|
||||
```
|
||||
account_id (32 bytes) = 55f2d285cf400d2da003d43fe0ccd5207b6f08780bfdd62999e00d14dd731938
|
||||
storage_key = 0x26aa394eea5630e07c48ae0c9558cef7
|
||||
b99d880ec681799c0cf30e8886371da9
|
||||
b157780e8874e1d5aeee0f3620cf7f76
|
||||
55f2d285cf400d2da003d43fe0ccd5207b6f08780bfdd62999e00d14dd731938
|
||||
```
|
||||
|
||||
`state_getStorage` on this key returns the SCALE-encoded `AccountInfo`
|
||||
struct (see §3).
|
||||
|
||||
---
|
||||
|
||||
## 2. SS58 Address Decoding (Ğ1 addresses)
|
||||
|
||||
Ğ1 addresses (e.g. `g1LvTpYXkKEASMiBYLp8RQmSN5kZyXtoHX8XE2FqQ9hDjqp5B`) are
|
||||
SS58-encoded. **The leading "g1" is NOT a literal prefix string** — it is
|
||||
simply the first two characters of the base58 encoding, which happen to spell
|
||||
"g1" coincidentally. The actual network identifier is encoded in bytes.
|
||||
|
||||
**Confirmed format for Ğ1 (verified against a real address with valid
|
||||
checksum):**
|
||||
|
||||
- Base58-decode the full address string → **36 bytes total**
|
||||
- Byte layout: `2-byte network prefix (0x5891) + 32-byte account ID + 2-byte checksum`
|
||||
- Checksum = first 2 bytes of `Blake2b-512("SS58PRE" + prefix + account_id)`
|
||||
|
||||
This is the 14-bit extended SS58 prefix format (prefixes ≥ 64 use 2 bytes;
|
||||
Ğ1's prefix `0x5891` decodes to network ID 4129... — the exact numeric value
|
||||
wasn't computed, only the raw 2-byte form was needed and verified).
|
||||
|
||||
**Implementation:** `cry01_ss58_decode()` in `cry01_chain.php`. Generic
|
||||
base58 decode is `cry01_base58_decode()` — pure PHP, byte-array accumulator,
|
||||
no bcmath/gmp dependency, handles arbitrary-length input.
|
||||
|
||||
**Caveat:** other Substrate chains/older Duniter v1 addresses may decode to
|
||||
a different total length (e.g. 32 bytes with no checksum at all — this was
|
||||
observed for an old Cesium v1-era address during testing, and correctly
|
||||
rejected by `cry01_ss58_decode()` as "unexpected decoded length"). The 36-byte
|
||||
/ 2-prefix-byte format is specific to (at least) Ğ1 v2 addresses as currently
|
||||
generated.
|
||||
|
||||
---
|
||||
|
||||
## 3. AccountInfo Decoding
|
||||
|
||||
`state_getStorage` on a `System.Account` key returns a SCALE-encoded
|
||||
`AccountInfo` struct:
|
||||
|
||||
```
|
||||
nonce: u32 (4 bytes)
|
||||
consumers: u32 (4 bytes)
|
||||
providers: u32 (4 bytes)
|
||||
sufficients: u32 (4 bytes)
|
||||
data.free: u128 (16 bytes) <- the spendable balance
|
||||
data.reserved: u128 (16 bytes)
|
||||
data.frozen: u128 (16 bytes)
|
||||
data.flags: u128 (16 bytes)
|
||||
```
|
||||
|
||||
All fields are little-endian, concatenated with no padding/separators
|
||||
(total 80 bytes when all fields present, though trailing zero fields may be
|
||||
omitted/truncated in the raw response — always check actual length).
|
||||
|
||||
`free` is at byte offset 16, length 16 (u128, little-endian). Duniter v2 uses
|
||||
**centimes** (1 Ğ1 = 100 units) as the smallest unit, same as Duniter v1.
|
||||
|
||||
**u128 arithmetic without bcmath/gmp:** `cry01_le_bytes_to_decimal_string()`
|
||||
implements little-endian byte → base-10 string conversion using only
|
||||
string-based big-integer add/multiply (`cry01_decimal_string_add()`,
|
||||
`cry01_decimal_string_multiply()`). No PHP extensions required.
|
||||
|
||||
**Verified result:** account with 1 Ğ1 → `free` raw value `100` → formatted
|
||||
as `1.00 Ğ1`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Node Architecture: Light vs. Full
|
||||
|
||||
### 4.1 Light mirror node (`duniter-mirror.service`, pre-existing)
|
||||
|
||||
- `--state-pruning 256` (default-ish), no explicit `--sync` flag
|
||||
- Disk usage: ~2GB at block ~1.39M
|
||||
- **Can serve `state_getStorage` for CURRENT state** (verified — this works
|
||||
fine for balance lookups)
|
||||
- Cannot serve state for blocks older than the pruning window (~256 blocks,
|
||||
roughly 25 minutes of history at 6s block time)
|
||||
- RPC originally bound to `127.0.0.1:9944` and `[::1]:9944` only (loopback) —
|
||||
**not reachable from the Hubzilla node over Wireguard** until fixed (see §5)
|
||||
|
||||
### 4.2 Full-state node (`duniter-full.service`, new tonight)
|
||||
|
||||
- `--sync fast --state-pruning 256`
|
||||
- "fast" sync: downloads blocks without executing them, downloads latest
|
||||
state with proofs — much faster than `full` sync (full block execution
|
||||
from genesis)
|
||||
- Disk usage: **under 5GB** after sync to chain head (~1.39M blocks) —
|
||||
significantly smaller than initially estimated; the 32GB volume resize
|
||||
done tonight was generously oversized
|
||||
- Sync time from genesis to chain head: **roughly 10-15 minutes** at
|
||||
~1500-2500 blocks/sec, ~600-900 KiB/s
|
||||
- Same current-state query capability as the light node — **for the balance
|
||||
lookup use case, this node was not strictly necessary**; the Twox128 fix
|
||||
alone would have made the light node work too (confirmed by testing the
|
||||
corrected storage key against both nodes — identical correct result)
|
||||
|
||||
### 4.3 What NEITHER node provides: full transaction history
|
||||
|
||||
Both nodes above use `--state-pruning 256` — only recent state is retrievable.
|
||||
**Neither supports querying historical balances at arbitrary past blocks, nor
|
||||
provides transaction history.** For the planned future feature (paste an
|
||||
address, see full transaction history), this requires either:
|
||||
|
||||
- `--state-pruning archive` (keep state for every historical block —
|
||||
significantly larger disk footprint, not yet measured)
|
||||
- A separate indexer (e.g. Subsquid/Squid, mentioned in Duniter's own docs
|
||||
for "public RPC" setups) that processes blocks and stores an indexed
|
||||
transaction database — likely the more practical path for a
|
||||
transaction-history UI, since raw archive-node state queries don't give
|
||||
you "all transactions for address X" without scanning every block
|
||||
|
||||
This is future work, scoped separately.
|
||||
|
||||
### 4.4 Smith / validator node — explicitly out of scope here
|
||||
|
||||
A Smith (validator) node requires session keys, `rotateKeys`, and on-chain
|
||||
Smith certification within the Ğ1 web of trust. This is a substantially
|
||||
larger, separate project (new Proxmox container, 786GB available on
|
||||
`/var/lib/vz` on `proxmox1`) and was **not** undertaken tonight. The
|
||||
`duniter-full` instance described in §4.2 is a plain full node, not a
|
||||
validator.
|
||||
|
||||
---
|
||||
|
||||
## 5. systemd Configuration Changes
|
||||
|
||||
### 5.1 `duniter-mirror.service` — RPC bind fix
|
||||
|
||||
**Problem:** RPC server only listened on `127.0.0.1:9944` / `[::1]:9944` —
|
||||
the Hubzilla node (on the Wireguard network, 10.0.0.x) could not reach it
|
||||
(`Connection refused`).
|
||||
|
||||
**Fix:** drop-in override at
|
||||
`/etc/systemd/system/duniter-mirror.service.d/override.conf`:
|
||||
|
||||
```ini
|
||||
[Service]
|
||||
ExecStart=
|
||||
ExecStart=/usr/bin/duniter --chain ${DUNITER_CHAIN_NAME} --name ${DUNITER_NODE_NAME}_mirror --listen-addr ${DUNITER_LISTEN_ADDR} --state-pruning ${DUNITER_PRUNING_PROFILE} --base-path ${BASE_PATH} --experimental-rpc-endpoint "listen-addr=127.0.0.1:9944,methods=safe" --experimental-rpc-endpoint "listen-addr=10.0.0.105:9944,methods=safe"
|
||||
```
|
||||
|
||||
**Important gotchas encountered:**
|
||||
|
||||
- `--experimental-rpc-endpoint` and the legacy `--rpc-cors` flag are
|
||||
**mutually exclusive** — using both is a hard error
|
||||
(`the argument '--rpc-cors <ORIGINS>' cannot be used with
|
||||
'--experimental-rpc-endpoint <EXPERIMENTAL_RPC_ENDPOINT>...'`)
|
||||
- The `cors=` sub-option of `--experimental-rpc-endpoint` expects
|
||||
`key=value` pairs separated by commas — passing a comma-separated list of
|
||||
CORS origins as `cors=http://a,http://b,...` breaks the parser (each
|
||||
origin gets misinterpreted as a separate `key=value` attempt). **We
|
||||
omitted `cors=` entirely** — not needed for server-to-server JSON-RPC
|
||||
(no browser involved).
|
||||
- `--experimental-rpc-endpoint` **replaces** the legacy RPC config wholesale
|
||||
— including the default localhost binding. The Oracle
|
||||
(`ORACLE_RPC_URL=ws://127.0.0.1:9944`) depends on a localhost endpoint, so
|
||||
**two** `--experimental-rpc-endpoint` flags are needed: one for
|
||||
`127.0.0.1` (Oracle) and one for `10.0.0.105` (Wireguard/Hubzilla access).
|
||||
- `methods=safe` restricts to read-only RPC methods — appropriate for both
|
||||
endpoints here, since neither the Oracle nor cry01 need to submit
|
||||
transactions through these nodes.
|
||||
|
||||
**Result confirmed:**
|
||||
```
|
||||
Running JSON-RPC server: addr=127.0.0.1:9944,10.0.0.105:9944
|
||||
```
|
||||
|
||||
### 5.2 `duniter-full.service` — new unit
|
||||
|
||||
New standalone systemd unit at `/etc/systemd/system/duniter-full.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Duniter full-state node.
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=duniter
|
||||
Group=duniter
|
||||
ExecStart=/usr/bin/duniter --chain g1 --name CivicInfrastructure-G1-Full_full --listen-addr /ip4/0.0.0.0/tcp/30334/ws --sync fast --state-pruning 256 --base-path /home/duniter/.local/share/duniter-full --experimental-rpc-endpoint "listen-addr=127.0.0.1:9945,methods=safe" --experimental-rpc-endpoint "listen-addr=10.0.0.105:9945,methods=safe"
|
||||
Restart=always
|
||||
RestartSec=10
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
EOF
|
||||
```
|
||||
|
||||
**Gotcha:** an IPv6 `--listen-addr /ip6/[::]/tcp/30334/ws` was attempted
|
||||
first and failed with `multiaddr parsing error: invalid IPv6 address syntax`
|
||||
— the shell's bracket-glob handling of `[::]` mangled the argument before it
|
||||
reached the binary, even when quoted in the heredoc (the unit file itself
|
||||
stores it correctly, but constructing/testing such strings interactively via
|
||||
shell is error-prone). **Omitted IPv6 listen-addr entirely** — the existing
|
||||
`duniter-mirror` unit does the same (IPv4-only `/ip4/0.0.0.0/tcp/30333/ws`),
|
||||
so this is consistent with existing practice, not a regression.
|
||||
|
||||
**Data directory:** `/home/duniter/.local/share/duniter-full`, owned by
|
||||
`duniter:duniter`, created fresh (separate from the mirror's
|
||||
`/home/duniter/.local/share/duniter`).
|
||||
|
||||
**Disk:** orchestrator's root filesystem (`/dev/loop4`) was resized from
|
||||
~8GB to 32GB ahead of this to provide headroom. Actual usage after full sync:
|
||||
under 5GB — the resize was generous relative to actual need, but a 32GB
|
||||
volume with ~27GB free leaves comfortable room for future growth (state trie
|
||||
grows over time as the chain progresses and more accounts/identities are
|
||||
created).
|
||||
|
||||
---
|
||||
|
||||
## 6. cry01 Configuration
|
||||
|
||||
`hubzilla/addon/cry01/config.json` (host-only, not in repo):
|
||||
|
||||
```json
|
||||
"g1_rpc_endpoint": "http://10.0.0.105:9945"
|
||||
```
|
||||
|
||||
Currently points at the new full node (port 9945). Per §4.2, the light
|
||||
node (port 9944) would also work for balance lookups now that the Twox128
|
||||
fix is in place — both were verified to return identical correct results
|
||||
for the test account. The choice of which to point at is not
|
||||
load-bearing for correctness; it is an operational/redundancy decision left
|
||||
open for now.
|
||||
|
||||
---
|
||||
|
||||
## 7. Tools Used for Diagnosis
|
||||
|
||||
- **scalecodec** (Python, `pip install scalecodec`) — decodes
|
||||
`state_getMetadata` output to enumerate pallets/storage items and confirm
|
||||
hasher types. Installed in the orchestrator's existing venv at
|
||||
`/srv/civic-orchestrator/venv`.
|
||||
- **xxhash** (Python, `pip install xxhash`) — used to independently compute
|
||||
and cross-check Twox128/xxh64 values against the PHP implementation.
|
||||
- Both are isolated to the orchestrator's Python venv — not installed on the
|
||||
Hubzilla node.
|
||||
|
||||
---
|
||||
|
||||
## 8. Summary of Verified Facts (quick reference)
|
||||
|
||||
| Claim | Status |
|
||||
|---|---|
|
||||
| Twox128 ≠ xxh128; Twox128 = reverse(xxh64(d,0)) + reverse(xxh64(d,1)) | ✅ Verified against live chain |
|
||||
| Blake2_128Concat = Blake2b-128(key) + key, Blake2b-128 is parameterized (not truncated) | ✅ Verified against RFC 7693 vectors |
|
||||
| Ğ1 addresses: 36-byte SS58, 2-byte prefix (0x5891), Blake2b-512/SS58PRE checksum | ✅ Verified, checksum matched |
|
||||
| AccountInfo.free at offset 16, 16 bytes LE, divide by 100 for Ğ1 | ✅ Verified: 1 Ğ1 account → correct result |
|
||||
| Light node (header-sync) can serve current-state `state_getStorage` | ✅ Verified — works identically to full node for current balances |
|
||||
| Light/full node disk usage at block ~1.39M | Light: ~2GB. Full (fast sync): <5GB |
|
||||
| Full sync (fast mode) time from genesis | ~10-15 minutes |
|
||||
| Neither node supports historical/archive queries or tx history | By design (`--state-pruning 256`); archive node or indexer needed for that |
|
||||
Reference in New Issue
Block a user