Annex IV dossier bundle¶
The dossier is the single artefact a Market Surveillance Authority can request to verify your compliance with Arts. 11, 12, 15, 53, and 73 for a given reporting period.
Endpoint¶
Returns application/zip with Content-Disposition: attachment;
filename="dossier_<org_slug>_<YYYYMMDD>_<YYYYMMDD>.zip".
Implemented in
backend/app/services/dossier_service.py.
Zip layout¶
dossier_<org>_<period>/
├── annex_iv.pdf # Art. 11 technical documentation
├── audit_log_period.jsonl # Art. 12 event logs (HMAC chain)
├── integrity_attestation.json # Chain verify result at bundle time
├── metrics.json # Art. 15 aggregates
├── provider_manifest.json # Art. 53 upstream GPAI trail
├── incidents.json # Art. 73 incidents in period
└── MANIFEST.json # File inventory with sha256 per member
annex_iv.pdf¶
Generated by report_generator.generate(). Sections:
- Cover page
- System description
- Data governance
- Interaction summary (period aggregates)
- Human oversight (actions breakdown from
audit_log_oversight) - Bias monitoring
- Integrity verification (chain status)
- Risk assessment (boilerplate in OSS; commercial edition customises per deployer)
If WeasyPrint system deps are missing on the host, falls back to
annex_iv.html. The MANIFEST.json's files[] key uses the actual
filename so verification still works either way.
audit_log_period.jsonl¶
One JSON line per audit entry in the period, ordered by
sequence_number ASC. Each line contains every field that went into
the HMAC payload, plus the stored previous_hash and current_hash.
A regulator can verify the chain offline given the per-org HMAC key.
integrity_attestation.json¶
The result of audit_service.verify_chain() at the moment the bundle
was built. Fields: verified, total_entries, entries_checked,
first_broken_at, message, plus bundle metadata.
metrics.json¶
Art. 15 aggregates over the period:
{
"format": "lex-custis/metrics/v1",
"period_start": "...",
"period_end": "...",
"total_interactions": 847,
"confidence": { "avg": 0.72, "min": 0.31, "max": 0.98 },
"pii": { "detections": 12, "rate": 0.0142 },
"bias": { "flagged_responses": 3, "rate": 0.0035 },
"human_oversight": { "actions_recorded": 102, "rate": 0.1204 },
"regulation": "EU AI Act Art. 15 — accuracy, robustness, cybersecurity (aggregates only; drift detection in commercial edition)"
}
provider_manifest.json¶
Distinct (provider, model) pairs seen in the period:
{
"format": "lex-custis/provider-manifest/v1",
"period_start": "...",
"period_end": "...",
"providers": [
{
"provider": "mistral",
"model": "mistral-large-latest",
"call_count": 842,
"first_seen": "...",
"last_seen": "...",
"upstream_disclosure": {
"policy_url": "https://mistral.ai/legal/",
"note": "Mistral models — see provider's Art. 53 GPAI disclosure..."
}
},
{
"provider": "ollama",
"model": "gemma3:4b",
"call_count": 5,
"first_seen": "...",
"last_seen": "...",
"upstream_disclosure": {
"policy_url": "https://ollama.com/",
"note": "Self-hosted Ollama runtime..."
}
}
],
"regulation": "EU AI Act Art. 53 — GPAI provider disclosures"
}
incidents.json¶
The list of Art. 73 incidents with detection_ts within the period,
each rendered via regulator_export(incident) + a
sla_status_at_bundle snapshot.
MANIFEST.json¶
The bundle's root of trust:
{
"format": "lex-custis/annex-iv-dossier/v1",
"organization_id": "...",
"organization_name": "...",
"period_start": "...",
"period_end": "...",
"generated_at": "...",
"generated_by_user_id": "...",
"files": [
{ "name": "annex_iv.pdf",
"size_bytes": 283415,
"sha256": "abc123..." },
{ "name": "audit_log_period.jsonl",
"size_bytes": 1128345,
"sha256": "def456..." },
...
],
"regulation_coverage": {
"art_11": "annex_iv.pdf",
"art_12": "audit_log_period.jsonl",
"art_15": "metrics.json",
"art_53": "provider_manifest.json",
"art_73": "incidents.json"
},
"integrity_verified_at_bundle_time": true
}
In commercial editions, the MANIFEST.json is signed with ed25519 and
time-stamped via RFC-3161. The public verify-key is published at a
well-known URL and on the repo, so a regulator can verify the bundle
without contact.
What's not in the bundle¶
- Raw training-data statistics (Art. 10 full dataset registry is paid).
- Art. 9 risk register (paid).
- Art. 27 FRIA outputs (paid).
- Deployer-specific IFUs (commercial customises the PDF per deployer).
If a regulator asks specifically for those, you'll need to ship them separately in OSS v0.1. Commercial edition adds them into the same bundle.
Offline verification¶
Even without a CLI, the verification procedure is reproducible by hand:
- Extract the zip.
- Confirm every file listed in
MANIFEST.json.files[]exists and itssha256matches. - Read
audit_log_period.jsonlinsequence_numberorder. - For each entry, recompute:
where
payload = previous_hash | org_id | user_id | sequence_number | llm_provider | llm_model | timestamp_iso | sha256(user_prompt) | sha256(ai_response) expected = HMAC-SHA-256(k_org, payload)k_org = HKDF-SHA-256(K, salt=org_id.bytes, info=b"lex-custis/audit-log/per-org-hmac-key/v1", 32 bytes). - Confirm
expected == entry.current_hashfor all entries.
If this walk succeeds, the chain is intact. If step 5 fails, the
first failing sequence_number localises the tamper.
A v0.2 CLI (verify-dossier) will automate this.