Valid sh.tangled.repo.pull records can remain permanently absent from the AppView database when the pull ingester fails or misses the Jetstream event once.
Observed during real E2E testing:
- A CLI-created PR on a fresh repo was ingested and is visible in AppView:
- AppView: https://tangled.org/onev.cat/tang-e2e-pr-did-20260517220149/pulls/1/round/0
- Record:
at://did:plc:kl2ejrmz5zmxnno3ll4luz76/sh.tangled.repo.pull/3mm2gfugldv22
- A CLI-created PR on
onev.cat/tangexists in PDS and Constellation, but never received an AppView pull number:- Record:
at://did:plc:kl2ejrmz5zmxnno3ll4luz76/sh.tangled.repo.pull/3mm2jn2yllp22 - It appears in protocol/Constellation based listing, but
/onev.cat/tang/pulls/<next id>does not exist.
- Record:
- A manually created AppView PR for the same branch does appear:
- AppView: https://tangled.org/onev.cat/tang/pulls/3/round/0
- Record:
at://did:plc:kl2ejrmz5zmxnno3ll4luz76/sh.tangled.repo.pull/3mm2jyxol5s22
The skipped CLI record appears structurally valid:
target.repois the repo DID:did:plc:vglueuiqgjehmkqpjloszu2qsource.repois the same repo DID for a branch-based PR- the patch blob can be fetched from the author's PDS and decompressed as a valid gzip patch
- its shape matches the successful fresh-repo CLI PR closely
The likely failure mode is in the AppView ingester path:
Ingester.Ingestdispatchessh.tangled.repo.pulltoingestPull.ingestPullfetches patch blobs, parsesmodels.PullFromRecord, validates, and callsdb.PutPull.- If any step returns an error, the outer ingester logs
failed to ingest record, skipping. - The cursor is still advanced with
SaveLastTimeUs(e.TimeUS + 1).
That makes a transient or data-dependent ingest failure one-shot: the valid PDS record remains visible to protocol consumers, but AppView DB never repairs itself unless there is a manual backfill/resync or a later record update that happens to retrigger ingestion.
Possible fixes:
- Do not advance the Jetstream cursor for failed records that should be retried.
- Add a dead-letter/retry queue for failed ingests, keyed by AT URI and collection.
- Add an admin or internal resync path for a specific AT URI / DID+rkey.
- Add a lazy repair path when AppView can resolve a pull from PDS/Constellation but has no DB row.
This is separate from CLI record schema generation: the affected record is present in PDS/Constellation and has a repo-DID based target.