Monorepo for Tangled tangled.org
856
fork

Configure Feed

Select the types of activity you want to include in your feed.

AppView pull ingester should retry or resync skipped records #533

open opened by onev.cat

Valid sh.tangled.repo.pull records can remain permanently absent from the AppView database when the pull ingester fails or misses the Jetstream event once.

Observed during real E2E testing:

  • A CLI-created PR on a fresh repo was ingested and is visible in AppView:
  • A CLI-created PR on onev.cat/tang exists in PDS and Constellation, but never received an AppView pull number:
    • Record: at://did:plc:kl2ejrmz5zmxnno3ll4luz76/sh.tangled.repo.pull/3mm2jn2yllp22
    • It appears in protocol/Constellation based listing, but /onev.cat/tang/pulls/<next id> does not exist.
  • A manually created AppView PR for the same branch does appear:

The skipped CLI record appears structurally valid:

  • target.repo is the repo DID: did:plc:vglueuiqgjehmkqpjloszu2q
  • source.repo is the same repo DID for a branch-based PR
  • the patch blob can be fetched from the author's PDS and decompressed as a valid gzip patch
  • its shape matches the successful fresh-repo CLI PR closely

The likely failure mode is in the AppView ingester path:

  1. Ingester.Ingest dispatches sh.tangled.repo.pull to ingestPull.
  2. ingestPull fetches patch blobs, parses models.PullFromRecord, validates, and calls db.PutPull.
  3. If any step returns an error, the outer ingester logs failed to ingest record, skipping.
  4. The cursor is still advanced with SaveLastTimeUs(e.TimeUS + 1).

That makes a transient or data-dependent ingest failure one-shot: the valid PDS record remains visible to protocol consumers, but AppView DB never repairs itself unless there is a manual backfill/resync or a later record update that happens to retrigger ingestion.

Possible fixes:

  • Do not advance the Jetstream cursor for failed records that should be retried.
  • Add a dead-letter/retry queue for failed ingests, keyed by AT URI and collection.
  • Add an admin or internal resync path for a specific AT URI / DID+rkey.
  • Add a lazy repair path when AppView can resolve a pull from PDS/Constellation but has no DB row.

This is separate from CLI record schema generation: the affected record is present in PDS/Constellation and has a repo-DID based target.

sign up or login to add to the discussion
Labels

None yet.

assignee

None yet.

Participants 1
AT URI
at://did:plc:kl2ejrmz5zmxnno3ll4luz76/sh.tangled.repo.issue/3mm2krb7uy22p