thank you for getting this going! some old thoughts from a previous private discussion about this:
thing 1: i'm wondering if the cursor should hold the initial length of the backlinks vec, pass it along to subsequent calls, and consider that the end, even if newer links have become available since beginning paging. (could even have a way to flag to clients that newer links are available, and perhaps a way to let clients start paging again starting from the end of a cursor?)
this would give something closer to a consistent view of a point in time, which is nice. of course it's not actually fully consistent: a link could be deleted while we're paging, so the total set that a client sees is "all live when they started, plus some undefined combination of links deleted while paging". but it's probably close enough.
(we could do an actual db snapshot and identify it with the cursor, but that's adding a lot of new stuff to handle, so, not for now at least.)
thing 2 (related): the way record updates are handled right now is by looking up the previous version, generating deletes for all links in it, and then generating creates for all links in the new version. this winds up being correct-ish, but for updates that don't change some links, it churns those links and pushes the backlinks to the end of the vec.
in the context of this PR, that means you can actually get duplicate backlinks from paging reversed, since you might see a link from a record that then gets updated causing it to reappear later.
that might be acceptable, but would need some documentation to note the behaviour. (or, if we stop iterating at the end of the original vec size, it's a non-issue because the duplicates would be after the end)
(separately: we should fix record update handling so it actually diffs the set of links, to avoid churning unchanged links)
thank you for getting this going! some old thoughts from a previous private discussion about this:
thing 1: i'm wondering if the cursor should hold the initial length of the backlinks vec, pass it along to subsequent calls, and consider that the end, even if newer links have become available since beginning paging. (could even have a way to flag to clients that newer links are available, and perhaps a way to let clients start paging again starting from the end of a cursor?)
this would give something closer to a consistent view of a point in time, which is nice. of course it's not actually fully consistent: a link could be deleted while we're paging, so the total set that a client sees is "all live when they started, plus some undefined combination of links deleted while paging". but it's probably close enough.
(we could do an actual db snapshot and identify it with the cursor, but that's adding a lot of new stuff to handle, so, not for now at least.)
thing 2 (related): the way record updates are handled right now is by looking up the previous version, generating deletes for all links in it, and then generating creates for all links in the new version. this winds up being correct-ish, but for updates that don't change some links, it churns those links and pushes the backlinks to the end of the vec.
in the context of this PR, that means you can actually get duplicate backlinks from paging reversed, since you might see a link from a record that then gets updated causing it to reappear later.
that might be acceptable, but would need some documentation to note the behaviour. (or, if we stop iterating at the end of the original vec size, it's a non-issue because the duplicates would be after the end)
(separately: we should fix record update handling so it actually diffs the set of links, to avoid churning unchanged links)