encode memory trust hierarchy in phi's personality and operational instructions
phi hallucinated a user's name because synthesized memory summaries were
treated with the same weight as verbatim exchanges. this adds principled
trust levels — grounded in anthropic's constitution — to the system prompt,
personality doc, context labels, and extraction prompt so phi hedges on
low-trust data and treats user corrections as authoritative.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>