experiment: round 2 complete, SAME_TYPE_REFINE_THRESHOLD is the key lever
8 more experiments after the inferEdgeType code fix. The new
SAME_TYPE_REFINE_THRESHOLD parameter at 0.2 was the big win, dropping
D-rate from 47%→9%. Final score: 0.9640. Remaining gap is Auth v2
type accuracy (67%) due to ambiguous substring matching in gold standard.