experiment: round 2 complete, SAME_TYPE_REFINE_THRESHOLD is the key lever

Reference implementation for the Phoenix Architecture. Work in progress. aicoding.leaflet.pub/

ai coding crazy

8 more experiments after the inferEdgeType code fix. The new
SAME_TYPE_REFINE_THRESHOLD parameter at 0.2 was the big win, dropping
D-rate from 47%→9%. Final score: 0.9640. Remaining gap is Auth v2
type accuracy (67%) due to ambiguous substring matching in gold standard.

Chad Fowler 1 month ago 7a75354b 1462844a

1 changed file

expand all

experiments

results.tsv

experiments/results.tsv

··· 18 18 2026-03-26T22:46:38.390Z 0.9061 100.0 94.4 95.5 47.4 100.0 6.2 nq75ia 19 19 2026-03-26T23:12:45.269Z 0.9265 100.0 94.4 95.5 33.8 100.0 6.2 x0da3a 20 20 2026-03-26T23:13:07.799Z 0.9640 100.0 94.4 95.5 8.8 100.0 6.2 42knqt 21 + 2026-03-26T23:14:22.740Z 0.9640 100.0 94.4 95.5 8.8 100.0 6.2 42knqt

Configure Feed

Configure Feed