providers/anthropic: drop thinking from retry_kwargs when forcing tool use

Live validation of the L3 sense pilot surfaced a real bug in L1's
Anthropic structured-output fallback path: when the primary
output_config call raises BadRequestError, the fallback to forced
tool_use kept the `thinking` parameter, which Anthropic's API rejects
("Thinking may not be enabled when tool_choice forces tool use"). The
fallback then bubbled a confusing secondary 400 instead of recovering.

Drop `thinking` from retry_kwargs in both sync + async paths. Restore
the temperature value that thinking originally displaced (the primary
path sets thinking xor temperature). Add a regression test asserting
the retry kwargs strip thinking and carry temperature forward.

Pre-existing Anthropic constraints surfaced during the same live test
but are out of scope here:
1. max_tokens must be > thinking.budget_tokens (production sense
defaults satisfy this)
2. SDK requires streaming for max_tokens that could take >10 min
(~30k+ for sonnet) — production sense default of 49152 hits this

Both affect any thinking-enabled Anthropic caller, schema or no
schema. Filed as separate VPE follow-up notes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Jer Miller 1 month ago c4a4f0c0 8c952dc4

+50

2 changed files

expand all

tests

test_anthropic.py

think

providers

anthropic.py

+40

tests/test_anthropic.py

··· 496 496 assert "output_config" not in retry_kwargs 497 497 assert result["text"] == json.dumps({"key": "value"}) 498 498 499 + def test_fallback_drops_thinking_when_forcing_tool_use(self, monkeypatch): 500 + # Anthropic rejects `tool_choice` forcing combined with `thinking`. 501 + # Verify the fallback strips thinking and restores temperature. 502 + provider = importlib.reload( 503 + importlib.import_module("think.providers.anthropic") 504 + ) 505 + mock_client = MagicMock() 506 + 507 + class DummyBadRequestError(Exception): 508 + pass 509 + 510 + fallback_response = MagicMock() 511 + fallback_response.content = [ 512 + SimpleNamespace(type="tool_use", input={"key": "value"}), 513 + ] 514 + fallback_response.usage = None 515 + fallback_response.stop_reason = "end_turn" 516 + mock_client.messages.create.side_effect = [ 517 + DummyBadRequestError("bad schema"), 518 + fallback_response, 519 + ] 520 + 521 + monkeypatch.setattr(provider, "BadRequestError", DummyBadRequestError) 522 + monkeypatch.setattr(provider, "_get_anthropic_client", lambda: mock_client) 523 + schema = {"type": "object"} 524 + 525 + provider.run_generate( 526 + "hello", json_schema=schema, thinking_budget=4096, temperature=0.5 527 + ) 528 + 529 + primary_kwargs = mock_client.messages.create.call_args_list[0].kwargs 530 + assert primary_kwargs.get("thinking") == { 531 + "type": "enabled", 532 + "budget_tokens": 4096, 533 + } 534 + retry_kwargs = mock_client.messages.create.call_args_list[1].kwargs 535 + assert "thinking" not in retry_kwargs 536 + assert retry_kwargs.get("temperature") == 0.5 537 + assert retry_kwargs["tool_choice"] == {"type": "tool", "name": "response"} 538 + 499 539 def test_async_with_schema_uses_output_config(self, monkeypatch): 500 540 provider = importlib.reload( 501 541 importlib.import_module("think.providers.anthropic")

+10

think/providers/anthropic.py

··· 519 519 except BadRequestError: 520 520 retry_kwargs = dict(request_kwargs) 521 521 retry_kwargs.pop("output_config", None) 522 + # Anthropic rejects `tool_choice` forcing combined with `thinking`. 523 + # When falling back to forced tool use, drop thinking and restore 524 + # the temperature path that thinking originally displaced. 525 + if retry_kwargs.pop("thinking", None) is not None: 526 + retry_kwargs.setdefault("temperature", temperature) 522 527 retry_kwargs["tools"] = [ 523 528 { 524 529 "name": tool_name, ··· 602 607 except BadRequestError: 603 608 retry_kwargs = dict(request_kwargs) 604 609 retry_kwargs.pop("output_config", None) 610 + # Anthropic rejects `tool_choice` forcing combined with `thinking`. 611 + # When falling back to forced tool use, drop thinking and restore 612 + # the temperature path that thinking originally displaced. 613 + if retry_kwargs.pop("thinking", None) is not None: 614 + retry_kwargs.setdefault("temperature", temperature) 605 615 retry_kwargs["tools"] = [ 606 616 { 607 617 "name": tool_name,

Configure Feed

Configure Feed