Expose structured thinking without polluting normal assistant output

Extended thinking needed to travel end-to-end through the API,
runtime, and CLI so the client can request a thinking budget,
preserve streamed reasoning blocks, and present them in a
collapsed text-first form. The implementation keeps thinking
strictly opt-in, adds a session-local toggle, and reuses the
existing flag/slash-command/reporting surfaces instead of
introducing a new UI layer.

Constraint: Existing non-thinking text/tool flows had to remain backward compatible by default
Constraint: Terminal UX needed a lightweight collapsed representation rather than an interactive TUI widget
Rejected: Heuristic CLI-only parsing of reasoning text | brittle against structured stream payloads
Rejected: Expanded raw thinking output by default | too noisy for normal assistant responses
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep thinking blocks structurally separate from answer text unless the upstream API contract changes
Tested: cargo fmt --all; cargo clippy --workspace --all-targets -- -D warnings; cargo test -q
Not-tested: Live upstream thinking payloads against the production API contract
This commit is contained in:
Yeachan-Heo
2026-04-01 01:08:18 +00:00
parent d6341d54c1
commit c14196c730
9 changed files with 353 additions and 31 deletions

View File

@@ -912,6 +912,7 @@ mod tests {
system: None,
tools: None,
tool_choice: None,
thinking: None,
stream: false,
};

View File

@@ -13,5 +13,5 @@ pub use types::{
ContentBlockDelta, ContentBlockDeltaEvent, ContentBlockStartEvent, ContentBlockStopEvent,
InputContentBlock, InputMessage, MessageDelta, MessageDeltaEvent, MessageRequest,
MessageResponse, MessageStartEvent, MessageStopEvent, OutputContentBlock, StreamEvent,
ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
ThinkingConfig, ToolChoice, ToolDefinition, ToolResultContentBlock, Usage,
};

View File

@@ -12,6 +12,8 @@ pub struct MessageRequest {
pub tools: Option<Vec<ToolDefinition>>,
#[serde(skip_serializing_if = "Option::is_none")]
pub tool_choice: Option<ToolChoice>,
#[serde(skip_serializing_if = "Option::is_none")]
pub thinking: Option<ThinkingConfig>,
#[serde(default, skip_serializing_if = "std::ops::Not::not")]
pub stream: bool,
}
@@ -24,6 +26,23 @@ impl MessageRequest {
}
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct ThinkingConfig {
#[serde(rename = "type")]
pub kind: String,
pub budget_tokens: u32,
}
impl ThinkingConfig {
#[must_use]
pub fn enabled(budget_tokens: u32) -> Self {
Self {
kind: "enabled".to_string(),
budget_tokens,
}
}
}
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct InputMessage {
pub role: String,
@@ -130,6 +149,11 @@ pub enum OutputContentBlock {
Text {
text: String,
},
Thinking {
thinking: String,
#[serde(default, skip_serializing_if = "Option::is_none")]
signature: Option<String>,
},
ToolUse {
id: String,
name: String,
@@ -189,6 +213,8 @@ pub struct ContentBlockDeltaEvent {
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ContentBlockDelta {
TextDelta { text: String },
ThinkingDelta { thinking: String },
SignatureDelta { signature: String },
InputJsonDelta { partial_json: String },
}

View File

@@ -258,6 +258,7 @@ async fn live_stream_smoke_test() {
system: None,
tools: None,
tool_choice: None,
thinking: None,
stream: false,
})
.await
@@ -438,6 +439,7 @@ fn sample_request(stream: bool) -> MessageRequest {
}),
}]),
tool_choice: Some(ToolChoice::Auto),
thinking: None,
stream,
}
}