TODO

        
  

      

        TEXT
      

      
      
    
================================================================================
[common/chat.cpp | apply]
<|im_start|>user
1+1=？<|im_end|>
<|im_start|>assistant
<think>

================================================================================
[common/chat.cpp | common_chat_templates_apply]
<|im_start|>user
1+1=？<|im_end|>
<|im_start|>assistant
<think>
</think>
================================================================================
[common/common.cpp | common_tokenize]
<|im_start|>user
1+1=？<|im_end|>
<|im_start|>assistant
<think>
</think>
================================================================================
[llama_tokenize]
<|im_start|>user
1+1=？<|im_end|>
<|im_start|>assistant
<think>
</think>
================================================================================

        点击展开查看更多
      

common_chat_templates_apply 做什么

这是 prompt 渲染的总入口函数（chat.cpp:3346）。它的职责是：

根据 inputs.use_jinja 决定走 Jinja 模板路径还是 Legacy 路径
Jinja 路径（common_chat_templates_apply_jinja）会根据模板源码中的特征字符串，匹配到特定模型的 handler（DeepSeek R1、Hermes 2 Pro、Llama 3.x、Qwen3 Coder等），然后调用对应的 common_chat_params_init_xxx 函数来生成最终 prompt 和 grammar
最后用你添加的 PromptLogger::log 记录结果

为什么追加了 `</think>`

从你的输出来看：

第一段（apply 输出）：模板渲染后 prompt 以 <think>\n 结尾 — 这是 Jinja 模板自身生成的，表示模板希望模型以 thinking 模式开始生成
第二段（common_chat_templates_apply 输出）：prompt 末尾变成了 <think>\n</think>

这个 </think> 是由 handler 中的以下模式添加的（例如 chat.cpp:2404–2406 的 Hermes 2 Pro handler，或其他类似 handler）：

        
        CPP
      
if (string_ends_with(data.prompt, "<think>\n")) {
    if (!inputs.enable_thinking) {
        data.prompt += "</think>";     // ← 这里追加了 </think>
    } else {
        data.thinking_forced_open = true;
    }
}

        点击展开查看更多

逻辑含义：

模板本身总是会在 assistant 回复开头加上 <think>\n（因为模板设计如此）
但如果用户没有启用 thinking 模式（enable_thinking == false），代码会立即追加 </think> 来关闭 thinking 块，这样模型就会跳过思考阶段，直接输出正式回答
如果 enable_thinking == true，则不追加 </think>，而是设置 thinking_forced_open = true，让模型自由地在 <think>...</think> 中生成思考内容

简而言之：你看到的 </think> 是因为你运行时没有开启 thinking/reasoning 模式（即 enable_thinking 为 false），代码通过立即关闭 <think> 标签来“禁用”模型的思考过程，让它直接输出答案。

如果你想让模型保留思考过程，需要在启动参数中启用 thinking（例如 --reasoning-format）。

CLI 参数

--reasoning-budget N (arg.cpp:3037) ↓

common_params.reasoning_budget (common.h:535, 默认值 -1) ↓

server_context_impl::init() (server-context.cpp:889)

        
        CPP
      
const bool enable_thinking =
    params_base.use_jinja              // 必须启用 jinja
    && params_base.reasoning_budget != 0  // budget 不为 0 (-1 = 无限)
    && common_chat_templates_support_enable_thinking(...)  // 模板支持 thinking

        点击展开查看更多

↓

server_chat_params.enable_thinking (server-context.cpp:900) ↓

cli.cpp::format_chat() (cli.cpp:178-194)

        
        CPP
      
	common_chat_templates_inputs inputs;
	inputs.enable_thinking = chat_params.enable_thinking;  // (cli.cpp:191)

        点击展开查看更多

↓

common_chat_templates_apply(tmpls, inputs) (chat.cpp:3346) ↓

common_chat_templates_apply_jinja(tmpls, inputs) (chat.cpp:3032) ↓ 匹配到具体 handler（如 hermes_2_pro / nemotron_v2 等） ↓ handler 内部检查 (如 chat.cpp:2404-2410)

        
        CPP
      
if (string_ends_with(data.prompt, "<think>\n")) {
    if (!inputs.enable_thinking) {
        data.prompt += "</think>";       // ← 禁用思考：立即关闭 think 标签
    } else {
        data.thinking_forced_open = true; // ← 启用思考：保持 think 打开
    }
}

        点击展开查看更多

Llama.cpp 控制思考的原理（TODO）

common_chat_templates_apply 做什么

为什么追加了 `</think>`

逻辑含义：

版权声明

目录

common_chat_templates_apply 做什么

为什么追加了 </think>

逻辑含义：

版权声明

相关文章

Prompt Template控制Qwen3.5的思考模式（LM Studio）

Telsa P100大模型测试

开始搜索

未找到结果

为什么追加了 `</think>`