跳转至内容
  • 版块
  • 最新
  • 标签
  • 热门
  • 用户
  • 群组
皮肤
  • 浅色
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • 深色
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • 默认(不使用皮肤)
  • 不使用皮肤
折叠
品牌标识

抡锤者

A

alanwoo

@alanwoo
取消关注 关注
关于
帖子
3
主题
1
分享
0
群组
0
粉丝
0
关注
0

帖子

最新 最佳 有争议的

  • Hermes Agent 最新版本 v0.17.0 部署本地模型 bug
    A alanwoo

    Hermes Agent 最新版本 v0.17.0 (June 19,2026)部署本地模型 bug

    本地模型:qwen3.6-27b-fp8
    推理引擎:vLLM 0.23.0
    顯卡:NVIDIA RTX PRO 6000 Blackwell Workstation Edition
    OS: Ubuntu 24.04.4 LTS

    今天當我更新 Hermes Agent 從 v0.14.0 到 最新版本 v.017.0 後頻繁出現一下報錯碼,在加入 max_tokens 設定後問題得以解決。

    報錯碼

    API call failed (attempt 1/3): BadRequestError [HTTP 400]
       Provider: custom  Model: qwen36-27b
       Endpoint: http://127.0.0.1:8000/v1
       Error: HTTP 400: This model's maximum context length is 65536 tokens. However, you requested 65536 output tokens and your prompt contains 81579 characters (more than 0 characters, which is the upper bound for 0 input tokens). Please reduce the length of the input prompt or the number of requested output tokens. (par
       Details: {'message': "This model's maximum context length is 65536 tokens. However, you requested 65536 output tokens and your prompt contains 81579 characters (more than 0 characters, which is the upper bound for 0 input tokens). Please reduce the length of the input prompt or the number of requested output
       Elapsed: 0.21s  Context: 2 msgs, ~4,861 tokens
       Output cap too large for current prompt — retrying with max_tokens=38,279 (available_tokens=38,343; context_length unchanged at 65,536)
    
    API call failed (attempt 1/3): BadRequestError [HTTP 400]
       Provider: custom  Model: qwen36-27b
       Endpoint: http://127.0.0.1:8000/v1
       Error: HTTP 400: This model's maximum context length is 65536 tokens. However, you requested 65536 output tokens and your prompt contains 98728 characters (more than 0 characters, which is the upper bound for 0 input tokens). Please reduce the length of the input prompt or the number of requested output tokens. (par
       Details: {'message': "This model's maximum context length is 65536 tokens. However, you requested 65536 output tokens and your prompt contains 98728 characters (more than 0 characters, which is the upper bound for 0 input tokens). Please reduce the length of the input prompt or the number of requested output
       Elapsed: 0.04s  Context: 5 msgs, ~9,388 tokens
       Output cap too large for current prompt — retrying with max_tokens=32,562 (available_tokens=32,626; context_length unchanged at 65,536)
    

    修改前 config.yaml 設置

    model:
      provider: custom
      default: qwen36-27b
      base_url: http://127.0.0.1:8000/v1
    

    解決方法:
    加入 max_tokens: 8192
    *可以根據需求調整 max_tokens 參數

    修改後 config.yaml 設置

    model:
      provider: custom
      default: qwen36-27b
      base_url: http://127.0.0.1:8000/v1
      max_tokens: 8192
    

    希望這個對大家有幫助

    Alan

    AI Agent
  • 登录

  • 没有帐号? 注册

  • 第一个帖子
    最后一个帖子
0
  • 版块
  • 最新
  • 标签
  • 热门
  • 用户
  • 群组