<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Llama.cpp如何上传图片]]></title><description><![CDATA[<p dir="auto">暂时没有测试，我用的是LMStudio，它自带上传图片功能，这是效果图：<br />
<img src="https://upload.lcz.me/uploads/e08647f1-61cd-4853-9f27-ef5ecb1a05f0.jpeg" alt="Qwen3.5 27b分析图片.jpeg" class=" img-fluid img-markdown" /></p>
<p dir="auto">ChatBox也支持的，就不上图了。</p>
<p dir="auto">问了下Gemini，标准的 llama.cpp 本身确实已经内置了对多模态的支持，不需要安装额外的插件，只需通过 llama-server 开启 API 服务即可。</p>
<p dir="auto">不过，在实际操作中，为了让它“转起来”，有几个关键的技术点需要你留意：</p>
<ol>
<li>核心架构：模型 + 投影器 (CLIP)<br />
虽然你只需发送 API，但 llama-server 在启动时必须加载两个部分：</li>
</ol>
<p dir="auto">语言模型 (LLM)：负责理解和对话（如 gemma4.gguf）。</p>
<p dir="auto">视觉投影器 (mmproj)：负责把图片“翻译”给模型看（如 gemma4.gguf）。</p>
<p dir="auto"><strong>启动命令示例：</strong><br />
export LD_LIBRARY_PATH=/mnt/nvidia/llama.cpp/build/bin:$LD_LIBRARY_PATH<br />
MODEL_PATH="/mnt/data/ai/lmstudio/.lmstudio/models/lmstudio-community/gemma-4-31B-it-GGUF/gemma-4-31B-it-Q4_K_M.gguf"<br />
/mnt/nvidia/llama.cpp/build/bin/llama-server <br />
-m "$MODEL_PATH" <br />
--host 0.0.0.0 --port 8080 <br />
--n-gpu-layers 99 <br />
--ctx-size 81920 <br />
--parallel 1 <br />
--cache-type-k q8_0 <br />
--cache-type-v q8_0 <br />
--flash-attn on <br />
--no-mmap <br />
--mlock <br />
--reasoning-budget 0</p>
<ol start="2">
<li>API 的“标准”格式<br />
llama.cpp 的 API 极力兼容 OpenAI 格式。当你通过代码发送图片时，<strong>图片必须转换成 Base64</strong> 编码。</li>
</ol>
<p dir="auto">一个典型的 Python 调用结构（使用 openai 库）：<br />
import base64<br />
from openai import OpenAI</p>
<p dir="auto">client = OpenAI(base_url="<a href="http://localhost:8080/v1" rel="nofollow ugc">http://localhost:8080/v1</a>", api_key="sk-no-key-required")</p>
<p dir="auto"><em><strong>图片转为 Base64</strong></em><br />
with open("image.jpg", "rb") as f:<br />
base64_image = base64.b64encode(f.read()).decode('utf-8')</p>
<p dir="auto">response = client.chat.completions.create(<br />
model="gpt-4-vision-preview", # 这里名字随便写，llama.cpp 会自动对应<br />
messages=[<br />
{<br />
"role": "user",<br />
"content": [<br />
{"type": "text", "text": "描述这张图"},<br />
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}<br />
],<br />
}<br />
]<br />
)<br />
print(response.choices[0].message.content)</p>
]]></description><link>https://lcz.me/topic/3/llama.cpp如何上传图片</link><generator>RSS for Node</generator><lastBuildDate>Wed, 20 May 2026 07:04:28 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/3.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 03 May 2026 02:16:03 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Fri, 08 May 2026 08:09:09 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/bily-j" aria-label="Profile: bily-j">@<bdi>bily-j</bdi></a> 同样的模型，文件，lmstudio就支持，所以和模型文件格式无关，就是要mmproj</p>
]]></description><link>https://lcz.me/post/539</link><guid isPermaLink="true">https://lcz.me/post/539</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Fri, 08 May 2026 08:09:09 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Fri, 08 May 2026 07:17:25 GMT]]></title><description><![CDATA[<p dir="auto">这个图片识别是不是跟模型有关，qwen3.6-27B我问AI说Q4.GGUF是文本模型，让我下载带VL的，我下载了确实能识别<br />
，也不知道是量化作者脱了图片识别能力还是模型本身就不支持</p>
]]></description><link>https://lcz.me/post/530</link><guid isPermaLink="true">https://lcz.me/post/530</guid><dc:creator><![CDATA[bily j]]></dc:creator><pubDate>Fri, 08 May 2026 07:17:25 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Wed, 06 May 2026 07:45:32 GMT]]></title><description><![CDATA[<p dir="auto">stakira 谢谢，我试试。</p>
]]></description><link>https://lcz.me/post/302</link><guid isPermaLink="true">https://lcz.me/post/302</guid><dc:creator><![CDATA[Tide]]></dc:creator><pubDate>Wed, 06 May 2026 07:45:32 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Tue, 05 May 2026 18:24:49 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/stakira" aria-label="Profile: stakira">@<bdi>stakira</bdi></a> 很好，该优化的都优化了。</p>
]]></description><link>https://lcz.me/post/253</link><guid isPermaLink="true">https://lcz.me/post/253</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 05 May 2026 18:24:49 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Tue, 05 May 2026 17:32:44 GMT]]></title><description><![CDATA[<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tide" aria-label="Profile: Tide">@<bdi>Tide</bdi></a> <a href="/post/146">说</a>:</p>
<p dir="auto">很吃内存</p>
</blockquote>
<p dir="auto">LM studio 节省资源的推荐配置</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/9bec860c-bec1-453a-b598-d51126d39c6f.jpeg" alt="deb5f677-dc09-4aad-9667-e154e1283990-image.jpeg" class=" img-fluid img-markdown" /></p>
<p dir="auto">第1项修改减少并发数减少显存使用，第2、3项修改减少内存使用，后面两项修改量化kv cache减少显存使用</p>
]]></description><link>https://lcz.me/post/240</link><guid isPermaLink="true">https://lcz.me/post/240</guid><dc:creator><![CDATA[stakira]]></dc:creator><pubDate>Tue, 05 May 2026 17:32:44 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Tue, 05 May 2026 01:33:22 GMT]]></title><description><![CDATA[<p dir="auto">linux下没看出来会吃内存，资源消耗正常。</p>
]]></description><link>https://lcz.me/post/149</link><guid isPermaLink="true">https://lcz.me/post/149</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 05 May 2026 01:33:22 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Tue, 05 May 2026 01:23:30 GMT]]></title><description><![CDATA[<p dir="auto">你们用过win的lmstudio很吃内存吗？Ubuntu里安装lmstudio这个问题好些吗？</p>
]]></description><link>https://lcz.me/post/146</link><guid isPermaLink="true">https://lcz.me/post/146</guid><dc:creator><![CDATA[Tide]]></dc:creator><pubDate>Tue, 05 May 2026 01:23:30 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Mon, 04 May 2026 23:58:22 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/linax777" aria-label="Profile: linax777">@<bdi>linax777</bdi></a> 民间大神多</p>
]]></description><link>https://lcz.me/post/142</link><guid isPermaLink="true">https://lcz.me/post/142</guid><dc:creator><![CDATA[墙内人]]></dc:creator><pubDate>Mon, 04 May 2026 23:58:22 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Mon, 04 May 2026 17:45:31 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/ike-yu" aria-label="Profile: ike-yu">@<bdi>ike-yu</bdi></a> 你只要跑起来差距不大，你怎么方便怎么来，先跑起来再对比。</p>
]]></description><link>https://lcz.me/post/121</link><guid isPermaLink="true">https://lcz.me/post/121</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Mon, 04 May 2026 17:45:31 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Mon, 04 May 2026 16:02:43 GMT]]></title><description><![CDATA[<p dir="auto">老哥，llama.cpp跟lm-studio會差很多嗎？還是選個自己用的上手的就好了</p>
]]></description><link>https://lcz.me/post/117</link><guid isPermaLink="true">https://lcz.me/post/117</guid><dc:creator><![CDATA[ike yu]]></dc:creator><pubDate>Mon, 04 May 2026 16:02:43 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Mon, 04 May 2026 14:16:59 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/linax777" aria-label="Profile: linax777">@<bdi>linax777</bdi></a> 非常好，刚准备来更新回答，这个就是标准答案了。</p>
]]></description><link>https://lcz.me/post/99</link><guid isPermaLink="true">https://lcz.me/post/99</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Mon, 04 May 2026 14:16:59 GMT</pubDate></item><item><title><![CDATA[Reply to Llama.cpp如何上传图片 on Mon, 04 May 2026 13:56:56 GMT]]></title><description><![CDATA[<p dir="auto">重點是要加載 mmproj 文件，以下是我使用的容器 docker-compose 文件，可以參考 command:<br />
services:<br />
llama-cpp:<br />
image: <a href="http://ghcr.io/ggml-org/llama.cpp:server-cuda" rel="nofollow ugc">ghcr.io/ggml-org/llama.cpp:server-cuda</a><br />
container_name: llama-cpp-cuda<br />
ports:<br />
- "8080:8080"<br />
volumes:<br />
- ~/models:/models<br />
command:<br />
- -m<br />
- /models/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-Q4_K_P.gguf<br />
- --alias<br />
- Qwen3.6-27B-Q4_K_P<br />
- --host<br />
- 0.0.0.0<br />
- --port<br />
- "8080"<br />
- --mmproj<br />
- /models/mmproj-Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-f16.gguf<br />
- --n-gpu-layers<br />
- "999"<br />
- --jinja<br />
- --ctx-size<br />
- "131072"<br />
- --chat-template-kwargs<br />
- '{"enable_thinking": false}'<br />
- --metrics<br />
deploy:<br />
resources:<br />
reservations:<br />
devices:<br />
- driver: nvidia<br />
count: 1<br />
capabilities: [gpu]</p>
]]></description><link>https://lcz.me/post/98</link><guid isPermaLink="true">https://lcz.me/post/98</guid><dc:creator><![CDATA[linax777]]></dc:creator><pubDate>Mon, 04 May 2026 13:56:56 GMT</pubDate></item></channel></rss>