<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业]]></title><description><![CDATA[<h1><strong>32k上下文</strong></h1>
<p dir="auto"><img src="https://upload.lcz.me/uploads/88487534-e0c5-4312-baf9-83f2b2c0a007.jpeg" alt="f606e72e-d575-49a5-9923-9f357c5aa2b8-image.jpeg" class=" img-fluid img-markdown" /><br />
<img src="https://upload.lcz.me/uploads/8abc07dd-f710-45c5-b8ea-d2851409e7eb.jpeg" alt="283c6ec7-f7f1-4299-962c-7edd1a4ebba8-image.jpeg" class=" img-fluid img-markdown" /></p>
<h1><strong>128k上下文</strong></h1>
<p dir="auto"><img src="https://upload.lcz.me/uploads/4c5c8df5-a2a3-40e8-8b28-877a91bc4497.jpeg" alt="bc96d55d-1658-4c65-b724-5adb5568986c-image.jpeg" class=" img-fluid img-markdown" /><br />
<img src="https://upload.lcz.me/uploads/6ef9e90c-7381-4d49-879f-dd68e5172eac.jpeg" alt="de5b8250-5ae9-4f72-9158-ee779226e41c-image.jpeg" class=" img-fluid img-markdown" /><br />
<img src="https://upload.lcz.me/uploads/8dce6511-7dad-4edc-881c-5adea49757a2.jpeg" alt="2fdc420e-eef4-4290-b718-596afc538479-image.jpeg" class=" img-fluid img-markdown" /><br />
测不动了，感觉128k还不是上限，反正就是越跑系统内存占用越来越大，吐字速度逐渐变慢！</p>
]]></description><link>https://lcz.me/topic/332/amd-5700g-32g-7900xtx-windows11-llama.cpp-windows-x64-vulkan-跑qwen3.6-35b-a3b-ud-q4_k_s交作业</link><generator>RSS for Node</generator><lastBuildDate>Sun, 31 May 2026 05:33:27 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/332.rss" rel="self" type="application/rss+xml"/><pubDate>Wed, 27 May 2026 17:28:46 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Thu, 28 May 2026 03:50:23 GMT]]></title><description><![CDATA[<p dir="auto">r9700 用Qwen3.6-35B-A3B-UD-Q6_K 没问题，速度还是很快， 96K上下文，速度还是不错。<br />
不搞严格推理，数学计算啥的，不需要全参数模型， A3B一般也够了。<br />
不过多尝试一下模型也没问题</p>
]]></description><link>https://lcz.me/post/3986</link><guid isPermaLink="true">https://lcz.me/post/3986</guid><dc:creator><![CDATA[sospda]]></dc:creator><pubDate>Thu, 28 May 2026 03:50:23 GMT</pubDate></item><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Thu, 28 May 2026 03:39:57 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/kop-wang" aria-label="Profile: kop-wang">@<bdi>kop-wang</bdi></a> 嗯嗯，有时间我试试</p>
]]></description><link>https://lcz.me/post/3985</link><guid isPermaLink="true">https://lcz.me/post/3985</guid><dc:creator><![CDATA[woaikuancheng0]]></dc:creator><pubDate>Thu, 28 May 2026 03:39:57 GMT</pubDate></item><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Thu, 28 May 2026 03:35:30 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/sospda" aria-label="Profile: sospda">@<bdi>sospda</bdi></a> 核显还是差点儿事儿，刚开始学习，以后多提宝贵意见</p>
]]></description><link>https://lcz.me/post/3984</link><guid isPermaLink="true">https://lcz.me/post/3984</guid><dc:creator><![CDATA[woaikuancheng0]]></dc:creator><pubDate>Thu, 28 May 2026 03:35:30 GMT</pubDate></item><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Thu, 28 May 2026 02:45:36 GMT]]></title><description><![CDATA[<p dir="auto">我个人理解楼主这套有几个改进的方向。<br />
1、Q4量化用Q4_K_M的性价比相对K_S更高一些。<br />
2、再对模型吞吐性能要求不高的前提下，可以尝试以下qwen3.6-27B Q4_K_M。理论上讲，配合使用q8的kv量化，可以做到128K上下文。这样能力更好。<br />
3、对于性能参考，楼主可以以llamabench来测试下速度，主要是要综合prefill和decode两个性能一起参考。</p>
<p dir="auto">仅供参考。</p>
]]></description><link>https://lcz.me/post/3980</link><guid isPermaLink="true">https://lcz.me/post/3980</guid><dc:creator><![CDATA[kop wang]]></dc:creator><pubDate>Thu, 28 May 2026 02:45:36 GMT</pubDate></item><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Wed, 27 May 2026 21:53:56 GMT]]></title><description><![CDATA[<p dir="auto">用5700G的核显开16G显存都能跑一些小模型。<br />
哈哈</p>
]]></description><link>https://lcz.me/post/3970</link><guid isPermaLink="true">https://lcz.me/post/3970</guid><dc:creator><![CDATA[sospda]]></dc:creator><pubDate>Wed, 27 May 2026 21:53:56 GMT</pubDate></item><item><title><![CDATA[Reply to AMD 5700G 32G 7900XTX windows11 llama.cpp Windows x64 (Vulkan)跑Qwen3.6-35B-A3B-UD-Q4_K_S交作业 on Wed, 27 May 2026 17:31:15 GMT]]></title><description><![CDATA[<p dir="auto">@echo off<br />
chcp 65001 &gt;nul<br />
title llama.cpp - Qwen3.6-35B API Server</p>
<p dir="auto">set "SCRIPT_DIR=%~dp0"<br />
set "MODEL=%SCRIPT_DIR%models\Qwen3.6-35B-A3B-UD-Q4_K_S.gguf"</p>
<p dir="auto">if not exist "%MODEL%" (<br />
echo [Error] Model file not found<br />
pause<br />
exit /b 1<br />
)</p>
<p dir="auto">cls<br />
echo ============================================<br />
echo   Qwen3.6-35B-A3B -- Select Context Length<br />
echo   256 Experts MoE ^| Only 3B active/token<br />
echo   RX 7900 XTX (24GB) ^| 32GB RAM<br />
echo   --cpu-moe: experts on CPU, frees VRAM<br />
echo ============================================<br />
echo.<br />
echo  #  Context   VRAM    Speed   Note<br />
echo --  -------   ------  ------  ---------------------------<br />
echo  1)  32K      ~10 GB  full GPU, fastest<br />
echo  2)  65K      ~12 GB  balanced<br />
echo  3)  96K      ~14 GB<br />
echo  4)  128K     ~16 GB<br />
echo  5)  196K     ~19 GB<br />
echo  6)  262K     ~22 GB  max native context<br />
echo.<br />
set /p ctx="Select (1-6): "</p>
<p dir="auto">if "%ctx%"=="1" set CTX=32768<br />
if "%ctx%"=="2" set CTX=65536<br />
if "%ctx%"=="3" set CTX=98304<br />
if "%ctx%"=="4" set CTX=131072<br />
if "%ctx%"=="5" set CTX=200704<br />
if "%ctx%"=="6" set CTX=262144</p>
<p dir="auto">if "%CTX%"=="" (<br />
echo Invalid selection<br />
pause<br />
exit /b 1<br />
)</p>
<p dir="auto">echo.<br />
echo Starting: %CTX% context<br />
echo <a href="http://127.0.0.1:8080" rel="nofollow ugc">http://127.0.0.1:8080</a><br />
echo.</p>
<p dir="auto">"%SCRIPT_DIR%llama-server.exe" ^<br />
-m "%MODEL%" ^<br />
-c %CTX% ^<br />
-fa on ^<br />
-ctk q4_0 ^<br />
-ctv q4_0 ^<br />
-t 8 ^<br />
-b 1024 ^<br />
--no-mmap ^<br />
--no-op-offload ^<br />
--host 127.0.0.1 ^<br />
--port 8080</p>
<p dir="auto">echo.<br />
pause</p>
]]></description><link>https://lcz.me/post/3958</link><guid isPermaLink="true">https://lcz.me/post/3958</guid><dc:creator><![CDATA[woaikuancheng0]]></dc:creator><pubDate>Wed, 27 May 2026 17:31:15 GMT</pubDate></item></channel></rss>