<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S]]></title><description><![CDATA[<p dir="auto">记录一下自己RTX3080 20g的一些配置<br />
WIN11安装：<br />
git clone <a href="https://github.com/ggerganov/llama.cpp" rel="nofollow ugc">https://github.com/ggerganov/llama.cpp</a><br />
cmake -B build -D GGML_CUDA=ON -D CMAKE_CUDA_ARCHITECTURES=86 -D LLAMA_BUILD_SERVER=ON -D LLAMA_BUILD_ALL_EXAMPLES=OFF -D LLAMA_BUILD_TESTS=OFF -D LLAMA_BUILD_EXAMPLES=OFF<br />
cmake --build build --config Release --parallel 2</p>
<p dir="auto">27B原始64KQ8(35token/s)：<br />
.\llama-server.exe -m D:\hermeswork\models\Qwen3.6-27B-IQ4_NL.gguf <code> -c 65536</code><br />
-ngl 99 <code> -t 14</code><br />
-tb 14 <code> -b 2048</code><br />
-ub 512 <code> --flash-attn on</code><br />
--cache-type-k q8_0 <code> --cache-type-v q8_0</code><br />
--kv-offload <code> --kv-unified</code><br />
--mmap <code> --mlock</code><br />
--context-shift <code> --reasoning off</code><br />
--reasoning-budget 0 <code> --host 0.0.0.0</code><br />
--port 11434</p>
<p dir="auto">27B126KQ4MTP无多模态（48token/s）<br />
.\llama-server.exe -m D:\hermeswork\models\Qwen3.6-27B-IQ4_NL.gguf <code> -c 128000</code><br />
-ngl 99 <code> -t 14</code><br />
-tb 14 <code> -b 4096</code><br />
-ub 256 <code> --jinja</code><br />
--temp 0.7 <code> --top-p 0.9</code><br />
--top-k 40 <code> --min-p 0.0</code><br />
--presence-penalty 1.0 <code> --repeat-penalty 1.05</code><br />
--flash-attn on <code> --cache-type-k q4_0</code><br />
--cache-type-v q4_0 <code> --kv-offload</code><br />
--kv-unified <code> --mmap</code><br />
--mlock <code> --spec-type draft-mtp</code><br />
--spec-draft-n-max 2 <code> --skip-chat-parsing</code><br />
--reasoning off <code> --reasoning-budget 0</code><br />
--host 0.0.0.0 `<br />
--port 11434</p>
<p dir="auto">35B256KQ4多模态 无MTP  (113 TOKEN/S)<br />
.\llama-server.exe -m D:\hermeswork\models\Qwen3.6-35B-A3B-UD-IQ4_NL.gguf <code> -c 262144</code><br />
-ngl 55 <code> -t 14</code><br />
-tb 14 <code> -b 4096</code><br />
-ub 256 <code> --jinja</code><br />
--temp 0.7 <code> --top-p 0.9</code><br />
--top-k 40 <code> --min-p 0.0</code><br />
--presence-penalty 1.0 <code> --repeat-penalty 1.05</code><br />
--flash-attn on <code> --mmproj D:/hermeswork/models/mmproj-F32.gguf</code><br />
--no-mmproj-offload <code> --image-min-tokens 1024</code><br />
--cache-type-k q4_0 <code> --cache-type-v q4_0</code><br />
--skip-chat-parsing <code> --reasoning off</code><br />
--reasoning-budget 0 <code> --host 0.0.0.0</code><br />
--port 8080 <code> --kv-offload</code><br />
--kv-unified <code> --mmap</code><br />
--mlock `<br />
真是一张神卡啊</p>
]]></description><link>https://lcz.me/topic/243/rtx3080-20g-qwen3.6-27b-45-50t-s-35b多模态256k-110t-s</link><generator>RSS for Node</generator><lastBuildDate>Sun, 31 May 2026 05:33:27 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/243.rss" rel="self" type="application/rss+xml"/><pubDate>Thu, 21 May 2026 10:54:04 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Sat, 30 May 2026 01:51:28 GMT]]></title><description><![CDATA[<p dir="auto">已经非常好了 <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f44d.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--+1" style="height:23px;width:auto;vertical-align:middle" title=":+1:" alt="👍" /> <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f44d.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--+1" style="height:23px;width:auto;vertical-align:middle" title=":+1:" alt="👍" /> <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f44d.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--+1" style="height:23px;width:auto;vertical-align:middle" title=":+1:" alt="👍" /></p>
]]></description><link>https://lcz.me/post/4288</link><guid isPermaLink="true">https://lcz.me/post/4288</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Sat, 30 May 2026 01:51:28 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Fri, 29 May 2026 02:54:46 GMT]]></title><description><![CDATA[<p dir="auto">可以试试这几个配置，这个上下文这个速度，我感觉已经没有升级的冲动了<br />
<img src="https://upload.lcz.me/uploads/c61d562d-1f01-4bb9-ac26-029e8b6a7241.jpg" alt="6310dcdb-b00e-4a26-af40-54f486506e1f.jpg" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/4139</link><guid isPermaLink="true">https://lcz.me/post/4139</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Fri, 29 May 2026 02:54:46 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Fri, 29 May 2026 01:31:58 GMT]]></title><description><![CDATA[<p dir="auto">感谢大佬，已经抄作业成功，同款20GB显存RTX3080 ，190K上下文，能跑50token/s左右，比LM Studio快太多了！<br />
<img src="https://upload.lcz.me/uploads/a6ac0159-8dd3-4951-b07c-bde65f65fbdc.jpeg" alt="640b6b41-fcdd-4250-872a-4d8836621189-image.jpeg" class=" img-fluid img-markdown" /><br />
<img src="https://upload.lcz.me/uploads/3aca2adb-6880-4995-a052-0a8a9dbffff6.jpeg" alt="90b489be-cb0f-4f9c-b074-f1ce7e7311c7-image.jpeg" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/4132</link><guid isPermaLink="true">https://lcz.me/post/4132</guid><dc:creator><![CDATA[Eliesid Sliva]]></dc:creator><pubDate>Fri, 29 May 2026 01:31:58 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Wed, 27 May 2026 00:49:19 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 就是命令行和hermes对话，让它建立项目做具体事情，做不好就骂，很奇怪，骂他比表扬他更有效，直到kv接近到99%,然后让它总结，写进度，退出，再次命令行进入，让它根据他自己写的项目进度继续做，每次重新打开第一次会话是比较久的，不过这个不影响了，毕竟190k上下文，已经可以撑很久，</p>
<p dir="auto">那么问题来了，大家是怎么用的？为什么我的上下文基本是线性向上的，其实dashboard 的输入token曲线就能看出来</p>
]]></description><link>https://lcz.me/post/3858</link><guid isPermaLink="true">https://lcz.me/post/3858</guid><dc:creator><![CDATA[ldscool]]></dc:creator><pubDate>Wed, 27 May 2026 00:49:19 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Wed, 27 May 2026 00:03:56 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a> moe现在确实不行，就算DeepSeep v4这么强的模型也是moe，就能够得到qwen 27b的水平</p>
]]></description><link>https://lcz.me/post/3855</link><guid isPermaLink="true">https://lcz.me/post/3855</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Wed, 27 May 2026 00:03:56 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 17:33:59 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a></p>
<p dir="auto">35b A3b 因为是moe模式， MTP和专家路由很难对齐，所以效果不好，等将来MTP的优化做得更好的时候，可能会有改善。</p>
]]></description><link>https://lcz.me/post/3844</link><guid isPermaLink="true">https://lcz.me/post/3844</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 26 May 2026 17:33:59 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 16:36:53 GMT]]></title><description><![CDATA[<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a> <a href="/post/3840">说</a>:</p>
<p dir="auto">我目前使用HERMES的方法是对话式的，还没达到大佬们自动生产脚本的程度，目前的体验已经比在线的要好，要快，能力一点不弱，甚至更强，因为它读我给它的PDF，又快又准，理解得又好，甚至有的时候我对PDF那个环节拿不准，让他帮我找解决方法比我去看还快，之前一旦对话到后期，显存占用19.7M无论共享显存是0.2G还是多少，就有几率出现个位数的T/S，这个时候就比较煎熬了，因为这个时候项目代码有的还没更新完，停又不好停，但是现在这个设置，达到19.7G显存占用后，速度几乎还能保持35T/S左右，甚至现在共享显存已经到了1G了，还是很稳，对话过程的延时基本就是一两秒就开始给我回复了，到此刻，正式结束HERMES 跑QWEN3.6 27B的参数优化，谢谢大家看我唠叨</p>
</blockquote>
<p dir="auto">你是怎样测试的？</p>
]]></description><link>https://lcz.me/post/3842</link><guid isPermaLink="true">https://lcz.me/post/3842</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Tue, 26 May 2026 16:36:53 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 16:13:28 GMT]]></title><description><![CDATA[<p dir="auto">我目前使用HERMES的方法是对话式的，还没达到大佬们自动生产脚本的程度，目前的体验已经比在线的要好，要快，能力一点不弱，甚至更强，因为它读我给它的PDF，又快又准，理解得又好，甚至有的时候我对PDF那个环节拿不准，让他帮我找解决方法比我去看还快，之前一旦对话到后期，显存占用19.7M无论共享显存是0.2G还是多少，就有几率出现个位数的T/S，这个时候就比较煎熬了，因为这个时候项目代码有的还没更新完，停又不好停，但是现在这个设置，达到19.7G显存占用后，速度几乎还能保持35T/S左右，甚至现在共享显存已经到了1G了，还是很稳，对话过程的延时基本就是一两秒就开始给我回复了，到此刻，正式结束HERMES 跑QWEN3.6 27B的参数优化，谢谢大家看我唠叨</p>
]]></description><link>https://lcz.me/post/3840</link><guid isPermaLink="true">https://lcz.me/post/3840</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Tue, 26 May 2026 16:13:28 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 15:57:00 GMT]]></title><description><![CDATA[<p dir="auto">最新优化，我觉得又白嫖了KV了，多模态MTP，长时间N多轮对话直到KV到99%都稳定运行，KV已经可以达到190K.。。。我继续让他做中型的代码任务，<br />
35B我觉得可以弃用了，MTP基本无效，不时出点“什么缩进错误”，或者“干脆我重写好了”，<br />
隔壁帖子提到的forcing full prompt re-processing due to lack of cache dataforcing full prompt re-processing due to lack of cache data现象，终于是出现了，不过也就一扫而过，没有感到任何异常</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/1890e71c-381c-4b7b-b23e-5adc8e6608ed.png" alt="019.png" class=" img-fluid img-markdown" /></p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/29ae606e-c087-4714-8414-e006e78a34a5.png" alt="020.png" class=" img-fluid img-markdown" /><br />
改了高亮的地方</p>
]]></description><link>https://lcz.me/post/3833</link><guid isPermaLink="true">https://lcz.me/post/3833</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Tue, 26 May 2026 15:57:00 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 14:56:49 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a> 支持的啊，下午还让AI找了个多模态的用上了</p>
]]></description><link>https://lcz.me/post/3824</link><guid isPermaLink="true">https://lcz.me/post/3824</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Tue, 26 May 2026 14:56:49 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 12:37:17 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/rock-shi" aria-label="Profile: rock-shi">@<bdi>rock-shi</bdi></a> 跑了一下午的代码项目，160K到顶了，170K有机会炸显存，疑问来了，不是说MTP不支持多模态吗？我怎么跑起来了</p>
]]></description><link>https://lcz.me/post/3801</link><guid isPermaLink="true">https://lcz.me/post/3801</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Tue, 26 May 2026 12:37:17 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 12:27:01 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/kop-wang" aria-label="Profile: kop-wang">@<bdi>kop-wang</bdi></a> 这是跑到105K KV时候的PREFILLL速度，对话开始的时候是1100多</p>
]]></description><link>https://lcz.me/post/3800</link><guid isPermaLink="true">https://lcz.me/post/3800</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Tue, 26 May 2026 12:27:01 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 07:57:18 GMT]]></title><description><![CDATA[<p dir="auto">感谢楼主分享。<br />
prefill性能不到500，从性价比上来讲可以接受，但是容易多轮对话之后每次LLM调用都要罚站10~20秒。</p>
<p dir="auto">但反过来想，如果使用localLLM只是用于背景进程任务，对实时性要求不高的话，也是可以接受的。</p>
<p dir="auto">还有就是MTP对于prefill有一定的负面影响，也需要去衡量。</p>
]]></description><link>https://lcz.me/post/3752</link><guid isPermaLink="true">https://lcz.me/post/3752</guid><dc:creator><![CDATA[kop wang]]></dc:creator><pubDate>Tue, 26 May 2026 07:57:18 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 07:37:01 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a> 对啊！最起码体感很舒服了，跟云端差不太多了。再就等DFlash了，让AI预估了一下3080估计能到60t/s了</p>
]]></description><link>https://lcz.me/post/3750</link><guid isPermaLink="true">https://lcz.me/post/3750</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Tue, 26 May 2026 07:37:01 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 06:08:47 GMT]]></title><description><![CDATA[<p dir="auto">隔壁帖子SKY大佬提供的模型Qwen3.6-27B-uncensored-abliterated-MTP-i1-IQ4_XS-FFN-IQ3，27B多模态MTP的速度，KV现在是150K上限，跑到了100K左右，显存峰值才19。3G，也就是说还可以继续加，不过这个速度这个精度还多模态，已经无遗憾了</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/5dd58003-698e-4ff5-a634-1b4982c60e79.png" alt="017.png" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/3739</link><guid isPermaLink="true">https://lcz.me/post/3739</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Tue, 26 May 2026 06:08:47 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 03:23:44 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 48g我的天，后面还可以关注DFlash，你这跑27b速度不得到80t/s</p>
]]></description><link>https://lcz.me/post/3718</link><guid isPermaLink="true">https://lcz.me/post/3718</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Tue, 26 May 2026 03:23:44 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Tue, 26 May 2026 00:17:46 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/im17me" aria-label="Profile: im17me">@<bdi>im17me</bdi></a> 有nvlink 的3090那是直接起飞了，可以预期速度x1.8，显存48g爽yy了</p>
]]></description><link>https://lcz.me/post/3677</link><guid isPermaLink="true">https://lcz.me/post/3677</guid><dc:creator><![CDATA[ldscool]]></dc:creator><pubDate>Tue, 26 May 2026 00:17:46 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 15:11:16 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/im17me" aria-label="Profile: im17me">@<bdi>im17me</bdi></a> 还没到啊. 我国外</p>
]]></description><link>https://lcz.me/post/3619</link><guid isPermaLink="true">https://lcz.me/post/3619</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Mon, 25 May 2026 15:11:16 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 15:10:26 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a>  你的3090加nvlink 有效果吗？能不能说一下提升情况</p>
]]></description><link>https://lcz.me/post/3618</link><guid isPermaLink="true">https://lcz.me/post/3618</guid><dc:creator><![CDATA[im17me]]></dc:creator><pubDate>Mon, 25 May 2026 15:10:26 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 11:07:06 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 原来您是双3090啊，那不一样，完全是两个世界了，我估计跑COMFYuI都能有不错的体验啊，单3080其实LTX2.3也能跑一下，体验都还可以的，这张卡其实我是去年拿来跑视频的敢信</p>
]]></description><link>https://lcz.me/post/3579</link><guid isPermaLink="true">https://lcz.me/post/3579</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Mon, 25 May 2026 11:07:06 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 08:57:05 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 双卡3090+nvlink绝对牛逼。期待一波反馈</p>
]]></description><link>https://lcz.me/post/3566</link><guid isPermaLink="true">https://lcz.me/post/3566</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Mon, 25 May 2026 08:57:05 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 07:59:56 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vosrock" aria-label="Profile: vosrock">@<bdi>vosrock</bdi></a></p>
<p dir="auto">钱已经花了 等我机器到也测一测</p>
]]></description><link>https://lcz.me/post/3564</link><guid isPermaLink="true">https://lcz.me/post/3564</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Mon, 25 May 2026 07:59:56 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 07:19:50 GMT]]></title><description><![CDATA[<p dir="auto"><img src="https://upload.lcz.me/uploads/ab31a02e-a475-45e9-a421-258bed9ef507.PNG" alt="016.PNG" class=" img-fluid img-markdown" /><br />
用27B跑项目的前期，工作习惯，框架大体搭好，然后用35B，开满上下文，不用MTP，速度就是这个样子，截图的这个状态实际上下文已经跑到了150K了，这只是单卡，还是不要搞双卡了，哥们</p>
]]></description><link>https://lcz.me/post/3551</link><guid isPermaLink="true">https://lcz.me/post/3551</guid><dc:creator><![CDATA[vosrock]]></dc:creator><pubDate>Mon, 25 May 2026 07:19:50 GMT</pubDate></item><item><title><![CDATA[Reply to RTX3080 20g,qwen3.6 27B 45-50T&#x2F;S 35B多模态256K 110T&#x2F;S on Mon, 25 May 2026 06:56:06 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 也不能这么说，肯定是有舍有得。像我这两个3080，当时买的时候感觉挺落后的，实际上玩起来的时候说不定有很多其他卡不适配的应用场景，整体速度感觉也还不错。<img src="https://upload.lcz.me/uploads/3dfc622e-e8ae-4723-948f-9dfc6506923a.jpeg" alt="53164c3f-b1fa-4fe0-8775-35881d05bfa6-image.jpeg" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/3547</link><guid isPermaLink="true">https://lcz.me/post/3547</guid><dc:creator><![CDATA[rock shi]]></dc:creator><pubDate>Mon, 25 May 2026 06:56:06 GMT</pubDate></item></channel></rss>