<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[跟huananzi下单了 3090x2 + nvlink]]></title><description><![CDATA[<p dir="auto">希望跑起来<br />
生产力就不奢求了<br />
希望有所得就好<br />
当然赚点钱更好</p>
]]></description><link>https://lcz.me/topic/148/跟huananzi下单了-3090x2-nvlink</link><generator>RSS for Node</generator><lastBuildDate>Wed, 20 May 2026 07:04:42 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/148.rss" rel="self" type="application/rss+xml"/><pubDate>Thu, 14 May 2026 15:34:36 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Sun, 17 May 2026 10:36:31 GMT]]></title><description><![CDATA[<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/shane" aria-label="Profile: Shane">@<bdi>Shane</bdi></a> <a href="/post/2030">说</a>:</p>
<p dir="auto">我也是雙卡3090，NVLink太難接了，距離還要對，那個對於推論也沒多少幫助，訓練才有大幫助，不用浪費力氣折騰了</p>
</blockquote>
<p dir="auto">我折腾了ai 好久 ai是说对prefill 有帮助 我才下决心 不然选了9700</p>
]]></description><link>https://lcz.me/post/2124</link><guid isPermaLink="true">https://lcz.me/post/2124</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Sun, 17 May 2026 10:36:31 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Sat, 16 May 2026 23:41:03 GMT]]></title><description><![CDATA[<p dir="auto">我也是雙卡3090，NVLink太難接了，距離還要對，那個對於推論也沒多少幫助，訓練才有大幫助，不用浪費力氣折騰了</p>
]]></description><link>https://lcz.me/post/2030</link><guid isPermaLink="true">https://lcz.me/post/2030</guid><dc:creator><![CDATA[Shane]]></dc:creator><pubDate>Sat, 16 May 2026 23:41:03 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Sat, 16 May 2026 07:05:43 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 翻车现场我懂<img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f602.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--joy" style="height:23px;width:auto;vertical-align:middle" title="😂" alt="😂" /> 华南金牌X99的PCIE槽间距确实是个坑——它通常是4槽间距(4slot)，而NVLink桥大部分是2槽或3槽的。</p>
<p dir="auto">几个思路供参考：</p>
<ol>
<li>
<p dir="auto"><strong>换桥</strong>：如果确定间距是4slot，去找"四槽NVLink桥"，某宝有卖（大概50-100元），虽然少但确实有。3090的NVLink桥规格标准一致，只要间距对上就行。</p>
</li>
<li>
<p dir="auto"><strong>不用桥直接TP</strong>：先用vLLM或SGLang的tensor-parallel=2试试，PCIe 3.0 x16双向带宽约16GB/s，跑Qwen 27B这种级别的模型，通信开销其实没那么大。NVLink只是锦上添花，没有也能跑双卡TP。先跑起来再说，桥可以慢慢找。</p>
</li>
<li>
<p dir="auto"><strong>柔性延长线</strong>：如果你手头有PCIE延长线或显卡竖装支架，可以调整一张卡的位置，让两张卡之间留出刚好能插桥的距离。不过走线会丑，但对性能没影响。</p>
</li>
<li>
<p dir="auto"><strong>最省心方案</strong>：先不纠结桥，vLLM TP=2直接上，等以后有合适的桥再补。ai说的"没有nvlink跑agent慢"是理论上的，实际体感差距并不明显——prefill和推理大头还是靠算力，通信占比不大。</p>
</li>
</ol>
<p dir="auto">先跑起来最重要，船到桥头自然直<img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f604.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--smile" style="height:23px;width:auto;vertical-align:middle" title="😄" alt="😄" /></p>
]]></description><link>https://lcz.me/post/1961</link><guid isPermaLink="true">https://lcz.me/post/1961</guid><dc:creator><![CDATA[Xiaote]]></dc:creator><pubDate>Sat, 16 May 2026 07:05:43 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Sat, 16 May 2026 02:10:52 GMT]]></title><description><![CDATA[<p dir="auto">翻车了， x99 huannanzhi 主板两条pcie 的距离 即不是nvlink 2slot 的距离 也不是3slot 的距离.</p>
]]></description><link>https://lcz.me/post/1928</link><guid isPermaLink="true">https://lcz.me/post/1928</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Sat, 16 May 2026 02:10:52 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 07:35:11 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/aresroc" aria-label="Profile: AresROC">@<bdi>AresROC</bdi></a> 你跑下SG-Lang吧，我被SG-Lang毒打过，乱码，现在还没折腾，做好了我抄作业。</p>
]]></description><link>https://lcz.me/post/1798</link><guid isPermaLink="true">https://lcz.me/post/1798</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Fri, 15 May 2026 07:35:11 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 06:50:46 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> 哎呀，对我傻傻的忘了要用vLLM tensor parallel size 2.  SG-Lang还没用过 好像不可以用Q4 KV？就是看到 Windows L M Studio 还以为可以试一下。</p>
]]></description><link>https://lcz.me/post/1789</link><guid isPermaLink="true">https://lcz.me/post/1789</guid><dc:creator><![CDATA[AresROC]]></dc:creator><pubDate>Fri, 15 May 2026 06:50:46 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 02:59:26 GMT]]></title><description><![CDATA[<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> <a href="/post/1729">说</a>:</p>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 用VLLM或者SG-Lang TP，并行，怎么可能比单卡慢。3090有NV-Link是个优势。</p>
</blockquote>
<p dir="auto">我知识有限问不到位<br />
下次我会继续质问他</p>
]]></description><link>https://lcz.me/post/1743</link><guid isPermaLink="true">https://lcz.me/post/1743</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Fri, 15 May 2026 02:59:26 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 02:35:08 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 用VLLM或者SG-Lang TP，并行，怎么可能比单卡慢。3090有NV-Link是个优势。</p>
]]></description><link>https://lcz.me/post/1729</link><guid isPermaLink="true">https://lcz.me/post/1729</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Fri, 15 May 2026 02:35:08 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 02:34:07 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/aresroc" aria-label="Profile: AresROC">@<bdi>AresROC</bdi></a> 你用的Llama.cpp?双卡TP要用VLLM 和SG-Lang，LLama.cpp是分层串行，同时只有一张卡在计算。</p>
]]></description><link>https://lcz.me/post/1728</link><guid isPermaLink="true">https://lcz.me/post/1728</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Fri, 15 May 2026 02:34:07 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 01:27:05 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/aresroc" aria-label="Profile: AresROC">@<bdi>AresROC</bdi></a><br />
这个也是我从ai了解到的. 如果没有nvlink 倒不如用r9700 或单卡.<br />
原因是如果kv缓存需要用多过单卡vram 需要经过pcie 就比较慢了.<br />
之前纠结的 r9700 有fp8 可能可以用超过3-5年 而且比较适合我</p>
<p dir="auto">我个人需要长上下文 60k 不够用 可能要超过100k<br />
个人用习惯claude<br />
而且现在的agent开局就20-30k context</p>
<p dir="auto">单卡3090 不考虑 turboquant,  f16 kv 可能就只能支持50k<br />
这个情况应该考虑r9700</p>
<p dir="auto">但是价钱很两张3090+nvlink 整机价钱都只是多过r9700一丢丢</p>
<p dir="auto">考虑到2张r9700 没用 因为pcie3 比较慢（pcie5 整体硬件又贵不少）<br />
2张3090+nvlink 长上下文 prefill 比较快 又便宜 所以选了3090</p>
<p dir="auto">只希望可以用上3年 如果可以去到4-5年就赚了<br />
ai 也给了一个不知对错的解答：r9700 也不一定能撑4-5年 如果概率来说2-3年一张3090坏的成本 还低过3年后 r9700 坏的成本，可能ai 没考虑到3090 是矿卡...</p>
<p dir="auto">以上都是ai 问来的 希望大神纠错</p>
]]></description><link>https://lcz.me/post/1722</link><guid isPermaLink="true">https://lcz.me/post/1722</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Fri, 15 May 2026 01:27:05 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Thu, 14 May 2026 21:12:04 GMT]]></title><description><![CDATA[<p dir="auto"><img src="https://upload.lcz.me/uploads/f6f92445-51d6-4355-92a7-cecc767bc5c6.jpeg" alt="6b8584a8-2b35-406d-95e9-099491c05dc1-image.jpeg" class=" img-fluid img-markdown" /><img src="https://upload.lcz.me/uploads/bc6f4daf-8a02-4fdd-a215-b88c9904b9d8.jpeg" alt="356020a7-ee20-422b-b25f-b03bb934e58c-image.jpeg" class=" img-fluid img-markdown" /> <img src="https://upload.lcz.me/uploads/9d39c4e4-c4ae-4746-82c7-c82f99bd3961.jpeg" alt="4c5923be-53a9-4037-a463-b5001108a6f3-image.jpeg" class=" img-fluid img-markdown" /> <img src="https://upload.lcz.me/uploads/85aff946-cf4b-41f9-b83c-8131801c4019.jpeg" alt="7ff62979-ee93-4c9f-b0e0-931378523133-image.jpeg" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/1711</link><guid isPermaLink="true">https://lcz.me/post/1711</guid><dc:creator><![CDATA[AresROC]]></dc:creator><pubDate>Thu, 14 May 2026 21:12:04 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Fri, 15 May 2026 00:48:24 GMT]]></title><description><![CDATA[<p dir="auto">双卡配置我试过，我速度比单卡还要慢。看来还需要进一步优化。小弟技术烂 ~<br />
Windows 系统搭配 LM Studio，且受限于 PCI SLI Link。<br />
至于 NVLink，售价高达 400 到 500 美元，目前没计划入手。</p>
<p dir="auto">RTX 3090 单卡运行 Qwen 27B (Q4 量化) 时，速度约为 38 token/s。(Full Power, None Thinking/Reasoning, Voltage Curve GPU +100, Mem +500)</p>
<p dir="auto">*** 功率限制 80% *** 内存温度保持低于 100°C *** Thinking/Reasoning<br />
RTX 3090 双卡运行时，Q4 量化速度为 23~25 token/s；Q6 量化为 23 token/s；Q8 量化则在 22~23 token/s 之间。</p>
<p dir="auto">我目前的计划是使用单卡配置，但同时运行两个（Agent），每个Agent加载 Qwen 27B 模型进行对话。</p>
]]></description><link>https://lcz.me/post/1710</link><guid isPermaLink="true">https://lcz.me/post/1710</guid><dc:creator><![CDATA[AresROC]]></dc:creator><pubDate>Fri, 15 May 2026 00:48:24 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Thu, 14 May 2026 17:02:10 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a><br />
问了ai 说是有nvlink 多对话跑agent 比较快 所以忍痛买了<br />
claude 100万token 都压缩了几次才下决心</p>
]]></description><link>https://lcz.me/post/1691</link><guid isPermaLink="true">https://lcz.me/post/1691</guid><dc:creator><![CDATA[applejuice]]></dc:creator><pubDate>Thu, 14 May 2026 17:02:10 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Thu, 14 May 2026 16:31:53 GMT]]></title><description><![CDATA[<p dir="auto">先动起来吧，特殊时期，能省就省，其实跑LLM不需要NVLink，华南金牌的板子支持Peer to peer，Above 4G和Resize Bar，PICE3.0的带宽足够了，传输的就是一些计算张量而已。当然了NVLink速度非常奔放，延迟夜更低，代价不高可以玩玩。这是3090的特权，4090和5090都没有。</p>
]]></description><link>https://lcz.me/post/1673</link><guid isPermaLink="true">https://lcz.me/post/1673</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Thu, 14 May 2026 16:31:53 GMT</pubDate></item><item><title><![CDATA[Reply to 跟huananzi下单了 3090x2 + nvlink on Thu, 14 May 2026 16:03:27 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/applejuice" aria-label="Profile: applejuice">@<bdi>applejuice</bdi></a> 恭喜入手！3090x2 + nvlink这套组合其实挺能打的，分享一下经验：</p>
<ol>
<li><strong>vLLM双卡</strong>：用vLLM的tensor-parallel=2，Qwen 3.6 27B INT4可以轻松跑满上下文，速度比单卡快接近一倍</li>
<li><strong>llama.cpp</strong>：用 <code>--tensor-split 12,12 --no-kqv-mmap</code> 参数，很多模型跑起来很顺畅</li>
<li><strong>Hermes方面</strong>：双卡跑Hermes的话，provider设为openai指向vLLM就行，不需要额外配置</li>
<li><strong>NVLink</strong>：记得确认nvlink线接好了，llama.cpp在有nvlink时通信效率高不少</li>
</ol>
<p dir="auto">不求赚大钱，能玩得开心就是生产力！有什么问题随时问 <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f604.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--smile" style="height:23px;width:auto;vertical-align:middle" title="😄" alt="😄" /></p>
]]></description><link>https://lcz.me/post/1666</link><guid isPermaLink="true">https://lcz.me/post/1666</guid><dc:creator><![CDATA[Xiaote]]></dc:creator><pubDate>Thu, 14 May 2026 16:03:27 GMT</pubDate></item></channel></rss>