<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048)]]></title><description><![CDATA[<p dir="auto"><img src="https://upload.lcz.me/uploads/06edb8ca-d4bb-48c4-a30d-cc260ac96cdf.jpeg" alt="c654fc6b-8155-4d1d-bb58-f48b2b0f0371-image.jpeg" class=" img-fluid img-markdown" /></p>
<p dir="auto">回馈以下社区，昨天刚去本地买了块公版3090，回来之后让claude配置sglang，这是目前的结果。系统是ubuntu</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/22a45d55-1123-4444-958c-ff9151be3def.jpeg" alt="75ad769c-68a2-45d4-a80b-1b2e64d5dd54-image.jpeg" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/topic/261/throughput-on-rtx-3090-qwen3.6-27b-awq-marlin-bf16-bf16-kv-ctx-2048</link><generator>RSS for Node</generator><lastBuildDate>Sat, 06 Jun 2026 07:04:23 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/261.rss" rel="self" type="application/rss+xml"/><pubDate>Fri, 22 May 2026 15:07:55 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Fri, 29 May 2026 06:05:59 GMT]]></title><description><![CDATA[<p dir="auto">vLLM 可以運行 32k 上下文，對於Agent用途來說還不錯，MTP速度為 50~60 tk/s @250w</p>
<p dir="auto">--model ~/AiModel/int4-AutoRound	<br />
--gpu-memory-utilization 0.95	<br />
--max-model-len 32768	<br />
--enable-auto-tool-choice	<br />
--tool-call-parser qwen3_coder	0	<br />
--language-model-only	<br />
--host 0.0.0.0	 --port 8000	<br />
--kv-cache-dtype fp8_e5m2	<br />
--max-num-seqs 1	<br />
--max-num-batched-tokens 4128	<br />
--trust-remote-code	<br />
--dtype bfloat16	<br />
--enable-prefix-caching	<br />
--enable-chunked-prefill	 	<br />
--no-scheduler-reserve-full-isl	<br />
--speculative-config '{"method":"mtp","num_speculative_tokens":3}'</p>
]]></description><link>https://lcz.me/post/4161</link><guid isPermaLink="true">https://lcz.me/post/4161</guid><dc:creator><![CDATA[AresROC]]></dc:creator><pubDate>Fri, 29 May 2026 06:05:59 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Wed, 27 May 2026 08:36:31 GMT]]></title><description><![CDATA[<p dir="auto"><img src="https://upload.lcz.me/uploads/dec91e6d-1f87-47e6-9d14-01281f208911.jpeg" alt="c29489bb-2352-474f-bd60-2cdf58555ca4-image.jpeg" class=" img-fluid img-markdown" /><br />
我也是用claude架的，這是我的配置</p>
]]></description><link>https://lcz.me/post/3914</link><guid isPermaLink="true">https://lcz.me/post/3914</guid><dc:creator><![CDATA[[[global:former-user]]]]></dc:creator><pubDate>Wed, 27 May 2026 08:36:31 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sun, 24 May 2026 11:21:00 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/johnnybegood" aria-label="Profile: johnnybegood">@<bdi>johnnybegood</bdi></a> LV并不比普通包体验好多少，好肯定好一点，但是没好那么多。不过有钱人都买LV。</p>
]]></description><link>https://lcz.me/post/3395</link><guid isPermaLink="true">https://lcz.me/post/3395</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Sun, 24 May 2026 11:21:00 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sun, 24 May 2026 06:14:13 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/larry-wang" aria-label="Profile: Larry-Wang">@<bdi>Larry-Wang</bdi></a> opus 4.7 比 deepseek 4.0 pro 到底好在哪呢？</p>
]]></description><link>https://lcz.me/post/3368</link><guid isPermaLink="true">https://lcz.me/post/3368</guid><dc:creator><![CDATA[johnnybegood]]></dc:creator><pubDate>Sun, 24 May 2026 06:14:13 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sun, 24 May 2026 02:05:01 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/larry-wang" aria-label="Profile: Larry-Wang">@<bdi>Larry-Wang</bdi></a> 价格放哪了。opus 4.7 造价不菲啊。建议重点调试用。或试试国产。</p>
]]></description><link>https://lcz.me/post/3331</link><guid isPermaLink="true">https://lcz.me/post/3331</guid><dc:creator><![CDATA[williamlouis]]></dc:creator><pubDate>Sun, 24 May 2026 02:05:01 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sat, 23 May 2026 21:41:35 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/larry-wang" aria-label="Profile: Larry-Wang">@<bdi>Larry-Wang</bdi></a> 不着急等Qwen3.7 27b发布之后，我相信Sg-Lang的支持会更好，到时候一起折腾，估计也就这两三周的事了。</p>
]]></description><link>https://lcz.me/post/3311</link><guid isPermaLink="true">https://lcz.me/post/3311</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Sat, 23 May 2026 21:41:35 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sat, 23 May 2026 15:50:33 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/williamlouis" aria-label="Profile: williamlouis">@<bdi>williamlouis</bdi></a> claude就用的opus 4.7，3090感觉目前跑不了qwen 3.6 27b sglang</p>
]]></description><link>https://lcz.me/post/3296</link><guid isPermaLink="true">https://lcz.me/post/3296</guid><dc:creator><![CDATA[Larry Wang]]></dc:creator><pubDate>Sat, 23 May 2026 15:50:33 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Sat, 23 May 2026 15:47:37 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> 4090 48g应该跑得下来，3090 24g sglang目前估计够呛，需要两张卡</p>
]]></description><link>https://lcz.me/post/3295</link><guid isPermaLink="true">https://lcz.me/post/3295</guid><dc:creator><![CDATA[Larry Wang]]></dc:creator><pubDate>Sat, 23 May 2026 15:47:37 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Fri, 22 May 2026 20:21:21 GMT]]></title><description><![CDATA[<p dir="auto">很有价值。谢谢。claude 选的什么版本？现在 open AI 5.5 有人使用调试成功了。claude 这是也站起来了。</p>
]]></description><link>https://lcz.me/post/3171</link><guid isPermaLink="true">https://lcz.me/post/3171</guid><dc:creator><![CDATA[williamlouis]]></dc:creator><pubDate>Fri, 22 May 2026 20:21:21 GMT</pubDate></item><item><title><![CDATA[Reply to Throughput on RTX 3090 (Qwen3.6-27B AWQ-Marlin BF16, BF16 KV, ctx=2048) on Fri, 22 May 2026 19:16:12 GMT]]></title><description><![CDATA[<p dir="auto">这效果相当不错了，发个详细的教程，我也找时间抄作业，我给4090上下sg-lang，之前总是乱码。 2048上下文长度太短了，没什么意义，测试更长一点，最少64k。</p>
]]></description><link>https://lcz.me/post/3162</link><guid isPermaLink="true">https://lcz.me/post/3162</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Fri, 22 May 2026 19:16:12 GMT</pubDate></item></channel></rss>