<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[M5pro 64G LLM性能参考.]]></title><description><![CDATA[<p dir="auto">交个作业。</p>
<p dir="auto">我的机器是 M5 Pro 64GB（18 CPU + 20 GPU），测试了几个 runtime：Ollama、LM Studio 和 MTPLX。<br />
模型主要是 Qwen3.6 27B 和 Qwen3.6 35B-A3B，均为 Q4 量化。</p>
<p dir="auto">先说结论：</p>
<ul>
<li>35B-A3B 在 MLX runtime（LM Studio）下，64K 上下文仍能跑到 50+ tok/s，已经达到可用状态，但智力相比 dense 27B 还是略弱一些。</li>
<li>27B dense 在 MLX + MTP（MTPLX）下，64K 上下文能跑到 19+ tok/s, 提升巨大, 但仍然只是勉强可用.</li>
<li>MTPLX 在 64K 下的 speculative decoding 命中率依然很高，更长上下文不知道.</li>
<li>Μ5 max 内存带宽可以达到pro的两倍, 如果MTPLX的生态成熟了, 感觉27b的LLM可用.</li>
</ul>
<p dir="auto">35b测试结果:<br />
<img src="https://upload.lcz.me/uploads/2d8cf337-fee5-40cc-b3cc-5ffcfb8ffd7f.png" alt="35b.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">27b测试结果:<br />
<img src="https://upload.lcz.me/uploads/5793029a-53db-463b-97ba-b5fcdb620ce0.png" alt="27b.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">说明:<br />
1,  4k, 8K 上下文测试没有意义, 所以大家关注16K以上的结果就好.<br />
2, Mac 环境很难搞干净, 我的MBP是主力机, 里面各种服务软件很多. 所以不能作为基准, 但是相互的比较还是有意义的.<br />
3, vMLX 之前测试过, 很不稳定, 所以算了.<br />
4, oMLX 看网上讲性能和LM Studio差不多, 所以也没测.</p>
]]></description><link>https://lcz.me/topic/191/m5pro-64g-llm性能参考.</link><generator>RSS for Node</generator><lastBuildDate>Wed, 20 May 2026 06:05:02 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/191.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 17 May 2026 23:55:43 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Wed, 20 May 2026 01:08:38 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> 我也是无意中刷到才知道amd也可以玩模型，我之前一直在研究Mac，也在看mac studio 比较n卡实在太贵了，无意中发现现在a卡也能搞了，5k左右，我准备把手上的3060 12g出掉来升级一下，折腾一下a卡</p>
]]></description><link>https://lcz.me/post/2699</link><guid isPermaLink="true">https://lcz.me/post/2699</guid><dc:creator><![CDATA[janebo]]></dc:creator><pubDate>Wed, 20 May 2026 01:08:38 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 17:51:43 GMT]]></title><description><![CDATA[<p dir="auto"><img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f602.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--joy" style="height:23px;width:auto;vertical-align:middle" title="😂" alt="😂" />不改变mac跑AI废物的局面，这是硬件残疾，很难后天修复。</p>
]]></description><link>https://lcz.me/post/2681</link><guid isPermaLink="true">https://lcz.me/post/2681</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 17:51:43 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 16:37:11 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a></p>
<p dir="auto">你看图片, 27b oQ4 MTP, pp已经到了 1131.8, tg 还能在 17.3.</p>
<p dir="auto">不过我问了AI, 这个 SpecPrefill 技术不适合多轮对话, 会乱掉. 只适合 长文本 的一次性分析, 和几轮对话.</p>
<p dir="auto">所以, 对于AI agent , 还是没啥用. 对于我LLM wiki, 倒是有点儿用, 不过那个, 我如果单独分析, 也就等一会儿就完了, 不着急了.  感觉这个技术有点儿鸡肋.</p>
]]></description><link>https://lcz.me/post/2676</link><guid isPermaLink="true">https://lcz.me/post/2676</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 16:37:11 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 16:31:18 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 那抢救一下，发点数据来，这就很有意义了。</p>
]]></description><link>https://lcz.me/post/2673</link><guid isPermaLink="true">https://lcz.me/post/2673</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 16:31:18 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 16:01:49 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a></p>
<p dir="auto">我靠, 我觉得还是可以抢救一下的, 刚才又在oMLX 加上了SpecPrefill , 加了一个qwen3.5 2B Q4 用来预测, PP狂涨.</p>
<p dir="auto">这个对我做LLM wiki 还是很有价值的.</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/6b073e51-b7f1-413e-9200-6910fac0e48e.png" alt="Screenshot 2026-05-19 at 11.45.22 AM.png" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/2666</link><guid isPermaLink="true">https://lcz.me/post/2666</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 16:01:49 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:45:01 GMT]]></title><description><![CDATA[<blockquote>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> <a href="/post/2656">说</a>:</p>
<p dir="auto">pro 5000， 6000都行。</p>
</blockquote>
<p dir="auto">我对噪音敏感, 已经排除掉了 5000, 6000. 如果不是噪音敏感, 我就入手你推荐的9700两张了. 我主要要LLM的算力, 视频也就是玩玩, 不用来生产.</p>
]]></description><link>https://lcz.me/post/2664</link><guid isPermaLink="true">https://lcz.me/post/2664</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 15:45:01 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:30:30 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 可以，5090， pro 5000， 6000都行。</p>
]]></description><link>https://lcz.me/post/2656</link><guid isPermaLink="true">https://lcz.me/post/2656</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 15:30:30 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:14:29 GMT]]></title><description><![CDATA[<p dir="auto">看来我必须得搞 5090了, 无论如何得想办法降低噪音 <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f61e.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--disappointed" style="height:23px;width:auto;vertical-align:middle" title=":(" alt="😞" /></p>
<p dir="auto">等回国了就动手.</p>
]]></description><link>https://lcz.me/post/2652</link><guid isPermaLink="true">https://lcz.me/post/2652</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 15:14:29 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:13:26 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 其实油管频道有人说，没啥卵用，慢。</p>
]]></description><link>https://lcz.me/post/2651</link><guid isPermaLink="true">https://lcz.me/post/2651</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 15:13:26 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:12:33 GMT]]></title><description><![CDATA[<p dir="auto">是啊, 所以结论不变, M5pro 跑27b 不行,  Max 没准行.</p>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a>  号召有 Μ5max的测试一下, 有可能给 LLM 带来希望. <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f642.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--slightly_smiling_face" style="height:23px;width:auto;vertical-align:middle" title=":)" alt="🙂" /></p>
]]></description><link>https://lcz.me/post/2648</link><guid isPermaLink="true">https://lcz.me/post/2648</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 15:12:33 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:10:14 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 那还不是没啥卵用</p>
]]></description><link>https://lcz.me/post/2647</link><guid isPermaLink="true">https://lcz.me/post/2647</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 15:10:14 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 15:09:48 GMT]]></title><description><![CDATA[<p dir="auto">今天oMLX发布了 0.3.9rc1, 支持了 native MTP, 我又测了一下加上了MTP机制的 27b oQ4, decode 明显提升, PP 基本不变.</p>
<p dir="auto"><img src="https://upload.lcz.me/uploads/ad072ce5-9cd4-438d-983f-7117ed48cfb9.png" alt="Screenshot 2026-05-19 at 11.03.01 AM.png" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/2646</link><guid isPermaLink="true">https://lcz.me/post/2646</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Tue, 19 May 2026 15:09:48 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 14:47:48 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/82445418" aria-label="Profile: 82445418">@<bdi>82445418</bdi></a> 根本不可能</p>
]]></description><link>https://lcz.me/post/2634</link><guid isPermaLink="true">https://lcz.me/post/2634</guid><dc:creator><![CDATA[Vittoria Veloso]]></dc:creator><pubDate>Tue, 19 May 2026 14:47:48 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 14:47:13 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 你用omlx真的，专门针对mac优化了。lmstudio还有ollama确实不大行。</p>
]]></description><link>https://lcz.me/post/2627</link><guid isPermaLink="true">https://lcz.me/post/2627</guid><dc:creator><![CDATA[Vittoria Veloso]]></dc:creator><pubDate>Tue, 19 May 2026 14:47:13 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 08:23:46 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/janebo" aria-label="Profile: janebo">@<bdi>janebo</bdi></a> 论坛那么多大神在用xtx，就是Qwen3.6 27b就值回票价了，更何况也能comfyui，抄作业就是了。</p>
]]></description><link>https://lcz.me/post/2557</link><guid isPermaLink="true">https://lcz.me/post/2557</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Tue, 19 May 2026 08:23:46 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 08:22:32 GMT]]></title><description><![CDATA[<p dir="auto">彻底打消了我对mac跑大模型的念想，我还在看省了九千，还是去买7900xtx,至少五千左右的价格能跑千文还是很强的性价比！</p>
]]></description><link>https://lcz.me/post/2555</link><guid isPermaLink="true">https://lcz.me/post/2555</guid><dc:creator><![CDATA[janebo]]></dc:creator><pubDate>Tue, 19 May 2026 08:22:32 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 05:21:48 GMT]]></title><description><![CDATA[<p dir="auto">Mac 适合 用 在线算力。做一个 API集合路由 狂剽 免费API。生产力肯定不行。办公室 缩减开资是可以实现的。chat生态推荐。</p>
]]></description><link>https://lcz.me/post/2533</link><guid isPermaLink="true">https://lcz.me/post/2533</guid><dc:creator><![CDATA[williamlouis]]></dc:creator><pubDate>Tue, 19 May 2026 05:21:48 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Tue, 19 May 2026 00:34:10 GMT]]></title><description><![CDATA[<p dir="auto">彻底打消了我用 mac 跑本地生视频的念头</p>
]]></description><link>https://lcz.me/post/2500</link><guid isPermaLink="true">https://lcz.me/post/2500</guid><dc:creator><![CDATA[82445418]]></dc:creator><pubDate>Tue, 19 May 2026 00:34:10 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 14:59:35 GMT]]></title><description><![CDATA[<p dir="auto">我也是啊, 我喜欢苹果的安静, 优雅和全生态的无缝连接.  但是和算力不能得兼. <img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f61e.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--disappointed" style="height:23px;width:auto;vertical-align:middle" title=":(" alt="😞" /></p>
]]></description><link>https://lcz.me/post/2431</link><guid isPermaLink="true">https://lcz.me/post/2431</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Mon, 18 May 2026 14:59:35 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 14:56:13 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 没事你的测试数据让我以后喷Apple的时候更有底气了<img src="https://lcz.me/assets/plugins/nodebb-plugin-emoji/emoji/android/1f602.png?v=d348ca29232" class="not-responsive emoji emoji-android emoji--joy" style="height:23px;width:auto;vertical-align:middle" title="😂" alt="😂" />，说实话我希望苹果站起来，我特别喜欢studio那个形态，就是跑comfyui太废柴了，不然我想买一个。</p>
]]></description><link>https://lcz.me/post/2426</link><guid isPermaLink="true">https://lcz.me/post/2426</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Mon, 18 May 2026 14:56:13 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 14:23:34 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/eddie-hk" aria-label="Profile: eddie-hk">@<bdi>eddie-hk</bdi></a></p>
<p dir="auto">Studio 肯定体验更好, air适合做日常主力机, 轻便.</p>
<p dir="auto">GPU还是很重要的, prefill 阶段主要拼算力, 我这个跑27b, prefill 300多, 属于很慢的了, 如果你开thinking, 有效的首字出来经常要1分钟以上, 属于体验很差的那种.</p>
<p dir="auto">内存我个人认为反而不重要, 64G就够用了(专用, 如果你还用它上网办公剪辑视频, 那就不够了). 70b左右的moe也不会比30b左右的稠密更聪明. 除非你需要它知识面大, 比如写作之类的.</p>
<p dir="auto">如果是Mac, 我还是赞同 <a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> 的说法, 只有Max及以上才能打. 而且也只是在LLM 和 图片能打. 视频就别想了, 玩玩儿可以, 生产肯定不行.</p>
]]></description><link>https://lcz.me/post/2412</link><guid isPermaLink="true">https://lcz.me/post/2412</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Mon, 18 May 2026 14:23:34 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 14:05:18 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/vittoria-veloso" aria-label="Profile: Vittoria-Veloso">@<bdi>Vittoria-Veloso</bdi></a> 用肯定是能勉强用, 但是prefill太慢, 以及10几个token的decode, 感受很差.</p>
]]></description><link>https://lcz.me/post/2411</link><guid isPermaLink="true">https://lcz.me/post/2411</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Mon, 18 May 2026 14:05:18 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 14:04:02 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/terry" aria-label="Profile: terry">@<bdi>terry</bdi></a> 如果只是聊天, 基本够了, 如果是连接 Hermes, 那prefill 只有 300多, 时间太长, 等待回应会比较痛苦.</p>
<p dir="auto">价格是不便宜, 大概要5000多加币, 不过是没办法的事. 这是我的主力机, 本来 air 就够用.  但是因为我近期要回国, 折腾台式机的话, 很难带回去. 所以就卖肾买了这个配置, 还不敢买max, 一是担心散热, 二是担心成为鸡肋.</p>
<p dir="auto">不过现在看来, m5 max 是有可能跑 27b 稠密的, 毕竟它的高配GPU和带宽都是我这个的两倍.</p>
]]></description><link>https://lcz.me/post/2409</link><guid isPermaLink="true">https://lcz.me/post/2409</guid><dc:creator><![CDATA[Tony Wang]]></dc:creator><pubDate>Mon, 18 May 2026 14:04:02 GMT</pubDate></item><item><title><![CDATA[Reply to M5pro 64G LLM性能参考. on Mon, 18 May 2026 08:22:54 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/tony-wang" aria-label="Profile: Tony-Wang">@<bdi>Tony-Wang</bdi></a> 可以的，我的m5 32g，跑qwen 3.6 27B，开到90k上下文都可以，不过我使用 omlx，没用LM Studio。也还可以十几token，哥们儿你这配置可以开qwen 3.6 27B 8bit了，27GB，上下文开到96k左右了。</p>
]]></description><link>https://lcz.me/post/2331</link><guid isPermaLink="true">https://lcz.me/post/2331</guid><dc:creator><![CDATA[Vittoria Veloso]]></dc:creator><pubDate>Mon, 18 May 2026 08:22:54 GMT</pubDate></item></channel></rss>