Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.
Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).
oMLX 默认就能用ssd做kv cache
token速度还是限于内存带宽啊。这么大内存也没有提高太多速度。 这个是用oMLX,还是LM studio跑出来的? oMLX应该有点优势吧,特别是prefill这块,可以用大内存做缓冲,提高命中率。