<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[AI生成了一个LLM GPU显存计算器]]></title><description><![CDATA[<p dir="auto">deepseekv4 pro生成：</p>
<pre><code>&lt;!DOCTYPE html&gt;
&lt;html lang="zh-CN"&gt;
&lt;head&gt;
  &lt;meta charset="UTF-8"&gt;
  &lt;meta name="viewport" content="width=device-width, initial-scale=1.0"&gt;
  &lt;title&gt;LLM 显存计算器 · GPU内存需求估算&lt;/title&gt;
  &lt;style&gt;
    * { box-sizing: border-box; margin: 0; padding: 0; }

    :root {
      --bg: #080c14;
      --bg2: #0e1420;
      --card: rgba(255,255,255,0.04);
      --card-border: rgba(255,255,255,0.08);
      --accent: #6366f1;
      --accent2: #a78bfa;
      --green: #22d3a0;
      --yellow: #f59e0b;
      --red: #f43f5e;
      --text: #e2e8f0;
      --muted: #64748b;
    }

    body {
      font-family: 'Segoe UI', system-ui, sans-serif;
      background: var(--bg);
      color: var(--text);
      min-height: 100vh;
      padding: 32px 16px;
    }

    .page { max-width: 960px; margin: 0 auto; }

    header { text-align: center; margin-bottom: 40px; }
    header h1 {
      font-size: 2.4rem; font-weight: 700; letter-spacing: -0.5px;
      background: linear-gradient(135deg, #818cf8, #c084fc, #38bdf8);
      -webkit-background-clip: text; -webkit-text-fill-color: transparent;
      margin-bottom: 8px;
    }
    header p { color: var(--muted); font-size: 0.95rem; }

    .grid { display: grid; grid-template-columns: 1fr 1fr; gap: 20px; }
    @media (max-width: 700px) { .grid { grid-template-columns: 1fr; } }

    .card {
      background: var(--card);
      border: 1px solid var(--card-border);
      border-radius: 16px;
      padding: 24px;
      backdrop-filter: blur(8px);
    }
    .card h2 {
      font-size: 0.8rem; font-weight: 600; letter-spacing: 0.08em;
      text-transform: uppercase; color: var(--muted); margin-bottom: 20px;
    }

    .slider-group { margin-bottom: 22px; }
    .slider-header {
      display: flex; justify-content: space-between; align-items: baseline;
      margin-bottom: 8px;
    }
    .slider-label { font-size: 0.88rem; color: #94a3b8; }
    .slider-value {
      font-size: 1rem; font-weight: 700; color: #c4b5fd;
      font-variant-numeric: tabular-nums;
      transition: color 0.2s;
    }

    input[type=range] {
      -webkit-appearance: none; appearance: none;
      width: 100%; height: 6px; border-radius: 3px;
      background: rgba(255,255,255,0.1); outline: none; cursor: pointer;
    }
    input[type=range]::-webkit-slider-thumb {
      -webkit-appearance: none; appearance: none;
      width: 18px; height: 18px; border-radius: 50%;
      background: linear-gradient(135deg, #6366f1, #a78bfa);
      cursor: pointer; transition: transform 0.15s, box-shadow 0.15s;
      box-shadow: 0 0 0 3px rgba(99,102,241,0.25);
    }
    input[type=range]::-webkit-slider-thumb:hover {
      transform: scale(1.2);
      box-shadow: 0 0 0 5px rgba(99,102,241,0.35);
    }
    input[type=range]::-moz-range-thumb {
      width: 18px; height: 18px; border-radius: 50%; border: none;
      background: linear-gradient(135deg, #6366f1, #a78bfa);
      cursor: pointer;
    }

    .result-card {
      background: linear-gradient(135deg, rgba(99,102,241,0.12), rgba(167,139,250,0.08));
      border: 1px solid rgba(99,102,241,0.3);
      border-radius: 16px; padding: 28px; text-align: center;
      grid-column: 1 / -1; position: relative; overflow: hidden;
    }
    .result-card::before {
      content: '';
      position: absolute; top: -60px; right: -60px;
      width: 200px; height: 200px; border-radius: 50%;
      background: radial-gradient(circle, rgba(99,102,241,0.15), transparent 70%);
      pointer-events: none;
    }

    .vram-total {
      font-size: 4rem; font-weight: 800; letter-spacing: -2px;
      font-variant-numeric: tabular-nums;
      background: linear-gradient(135deg, #818cf8, #c084fc);
      -webkit-background-clip: text; -webkit-text-fill-color: transparent;
      line-height: 1; margin: 8px 0 4px;
      transition: all 0.3s;
    }
    .vram-unit { font-size: 1.3rem; font-weight: 500; color: var(--muted); }
    .vram-label { font-size: 0.8rem; letter-spacing: 0.1em; text-transform: uppercase; color: var(--muted); }

    .breakdown { margin-top: 24px; }
    .breakdown-row { display: flex; align-items: center; gap: 10px; margin-bottom: 10px; }
    .breakdown-dot {
      width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0;
    }
    .breakdown-name { font-size: 0.82rem; color: #94a3b8; width: 130px; flex-shrink: 0; text-align: left; }
    .breakdown-bar-wrap {
      flex: 1; height: 8px; background: rgba(255,255,255,0.06); border-radius: 4px; overflow: hidden;
    }
    .breakdown-bar {
      height: 100%; border-radius: 4px;
      transition: width 0.4s cubic-bezier(0.4,0,0.2,1);
    }
    .breakdown-gb {
      font-size: 0.82rem; font-weight: 600; font-variant-numeric: tabular-norms;
      min-width: 70px; text-align: right; color: var(--text);
    }

    .gpu-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(140px, 1fr)); gap: 10px; margin-top: 8px; }
    .gpu-tile {
      background: rgba(255,255,255,0.03);
      border: 1px solid var(--card-border);
      border-radius: 10px; padding: 10px 12px;
      display: flex; flex-direction: column; gap: 4px;
      transition: border-color 0.25s, background 0.25s, opacity 0.25s;
    }
    .gpu-tile.fits {
      border-color: rgba(34,211,160,0.4);
      background: rgba(34,211,160,0.06);
    }
    .gpu-tile.tight {
      border-color: rgba(245,158,11,0.4);
      background: rgba(245,158,11,0.06);
    }
    .gpu-tile.nope { opacity: 0.35; }
    .gpu-name { font-size: 0.75rem; font-weight: 600; color: #cbd5e1; }
    .gpu-mem { font-size: 0.72rem; color: var(--muted); }
    .gpu-status { font-size: 0.68rem; font-weight: 700; letter-spacing: 0.05em; margin-top: 2px; }
    .gpu-tile.fits .gpu-status  { color: var(--green); }
    .gpu-tile.tight .gpu-status { color: var(--yellow); }
    .gpu-tile.nope .gpu-status  { color: var(--muted); }

    .arch-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 12px; margin-top: 4px; }
    @media (max-width: 500px) { .arch-grid { grid-template-columns: repeat(2, 1fr); } }
    .arch-item {
      background: rgba(255,255,255,0.03);
      border: 1px solid var(--card-border);
      border-radius: 10px; padding: 10px 14px;
      text-align: center;
    }
    .arch-val { font-size: 1.1rem; font-weight: 700; color: #818cf8; }
    .arch-key { font-size: 0.68rem; color: var(--muted); text-transform: uppercase; letter-spacing: 0.06em; margin-top: 2px; }

    .formula-note {
      background: rgba(255,255,255,0.02); border: 1px solid var(--card-border);
      border-radius: 10px; padding: 12px 16px; margin-top: 8px;
      font-size: 0.78rem; color: var(--muted); line-height: 1.6;
    }
    .formula-note code {
      background: rgba(255,255,255,0.07); padding: 1px 5px; border-radius: 4px;
      font-family: 'JetBrains Mono', 'Fira Code', monospace; font-size: 0.85em;
      color: #c4b5fd;
    }

    .tick-row { display: flex; justify-content: space-between; margin-top: 4px; }
    .tick { font-size: 0.65rem; color: var(--muted); }

    @keyframes pop { 0% { transform: scale(1); } 50% { transform: scale(1.05); } 100% { transform: scale(1); } }
    .pop { animation: pop 0.25s ease-out; }
  &lt;/style&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;div class="page"&gt;
  &lt;header&gt;
    &lt;h1&gt;LLM 显存计算器&lt;/h1&gt;
    &lt;p&gt;估算大语言模型推理所需的 GPU 显存&lt;/p&gt;
  &lt;/header&gt;

  &lt;div class="grid"&gt;
    &lt;!-- 左侧：控制区 --&gt;
    &lt;div class="card"&gt;
      &lt;h2&gt;模型参数&lt;/h2&gt;

      &lt;!-- 模型大小（档位已扩展：增加20B和27B） --&gt;
      &lt;div class="slider-group"&gt;
        &lt;div class="slider-header"&gt;
          &lt;span class="slider-label"&gt;模型大小&lt;/span&gt;
          &lt;span class="slider-value" id="val-model"&gt;7 B&lt;/span&gt;
        &lt;/div&gt;
        &lt;input type="range" id="sl-model" min="0" max="15" step="1" value="4"&gt;
        &lt;div class="tick-row"&gt;
          &lt;span class="tick"&gt;0.5B&lt;/span&gt;&lt;span class="tick"&gt;7B&lt;/span&gt;
          &lt;span class="tick"&gt;70B&lt;/span&gt;&lt;span class="tick"&gt;671B&lt;/span&gt;
        &lt;/div&gt;
      &lt;/div&gt;

      &lt;!-- 权重量化 --&gt;
      &lt;div class="slider-group"&gt;
        &lt;div class="slider-header"&gt;
          &lt;span class="slider-label"&gt;权重量化&lt;/span&gt;
          &lt;span class="slider-value" id="val-quant"&gt;FP16 (16-bit)&lt;/span&gt;
        &lt;/div&gt;
        &lt;input type="range" id="sl-quant" min="0" max="7" step="1" value="6"&gt;
        &lt;div class="tick-row"&gt;
          &lt;span class="tick"&gt;2-bit&lt;/span&gt;&lt;span class="tick"&gt;Q4&lt;/span&gt;
          &lt;span class="tick"&gt;Q8&lt;/span&gt;&lt;span class="tick"&gt;FP32&lt;/span&gt;
        &lt;/div&gt;
      &lt;/div&gt;

      &lt;!-- 上下文长度 --&gt;
      &lt;div class="slider-group"&gt;
        &lt;div class="slider-header"&gt;
          &lt;span class="slider-label"&gt;上下文长度&lt;/span&gt;
          &lt;span class="slider-value" id="val-ctx"&gt;4 K tokens&lt;/span&gt;
        &lt;/div&gt;
        &lt;input type="range" id="sl-ctx" min="0" max="11" step="1" value="3"&gt;
        &lt;div class="tick-row"&gt;
          &lt;span class="tick"&gt;512&lt;/span&gt;&lt;span class="tick"&gt;4K&lt;/span&gt;
          &lt;span class="tick"&gt;128K&lt;/span&gt;&lt;span class="tick"&gt;1M&lt;/span&gt;
        &lt;/div&gt;
      &lt;/div&gt;

      &lt;!-- KV Cache 量化 --&gt;
      &lt;div class="slider-group"&gt;
        &lt;div class="slider-header"&gt;
          &lt;span class="slider-label"&gt;KV Cache 量化&lt;/span&gt;
          &lt;span class="slider-value" id="val-kv"&gt;FP16 (16-bit)&lt;/span&gt;
        &lt;/div&gt;
        &lt;input type="range" id="sl-kv" min="0" max="2" step="1" value="2"&gt;
        &lt;div class="tick-row"&gt;
          &lt;span class="tick"&gt;4-bit&lt;/span&gt;&lt;span class="tick"&gt;8-bit&lt;/span&gt;
          &lt;span class="tick"&gt;FP16&lt;/span&gt;
        &lt;/div&gt;
      &lt;/div&gt;

      &lt;!-- 批量大小 --&gt;
      &lt;div class="slider-group" style="margin-bottom:0"&gt;
        &lt;div class="slider-header"&gt;
          &lt;span class="slider-label"&gt;批量大小&lt;/span&gt;
          &lt;span class="slider-value" id="val-batch"&gt;1&lt;/span&gt;
        &lt;/div&gt;
        &lt;input type="range" id="sl-batch" min="0" max="5" step="1" value="0"&gt;
        &lt;div class="tick-row"&gt;
          &lt;span class="tick"&gt;1&lt;/span&gt;&lt;span class="tick"&gt;4&lt;/span&gt;
          &lt;span class="tick"&gt;16&lt;/span&gt;&lt;span class="tick"&gt;32&lt;/span&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;

    &lt;!-- 右侧：架构信息 + 公式 --&gt;
    &lt;div class="card"&gt;
      &lt;h2&gt;估算架构&lt;/h2&gt;
      &lt;div class="arch-grid"&gt;
        &lt;div class="arch-item"&gt;&lt;div class="arch-val" id="arch-layers"&gt;32&lt;/div&gt;&lt;div class="arch-key"&gt;层数&lt;/div&gt;&lt;/div&gt;
        &lt;div class="arch-item"&gt;&lt;div class="arch-val" id="arch-hidden"&gt;4096&lt;/div&gt;&lt;div class="arch-key"&gt;隐藏维度&lt;/div&gt;&lt;/div&gt;
        &lt;div class="arch-item"&gt;&lt;div class="arch-val" id="arch-kv-heads"&gt;8&lt;/div&gt;&lt;div class="arch-key"&gt;KV 头数&lt;/div&gt;&lt;/div&gt;
        &lt;div class="arch-item"&gt;&lt;div class="arch-val" id="arch-head-dim"&gt;128&lt;/div&gt;&lt;div class="arch-key"&gt;头维度&lt;/div&gt;&lt;/div&gt;
      &lt;/div&gt;

      &lt;div style="margin-top:20px"&gt;
        &lt;h2&gt;显存公式&lt;/h2&gt;
        &lt;div class="formula-note"&gt;
          &lt;b style="color:#c4b5fd"&gt;模型权重&lt;/b&gt; = 参数量 × 每参数字节数&lt;br&gt;
          &lt;b style="color:#34d399"&gt;KV Cache&lt;/b&gt; = 长度 × 批量 × 2 × 层数 × kv_heads × head_dim × kv字节数&lt;br&gt;
          &lt;b style="color:#f59e0b"&gt;运行时开销&lt;/b&gt; = (权重 + KV cache) × 10%&lt;br&gt;&lt;br&gt;
          &lt;code&gt;总显存 = 权重 + KV Cache + 开销&lt;/code&gt;
        &lt;/div&gt;
      &lt;/div&gt;

      &lt;div style="margin-top:20px"&gt;
        &lt;h2&gt;量化字节/参数量&lt;/h2&gt;
        &lt;div class="formula-note"&gt;
          2-bit &lt;code&gt;0.25 B&lt;/code&gt; · 3-bit &lt;code&gt;0.375 B&lt;/code&gt; · 4-bit &lt;code&gt;0.5 B&lt;/code&gt; ·
          5-bit &lt;code&gt;0.625 B&lt;/code&gt; · 6-bit &lt;code&gt;0.75 B&lt;/code&gt; · 8-bit / Q8 &lt;code&gt;1 B&lt;/code&gt; ·
          FP16 / BF16 &lt;code&gt;2 B&lt;/code&gt; · FP32 &lt;code&gt;4 B&lt;/code&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;

    &lt;!-- 全宽：总显存结果 --&gt;
    &lt;div class="result-card"&gt;
      &lt;div class="vram-label"&gt;所需显存&lt;/div&gt;
      &lt;div class="vram-total" id="vram-total"&gt;0.00&lt;/div&gt;
      &lt;div class="vram-unit"&gt;GB&lt;/div&gt;

      &lt;div class="breakdown" id="breakdown"&gt;
        &lt;div class="breakdown-row"&gt;
          &lt;div class="breakdown-dot" style="background:#818cf8"&gt;&lt;/div&gt;
          &lt;div class="breakdown-name"&gt;模型权重&lt;/div&gt;
          &lt;div class="breakdown-bar-wrap"&gt;&lt;div class="breakdown-bar" id="bar-weights" style="background:#818cf8; width:0%"&gt;&lt;/div&gt;&lt;/div&gt;
          &lt;div class="breakdown-gb" id="gb-weights"&gt;0.00 GB&lt;/div&gt;
        &lt;/div&gt;
        &lt;div class="breakdown-row"&gt;
          &lt;div class="breakdown-dot" style="background:#34d399"&gt;&lt;/div&gt;
          &lt;div class="breakdown-name"&gt;KV Cache&lt;/div&gt;
          &lt;div class="breakdown-bar-wrap"&gt;&lt;div class="breakdown-bar" id="bar-kv" style="background:#34d399; width:0%"&gt;&lt;/div&gt;&lt;/div&gt;
          &lt;div class="breakdown-gb" id="gb-kv"&gt;0.00 GB&lt;/div&gt;
        &lt;/div&gt;
        &lt;div class="breakdown-row"&gt;
          &lt;div class="breakdown-dot" style="background:#f59e0b"&gt;&lt;/div&gt;
          &lt;div class="breakdown-name"&gt;运行时开销&lt;/div&gt;
          &lt;div class="breakdown-bar-wrap"&gt;&lt;div class="breakdown-bar" id="bar-oh" style="background:#f59e0b; width:0%"&gt;&lt;/div&gt;&lt;/div&gt;
          &lt;div class="breakdown-gb" id="gb-oh"&gt;0.00 GB&lt;/div&gt;
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;

    &lt;!-- 全宽：GPU 兼容性 --&gt;
    &lt;div class="card" style="grid-column: 1 / -1"&gt;
      &lt;h2&gt;GPU 兼容性&lt;/h2&gt;
      &lt;div class="gpu-grid" id="gpu-grid"&gt;&lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;script&gt;
  // ---------- 模型档位（增加了 20B 和 27B）----------
  const MODEL_STEPS = [
    { label: '0.5B',  b: 0.5  },
    { label: '1B',    b: 1    },
    { label: '1.5B',  b: 1.5  },
    { label: '3B',    b: 3    },
    { label: '7B',    b: 7    },
    { label: '8B',    b: 8    },
    { label: '13B',   b: 13   },
    { label: '20B',   b: 20   },   // 新增
    { label: '27B',   b: 27   },   // 新增
    { label: '30B',   b: 30   },
    { label: '34B',   b: 34   },
    { label: '70B',   b: 70   },
    { label: '72B',   b: 72   },
    { label: '120B',  b: 120  },
    { label: '405B',  b: 405  },
    { label: '671B',  b: 671  }
  ];

  const QUANT_STEPS = [
    { label: '2-bit',        bpw: 0.25 },
    { label: '3-bit',        bpw: 0.375 },
    { label: 'Q4 (4-bit)',   bpw: 0.5 },
    { label: 'Q5 (5-bit)',   bpw: 0.625 },
    { label: 'Q6 (6-bit)',   bpw: 0.75 },
    { label: 'Q8 (8-bit)',   bpw: 1.0 },
    { label: 'FP16 (16-bit)',bpw: 2.0 },
    { label: 'FP32 (32-bit)',bpw: 4.0 }
  ];

  const CTX_STEPS = [
    { label: '512',   v: 512    },
    { label: '1K',    v: 1024   },
    { label: '2K',    v: 2048   },
    { label: '4K',    v: 4096   },
    { label: '8K',    v: 8192   },
    { label: '16K',   v: 16384  },
    { label: '32K',   v: 32768  },
    { label: '64K',   v: 65536  },
    { label: '128K',  v: 131072 },
    { label: '256K',  v: 262144 },
    { label: '512K',  v: 524288 },
    { label: '1M',    v: 1048576}
  ];

  const KV_STEPS = [
    { label: '4-bit',        bpw: 0.5 },
    { label: 'Q8 (8-bit)',   bpw: 1.0 },
    { label: 'FP16 (16-bit)',bpw: 2.0 }
  ];

  const BATCH_STEPS = [1, 2, 4, 8, 16, 32];

  const GPUS = [
    { name: 'RTX 3060',    vram: 12  },
    { name: 'RTX 3090',    vram: 24  },
    { name: 'RTX 4070',    vram: 12  },
    { name: 'RTX 4090',    vram: 24  },
    { name: 'RTX 5090',    vram: 32  },
    { name: 'A10',         vram: 24  },
    { name: 'A100 40G',    vram: 40  },
    { name: 'A100 80G',    vram: 80  },
    { name: 'H100 80G',    vram: 80  },
    { name: 'H100 NVL',    vram: 94  },
    { name: 'H200',        vram: 141 },
    { name: 'B200',        vram: 192 },
    { name: '2× H100',     vram: 160 },
    { name: '4× H100',     vram: 320 },
    { name: '8× H100',     vram: 640 },
    { name: '8× B200',     vram: 1536}
  ];

  // 根据参数量估算架构（与第一版完全一致）
  function estimateArch(b) {
    const layers   = Math.max(8, Math.round(14 * Math.pow(b, 0.30)));
    const hidden   = Math.round(2048 * Math.pow(b, 0.285) / 64) * 64;
    const kvHeads  = b &gt;= 200 ? 16 : 8;
    const headDim  = 128;
    return { layers, hidden, kvHeads, headDim };
  }

  // 核心计算
  function calculate() {
    const modelB   = MODEL_STEPS[+document.getElementById('sl-model').value].b;
    const quant    = QUANT_STEPS[+document.getElementById('sl-quant').value];
    const ctx      = CTX_STEPS[+document.getElementById('sl-ctx').value].v;
    const kv       = KV_STEPS[+document.getElementById('sl-kv').value];
    const batch    = BATCH_STEPS[+document.getElementById('sl-batch').value];

    const arch = estimateArch(modelB);

    const weightsGB = modelB * 1e9 * quant.bpw / (1024 ** 3);
    const kvGB = 2 * arch.layers * arch.kvHeads * arch.headDim * ctx * batch * kv.bpw / (1024 ** 3);
    const overheadGB = (weightsGB + kvGB) * 0.10;
    const totalGB = weightsGB + kvGB + overheadGB;

    return { weightsGB, kvGB, overheadGB, totalGB, arch };
  }

  function fmt(n) {
    if (n &lt; 10)    return n.toFixed(2);
    if (n &lt; 100)   return n.toFixed(1);
    return Math.round(n).toString();
  }

  function renderLabels() {
    const m = MODEL_STEPS[+document.getElementById('sl-model').value];
    const q = QUANT_STEPS[+document.getElementById('sl-quant').value];
    const c = CTX_STEPS[+document.getElementById('sl-ctx').value];
    const k = KV_STEPS[+document.getElementById('sl-kv').value];
    const b = BATCH_STEPS[+document.getElementById('sl-batch').value];

    document.getElementById('val-model').textContent  = m.label;
    document.getElementById('val-quant').textContent  = q.label;
    document.getElementById('val-ctx').textContent    = c.label + ' tokens';
    document.getElementById('val-kv').textContent     = k.label;
    document.getElementById('val-batch').textContent  = b;
  }

  function renderArch(arch) {
    document.getElementById('arch-layers').textContent   = arch.layers;
    document.getElementById('arch-hidden').textContent   = arch.hidden.toLocaleString();
    document.getElementById('arch-kv-heads').textContent = arch.kvHeads;
    document.getElementById('arch-head-dim').textContent = arch.headDim;
  }

  function renderResult({ weightsGB, kvGB, overheadGB, totalGB }) {
    const el = document.getElementById('vram-total');
    el.textContent = fmt(totalGB);
    el.classList.remove('pop');
    void el.offsetWidth;
    el.classList.add('pop');

    document.getElementById('gb-weights').textContent = fmt(weightsGB) + ' GB';
    document.getElementById('gb-kv').textContent      = fmt(kvGB) + ' GB';
    document.getElementById('gb-oh').textContent      = fmt(overheadGB) + ' GB';

    const maxBar = Math.max(weightsGB, kvGB, overheadGB, 0.01);
    document.getElementById('bar-weights').style.width = (weightsGB  / maxBar * 100) + '%';
    document.getElementById('bar-kv').style.width      = (kvGB       / maxBar * 100) + '%';
    document.getElementById('bar-oh').style.width      = (overheadGB / maxBar * 100) + '%';
  }

  function renderGPUs(totalGB) {
    const grid = document.getElementById('gpu-grid');
    grid.innerHTML = '';
    GPUS.forEach(gpu =&gt; {
      const ratio = totalGB / gpu.vram;
      let cls, status;
      if (ratio &lt;= 0.85)      { cls = 'fits';  status = '✓ 够用';   }
      else if (ratio &lt;= 1.0)  { cls = 'tight'; status = '⚠ 紧张';  }
      else                    { cls = 'nope';  status = '✗ 不足'; }

      const pct = Math.min(100, Math.round(ratio * 100));
      const tile = document.createElement('div');
      tile.className = `gpu-tile ${cls}`;
      tile.innerHTML = `
        &lt;div class="gpu-name"&gt;${gpu.name}&lt;/div&gt;
        &lt;div class="gpu-mem"&gt;${gpu.vram} GB&lt;/div&gt;
        &lt;div class="gpu-status"&gt;${status} (${pct}%)&lt;/div&gt;
      `;
      grid.appendChild(tile);
    });
  }

  function update() {
    renderLabels();
    const result = calculate();
    renderArch(result.arch);
    renderResult(result);
    renderGPUs(result.totalGB);
  }

  // 绑定滑块事件
  ['sl-model', 'sl-quant', 'sl-ctx', 'sl-kv', 'sl-batch'].forEach(id =&gt; {
    document.getElementById(id).addEventListener('input', update);
  });

  update();
&lt;/script&gt;
&lt;/body&gt;
&lt;/html&gt;
</code></pre>
<pre><code></code></pre>
]]></description><link>https://lcz.me/topic/452/ai生成了一个llm-gpu显存计算器</link><generator>RSS for Node</generator><lastBuildDate>Thu, 11 Jun 2026 16:10:33 GMT</lastBuildDate><atom:link href="https://lcz.me/topic/452.rss" rel="self" type="application/rss+xml"/><pubDate>Sat, 06 Jun 2026 14:22:31 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Mon, 08 Jun 2026 09:45:54 GMT]]></title><description><![CDATA[<p dir="auto">@艳阳天 Q5_K_M对7900XTX 24GB来说确实很勉强。我来算一下：</p>
<p dir="auto">Qwen3.6-27B的Q5_K_M体积大约19-20GB，加上系统占用约1-2GB，总共21-22GB。剩下2-3GB给KV Cache和context，如果上下文超过2-4K就可能OOM。而Q4_K_M约16-17GB，能留出6-7GB给KV Cache，跑32-64K上下文很稳。</p>
<p dir="auto">所以24GB卡跑27B模型，Q4_K_M确实是实际可用的上限。那个计算器说"适用Q5"可能没考虑系统开销和上下文显存。如果你一定要试Q5，可以：</p>
<ol>
<li>把--ctx-size降到4096以下</li>
<li>关掉MTP（能省1-2GB）</li>
<li>KV Cache用Q4甚至Q4_0量化</li>
</ol>
<p dir="auto">但体验上不如直接Q4_K_M开128K上下文来得实用。</p>
]]></description><link>https://lcz.me/post/5712</link><guid isPermaLink="true">https://lcz.me/post/5712</guid><dc:creator><![CDATA[Xiaote]]></dc:creator><pubDate>Mon, 08 Jun 2026 09:45:54 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Mon, 08 Jun 2026 04:55:10 GMT]]></title><description><![CDATA[<p dir="auto">我用它測試我的主機,7900XTX 24GB,說適用QWEN3.6-27B Q5, 我現在用Q4,開機後顯存已經91%,怕Q5會OOM, 有人試過裝Q5的嗎?</p>
]]></description><link>https://lcz.me/post/5668</link><guid isPermaLink="true">https://lcz.me/post/5668</guid><dc:creator><![CDATA[艷陽天]]></dc:creator><pubDate>Mon, 08 Jun 2026 04:55:10 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Sun, 07 Jun 2026 05:09:27 GMT]]></title><description><![CDATA[<p dir="auto">github上有个类似的项目: <a href="https://github.com/Andyyyy64/whichllm" rel="nofollow ugc">https://github.com/Andyyyy64/whichllm</a></p>
]]></description><link>https://lcz.me/post/5443</link><guid isPermaLink="true">https://lcz.me/post/5443</guid><dc:creator><![CDATA[laobenxiong]]></dc:creator><pubDate>Sun, 07 Jun 2026 05:09:27 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Sun, 07 Jun 2026 04:20:05 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/agi" aria-label="Profile: AGI">@<bdi>AGI</bdi></a> 可以，楼主可以优化下。你也可以发一个改进版。</p>
]]></description><link>https://lcz.me/post/5437</link><guid isPermaLink="true">https://lcz.me/post/5437</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Sun, 07 Jun 2026 04:20:05 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Mon, 08 Jun 2026 01:56:34 GMT]]></title><description><![CDATA[<p dir="auto">Key和value可以用不同的量化，这个可以优化下，我key一般8bit, value用4bit</p>
]]></description><link>https://lcz.me/post/5419</link><guid isPermaLink="true">https://lcz.me/post/5419</guid><dc:creator><![CDATA[AGI]]></dc:creator><pubDate>Mon, 08 Jun 2026 01:56:34 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Sun, 07 Jun 2026 01:40:19 GMT]]></title><description><![CDATA[<p dir="auto">想法不错，大家可以下载测试下。</p>
]]></description><link>https://lcz.me/post/5418</link><guid isPermaLink="true">https://lcz.me/post/5418</guid><dc:creator><![CDATA[terry]]></dc:creator><pubDate>Sun, 07 Jun 2026 01:40:19 GMT</pubDate></item><item><title><![CDATA[Reply to AI生成了一个LLM GPU显存计算器 on Sat, 06 Jun 2026 14:25:15 GMT]]></title><description><![CDATA[<p dir="auto"><img src="https://upload.lcz.me/uploads/987e63e7-8985-4e4f-a84d-7a1c9580276a.png" alt="PixPin_2026-06-06_22-24-44.png" class=" img-fluid img-markdown" /></p>
]]></description><link>https://lcz.me/post/5377</link><guid isPermaLink="true">https://lcz.me/post/5377</guid><dc:creator><![CDATA[wwcd2016]]></dc:creator><pubDate>Sat, 06 Jun 2026 14:25:15 GMT</pubDate></item></channel></rss>