新手入坑 R9700 真的行嗎?
-
$: llama-bench-vulkan -m 'Qwen3.6-27B-UD-Q4_K_XL.gguf' WARNING: radv is not a conformant Vulkan implementation, testing use only. ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon AI PRO R9700 (RADV GFX1201) (radv) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | qwen35 27B Q4_K - Medium | 16.39 GiB | 26.90 B | Vulkan | 99 | pp512 | 1050.13 ± 0.54 | | qwen35 27B Q4_K - Medium | 16.39 GiB | 26.90 B | Vulkan | 99 | tg128 | 31.26 ± 0.01 | build: 97895129e (8863)運行參數
llama-server-vulkan -m '/Qwen3.6-27B-UD-Q4_K_XL.gguf' --mmproj '/mmproj-BF16(3).gguf' -np 1 -ngl 99 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.00 --presence_penalty 0.00 --jinja --chat-template-kwargs '{"preserve_thinking": true}' -ub 2048 -fa 1 --spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 12 --draft-max 48 --host 0.0.0.0 --port 8180--- Prompt Processing (PPS) Statistics --- Mean: 549.60 t/s Median: 519.19 t/s P95: 936.60 t/s StdDev: 240.80 (Stability) Range: 64.18 - 1015.91 t/s --- Token Generation (Tok/s) Statistics --- Mean: 28.80 t/s Median: 28.20 t/s P95: 45.34 t/s StdDev: 6.78 (Stability) Range: 16.49 - 53.63 t/s Total Tokens Generated: 87840 $:~/Documents/llama_perf$ python3 parse_performance_stats_full.py == Prompt Processing (PPS) Analysis == Effective Avg: 549.60 t/s (Token-Weighted) Median (P50): 519.19 t/s Tail (P99): 958.31 t/s Stability(CV): 43.8% (JITTERY) Skewness: 0.04 (Symmetric) == Token Generation (Tok/s) Analysis == Effective Avg: 1697.20 t/s (Token-Weighted) Median (P50): 28.20 t/s Tail (P99): 51.39 t/s Stability(CV): 23.5% (JITTERY) Skewness: 1.40 (Burst Heavy)看上去至少比vLLM好, 不過真的就只有一點
-
@rolex-lo
我是 opencode 搭配 liteLLM 跑 gamma4 / Qwne 3.6 3.7
主力是 codex max + claude code max 200 ,我的工作是移動端全棧開發+LLM devops
我平常常會把大量的裝置端 log直接喂進去做分析,也會讓AI直接去做E2E測試
還有配合 BDD 做 測試與開發 -
$: llama-bench-vulkan -m 'Qwen3.6-27B-UD-Q4_K_XL.gguf' WARNING: radv is not a conformant Vulkan implementation, testing use only. ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon AI PRO R9700 (RADV GFX1201) (radv) | uma: 0 | fp16: 1 | bf16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | qwen35 27B Q4_K - Medium | 16.39 GiB | 26.90 B | Vulkan | 99 | pp512 | 1050.13 ± 0.54 | | qwen35 27B Q4_K - Medium | 16.39 GiB | 26.90 B | Vulkan | 99 | tg128 | 31.26 ± 0.01 | build: 97895129e (8863)運行參數
llama-server-vulkan -m '/Qwen3.6-27B-UD-Q4_K_XL.gguf' --mmproj '/mmproj-BF16(3).gguf' -np 1 -ngl 99 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.00 --presence_penalty 0.00 --jinja --chat-template-kwargs '{"preserve_thinking": true}' -ub 2048 -fa 1 --spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 12 --draft-max 48 --host 0.0.0.0 --port 8180--- Prompt Processing (PPS) Statistics --- Mean: 549.60 t/s Median: 519.19 t/s P95: 936.60 t/s StdDev: 240.80 (Stability) Range: 64.18 - 1015.91 t/s --- Token Generation (Tok/s) Statistics --- Mean: 28.80 t/s Median: 28.20 t/s P95: 45.34 t/s StdDev: 6.78 (Stability) Range: 16.49 - 53.63 t/s Total Tokens Generated: 87840 $:~/Documents/llama_perf$ python3 parse_performance_stats_full.py == Prompt Processing (PPS) Analysis == Effective Avg: 549.60 t/s (Token-Weighted) Median (P50): 519.19 t/s Tail (P99): 958.31 t/s Stability(CV): 43.8% (JITTERY) Skewness: 0.04 (Symmetric) == Token Generation (Tok/s) Analysis == Effective Avg: 1697.20 t/s (Token-Weighted) Median (P50): 28.20 t/s Tail (P99): 51.39 t/s Stability(CV): 23.5% (JITTERY) Skewness: 1.40 (Burst Heavy)看上去至少比vLLM好, 不過真的就只有一點
@566656661 看了又看 那如果上 blackwell 4500 32GB vram 對比 R9700 來說差多嗎?除了價錢外...
-
@566656661 看了又看 那如果上 blackwell 4500 32GB vram 對比 R9700 來說差多嗎?除了價錢外...
-
@566656661 謝過大哥. 都想了解 一倍價錢, 會否比r9700好一半,,,

-
@566656661 我也很期待,也許我們可以來測同一個指標?
-
@566656661 我也很期待,也許我們可以來測同一個指標?
那r9700對你來說真的雞肋,你cotext 開到多少>?