update inference comparison

alibaba · Jan 23, 2025 · fbddd0c · fbddd0c
1 parent d1b7306
commit fbddd0c
Show file tree

Hide file tree

Showing 4 changed files with 11 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -5,7 +5,7 @@
 [MNN Homepage](http://www.mnn.zone)
 
 ## News 🔥
-- [2025/01/23] We released our full multimodal LLM Android App:[MNN-LLM](./project/android/apps/MnnLlmApp/README.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.
+- [2025/01/23] We released our full multimodal LLM Android App:[MNN-LLM-Android](./project/android/apps/MnnLlmApp/README.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.
 <p align="center">
   <img width="20%" alt="Icon"  src="./project/android/apps/MnnLlmApp/assets/image_home.jpg" style="margin: 0 10px;">
   <img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_diffusion.jpg" style="margin: 0 10px;">

diff --git a/project/android/apps/MnnLlmApp/README.md b/project/android/apps/MnnLlmApp/README.md
@@ -15,7 +15,10 @@ This is our full multimodal language model (LLM) Android app
 
 + **Multimodal Support:** Enables functionality across diverse tasks, including text-to-text, image-to-text, audio-to-text, and text-to-image generation (via diffusion models).
 
-+ **CPU Inference Optimization:** MNN-LLM demonstrates exceptional performance in CPU benchmarking in Android, achieving prefill speed improvements of 8.6x over llama.cpp and 20.5x over fastllm, with decoding speeds that are 2.3x and 8.9x faster, respectively.
++ **CPU Inference Optimization:** MNN-LLM demonstrates exceptional performance in CPU benchmarking in Android, achieving prefill speed improvements of 8.6x over llama.cpp and 20.5x over fastllm, with decoding speeds that are 2.3x and 8.9x faster, respectively. the following is a comparison between llama.cpp and MNN-LLM on Android inferencing qwen-7b.
+<p align="center">
+  <img width="60%"   src="./assets/compare.gif" style="margin: 0 10px;">
+</p>
 
 + **Broad Model Compatibility:** Supports multiple leading model providers, such as Qwen, Gemma, Llama (including TinyLlama and MobileLLM), Baichuan, Yi, DeepSeek, InternLM, Phi, ReaderLM, and Smolm.
 

diff --git a/project/android/apps/MnnLlmApp/README_CN.md b/project/android/apps/MnnLlmApp/README_CN.md
@@ -10,11 +10,16 @@
   <img width="20%" alt="Icon" src="./assets/image_image.jpg" style="margin: 0 10px;">
 </p>
 
+
 ### 功能亮点
 
 + **多模态支持：** 提供多种任务功能，包括文本生成文本、图像生成文本、音频转文本及文本生成图像（基于扩散模型）。
 
-+ **CPU推理优化：** 在安卓平台上，MNN-LLM展现了卓越的CPU性能，预填充速度相较于llama.cpp提高了8.6倍，相较于fastllm提升了20.5倍，解码速度分别快了2.3倍和8.9倍。
++ **CPU推理优化：** 在安卓平台上，MNN-LLM展现了卓越的CPU性能，预填充速度相较于llama.cpp提高了8.6倍，相较于fastllm提升了20.5倍，解码速度分别快了2.3倍和8.9倍。下图为 llama.cpp 与 MNN-LLM 与 llama.cpp 的比较。
+<p align="center">
+  <img width="60%"   src="./assets/compare.gif" style="margin: 0 10px;">
+</p>
+
 
 + **广泛的模型兼容性：** 支持多种领先的模型提供商，包括Qwen、Gemma、Llama（涵盖TinyLlama与MobileLLM）、Baichuan、Yi、DeepSeek、InternLM、Phi、ReaderLM和Smolm。
 

diff --git a/project/android/apps/MnnLlmApp/assets/compare.gif b/project/android/apps/MnnLlmApp/assets/compare.gif