Skip to content

Commit

Permalink
update inference comparison
Browse files Browse the repository at this point in the history
  • Loading branch information
Juude committed Jan 23, 2025
1 parent d1b7306 commit fbddd0c
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 3 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[MNN Homepage](http://www.mnn.zone)

## News 🔥
- [2025/01/23] We released our full multimodal LLM Android App:[MNN-LLM](./project/android/apps/MnnLlmApp/README.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.
- [2025/01/23] We released our full multimodal LLM Android App:[MNN-LLM-Android](./project/android/apps/MnnLlmApp/README.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.
<p align="center">
<img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_home.jpg" style="margin: 0 10px;">
<img width="20%" alt="Icon" src="./project/android/apps/MnnLlmApp/assets/image_diffusion.jpg" style="margin: 0 10px;">
Expand Down
5 changes: 4 additions & 1 deletion project/android/apps/MnnLlmApp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,10 @@ This is our full multimodal language model (LLM) Android app

+ **Multimodal Support:** Enables functionality across diverse tasks, including text-to-text, image-to-text, audio-to-text, and text-to-image generation (via diffusion models).

+ **CPU Inference Optimization:** MNN-LLM demonstrates exceptional performance in CPU benchmarking in Android, achieving prefill speed improvements of 8.6x over llama.cpp and 20.5x over fastllm, with decoding speeds that are 2.3x and 8.9x faster, respectively.
+ **CPU Inference Optimization:** MNN-LLM demonstrates exceptional performance in CPU benchmarking in Android, achieving prefill speed improvements of 8.6x over llama.cpp and 20.5x over fastllm, with decoding speeds that are 2.3x and 8.9x faster, respectively. the following is a comparison between llama.cpp and MNN-LLM on Android inferencing qwen-7b.
<p align="center">
<img width="60%" src="./assets/compare.gif" style="margin: 0 10px;">
</p>

+ **Broad Model Compatibility:** Supports multiple leading model providers, such as Qwen, Gemma, Llama (including TinyLlama and MobileLLM), Baichuan, Yi, DeepSeek, InternLM, Phi, ReaderLM, and Smolm.

Expand Down
7 changes: 6 additions & 1 deletion project/android/apps/MnnLlmApp/README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,16 @@
<img width="20%" alt="Icon" src="./assets/image_image.jpg" style="margin: 0 10px;">
</p>


### 功能亮点

+ **多模态支持:** 提供多种任务功能,包括文本生成文本、图像生成文本、音频转文本及文本生成图像(基于扩散模型)。

+ **CPU推理优化:** 在安卓平台上,MNN-LLM展现了卓越的CPU性能,预填充速度相较于llama.cpp提高了8.6倍,相较于fastllm提升了20.5倍,解码速度分别快了2.3倍和8.9倍。
+ **CPU推理优化:** 在安卓平台上,MNN-LLM展现了卓越的CPU性能,预填充速度相较于llama.cpp提高了8.6倍,相较于fastllm提升了20.5倍,解码速度分别快了2.3倍和8.9倍。下图为 llama.cpp 与 MNN-LLM 与 llama.cpp 的比较。
<p align="center">
<img width="60%" src="./assets/compare.gif" style="margin: 0 10px;">
</p>


+ **广泛的模型兼容性:** 支持多种领先的模型提供商,包括Qwen、Gemma、Llama(涵盖TinyLlama与MobileLLM)、Baichuan、Yi、DeepSeek、InternLM、Phi、ReaderLM和Smolm。

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit fbddd0c

Please sign in to comment.