PaliGemma Android HF

This repository is an implementation of inferring the PaliGemma Vision Language Model on Android using Hugging Face-Gradio Client API for tasks such as zero-shot object detection, image captioning and visual question-answering.

Pipeline:

Demo Outputs:

Visual question-answering, zero-shot object detection, image captioning

Reference Expression Segmentation

Model used: Florence-2

Resources:

Colab notebooks for PaliGemma
Official Gemma Cookbook
Medium blog for step-by-step implementation.
Big Vision HF 🤗 Spaces

Citation

If you find this project useful for your work, please cite it using the following BibTeX entry:

@misc{PaliGemma on Android using Hugging Face API,
  authors      = {Nitin Tiwari, Sagar Malhotra, Savio Rodrigues},
  title        = {PaliGemma on Android using Hugging Face API},
  year         = {2024},
  publisher    = {GitHub},
  howpublished = {\url{https://github.com/NSTiwari/PaliGemma-Android-HF}},
}

Acknowledgment

This project was developed during Google's ML Developer Programs AI Sprint. Thanks to the MLDP team for providing Google Cloud credits to support this project.

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
.idea		.idea
Android_App/PaliGemma		Android_App/PaliGemma
Python_Server		Python_Server
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaliGemma Android HF

Pipeline:

Demo Outputs:

Resources:

Citation

Acknowledgment

About

Releases

Packages

Contributors 3

Languages

License

NSTiwari/PaliGemma-Android-HF

Folders and files

Latest commit

History

Repository files navigation

PaliGemma Android HF

Pipeline:

Demo Outputs:

Resources:

Citation

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages