Skip to content
View 312shan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report 312shan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A python wrapper for Tavily search API

Python 527 63 Updated Feb 11, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,480 1,659 Updated Feb 26, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,786 1,167 Updated Mar 10, 2025

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 16,641 1,085 Updated Mar 9, 2025

A feature-rich command-line audio/video downloader

Python 103,745 8,135 Updated Mar 7, 2025
Jupyter Notebook 37 2 Updated May 21, 2024

VS Code in the browser

TypeScript 70,196 5,806 Updated Mar 10, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,570 609 Updated Mar 7, 2025

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Python 8,761 1,151 Updated Apr 2, 2024

Background Matting: The World is Your Green Screen

Python 4,793 664 Updated Nov 22, 2022

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

Jupyter Notebook 1,944 142 Updated Mar 11, 2025

CLIP⚡NCNN⚡基于自然语言的图片搜索(Image Search)⚡以字搜图⚡x86⚡Android

C++ 240 23 Updated Jul 12, 2023

[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Jupyter Notebook 789 51 Updated Jul 30, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 27,806 3,483 Updated Jul 23, 2024

A curated list of recent diffusion models for video generation, editing, and various other applications.

4,123 242 Updated Mar 10, 2025

Fine-Tuning Dataset Auto-Generation for Graph Query Languages.

Python 27 8 Updated Mar 10, 2025

TuGraph: A High Performance Graph Database.

C++ 1,501 200 Updated Mar 11, 2025

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 51,736 6,109 Updated Mar 11, 2025

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,333 899 Updated Feb 27, 2025

An embodiment robot run on iphone+macbook+Arduino+GPT API

C 14 4 Updated Apr 26, 2024

The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.

Python 459 88 Updated Mar 2, 2025

Learning audio concepts from natural language supervision

Python 532 39 Updated Sep 18, 2024

ESC-50: Dataset for Environmental Sound Classification

Python 1,498 293 Updated Mar 20, 2024

Sample Repository for the AlibabaCloud Bailian Speech SDK

111 8 Updated Feb 14, 2025

An infant cry audio corpus that's being built through the Donate-a-cry campaign - see http://donateacry.com

170 65 Updated Sep 1, 2020

用 Express 和 Vue3 搭建的 ChatGPT 演示网页

Vue 31,896 11,229 Updated Aug 16, 2024

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Python 1,920 144 Updated Jan 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 41,105 6,196 Updated Mar 11, 2025
Next
Showing results