-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6ada7c4
commit 5368cc4
Showing
2 changed files
with
80 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
--- | ||
title: 7 月 8 日 | ||
date: 2024-07-08 | ||
cover: https://oss.justin3go.com/blogs/fav0-001.jpg | ||
|
||
--- | ||
|
||
每天仅需 1 分钟,全面获取 AI 技术发展、行业动态和市场趋势。 | ||
|
||
内容涵盖但不限于**前沿 AI 资讯**、**AI 工具**、**AI 绘画**、**开源项目**和**学习教程**等等。 | ||
|
||
关注 AI 日报,紧跟 AI 潮流,希望对你有所帮助。对于重要信息,会独立发帖进行详细介绍。 | ||
|
||
以下是 7 月 8 日的最新 AI 信息。 | ||
|
||
### 前沿技术 | ||
|
||
**1、发现一个新的实时物体检测器 RT-DETR。** | ||
|
||
它是第一个实时端到端目标检测器,在速度和精度方面都优于相同规模的 YOLO 检测器。 | ||
|
||
GitHub:https://github.com/lyuwenyu/RT-DETR | ||
|
||
在线体验:https://huggingface.co/spaces/merve/RT-DETR-tracking-coco | ||
|
||
![image-20240708211818487](https://p.ipic.vip/xnhjac.png) | ||
|
||
**2、VAST 开源了一个 3D 角色生成模型 CharacterGen。** | ||
|
||
可将单张图像转换为高质量、外观一致的 3D 角色。非常适合游戏和动画的工作流程。 | ||
|
||
详细介绍:https://charactergen.github.io/ | ||
|
||
GitHub:https://github.com/zjp-shadow/CharacterGen | ||
|
||
![teaser](https://github.com/zjp-shadow/CharacterGen/raw/main/materials/teaser.png) | ||
|
||
|
||
|
||
### AI 绘画 | ||
|
||
**1、可用于图像生成和编辑的 ControlNet Plus 模型。** | ||
|
||
基于原始 ControlNet 架构扩展开发,能够在条件文本生成图像中支持 10 多种控制类型,并且能够生成视觉效果媲美 Midjourney 的高分辨率图像。 | ||
|
||
模型下载:https://huggingface.co/xinsir/controlnet-union-sdxl-1.0 | ||
|
||
![image-20240708205305070](https://p.ipic.vip/r93bk5.png) | ||
|
||
|
||
|
||
### 开源项目 | ||
|
||
**1、斯坦福开源的 Prompt 编程框架 DSPy,目前已获得 14.1k Star。** | ||
|
||
具有如下特性: | ||
|
||
- 模块化编程:提供标准模块帮助你写 Prompt。 | ||
- 自动编译器:自动为特定 LLM 微调 Prompt 与参数。 | ||
- 类似 HippoRAG 支持解决复杂多跳检索。 | ||
- 支持强大的主流大模型,如 GPT-4o、Claude 3、Gemin Pro、Llama 等。 | ||
|
||
GitHub:https://github.com/stanfordnlp/dspy | ||
|
||
提供了详细的入门教程,官方用的是 Python 语言,还有一个非官方 Typescript 版本,可以看下。 | ||
|
||
GitHub:https://github.com/ax-llm/ax | ||
|
||
![image-20240708203343152](https://p.ipic.vip/0yluae.png) | ||
|
||
**2、用 160 行代码实现 GPT-4o 发布会的实时音视频通话能力。** | ||
|
||
使用 OpenCV 搞定视频画面捕获,再使用 GPT-4o 来进行文本处理和多模态,而音频则是基于 Whisper 和 TTS 处理。 | ||
|
||
目前代码已经开源到 GitHub,并且作者还录制了一条教程视频,感兴趣的可以看看。 | ||
|
||
GitHub:https://github.com/svpino/alloy-voice-assistant | ||
|
||
![img](https://img.youtube.com/vi/zVttVCQvACQ/maxresdefault.jpg) |