Releases: nomic-ai/gpt4all
v3.3.0
What's New
- UI Improvements: The minimum window size now adapts to the font size. A few labels and links have been fixed. The Embeddings Device selection of "Auto"/"Application default" works again. The window icon is now set on Linux. The antenna icon now displays when the API server is listening.
- Single Instance: Only one instance of GPT4All can be opened at a time. This is now enforced.
- Greedy Sampling: Set temperature to zero to enable greedy sampling.
- API Server Changes: The built-in API server now responds correctly to both legacy completions, and chats with message history. Also, it now uses the system prompt configured in the UI.
- Translation Improvements: The Italian, Romanian, and Traditional Chinese translations have been updated.
Contributors
- Jared Van Bortel (Nomic AI)
- Adam Treat (Nomic AI)
- 3Simplex (
@3Simplex
) - Riccardo Giovanetti (
@Harvester62
) - Victor Emanuel (
@SINAPSA-IC
) - Dominik (
@cosmic-snow
) - Shiranui (
@supersonictw
)
Full Changelog: CHANGELOG.md
v3.2.1
v3.2.0
Added
- Add Qwen2-1.5B-Instruct to models3.json (by @ThiloteE in #2759)
- Enable translation feature for seven languages: English, Spanish, Italian, Portuguese, Chinese Simplified, Chinese Traditional, Romanian (#2830)
Changed
- Add missing entries to Italian transltation (by @Harvester62 in #2783)
- Use llama_kv_cache ops to shift context faster (#2781)
- Don't stop generating at end of context (#2781)
Fixed
- Case-insensitive LocalDocs source icon detection (by @cosmic-snow in #2761)
- Fix comparison of pre- and post-release versions for update check and models3.json (#2762, #2772)
- Fix several backend issues (#2778)
- Make reverse prompt detection work more reliably and prevent it from breaking output (#2781)
- Disallow context shift for chat name and follow-up generation to prevent bugs (#2781)
- Explicitly target macOS 12.6 in CI to fix Metal compatibility on older macOS (#2846)
New Contributors
- @SINAPSA-IC made their first contribution in #2828
Full Changelog: CHANGELOG.md
v3.1.1-web_search_beta_2
This is version 2 of the web search beta which contains some important fixes including upstream llama.cpp fixes for Llama 3.1.
Fixes
- Update to latest llama.cpp which includes RoPE fix
- Fix problem with only displaying one source for tool call excerpts
- Add the extra snippets to the source excerpts
- Fix the way we're injecting the context back into the model for web search
- Change the suggestion mode to turn on for tool calls by default
WARNING:
There was a problem with the synchronization between this beta release and the models.json. In order to make this work you have to perform the following steps:
- Rename the file
Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf
toMeta-Llama-3.1-8B-Instruct-Q4_0.gguf
- Copy the following into the prompt template:
<|start_header_id|>user<|end_header_id|>
%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>
%2
- Copy the following into the system prompt:
<|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search
Cutting Knowledge Date: December 2023
Today Date: 25 Jul 2024
You are a helpful assistant.<|eot_id|>
v3.1.1
v3.1.0-web_search_beta
This is the beta version of GPT4All including a new web search feature powered by Llama 3.1. To use this version you should consult the guide located here: https://github.com/nomic-ai/gpt4all/wiki/Web-Search-Beta-Release
For questions and feedback please join the discord channel: https://discord.com/invite/4M2QFmTt2k
v3.1.0
Added
- Generate suggested follow-up questions (#2634)
- Scaffolding for translations (#2612)
- Spanish (MX) translation (by @jstayco in #2654)
- Chinese (Simplified) translation by mikage (#2657)
- Dynamic changes of language and locale at runtime (#2659, #2677)
- Romanian translation by @SINAPSA_IC (#2662)
- Chinese (Traditional) translation (by @supersonictw in #2661)
- Italian translation (by @Harvester62 in #2700)
Changed
- Customize combo boxes and context menus to fit the new style (#2535)
- Improve view bar scaling and Model Settings layout (#2520
- Make the logo spin while the model is generating (#2557)
- Server: Reply to wrong GET/POST method with HTTP 405 instead of 404 (by @cosmic-snow in #2615)
- Update theme for menus (by @3Simplex in #2578)
- Move the "stop" button to the message box (#2561)
- Build with CUDA 11.8 for better compatibility (#2639)
- Make links in latest news section clickable (#2643)
- Support translation of settings choices (#2667, #2690)
- Improve LocalDocs view's error message (by @cosmic-snow in #2679)
- Ignore case of LocalDocs file extensions (#2642, #2684)
- Update llama.cpp to commit 87e397d00 from July 19th (#2694)
- Add support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support)
- Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo
- Show scrollbar in chat collections list as needed (by @cosmic-snow in #2691)
Removed
Fixed
- Fix placement of thumbs-down and datalake opt-in dialogs (#2540)
- Select the correct folder with the Linux fallback folder dialog (#2541)
- Fix clone button sometimes producing blank model info (#2545)
- Fix jerky chat view scrolling (#2555)
- Fix "reload" showing for chats with missing models (#2520
- Fix property binding loop warning (#2601)
- Fix UI hang with certain chat view content (#2543)
- Fix crash when Kompute falls back to CPU (#2640)
- Fix several Vulkan resource management issues (#2694)
v3.0.0
What's New
- Complete UI overhaul (#2396)
- LocalDocs improvements (#2396)
- Use nomic-embed-text-v1.5 as local model instead of SBert
- Ship local model with application instead of downloading afterwards
- Store embeddings flat in SQLite DB instead of in hnswlib index
- Do exact KNN search with usearch instead of approximate KNN search with hnswlib
- Markdown support (#2476)
- Support CUDA/Metal device option for embeddings (#2477)
Fixes
- Fix embedding tokenization after #2310 (#2381)
- Fix a crash when loading certain models with "code" in their name (#2382)
- Fix an embedding crash with large chunk sizes after #2310 (#2383)
- Fix inability to load models with non-ASCII path on Windows (#2388)
- CUDA: Do not show non-fatal DLL errors on Windows (#2389)
- LocalDocs fixes (#2396)
- Always use requested number of snippets even if there are better matches in unselected collections
- Check for deleted files on startup
- CUDA: Fix PTX errors with some GPT4All builds (#2421)
- Fix blank device in UI after model switch and improve usage stats (#2409)
- Use CPU instead of CUDA backend when GPU loading fails the first time (ngl=0 is not enough) (#2477)
- Fix crash when sending a message greater than n_ctx tokens after #1970 (#2498)
New Contributors
- @woheller69 made their first contribution in (#2339)
- @patcher9 made their first contribution in (#2386)
- @sunsided made their first contribution in (#2414)
- @johnwparent made their first contribution in (#2319)
- @mcembalest made their first contribution in (#2488)
Full Changelog: v2.8.0...v3.0.0
v2.8.0
What's New
- Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
- Context Menu: Hide Copy/Cut when nothing is selected (#2324)
- Improve speed of context switch after quickly switching between several chats (#2343)
- New Chat: Always switch to the new chat when the button is clicked (#2330)
- New Chat: Always scroll to the top of the list when the button is clicked (#2330)
- Update to latest llama.cpp as of May 9, 2024 (#2310)
- Add support for the llama.cpp CUDA backend (#2310, #2357)
- Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
- When in use: Greatly improved prompt processing and generation speed on some devices
- When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
- Add support for InternLM models (#2310)
Fixes
- Do not allow sending a message while the LLM is responding (#2323)
- Fix poor quality of generated chat titles with many models (#2322)
- Set the window icon correctly on Windows (#2321)
- Fix a few memory leaks (#2328, #2348, #2310)
- Do not crash if a model file has no architecture key (#2346)
- Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
- New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
- macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
- Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
- Fix GPT4All.desktop being created by offline installers on macOS (#2361)
Full Changelog: v2.7.5...v2.8.0
v2.8.0-pre1
What's New
- Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
- Context Menu: Hide Copy/Cut when nothing is selected (#2324)
- Improve speed of context switch after quickly switching between several chats (#2343)
- New Chat: Always switch to the new chat when the button is clicked (#2330)
- New Chat: Always scroll to the top of the list when the button is clicked (#2330)
- Update to latest llama.cpp as of May 9, 2024 (#2310)
- Add support for the llama.cpp CUDA backend (#2310)
- Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
- When in use: Greatly improved prompt processing and generation speed on some devices
- When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
- Add support for InternLM models (#2310)
Fixes
- Do not allow sending a message while the LLM is responding (#2323)
- Fix poor quality of generated chat titles with many models (#2322)
- Set the window icon correctly on Windows (#2321)
- Fix a few memory leaks (#2328, #2348, #2310)
- Do not crash if a model file has no architecture key (#2346)
- Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
- New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
- macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
- Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
Full Changelog: v2.7.5...v2.8.0-pre1