Skip to content

Releases: nomic-ai/gpt4all

v3.3.0

23 Sep 15:54
da21174
Compare
Choose a tag to compare

What's New

  • UI Improvements: The minimum window size now adapts to the font size. A few labels and links have been fixed. The Embeddings Device selection of "Auto"/"Application default" works again. The window icon is now set on Linux. The antenna icon now displays when the API server is listening.
  • Single Instance: Only one instance of GPT4All can be opened at a time. This is now enforced.
  • Greedy Sampling: Set temperature to zero to enable greedy sampling.
  • API Server Changes: The built-in API server now responds correctly to both legacy completions, and chats with message history. Also, it now uses the system prompt configured in the UI.
  • Translation Improvements: The Italian, Romanian, and Traditional Chinese translations have been updated.

Contributors

  • Jared Van Bortel (Nomic AI)
  • Adam Treat (Nomic AI)
  • 3Simplex (@3Simplex)
  • Riccardo Giovanetti (@Harvester62)
  • Victor Emanuel (@SINAPSA-IC)
  • Dominik (@cosmic-snow)
  • Shiranui (@supersonictw)

Full Changelog: CHANGELOG.md

v3.2.1

13 Aug 23:47
8ccf1fa
Compare
Choose a tag to compare

Fixed

  • Fix a potential Vulkan crash on application exit on some Linux systems (#2843)
  • Fix a bad CUDA build option that led to gibberish on newer NVIDIA GPUs (#2855)

Full Changelog: v3.2.0...v3.2.1

v3.2.0

12 Aug 21:32
3e0ad62
Compare
Choose a tag to compare

Added

  • Add Qwen2-1.5B-Instruct to models3.json (by @ThiloteE in #2759)
  • Enable translation feature for seven languages: English, Spanish, Italian, Portuguese, Chinese Simplified, Chinese Traditional, Romanian (#2830)

Changed

  • Add missing entries to Italian transltation (by @Harvester62 in #2783)
  • Use llama_kv_cache ops to shift context faster (#2781)
  • Don't stop generating at end of context (#2781)

Fixed

  • Case-insensitive LocalDocs source icon detection (by @cosmic-snow in #2761)
  • Fix comparison of pre- and post-release versions for update check and models3.json (#2762, #2772)
  • Fix several backend issues (#2778)
    • Restore leading space removal logic that was incorrectly removed in #2694
    • CUDA: Cherry-pick llama.cpp DMMV cols requirement fix that caused a crash with long conversations since #2694
  • Make reverse prompt detection work more reliably and prevent it from breaking output (#2781)
  • Disallow context shift for chat name and follow-up generation to prevent bugs (#2781)
  • Explicitly target macOS 12.6 in CI to fix Metal compatibility on older macOS (#2846)

New Contributors

Full Changelog: CHANGELOG.md

v3.1.1-web_search_beta_2

27 Jul 22:45
Compare
Choose a tag to compare

This is version 2 of the web search beta which contains some important fixes including upstream llama.cpp fixes for Llama 3.1.

Fixes

  • Update to latest llama.cpp which includes RoPE fix
  • Fix problem with only displaying one source for tool call excerpts
  • Add the extra snippets to the source excerpts
  • Fix the way we're injecting the context back into the model for web search
  • Change the suggestion mode to turn on for tool calls by default

WARNING:

There was a problem with the synchronization between this beta release and the models.json. In order to make this work you have to perform the following steps:

  1. Rename the file Meta-Llama-3.1-8B-Instruct-128k-Q4_0.gguf to Meta-Llama-3.1-8B-Instruct-Q4_0.gguf
  2. Copy the following into the prompt template:
<|start_header_id|>user<|end_header_id|>

%1<|eot_id|><|start_header_id|>assistant<|end_header_id|>

%2
  1. Copy the following into the system prompt:
<|start_header_id|>system<|end_header_id|>
Environment: ipython
Tools: brave_search

Cutting Knowledge Date: December 2023
Today Date: 25 Jul 2024

You are a helpful assistant.<|eot_id|> 

v3.1.1

27 Jul 21:54
Compare
Choose a tag to compare

Added

  • Ability to add OpenAI compatible remote models (#2683)

Fixed

  • Update llama.cpp to cherry-pick Llama 3.1 RoPE fix. (#2758)

v3.1.0-web_search_beta

25 Jul 21:30
Compare
Choose a tag to compare

This is the beta version of GPT4All including a new web search feature powered by Llama 3.1. To use this version you should consult the guide located here: https://github.com/nomic-ai/gpt4all/wiki/Web-Search-Beta-Release

For questions and feedback please join the discord channel: https://discord.com/invite/4M2QFmTt2k

v3.1.0

24 Jul 16:36
Compare
Choose a tag to compare

Added

Changed

  • Customize combo boxes and context menus to fit the new style (#2535)
  • Improve view bar scaling and Model Settings layout (#2520
  • Make the logo spin while the model is generating (#2557)
  • Server: Reply to wrong GET/POST method with HTTP 405 instead of 404 (by @cosmic-snow in #2615)
  • Update theme for menus (by @3Simplex in #2578)
  • Move the "stop" button to the message box (#2561)
  • Build with CUDA 11.8 for better compatibility (#2639)
  • Make links in latest news section clickable (#2643)
  • Support translation of settings choices (#2667, #2690)
  • Improve LocalDocs view's error message (by @cosmic-snow in #2679)
  • Ignore case of LocalDocs file extensions (#2642, #2684)
  • Update llama.cpp to commit 87e397d00 from July 19th (#2694)
    • Add support for GPT-NeoX, Gemma 2, OpenELM, ChatGLM, and Jais architectures (all with Vulkan support)
    • Enable Vulkan support for StarCoder2, XVERSE, Command R, and OLMo
  • Show scrollbar in chat collections list as needed (by @cosmic-snow in #2691)

Removed

Fixed

  • Fix placement of thumbs-down and datalake opt-in dialogs (#2540)
  • Select the correct folder with the Linux fallback folder dialog (#2541)
  • Fix clone button sometimes producing blank model info (#2545)
  • Fix jerky chat view scrolling (#2555)
  • Fix "reload" showing for chats with missing models (#2520
  • Fix property binding loop warning (#2601)
  • Fix UI hang with certain chat view content (#2543)
  • Fix crash when Kompute falls back to CPU (#2640)
  • Fix several Vulkan resource management issues (#2694)

v3.0.0

02 Jul 16:13
4c26726
Compare
Choose a tag to compare

What's New

  • Complete UI overhaul (#2396)
  • LocalDocs improvements (#2396)
    • Use nomic-embed-text-v1.5 as local model instead of SBert
    • Ship local model with application instead of downloading afterwards
    • Store embeddings flat in SQLite DB instead of in hnswlib index
    • Do exact KNN search with usearch instead of approximate KNN search with hnswlib
  • Markdown support (#2476)
  • Support CUDA/Metal device option for embeddings (#2477)

Fixes

  • Fix embedding tokenization after #2310 (#2381)
  • Fix a crash when loading certain models with "code" in their name (#2382)
  • Fix an embedding crash with large chunk sizes after #2310 (#2383)
  • Fix inability to load models with non-ASCII path on Windows (#2388)
  • CUDA: Do not show non-fatal DLL errors on Windows (#2389)
  • LocalDocs fixes (#2396)
    • Always use requested number of snippets even if there are better matches in unselected collections
    • Check for deleted files on startup
  • CUDA: Fix PTX errors with some GPT4All builds (#2421)
  • Fix blank device in UI after model switch and improve usage stats (#2409)
  • Use CPU instead of CUDA backend when GPU loading fails the first time (ngl=0 is not enough) (#2477)
  • Fix crash when sending a message greater than n_ctx tokens after #1970 (#2498)

New Contributors

Full Changelog: v2.8.0...v3.0.0

v2.8.0

24 May 03:29
09dd3dc
Compare
Choose a tag to compare

What's New

  • Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
  • Context Menu: Hide Copy/Cut when nothing is selected (#2324)
  • Improve speed of context switch after quickly switching between several chats (#2343)
  • New Chat: Always switch to the new chat when the button is clicked (#2330)
  • New Chat: Always scroll to the top of the list when the button is clicked (#2330)
  • Update to latest llama.cpp as of May 9, 2024 (#2310)
  • Add support for the llama.cpp CUDA backend (#2310, #2357)
    • Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
    • When in use: Greatly improved prompt processing and generation speed on some devices
    • When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
  • Add support for InternLM models (#2310)

Fixes

  • Do not allow sending a message while the LLM is responding (#2323)
  • Fix poor quality of generated chat titles with many models (#2322)
  • Set the window icon correctly on Windows (#2321)
  • Fix a few memory leaks (#2328, #2348, #2310)
  • Do not crash if a model file has no architecture key (#2346)
  • Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
  • New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
  • macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
  • Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
  • Fix GPT4All.desktop being created by offline installers on macOS (#2361)

Full Changelog: v2.7.5...v2.8.0

v2.8.0-pre1

15 May 23:37
a92d266
Compare
Choose a tag to compare
v2.8.0-pre1 Pre-release
Pre-release

What's New

  • Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
  • Context Menu: Hide Copy/Cut when nothing is selected (#2324)
  • Improve speed of context switch after quickly switching between several chats (#2343)
  • New Chat: Always switch to the new chat when the button is clicked (#2330)
  • New Chat: Always scroll to the top of the list when the button is clicked (#2330)
  • Update to latest llama.cpp as of May 9, 2024 (#2310)
  • Add support for the llama.cpp CUDA backend (#2310)
    • Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
    • When in use: Greatly improved prompt processing and generation speed on some devices
    • When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
  • Add support for InternLM models (#2310)

Fixes

  • Do not allow sending a message while the LLM is responding (#2323)
  • Fix poor quality of generated chat titles with many models (#2322)
  • Set the window icon correctly on Windows (#2321)
  • Fix a few memory leaks (#2328, #2348, #2310)
  • Do not crash if a model file has no architecture key (#2346)
  • Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
  • New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
  • macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
  • Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)

Full Changelog: v2.7.5...v2.8.0-pre1