Skip to content

Commit

Permalink
Update docker-compose files and test endpoints
Browse files Browse the repository at this point in the history
Update to llama.cpp

python: bump bindings version for AMD fixes

update llama.cpp-mainline

vulkan support for typescript bindings, gguf support (nomic-ai#1390)

* adding some native methods to cpp wrapper

* gpu seems to work

* typings and add availibleGpus method

* fix spelling

* fix syntax

* more

* normalize methods to conform to py

* remove extra dynamic linker deps when building with vulkan

* bump python version (library linking fix)

* Don't link against libvulkan.

* vulkan python bindings on windows fixes

* Bring the vulkan backend to the GUI.

* When device is Auto (the default) then we will only consider discrete GPU's otherwise fallback to CPU.

* Show the device we're currently using.

* Fix up the name and formatting.

* init at most one vulkan device, submodule update

fixes issues w/ multiple of the same gpu

* Update the submodule.

* Add version 2.4.15 and bump the version number.

* Fix a bug where we're not properly falling back to CPU.

* Sync to a newer version of llama.cpp with bugfix for vulkan.

* Report the actual device we're using.

* Only show GPU when we're actually using it.

* Bump to new llama with new bugfix.

* Release notes for v2.4.16 and bump the version.

* Fallback to CPU more robustly.

* Release notes for v2.4.17 and bump the version.

* Bump the Python version to python-v1.0.12 to restrict the quants that vulkan recognizes.

* Link against ggml in bin so we can get the available devices without loading a model.

* Send actual and requested device info for those who have opt-in.

* Actually bump the version.

* Release notes for v2.4.18 and bump the version.

* Fix for crashes on systems where vulkan is not installed properly.

* Release notes for v2.4.19 and bump the version.

* fix typings and vulkan build works on win

* Add flatpak manifest

* Remove unnecessary stuffs from manifest

* Update to 2.4.19

* appdata: update software description

* Latest rebase on llama.cpp with gguf support.

* macos build fixes

* llamamodel: metal supports all quantization types now

* gpt4all.py: GGUF

* pyllmodel: print specific error message

* backend: port BERT to GGUF

* backend: port MPT to GGUF

* backend: port Replit to GGUF

* backend: use gguf branch of llama.cpp-mainline

* backend: use llamamodel.cpp for StarCoder

* conversion scripts: cleanup

* convert scripts: load model as late as possible

* convert_mpt_hf_to_gguf.py: better tokenizer decoding

* backend: use llamamodel.cpp for Falcon

* convert scripts: make them directly executable

* fix references to removed model types

* modellist: fix the system prompt

* backend: port GPT-J to GGUF

* gpt-j: update inference to match latest llama.cpp insights

- Use F16 KV cache
- Store transposed V in the cache
- Avoid unnecessary Q copy

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

ggml upstream commit 0265f0813492602fec0e1159fe61de1bf0ccaf78

* chatllm: grammar fix

* convert scripts: use bytes_to_unicode from transformers

* convert scripts: make gptj script executable

* convert scripts: add feed-forward length for better compatiblilty

This GGUF key is used by all llama.cpp models with upstream support.

* gptj: remove unused variables

* Refactor for subgroups on mat * vec kernel.

* Add q6_k kernels for vulkan.

* python binding: print debug message to stderr

* Fix regenerate button to be deterministic and bump the llama version to latest we have for gguf.

* Bump to the latest fixes for vulkan in llama.

* llamamodel: fix static vector in LLamaModel::endTokens

* Switch to new models2.json for new gguf release and bump our version to
2.5.0.

* Bump to latest llama/gguf branch.

* chat: report reason for fallback to CPU

* chat: make sure to clear fallback reason on success

* more accurate fallback descriptions

* differentiate between init failure and unsupported models

* backend: do not use Vulkan with non-LLaMA models

* Add q8_0 kernels to kompute shaders and bump to latest llama/gguf.

* backend: fix build with Visual Studio generator

Use the $<CONFIG> generator expression instead of CMAKE_BUILD_TYPE. This
is needed because Visual Studio is a multi-configuration generator, so
we do not know what the build type will be until `cmake --build` is
called.

Fixes nomic-ai#1470

* remove old llama.cpp submodules

* Reorder and refresh our models2.json.

* rebase on newer llama.cpp

* python/embed4all: use gguf model, allow passing kwargs/overriding model

* Add starcoder, rift and sbert to our models2.json.

* Push a new version number for llmodel backend now that it is based on gguf.

* fix stray comma in models2.json

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* Speculative fix for build on mac.

* chat: clearer CPU fallback messages

* Fix crasher with an empty string for prompt template.

* Update the language here to avoid misunderstanding.

* added EM German Mistral Model

* make codespell happy

* issue template: remove "Related Components" section

* cmake: install the GPT-J plugin (nomic-ai#1487)

* Do not delete saved chats if we fail to serialize properly.

* Restore state from text if necessary.

* Another codespell attempted fix.

* llmodel: do not call magic_match unless build variant is correct (nomic-ai#1488)

* chatllm: do not write uninitialized data to stream (nomic-ai#1486)

* mat*mat for q4_0, q8_0

* do not process prompts on gpu yet

* python: support Path in GPT4All.__init__ (nomic-ai#1462)

* llmodel: print an error if the CPU does not support AVX (nomic-ai#1499)

* python bindings should be quiet by default

* disable llama.cpp logging unless GPT4ALL_VERBOSE_LLAMACPP envvar is
  nonempty
* make verbose flag for retrieve_model default false (but also be
  overridable via gpt4all constructor)

should be able to run a basic test:

```python
import gpt4all
model = gpt4all.GPT4All('/Users/aaron/Downloads/rift-coder-v0-7b-q4_0.gguf')
print(model.generate('def fib(n):'))
```

and see no non-model output when successful

* python: always check status code of HTTP responses (nomic-ai#1502)

* Always save chats to disk, but save them as text by default. This also changes
the UI behavior to always open a 'New Chat' and setting it as current instead
of setting a restored chat as current. This improves usability by not requiring
the user to wait if they want to immediately start chatting.

* Update README.md

Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>

* fix embed4all filename

https://discordapp.com/channels/1076964370942267462/1093558720690143283/1161778216462192692

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* Improves Java API signatures maintaining back compatibility

* python: replace deprecated pkg_resources with importlib (nomic-ai#1505)

* Updated chat wishlist (nomic-ai#1351)

* q6k, q4_1 mat*mat

* update mini-orca 3b to gguf2, license

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>

* convert scripts: fix AutoConfig typo (nomic-ai#1512)

* publish config https://docs.npmjs.com/cli/v9/configuring-npm/package-json#publishconfig (nomic-ai#1375)

merge into my branch

* fix appendBin

* fix gpu not initializing first

* sync up

* progress, still wip on destructor

* some detection work

* untested dispose method

* add js side of dispose

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/index.cc

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/gpt4all.d.ts

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/gpt4all.js

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* Update gpt4all-bindings/typescript/src/util.js

Co-authored-by: cebtenzzre <cebtenzzre@gmail.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>

* fix tests

* fix circleci for nodejs

* bump version

---------

Signed-off-by: Aaron Miller <apage43@ninjawhale.com>
Signed-off-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>
Signed-off-by: Jacob Nguyen <76754747+jacoobes@users.noreply.github.com>
Co-authored-by: Aaron Miller <apage43@ninjawhale.com>
Co-authored-by: Adam Treat <treat.adam@gmail.com>
Co-authored-by: Akarshan Biswas <akarshan.biswas@gmail.com>
Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com>
Co-authored-by: Jan Philipp Harries <jpdus@users.noreply.github.com>
Co-authored-by: umarmnaq <102142660+umarmnaq@users.noreply.github.com>
Co-authored-by: Alex Soto <asotobu@gmail.com>
Co-authored-by: niansa/tuxifan <tuxifan@posteo.de>

ts/tooling (nomic-ai#1602)

Updated readme for correct install instructions (nomic-ai#1607)

Co-authored-by: aj-gameon <aj@gameontechnology.com>

llmodel_c: improve quality of error messages (nomic-ai#1625)

Add .gguf files to .gitignore and remove unused
Dockerfile argument and app/__init__.py file

Delete gpt4all-api/gpt4all_api/app/api_v1/routes/__init__.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Delete gpt4all-api/test.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Delete gpt4all-api/completiontest.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Revert "Delete gpt4all-api/completiontest.py"

This reverts commit 08e8eea.

Revert "Delete gpt4all-api/test.py"

This reverts commit 7de26be.

Delete test files for local LLM development

Refactor code for improved readability and
performance.

Delete gpt4all-api/completiontest.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Delete gpt4all-api/test.py

Signed-off-by: Daniel Salvatierra <dsalvat1@gmail.com>

Refactor code for improved readability and
performance.

Resolve

Delete test batched completion function with
OpenAI API.
  • Loading branch information
dpsalvatierra committed Nov 12, 2023
1 parent bc88271 commit a5baa8b
Show file tree
Hide file tree
Showing 42 changed files with 6,032 additions and 4,438 deletions.
4 changes: 3 additions & 1 deletion .circleci/continue_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -856,6 +856,7 @@ jobs:
- node/install-packages:
app-dir: gpt4all-bindings/typescript
pkg-manager: yarn
override-ci-command: yarn install
- run:
command: |
cd gpt4all-bindings/typescript
Expand Down Expand Up @@ -885,6 +886,7 @@ jobs:
- node/install-packages:
app-dir: gpt4all-bindings/typescript
pkg-manager: yarn
override-ci-command: yarn install
- run:
command: |
cd gpt4all-bindings/typescript
Expand Down Expand Up @@ -994,7 +996,7 @@ jobs:
command: |
cd gpt4all-bindings/typescript
npm set //registry.npmjs.org/:_authToken=$NPM_TOKEN
npm publish --access public --tag alpha
npm publish
workflows:
version: 2
Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -183,4 +183,7 @@ build_*
build-*

# IntelliJ
.idea/
.idea/

# gguf files
*.gguf
4 changes: 3 additions & 1 deletion gpt4all-api/docker-compose.gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ services:
environment:
- HUGGING_FACE_HUB_TOKEN=token
- USE_FLASH_ATTENTION=false
- MODEL_ID=''
- MODEL_ID=${EMBEDDING}
- NUM_SHARD=1
env_file:
- ./gpt4all_api/.env
command: --model-id $MODEL_ID --num-shard $NUM_SHARD
volumes:
- ./:/data
Expand Down
5 changes: 4 additions & 1 deletion gpt4all-api/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@ services:
restart: always #restart on error (usually code compilation from save during bad state)
ports:
- "4891:4891"
env_file:
- './gpt4all_api/.env'
environment:
- APP_ENVIRONMENT=dev
- WEB_CONCURRENCY=2
- LOGLEVEL=debug
- PORT=4891
- model=ggml-mpt-7b-chat.bin
- model=${MODEL_BIN}
- inference_mode=cpu
volumes:
- './gpt4all_api/app:/app'
- './gpt4all_api/models:/models'
command: ["/start-reload.sh"]
7 changes: 7 additions & 0 deletions gpt4all-api/gpt4all.code-workspace
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"folders": [
{
"path": ".."
}
]
}
2 changes: 0 additions & 2 deletions gpt4all-api/gpt4all_api/Dockerfile.buildkit
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# syntax=docker/dockerfile:1.0.0-experimental
FROM tiangolo/uvicorn-gunicorn:python3.11

ARG MODEL_BIN=ggml-mpt-7b-chat.bin

# Put first so anytime this file changes other cached layers are invalidated.
COPY gpt4all_api/requirements.txt /requirements.txt

Expand Down
Empty file.
Empty file.
Empty file.
22 changes: 16 additions & 6 deletions gpt4all-api/gpt4all_api/app/tests/test_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,34 @@
Use the OpenAI python API to test gpt4all models.
"""
from typing import List, get_args
import os
from dotenv import load_dotenv

import openai

openai.api_base = "http://localhost:4891/v1"

openai.api_key = "not needed for a local LLM"

# Load the .env file
env_path = 'gpt4all-api/gpt4all_api/.env'
load_dotenv(dotenv_path=env_path)

# Fetch MODEL_ID from .env file
model_id = os.getenv('MODEL_BIN', 'default_model_id')
embedding = os.getenv('EMBEDDING', 'default_embedding_model_id')
print (model_id)
print (embedding)

def test_completion():
model = "ggml-mpt-7b-chat.bin"
model = model_id
prompt = "Who is Michael Jordan?"
response = openai.Completion.create(
model=model, prompt=prompt, max_tokens=50, temperature=0.28, top_p=0.95, n=1, echo=True, stream=False
)
assert len(response['choices'][0]['text']) > len(prompt)

def test_streaming_completion():
model = "ggml-mpt-7b-chat.bin"
model = model_id
prompt = "Who is Michael Jordan?"
tokens = []
for resp in openai.Completion.create(
Expand All @@ -38,7 +48,7 @@ def test_streaming_completion():


def test_batched_completion():
model = "ggml-mpt-7b-chat.bin"
model = model_id
prompt = "Who is Michael Jordan?"
response = openai.Completion.create(
model=model, prompt=[prompt] * 3, max_tokens=50, temperature=0.28, top_p=0.95, n=1, echo=True, stream=False
Expand All @@ -48,12 +58,12 @@ def test_batched_completion():


def test_embedding():
model = "ggml-all-MiniLM-L6-v2-f16.bin"
model = embedding
prompt = "Who is Michael Jordan?"
response = openai.Embedding.create(model=model, input=prompt)
output = response["data"][0]["embedding"]
args = get_args(List[float])

assert response["model"] == model
assert isinstance(output, list)
assert all(isinstance(x, args) for x in output)
assert all(isinstance(x, args) for x in output)
1 change: 1 addition & 0 deletions gpt4all-api/gpt4all_api/models/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
## Place your gguf models and embeddings on this folder
10 changes: 5 additions & 5 deletions gpt4all-api/makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,19 +28,19 @@ clean_testenv:
fresh_testenv: clean_testenv testenv

venv:
if [ ! -d $(ROOT_DIR)/env ]; then $(PYTHON) -m venv $(ROOT_DIR)/env; fi
if [ ! -d $(ROOT_DIR)/venv ]; then $(PYTHON) -m venv $(ROOT_DIR)/venv; fi

dependencies: venv
source $(ROOT_DIR)/env/bin/activate; $(PYTHON) -m pip install -r $(ROOT_DIR)/$(APP_NAME)/requirements.txt
source $(ROOT_DIR)/venv/bin/activate; $(PYTHON) -m pip install -r $(ROOT_DIR)/$(APP_NAME)/requirements.txt

clean: clean_testenv
# Remove existing environment
rm -rf $(ROOT_DIR)/env;
rm -rf $(ROOT_DIR)/venv;
rm -rf $(ROOT_DIR)/$(APP_NAME)/*.pyc;


black:
source $(ROOT_DIR)/env/bin/activate; black -l 120 -S --target-version py38 $(APP_NAME)
source $(ROOT_DIR)/venv/bin/activate; black -l 120 -S --target-version py38 $(APP_NAME)

isort:
source $(ROOT_DIR)/env/bin/activate; isort --ignore-whitespace --atomic -w 120 $(APP_NAME)
source $(ROOT_DIR)/venv/bin/activate; isort --ignore-whitespace --atomic -w 120 $(APP_NAME)
2 changes: 1 addition & 1 deletion gpt4all-backend/llama.cpp-mainline
23 changes: 18 additions & 5 deletions gpt4all-backend/llamamodel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -385,22 +385,35 @@ DLL_EXPORT const char *get_build_variant() {
}

DLL_EXPORT bool magic_match(const char * fname) {

struct ggml_context * ctx_meta = NULL;
struct gguf_init_params params = {
/*.no_alloc = */ true,
/*.ctx = */ &ctx_meta,
};
gguf_context *ctx_gguf = gguf_init_from_file(fname, params);
if (!ctx_gguf)
if (!ctx_gguf) {
std::cerr << __func__ << ": gguf_init_from_file failed\n";
return false;
}

bool valid = true;

int gguf_ver = gguf_get_version(ctx_gguf);
if (valid && gguf_ver > 3) {
std::cerr << __func__ << ": unsupported gguf version: " << gguf_ver << "\n";
valid = false;
}

bool isValid = gguf_get_version(ctx_gguf) <= 3;
auto arch = get_arch_name(ctx_gguf);
isValid = isValid && (arch == "llama" || arch == "starcoder" || arch == "falcon" || arch == "mpt");
if (valid && !(arch == "llama" || arch == "starcoder" || arch == "falcon" || arch == "mpt")) {
if (!(arch == "gptj" || arch == "bert")) { // we support these via other modules
std::cerr << __func__ << ": unsupported model architecture: " << arch << "\n";
}
valid = false;
}

gguf_free(ctx_gguf);
return isValid;
return valid;
}

DLL_EXPORT LLModel *construct() {
Expand Down
7 changes: 7 additions & 0 deletions gpt4all-backend/llmodel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -123,11 +123,18 @@ const std::vector<LLModel::Implementation> &LLModel::Implementation::implementat
}

const LLModel::Implementation* LLModel::Implementation::implementation(const char *fname, const std::string& buildVariant) {
bool buildVariantMatched = false;
for (const auto& i : implementationList()) {
if (buildVariant != i.m_buildVariant) continue;
buildVariantMatched = true;

if (!i.m_magicMatch(fname)) continue;
return &i;
}

if (!buildVariantMatched) {
std::cerr << "LLModel ERROR: Could not find any implementations for build variant: " << buildVariant << "\n";
}
return nullptr;
}

Expand Down
28 changes: 8 additions & 20 deletions gpt4all-backend/llmodel_c.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,45 +11,33 @@ struct LLModelWrapper {
~LLModelWrapper() { delete llModel; }
};


thread_local static std::string last_error_message;


llmodel_model llmodel_model_create(const char *model_path) {
auto fres = llmodel_model_create2(model_path, "auto", nullptr);
const char *error;
auto fres = llmodel_model_create2(model_path, "auto", &error);
if (!fres) {
fprintf(stderr, "Invalid model file\n");
fprintf(stderr, "Unable to instantiate model: %s\n", error);
}
return fres;
}

llmodel_model llmodel_model_create2(const char *model_path, const char *build_variant, llmodel_error *error) {
llmodel_model llmodel_model_create2(const char *model_path, const char *build_variant, const char **error) {
auto wrapper = new LLModelWrapper;
int error_code = 0;

try {
wrapper->llModel = LLModel::Implementation::construct(model_path, build_variant);
if (!wrapper->llModel) {
last_error_message = "Model format not supported (no matching implementation found)";
}
} catch (const std::exception& e) {
error_code = EINVAL;
last_error_message = e.what();
}

if (!wrapper->llModel) {
delete std::exchange(wrapper, nullptr);
// Get errno and error message if none
if (error_code == 0) {
if (errno != 0) {
error_code = errno;
last_error_message = std::strerror(error_code);
} else {
error_code = ENOTSUP;
last_error_message = "Model format not supported (no matching implementation found)";
}
}
// Set error argument
if (error) {
error->message = last_error_message.c_str();
error->code = error_code;
*error = last_error_message.c_str();
}
}
return reinterpret_cast<llmodel_model*>(wrapper);
Expand Down
15 changes: 2 additions & 13 deletions gpt4all-backend/llmodel_c.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,6 @@ extern "C" {
*/
typedef void *llmodel_model;

/**
* Structure containing any errors that may eventually occur
*/
struct llmodel_error {
const char *message; // Human readable error description; Thread-local; guaranteed to survive until next llmodel C API call
int code; // errno; 0 if none
};
#ifndef __cplusplus
typedef struct llmodel_error llmodel_error;
#endif

/**
* llmodel_prompt_context structure for holding the prompt context.
* NOTE: The implementation takes care of all the memory handling of the raw logits pointer and the
Expand Down Expand Up @@ -105,10 +94,10 @@ DEPRECATED llmodel_model llmodel_model_create(const char *model_path);
* Recognises correct model type from file at model_path
* @param model_path A string representing the path to the model file; will only be used to detect model type.
* @param build_variant A string representing the implementation to use (auto, default, avxonly, ...),
* @param error A pointer to a llmodel_error; will only be set on error.
* @param error A pointer to a string; will only be set on error.
* @return A pointer to the llmodel_model instance; NULL on error.
*/
llmodel_model llmodel_model_create2(const char *model_path, const char *build_variant, llmodel_error *error);
llmodel_model llmodel_model_create2(const char *model_path, const char *build_variant, const char **error);

/**
* Destroy a llmodel instance.
Expand Down
4 changes: 2 additions & 2 deletions gpt4all-bindings/golang/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ $(info I CXX: $(CXXV))
$(info )

llmodel.o:
mkdir buildllm
[ -e buildllm ] || mkdir buildllm
cd buildllm && cmake ../../../gpt4all-backend/ $(CMAKEFLAGS) && make
cd buildllm && cp -rf CMakeFiles/llmodel.dir/llmodel_c.cpp.o ../llmodel_c.o
cd buildllm && cp -rf CMakeFiles/llmodel.dir/llmodel.cpp.o ../llmodel.o
Expand All @@ -150,7 +150,7 @@ clean:
rm -rf buildllm
rm -rf example/main

binding.o:
binding.o: binding.cpp binding.h
$(CXX) $(CXXFLAGS) binding.cpp -o binding.o -c $(LDFLAGS)

libgpt4all.a: binding.o llmodel.o
Expand Down
7 changes: 3 additions & 4 deletions gpt4all-bindings/golang/binding.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,10 @@

void* load_model(const char *fname, int n_threads) {
// load the model
llmodel_error new_error{};
const char *new_error;
auto model = llmodel_model_create2(fname, "auto", &new_error);
if (model == nullptr ){
fprintf(stderr, "%s: error '%s'\n",
__func__, new_error.message);
if (model == nullptr) {
fprintf(stderr, "%s: error '%s'\n", __func__, new_error);
return nullptr;
}
if (!llmodel_loadModel(model, fname)) {
Expand Down
Empty file.
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package com.hexadevlabs.gpt4all;

import jnr.ffi.Pointer;
import jnr.ffi.byref.PointerByReference;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -176,7 +177,7 @@ public LLModel(Path modelPath) {
modelName = modelPath.getFileName().toString();
String modelPathAbs = modelPath.toAbsolutePath().toString();

LLModelLibrary.LLModelError error = new LLModelLibrary.LLModelError(jnr.ffi.Runtime.getSystemRuntime());
PointerByReference error = new PointerByReference();

// Check if model file exists
if(!Files.exists(modelPath)){
Expand All @@ -192,7 +193,7 @@ public LLModel(Path modelPath) {
model = library.llmodel_model_create2(modelPathAbs, "auto", error);

if(model == null) {
throw new IllegalStateException("Could not load, gpt4all backend returned error: " + error.message);
throw new IllegalStateException("Could not load, gpt4all backend returned error: " + error.getValue().getString(0));
}
library.llmodel_loadModel(model, modelPathAbs);

Expand Down Expand Up @@ -631,4 +632,4 @@ public void close() throws Exception {
library.llmodel_model_destroy(model);
}

}
}
Loading

0 comments on commit a5baa8b

Please sign in to comment.