@elizaos/plugin-llama

Core LLaMA plugin for Eliza OS that provides local Large Language Model capabilities.

Overview

The LLaMA plugin serves as a foundational component of Eliza OS, providing local LLM capabilities using LLaMA models. It enables efficient and customizable text generation with both CPU and GPU support.

Features

Local LLM Support: Run LLaMA models locally
GPU Acceleration: CUDA support for faster inference
Flexible Configuration: Customizable parameters for text generation
Message Queuing: Efficient handling of multiple requests
Automatic Model Management: Download and verification systems

Installation

npm install @elizaos/plugin-llama

Configuration

The plugin can be configured through environment variables:

Core Settings

LLAMALOCAL_PATH=your_model_storage_path
OLLAMA_MODEL=optional_ollama_model_name

Usage

import { createLlamaPlugin } from "@elizaos/plugin-llama";

// Initialize the plugin
const llamaPlugin = createLlamaPlugin();

// Register with Eliza OS
elizaos.registerPlugin(llamaPlugin);

Services

LlamaService

Provides local LLM capabilities using LLaMA models.

Technical Details

Model: Hermes-3-Llama-3.1-8B (8-bit quantized)
Source: Hugging Face (NousResearch/Hermes-3-Llama-3.1-8B-GGUF)
Context Size: 8192 tokens
Inference: CPU and GPU (CUDA) support

Features

Text Generation
- Completion-style inference
- Temperature control
- Stop token configuration
- Frequency and presence penalties
- Maximum token limit control
Model Management
- Automatic model downloading
- Model file verification
- Automatic retry on initialization failures
- GPU detection for acceleration
Performance
- Message queuing system
- CUDA acceleration when available
- Configurable context size

Troubleshooting

Common Issues

Model Initialization Failures

Error: Model initialization failed

Verify model file exists and is not corrupted
Check available system memory
Ensure CUDA is properly configured (if using GPU)

Performance Issues

Warning: No CUDA detected - local response will be slow

Verify CUDA installation if using GPU
Check system resources
Consider reducing context size

Debug Mode

Enable debug logging for detailed troubleshooting:

process.env.DEBUG = "eliza:plugin-llama:*";

System Requirements

Node.js 16.x or higher
Minimum 8GB RAM recommended
CUDA-compatible GPU (optional, for acceleration)
Sufficient storage for model files

Performance Optimization

Model Selection
- Choose appropriate model size
- Use quantized versions when possible
- Balance quality vs speed
Resource Management
- Monitor memory usage
- Configure appropriate context size
- Optimize batch processing
GPU Utilization
- Enable CUDA when available
- Monitor GPU memory
- Balance CPU/GPU workload

Support

For issues and feature requests, please:

Check the troubleshooting guide above
Review existing GitHub issues
Submit a new issue with:
- System information
- Error logs
- Steps to reproduce

Credits

This plugin integrates with and builds upon:

LLaMA - Base language model
node-llama-cpp - Node.js bindings
GGUF - Model format

Special thanks to:

The LLaMA community for model development
The Node.js community for tooling support
The Eliza community for testing and feedback

License

This plugin is part of the Eliza project. See the main project repository for license information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

@elizaos/plugin-llama

Overview

Features

Installation

Configuration

Core Settings

Usage

Services

LlamaService

Technical Details

Features

Troubleshooting

Common Issues

Debug Mode

System Requirements

Performance Optimization

Support

Credits

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

@elizaos/plugin-llama

Overview

Features

Installation

Configuration

Core Settings

Usage

Services

LlamaService

Technical Details

Features

Troubleshooting

Common Issues

Debug Mode

System Requirements

Performance Optimization

Support

Credits

License