Skip to content

Core LLaMA plugin for Eliza OS that provides local Large Language Model capabilities.

Notifications You must be signed in to change notification settings

elizaos-plugins/plugin-llama

Repository files navigation

@elizaos/plugin-llama

Core LLaMA plugin for Eliza OS that provides local Large Language Model capabilities.

Overview

The LLaMA plugin serves as a foundational component of Eliza OS, providing local LLM capabilities using LLaMA models. It enables efficient and customizable text generation with both CPU and GPU support.

Features

  • Local LLM Support: Run LLaMA models locally
  • GPU Acceleration: CUDA support for faster inference
  • Flexible Configuration: Customizable parameters for text generation
  • Message Queuing: Efficient handling of multiple requests
  • Automatic Model Management: Download and verification systems

Installation

npm install @elizaos/plugin-llama

Configuration

The plugin can be configured through environment variables:

Core Settings

LLAMALOCAL_PATH=your_model_storage_path
OLLAMA_MODEL=optional_ollama_model_name

Usage

import { createLlamaPlugin } from "@elizaos/plugin-llama";

// Initialize the plugin
const llamaPlugin = createLlamaPlugin();

// Register with Eliza OS
elizaos.registerPlugin(llamaPlugin);

Services

LlamaService

Provides local LLM capabilities using LLaMA models.

Technical Details

  • Model: Hermes-3-Llama-3.1-8B (8-bit quantized)
  • Source: Hugging Face (NousResearch/Hermes-3-Llama-3.1-8B-GGUF)
  • Context Size: 8192 tokens
  • Inference: CPU and GPU (CUDA) support

Features

  1. Text Generation

    • Completion-style inference
    • Temperature control
    • Stop token configuration
    • Frequency and presence penalties
    • Maximum token limit control
  2. Model Management

    • Automatic model downloading
    • Model file verification
    • Automatic retry on initialization failures
    • GPU detection for acceleration
  3. Performance

    • Message queuing system
    • CUDA acceleration when available
    • Configurable context size

Troubleshooting

Common Issues

  1. Model Initialization Failures
Error: Model initialization failed
  • Verify model file exists and is not corrupted
  • Check available system memory
  • Ensure CUDA is properly configured (if using GPU)
  1. Performance Issues
Warning: No CUDA detected - local response will be slow
  • Verify CUDA installation if using GPU
  • Check system resources
  • Consider reducing context size

Debug Mode

Enable debug logging for detailed troubleshooting:

process.env.DEBUG = "eliza:plugin-llama:*";

System Requirements

  • Node.js 16.x or higher
  • Minimum 8GB RAM recommended
  • CUDA-compatible GPU (optional, for acceleration)
  • Sufficient storage for model files

Performance Optimization

  1. Model Selection

    • Choose appropriate model size
    • Use quantized versions when possible
    • Balance quality vs speed
  2. Resource Management

    • Monitor memory usage
    • Configure appropriate context size
    • Optimize batch processing
  3. GPU Utilization

    • Enable CUDA when available
    • Monitor GPU memory
    • Balance CPU/GPU workload

Support

For issues and feature requests, please:

  1. Check the troubleshooting guide above
  2. Review existing GitHub issues
  3. Submit a new issue with:
    • System information
    • Error logs
    • Steps to reproduce

Credits

This plugin integrates with and builds upon:

Special thanks to:

  • The LLaMA community for model development
  • The Node.js community for tooling support
  • The Eliza community for testing and feedback

License

This plugin is part of the Eliza project. See the main project repository for license information.

About

Core LLaMA plugin for Eliza OS that provides local Large Language Model capabilities.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published