Skip to content

Latest commit

 

History

History
162 lines (111 loc) · 3.75 KB

README.md

File metadata and controls

162 lines (111 loc) · 3.75 KB

@elizaos/plugin-llama

Core LLaMA plugin for Eliza OS that provides local Large Language Model capabilities.

Overview

The LLaMA plugin serves as a foundational component of Eliza OS, providing local LLM capabilities using LLaMA models. It enables efficient and customizable text generation with both CPU and GPU support.

Features

  • Local LLM Support: Run LLaMA models locally
  • GPU Acceleration: CUDA support for faster inference
  • Flexible Configuration: Customizable parameters for text generation
  • Message Queuing: Efficient handling of multiple requests
  • Automatic Model Management: Download and verification systems

Installation

npm install @elizaos/plugin-llama

Configuration

The plugin can be configured through environment variables:

Core Settings

LLAMALOCAL_PATH=your_model_storage_path
OLLAMA_MODEL=optional_ollama_model_name

Usage

import { createLlamaPlugin } from "@elizaos/plugin-llama";

// Initialize the plugin
const llamaPlugin = createLlamaPlugin();

// Register with Eliza OS
elizaos.registerPlugin(llamaPlugin);

Services

LlamaService

Provides local LLM capabilities using LLaMA models.

Technical Details

  • Model: Hermes-3-Llama-3.1-8B (8-bit quantized)
  • Source: Hugging Face (NousResearch/Hermes-3-Llama-3.1-8B-GGUF)
  • Context Size: 8192 tokens
  • Inference: CPU and GPU (CUDA) support

Features

  1. Text Generation

    • Completion-style inference
    • Temperature control
    • Stop token configuration
    • Frequency and presence penalties
    • Maximum token limit control
  2. Model Management

    • Automatic model downloading
    • Model file verification
    • Automatic retry on initialization failures
    • GPU detection for acceleration
  3. Performance

    • Message queuing system
    • CUDA acceleration when available
    • Configurable context size

Troubleshooting

Common Issues

  1. Model Initialization Failures
Error: Model initialization failed
  • Verify model file exists and is not corrupted
  • Check available system memory
  • Ensure CUDA is properly configured (if using GPU)
  1. Performance Issues
Warning: No CUDA detected - local response will be slow
  • Verify CUDA installation if using GPU
  • Check system resources
  • Consider reducing context size

Debug Mode

Enable debug logging for detailed troubleshooting:

process.env.DEBUG = "eliza:plugin-llama:*";

System Requirements

  • Node.js 16.x or higher
  • Minimum 8GB RAM recommended
  • CUDA-compatible GPU (optional, for acceleration)
  • Sufficient storage for model files

Performance Optimization

  1. Model Selection

    • Choose appropriate model size
    • Use quantized versions when possible
    • Balance quality vs speed
  2. Resource Management

    • Monitor memory usage
    • Configure appropriate context size
    • Optimize batch processing
  3. GPU Utilization

    • Enable CUDA when available
    • Monitor GPU memory
    • Balance CPU/GPU workload

Support

For issues and feature requests, please:

  1. Check the troubleshooting guide above
  2. Review existing GitHub issues
  3. Submit a new issue with:
    • System information
    • Error logs
    • Steps to reproduce

Credits

This plugin integrates with and builds upon:

Special thanks to:

  • The LLaMA community for model development
  • The Node.js community for tooling support
  • The Eliza community for testing and feedback

License

This plugin is part of the Eliza project. See the main project repository for license information.