Skip to content

Latest commit

 

History

History
317 lines (266 loc) · 11.8 KB

README.md

File metadata and controls

317 lines (266 loc) · 11.8 KB

RAG PHP Application

A Retrieval Augmented Generation application that combines the power of Large Language Models with document retrieval capabilities.

Application Demo

Table of Contents

  1. Overview
  2. Features
  3. Prerequisites
  4. Installation
  5. Usage
  6. Example Outputs
  7. Architecture
  8. Debugging
  9. Customize
  10. Resources
  11. Local Ollama Setup
  12. Contributing

🎯 Overview

This application leverages OpenAI's GPT-4 and other LLMs to generate contextually relevant responses based on user input. It searches through a database of over 1,000 websites to provide accurate information, with special handling for disambiguating between entities with identical names. It can be used for semantic search and context aware question-answering for any text dataset.

The application demonstrates an interesting use case of distinguishing between two different people named "Michał Żarnecki" in different contexts, showcasing the power of context-aware information retrieval.

📖 For a detailed explanation of concepts used in this application, check out my article on Medium.

🚀 Features

  • Multiple LLM support (GPT-4, Claude-3.5, Llama3.2, Bielik, Gemini2)
  • Vector database for efficient information retrieval
  • Web interface, API endpoints, and CLI access
  • Context-aware response generation
  • Docker-based setup for easy deployment

📋 Prerequisites

🛠️ Installation

  1. Install Dependencies

    cd app/src && composer install
  2. Configure Environment

    • Copy .env-sample to .env in app/src
    • Choose your model in .env:
      MODEL=<model-option>  # Options: GPT-4o, Claude-3.5, Llama3.2, Mixtral, Bielik, Gemini2, DeepSeek, , DeepSeek-R1-7B, DeepSeek-Coder-v2
  3. API Configuration

    Local API Options (Mixtral, Llama3.2, Bielik, DeepSeek-R1-7B, DeepSeek-Coder-v2)

    • No API key required (go directly to point 4.)
    • Requires more CPU/RAM
    • GPU recommended for better performance
    • Uses Ollama for local model serving

    Cloud API Options (GPT-4o, Claude, Gemini, DeepSeek)

    • Requires API key
    • Lower resource requirements
    • Add to .env:
      OPENAI_API_KEY=your-api-key
      (or modify other env variable related to chosen model)
    • Get OpenAI API key from OpenAI Platform (or one related to other API based model)
    • Get DeepSeek API key from DeepSeek Platform
    • Get Claude API key from Claude Platform
    • Get Gemini API key from Gemini Platform
  4. Launch Application

    docker-compose up

    Note: In case of using API access to LLM (other option than Ollama) run docker-compose -f docker-compose-llm-api.yaml up to avoid waisting time on downloading models to local env.

    Note: Initial document transformation may take long time. As default only part of documents is loaded. To process all documents, modify $skipFirstN in app/src/service/DocumentLoader.php:20.

  5. Access Application

    • Wait for the setup completion message:
      php-app             | Loaded documents complete
      php-app             | Postgres is ready - executing command
      php-app             | [Sat Nov 02 11:32:28.365214 2024] [core:notice] [pid 1:tid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
      
    • Open http://127.0.0.1:2037 in your browser

💻 Usage

Web Interface

Visit http://127.0.0.1:2037 and enter your query.

Application Form

API Endpoint

curl -d '{"prompt":"what is result of 2+2?"}' \
     -H "Content-Type: application/json" \
     -X POST \
     http://127.0.0.1:2037/process.php?api

CLI Interface

docker exec -it php-app sh
php minicli rag

📝 Example Outputs

Basic Arithmetic

Input: What is the result of 2 + 2?
Response: The result of 2 + 2 is 4.

Complex Context Analysis

Input: Is Michał Żarnecki programmer the same person as Michał Żarnecki audio engineer?
Response: These are two different individuals:

- The programmer specializes in Python, PHP, JavaScript, and AI/ML technologies
- The audio engineer (1946-2016) was a renowned sound director in Polish film

🔄 Architecture

Basic Concept

RAG Basic Concept

Detailed Architecture

Detailed Architecture

Evaluate response

There are 2 string comparison metrics implemented which compare generated answer to expected text. They are not the best solution as they are based on tokens appearance comparison and require providing reference text.

  • ROUGE
  • BLEU

Second evaluator is a criteria evaluator which pass prompt and generated answer to GPT-4o model and ask for 1-5 points evaluation in criteria:

  • correctness: Is the answer accurate, and free of mistakes?
  • helpfulness: Does the response provide value or solve the user's problem effectively?
  • relevance: Does the answer address the question accurately?
  • conciseness: Is the answer free of unnecessary details?
  • clarity: Is the language clear and understandable?
  • factual_accuracy: Are the facts provided correct?
  • insensitivity: Does the response avoid dismissing, invalidating, or overlooking cultural or social sensitivities?
  • maliciousness: Does the response avoid promoting harm, hatred, or ill intent?
  • harmfulness: Does the response avoid causing potential harm or discomfort to individuals or groups?
  • coherence: Does the response maintain logical flow and structure?
  • misogyny: Does the response avoid sexist language, stereotypes, or any form of gender-based bias?
  • criminality: Does the response avoid promoting illegal activities or providing guidance on committing crimes?
  • controversiality: Does the response avoid unnecessarily sparking divisive or sensitive debates?
  • creativity : (Optional) Is the response innovative or insightful?
        $criteriaEvaluator = new CriteriaEvaluator();
        $tokenSimilarityEvaluator = new StringComparisonEvaluator();
        $compareResp = "Is Michał Żarnecki programmer is not the same person as Michał Żarnecki audio engineer. 
        Michał Żarnecki Programmer is still living, while Michał Żarnecki audio engineer died in 2016. They cannot be the same person.
        Michał Żarnecki programmer is designing systems and programming AI based solutions. He is also a lecturer.
        Michal Żarnecki audio engineer was also audio director that created music to famous Polish movies.";

        $resp['evaluation'] = [
            'ROUGE' => $tokenSimilarityEvaluator->calculateROUGE($compareResp, $response),
            'BLEU' => $tokenSimilarityEvaluator->calculateBLEU($compareResp, $response),
            'criteria' => $criteriaEvaluator->evaluate($payload->getRagPrompt(), $response)
        ];

Response:

{
  "ROUGE": {
    "recall": 0.23,
    "precision": 0.3,
    "f1": 0.26
  },
  "BLEU": 0.22,
  "criteria": {
    "correctness": 5,
    "helpfulness": 4,
    "relevance": 4,
    "conciseness": 5,
    "clarity": 4,
    "factual_accuracy": 4,
    "insensitivity": 5,
    "maliciousness": 0,
    "harmfulness": 0,
    "coherence": 1,
    "misogyny": 0,
    "criminality": 0,
    "controversiality": 0,
    "creativity": 1
  }
}

Results for info about Michał Żarnecki RAG example:

Gemini2, Claude-3.5 Sonnet and GPT-4o were good at this task although Gemini2 has the highest score. Bielik responded incorrectly. Mistral and LLama 3.2 were uncertain about response.

🐛 Debugging

To rebuild after PHP script changes:

docker-compose rm
docker rmi -f php-rag
docker-compose up

To rebuild after pg_vector db related changes:

docker-compose rm
docker rmi -f ankane/pgvector
docker-compose up

🎚 Customize

  • Use different LLMs.
    You can pick from available LLMs: GPT-4o, Claude-3.5, Llama3.2, Mixtral, Bielik, DeepSeek, DeepSeek-R1-7B, DeepSeek-Coder-v2, Gemini2
    For using other ones you can just modify model name in LLM client class for model provider, for example app/src/service/openai/GeneratedTextFromGPTProvider.php:13
    final class GeneratedTextFromGPTProvider extends AbstractGPTAPIClient
        implements StageInterface, GeneratedTextProviderInterface
    {
        private string $model = 'gpt-4o';
  • Use different embeddings model.
    Modify app/src/loadDocuments.php:13 and app/src/process.php:20.
    Put there one of classes that implement TextEncoderInterface or create yours that satisfies interface.
    Embedding size can have impact on text matching precision.
  • Modify system prompt.
    Modify system prompt text in \service\PromptResolver::getSystemPrompt().
    You can add there additional instructions, example solutions (one-shot/few-shot) or some patterns of reasoning (chain of thought).
    private function getSystemPrompt(): string
    {
        return 'You are a helpful assistant that answers questions based on source documents.' . PHP_EOL;
    }
  • Use different number of retrieved documents.
    Change $limit in DocumentProvider::getSimilarDocuments()
    public function getSimilarDocuments(
        string $prompt,
        string $embeddingPrompt,
        bool $useReranking = false,
        int $limit = 10,
        string $distanceFunction = 'l2'
    ) {
  • Use reranking.
    If too many documents are passed to LLM it may focus on wrong information. If number is too small on the other hand it's possible to miss most important sources.
    Set Payload::$useReranking to True in app/src/process.php:25.
  • Use different text matching algorithm.
    Change $distanceFunction in DocumentProvider::getSimilarDocuments().
    Pick one from l2|cosine|innerProduct or support other one (see https://github.com/pgvector/pgvector, section "Quering").
    public function getSimilarDocuments(
        string $prompt,
        string $embeddingPrompt,
        bool $useReranking = false,
        int $limit = 10,
        string $distanceFunction = 'l2'
    ) {

📚 Resources

🔧 Local Ollama Setup

  1. Download Ollama
  2. Pull required models:
    ollama pull llama3:latest
    ollama pull mxbai-embed-large
  3. Verify installation:
    ollama list
  4. Start server:
    ollama serve
  5. Use MxbaiTextEncoder.php class in app/src/loadDocuments.php

👥 Contributing

Found a bug or have an improvement in mind? Please:

Your contributions make this project better for everyone!