Skip to content

An open-source Perplexity analogue for Research Data Management based on sensei

License

Notifications You must be signed in to change notification settings

UB-Mannheim/FAIR-sensei

 
 

Repository files navigation

FAIR-Sensei

FAIR-Sensei an open-source Perplexity-like search for Research Data Management based on sensei.

📸 Screenshots

Light Mode

Explain RDM

Dark Mode

templates for DMP

💡 Insights from Utilizing Open Source LLMs

The key takeaways and experiences of working with open source Large Language Models are summarized in a detailed discussion. For more information, you can read the full discussion on Reddit:

🛠️ Tech Stack

Sensei Search is built using the following technologies:

  • Frontend: Next.js, Tailwind CSS
  • Backend: FastAPI, OpenAI client
  • LLMs: Command-R, Qwen-2-72b-instruct, WizardLM-2 8x22B, Claude Haiku, GPT-3.5-turbo
  • Search: SearxNG, Bing
  • Memory: Redis
  • Deployment: AWS, Paka

🏃‍♂️ How to Run Sensei Search

You can run Sensei Search either locally on your machine or in the cloud.

Running Locally

Follow these steps to run Sensei Search locally:

  1. Clone the repository:

    git clone https://github.com/UB-Mannheim/FAIR-sensei
  2. Prepare the backend environment:

    cd FAIR-sensei/backend/
    mv .env.development.example .env.development

    Edit .env.development as needed. The example environment assumes you run models through Ollama. Make sure you have reasonably good GPUs to run the command-r/Qwen-2-72b-instruct/WizardLM-2 8x22B model.

  3. No need to do anything for the frontend.

  4. Run the app with the following command:

    cd sensei_root_folder/
    docker compose up
  5. Open your browser and go to http://localhost:3000

Running in the Cloud

We deploy the app to AWS using paka. Please note that the models require GPU instances to run.

Before you start, make sure you have:

  • An AWS account
  • Requested GPU quota in your AWS account

The configuration for the cluster is located in the cluster.yaml file. You'll need to replace the HF_TOKEN value in cluster.yaml with your own Hugging Face token. This is necessary because the mistral-7b and command-r models require your account to have accepted their terms and conditions.

Follow these steps to run Sensei Search in the cloud:

  1. Install paka:

    pip install paka
  2. Provision the cluster in AWS:

    make provision-prod
  3. Deploy the backend:

    make deploy-backend
  4. Deploy the frontend:

    make deploy-frontend
  5. Get the URL of the frontend:

    paka function list
  6. Open the URL in your browser.

About

An open-source Perplexity analogue for Research Data Management based on sensei

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 48.7%
  • Python 48.2%
  • CSS 1.4%
  • Jinja 0.9%
  • Makefile 0.3%
  • Dockerfile 0.2%
  • Other 0.3%