Stateful AI Agent for Knowledge Extraction in Medical Research

A simple human-in-the-loop multi-state AI agent designed to answer medical research questions with research papers from PubMed. This project is based on the StateFlow research paper, using states with cascading function calling in a research pipeline. The benefit of using states is that it allows for a more structured and modular approach to the research process, making it easier to manage and scale. Using states is a different but highly effective approach for building AI agents, allowing for more deterministic and predictable behavior. The function calling is implemented using FastAPI, OpenAI API and DSPy to process Chain-of-Thought reasoning for prompting the LLM. The backend is interfaced using a frontend implemented in Next.js and Tailwind CSS.

Tech Stack

Frontend: Next.js (React framework)
Backend: Python with FastAPI (Python framework) and Uvicorn (ASGI server)
Data Validation: Pydantic (Type checking)
Language Model: OpenAI API (GPT models)
Prompting Framework: DSPy (Chain-of-Thought reasoning)
Styling: Tailwind CSS, Radix UI (UI components)
Animation: Framer Motion
Deployment: Render

Overview

This project implements a number of different Python frameworks and libraries to create a multi-state AI agent for knowledge extraction in medical research. The agent has been designed with 5 states in mind; Start, Clarify, Research, Analyze, and Conclusion. Each state has a number of functions that are used to extract knowledge from the research papers. Below is a list of the functions for each state.

Functions

State 1: Start

solve_task: Initializes the research process and transitions to the Clarify state

State 2: Clarify

generate_clarifying_questions: Generates relevant questions to better understand the user's research needs
ClarifyQuestions (DSPy Signature): Processes the task description to generate targeted clarifying questions

State 3: Research

fetch_research_papers: Retrieves research papers from PubMed based on the query
process_research_papers: Processes and evaluates retrieved papers
enhance_search_query: Optimizes the search query for better results
enhance_query_with_dspy: Enhances the query using clarifying answers
check_paper_accessibility: Checks if papers are openly accessible
PaperEvaluation (DSPy Signature): Evaluates papers for relevance and scientific merit
relevancy_score: Ranks each research paper based on its relevance to the user's query
citation_score: Ranks each research paper based on its methdology, study design, and other factors

State 4: Analyze

analyze_papers: Performs comprehensive analysis of selected papers, altogether
analyze_paper_content: Analyzes individual paper content using full text or abstract
fetch_pdf_content: Retrieves and extracts text from PDF papers using URL
fetch_pmc_paper_content: Fetches paper content from PubMed Central
PaperAnalysis (DSPy Signature): Extracts supporting and opposing evidence from papers

State 5: Conclude (WIP)

conclude_research: Generates final conclusions based on analyzed papers
Conclude (DSPy Signature): Processes all findings to create a comprehensive conclusion

Roadmap

Implement the Conclusion state
Implement Embase API access
Implement PICO Search Option for research paper literature review
Implement paper access type (Open Access, Paywalled, etc.) tag
Implement citation scoring of entire research paper pdf rather than abstract
Improve the UI/UX on the frontend

Potential Roadmaps

Implement research question critique agent state

Backlog

This project is a work in progress, and so needs more work to be fully functional. Below is a list of tasks.

Add state transition to move between states (02/01/2025)
Implement proper component structure and file path organization on the frontend (03/01/2025)
Create classes.ts to consolidate Tailwind classes (05/01/2025)
Refactor functions for each state for better readability and maintainability
Add more descriptive logging and error handling for debugging and troubleshooting
Add proper DSPy instantiation of prompt optimization
Break up main app.py (serverless function) into smaller callable functions

Getting Started

Prerequisites

Node.js (v14 or later)
Python (v3.7 or later)
OpenAI API key

Installation

Clone the repository:

git clone https://github.com/kallemickelborg/agentic-ai.git
cd agentic-ai

Set up the frontend:
```
cd frontend
npm install
```

Set up the backend:

cd backend
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
pip install -r requirements.txt

Create a .env file in the backend directory with your OpenAI API key:
```
OPENAI_API_KEY=your_api_key_here
```

Development

Start the backend server:
```
cd backend
uvicorn app:app --reload
```
In a new terminal, start the frontend development server:
```
cd frontend
npm run dev
```
Open your browser and navigate to http://localhost:3000

Deployment

This project is configured for deployment on Render. Follow these steps:

Fork this repository to your GitHub account.
Create a new Web Service on Render, connecting to your forked repository.
Set up the environment variables in Render, including your OpenAI API key.
Deploy the service on Render.

For detailed deployment instructions, refer to the Render documentation.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. If you are a medical researcher or student and know about the process of conducting research, please feel free to write to me and help me understand the medical research process better.

Contact

If you have any questions or feedback, please feel free to contact me at kallemickelborg@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
backend		backend
frontend		frontend
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stateful AI Agent for Knowledge Extraction in Medical Research

Table of Contents

Tech Stack

Overview

Functions

State 1: Start

State 2: Clarify

State 3: Research

State 4: Analyze

State 5: Conclude (WIP)

Roadmap

Potential Roadmaps

Backlog

Getting Started

Prerequisites

Installation

Development

Deployment

Contributing

Contact

About

Releases

Packages

Contributors 2

Languages

kallemickelborg/agentic-ai

Folders and files

Latest commit

History

Repository files navigation

Stateful AI Agent for Knowledge Extraction in Medical Research

Table of Contents

Tech Stack

Overview

Functions

State 1: Start

State 2: Clarify

State 3: Research

State 4: Analyze

State 5: Conclude (WIP)

Roadmap

Potential Roadmaps

Backlog

Getting Started

Prerequisites

Installation

Development

Deployment

Contributing

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages