This project helps researchers stay updated with the daily papers published on arXiv. It includes:
- A backend to fetch and rank papers by relevance using an LLM.
- A frontend to view paper titles, abstracts, and relevance scores
- An Integration with Notion for easy organization
demo.mp4
![demo](https://private-user-images.githubusercontent.com/110317291/399498917-032487af-17a0-46dc-b337-82679a62c5fd.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0MTg0NjUsIm5iZiI6MTczOTQxODE2NSwicGF0aCI6Ii8xMTAzMTcyOTEvMzk5NDk4OTE3LTAzMjQ4N2FmLTE3YTAtNDZkYy1iMzM3LTgyNjc5YTYyYzVmZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjEzJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxM1QwMzQyNDVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1hM2NiZDM3NWY2MjVmZDM4ZmYyNjhkZWI1YzI2MTI2M2Q1ZDgzNjk2YjIwYWM3NDI5YTY5NWJkMzQ5Y2IwNmI1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.Oi-aVmHfsC7DqVPKXWa2VpiCqZGOgiPsXyeWkpzcGRA)
- Fetch Papers
- Retrieves papers from arXiv within a specific category published in the last 24 hours.
- Relevance Calculation with a Language Model
- Uses a pretrained text embedding model (
jinaai/jina-embeddings-v3
) to generate embeddings for paper abstracts. - Calculates cosine similarity between fetched papers and predefined reference texts.
- Sorts papers by relevance score.
- Notion Integration
- Adds selected papers to a Notion database via the Notion API.
- Web UI
- Displays titles, abstracts, and relevance scores of fetched papers.
- Allows users to add papers to Notion with a single click.
git clone https://github.com/Liangym1225/arXiv_app.git
Install Pytorch, Transformers
Create .env
with your notion api token as /backend/.env.sample
cd arXiv_app
cd backend
python -m uvicorn main:app --reload
npm install
npm run start