ChiCurate
Date: 2024-11-17
Last Updated:
Introducing ChiCurate: Your Personalized ArXiv Feed
In the fast-paced world of scientific research, staying up-to-date with
the latest publications is crucial. However, with hundreds of papers
published daily on ArXiv, finding the most relevant ones for your
research can be like searching for a needle in a haystack. Enter
ChiCurate, a simple tool designed to make researchers more efficient,
and unearth the papers they are interested in.
Get it here!
What is ChiCurate?
ChiCurate is an open-source tool that creates a personalized ArXiv feed
tailored to your specific research interests. By leveraging the power of
local Large Language Models (LLMs) and the simplicity of RSS feeds,
ChiCurate ranks papers based on their relevance to your work, saving you
valuable time and ensuring you never miss an important publication in
your field.
Key Advantages
- Personalization: Unlike traditional RSS feeds, ChiCurate allows
you to describe your research interests in plain language. The AI
then uses this information to rank papers, ensuring you see the most
relevant content first. - Local Processing: ChiCurate runs on your local machine, ensuring
privacy and giving you full control over the curation process. - Open Source: The tool is open-source, allowing for community
contributions and customizations to fit various research needs. - Efficient: While it may take about 15 minutes to process a
day's worth of papers, you can set it up to run automatically each
morning, having your personalized feed ready before you start your
day. - Flexible: You can easily adjust your research interests or try
different LLM models to optimize the curation process. - Resource-Friendly: ChiCurate can run on modest hardware, making
it accessible to researchers without high-end computing resources.
How It Works
ChiCurate combines two key technologies:
- Local LLMs: Using Ollama, an open-source tool for running LLMs
locally, ChiCurate analyzes paper titles (and potentially abstracts)
to determine their relevance to your interests. - RSS Feeds: The tool fetches the latest papers from ArXiv's RSS
feeds, providing a structured and machine-readable interface for the
LLM to process.
By describing your research interests to the AI, you create a
personalized algorithm that ranks papers based on their relevance to
your work. This approach gives you the benefits of a curated feed while
maintaining full control over the curation process.
Installation Guide
To get started with ChiCurate, follow these steps:
-
Install Ollama:
- For Linux:
curl -fsSL https://ollama.com/install.sh | sh
- For Windows: Install WSL (Windows Subsystem for Linux) first,
then follow the Linux instructions - For MacOS: Download the dedicated installer from the Ollama
website
- For Linux:
-
Pull the LLM Model:
ollama pull mistral-nemo
-
Install ChiCurate:
git clone https://github.com/CJones-Optics/ArXivCurator.git
cd ./ArXivCurator
python3 -m venv .venv
source .venv/activate
pip install -r requirements.txt -
Configure ChiCurate:
- Edit
./userData/feeds.csv
to include the ArXiv RSS feeds
you're interested in - Modify
./userData/userPrompt.txt
to describe your research
interests - Adjust
config.yaml
if needed (e.g., to change the LLM model or
batch size)
- Edit
-
Run ArXiv Curator:
Execute
run.sh
to start the curation process
For optimal use, consider setting up a cron job to run ChiCurate
automatically each morning.
UPDATE: September 2024
Stay Tuned. I have made the program even more efficient :)