RabbitAI Documentation
Open-source AI code reviewer. Auto-reviews GitHub PRs with zero cost and full self-hosting.
How It Works
RabbitAI runs a 9-node LangGraph pipeline every time a PR is opened. Each node does one job and passes its output to the next.
fetch → graph → classify → embed → retrieve → load_memory → review → post → save_memory
| Node | File | What it does |
|---|---|---|
| fetch | nodes/fetcher.py | Pulls PR diff and metadata from GitHub API |
| graph | nodes/graph_builder.py | Builds NetworkX dependency graph, computes blast radius |
| classify | nodes/classifier.py | Detects change type — bug fix, feature, refactor, security |
| embed | nodes/embedder.py | Chunks diff, embeds via your chosen model, stores in vector DB |
| retrieve | nodes/retriever.py | Semantic search over stored chunks |
| load_memory | memory/repo_memory.py | Loads past learnings from mem0 |
| review | nodes/reviewer.py | Builds prompt from all context, calls your chosen LLM |
| post | nodes/poster.py | Posts structured comment on the PR |
| save_memory | memory/repo_memory.py | Saves new learnings to mem0 for future PRs |
Intelligence Stack
Three separate systems feed context into the review prompt simultaneously.
NetworkX Knowledge Graph Parses import and require statements from the diff and maps file dependencies into a directed graph. Computes blast radius — how many files in the codebase depend on each changed file. Files with 5+ dependents are flagged HIGH RISK and the reviewer focuses harder on them. Built fresh from each diff, no persistence needed.
Vector DB — RAG Pipeline Chunks the PR diff by file, embeds each chunk using your configured embedding model, and stores it in your vector store. Before each review, semantically similar chunks are retrieved using the classification result as the query. This gives the reviewer relevant code context beyond just the raw diff.
mem0 Persistent Memory After every review, mem0 automatically extracts facts and patterns from the review text — things like "this repo uses Drizzle ORM", "SQL injection found in db.ts previously", "team prefers functional components". Before the next review on the same repo, these are loaded and injected into the prompt. RabbitAI gets smarter with every PR it reviews.
Quick Start
1. Clone and install
git clone https://github.com/nikhilsaiankilla/rabbitai
cd rabbitai
pip install -r requirements.txt
2. Configure
cp config.example.yaml config.yaml
Fill in your keys. See Getting Your API Keys below.
3. Add the workflow to your repo
Create .github/workflows/review.yml in the repo you want reviewed:
name: RabbitAI Code Review
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
review:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- name: Checkout RabbitAI
uses: actions/checkout@v4
with:
repository: nikhilsaiankilla/rabbitai
path: rabbitai
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install dependencies
run: pip install -r rabbitai/requirements.txt
- name: Run RabbitAI
env:
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GITHUB_REPOSITORY: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
cd rabbitai
python -c "
import os
from agent import run
result = run(os.environ['GITHUB_REPOSITORY'], int(os.environ['PR_NUMBER']))
print(result.comment_url if result.posted else result.reason)
"
4. Add secrets to your repo
Go to your repo on GitHub → Settings → Secrets and variables → Actions → New repository secret
Add the secrets for whichever providers you chose. See Getting Your API Keys for where to get each one.
GITHUB_TOKEN is injected automatically by GitHub — do not add it manually.
5. Open a PR
That's it. RabbitAI reviews it automatically.
Getting Your API Keys
GitHub Personal Access Token
Required scope: repo (full control of private repositories)
Only needed for local development. GitHub Actions injects GITHUB_TOKEN automatically.
- →Go to github.com → click your avatar → Settings
- →Scroll to the bottom → Developer settings
- →Personal access tokens → Tokens (classic)
- →Generate new token (classic)
- →Check
repo(the top-level checkbox — checks everything below it) - →Set an expiration → Generate token → copy it immediately
Never share this token or commit it to your repo.
Gemini API Key
Free tier available. No credit card required.
- →Go to aistudio.google.com
- →Sign in with your Google account
- →Click Get API key → Create API key
- →Copy the key
Used when embedding.provider or llm.provider is set to "gemini".
OpenAI API Key
Required if using OpenAI for embeddings or review generation.
- →Go to platform.openai.com
- →Sign in → click your avatar → API keys
- →Create new secret key → copy it
Used when embedding.provider or llm.provider is set to "openai".
Pinecone API Key
Required if using Pinecone as your vector store.
- →Go to app.pinecone.io → sign up free
- →API Keys in the left sidebar → copy your key
- →Indexes → Create Index with these settings:
- →Name:
rabbitai(or whatever you set asindex_name) - →Dimensions: match your embedding model (see Vector Store Providers)
- →Metric:
cosine - →Cloud: AWS
us-east-1(free tier)
- →Name:
Configuration
Copy config.example.yaml to config.yaml and fill in your values. config.yaml is gitignored — never commit it.
In GitHub Actions, all values are injected via repository secrets. config.yaml is not needed in CI.
github_token: "" # local dev only
gemini_api_key: "" # required if using gemini for embedding or llm
embedding:
provider: "gemini" # gemini | openai
model: "" # leave empty to use default for the provider
api_key: "" # openai only
llm:
provider: "gemini" # gemini | openai
model: "" # leave empty to use default for the provider
api_key: "" # openai only
vector_store:
provider: "chromadb" # chromadb | pinecone | qdrant
path: "./chroma_db" # chromadb only
collection: "pr-chunks"
memory:
enabled: true
repo_context: |
Describe your repo here so RabbitAI understands it from day one.
review:
language: "typescript"
focus:
- bugs
- security
- performance
min_risk_score: 6
post_score: true
Embedding Providers
Controls how PR diff chunks are converted to vectors for storage and retrieval. Must match the dimension of your vector store index.
Gemini (default, free)
Model: models/gemini-embedding-001 — outputs 768 dimensions
embedding:
provider: "gemini"
model: "models/gemini-embedding-001"
Uses gemini_api_key from the top of your config. Get your key at aistudio.google.com.
Create your vector store index with 768 dimensions.
OpenAI
| Model | Dimensions | Notes |
|---|---|---|
text-embedding-3-small | 1536 | Recommended — fast, low cost |
text-embedding-3-large | 3072 | Higher quality, higher cost |
embedding:
provider: "openai"
model: "text-embedding-3-small"
api_key: "sk-xxx"
Get your key at platform.openai.com/api-keys.
Create your vector store index with 1536 dimensions if using text-embedding-3-small.
LLM Providers
Controls which model generates the actual code review.
Gemini (free)
Model: gemini-2.0-flash — fast, capable, free tier
llm:
provider: "gemini"
model: "gemini-2.0-flash"
Uses gemini_api_key from the top of your config.
OpenAI
Model: gpt-4.1-mini — best balance of quality and cost for code review
llm:
provider: "openai"
model: "gpt-4.1-mini"
api_key: "sk-xxx"
Get your key at platform.openai.com/api-keys.
Vector Store Providers
Stores embedded diff chunks for the RAG pipeline.
ChromaDB — local, free, no setup
vector_store:
provider: "chromadb"
path: "./chroma_db"
collection: "pr-chunks"
No account or API key needed. Data persists in ./chroma_db on disk. Recommended for getting started.
Note: ChromaDB requires Python 3.11 or lower. Python 3.12+ may have compatibility issues.
Install:
pip install chromadb
Pinecone — cloud, free tier available
vector_store:
provider: "pinecone"
api_key: "YOUR_PINECONE_KEY"
index_name: "rabbitai"
collection: "pr-chunks"
Create your index at app.pinecone.io. Set dimensions to match your embedding model:
| Embedding model | Index dimensions |
|---|---|
gemini-embedding-001 | 768 |
text-embedding-3-small | 1536 |
text-embedding-3-large | 3072 |
Install:
pip install pinecone
Qdrant — self-hosted or cloud
Self-hosted:
vector_store:
provider: "qdrant"
host: "localhost"
port: 6333
collection: "pr-chunks"
Run locally with Docker:
docker run -p 6333:6333 qdrant/qdrant
Qdrant Cloud — sign up at cloud.qdrant.io:
vector_store:
provider: "qdrant"
host: "https://your-cluster.qdrant.io"
api_key: "YOUR_QDRANT_KEY"
collection: "pr-chunks"
Install:
pip install qdrant-client
Memory
RabbitAI uses mem0 to build persistent memory across PR reviews. After each review, mem0 extracts facts and patterns and stores them. Before the next review on the same repo, those facts are retrieved and injected into the prompt.
Memory uses the same embedding provider you configured under embedding.
Enable or disable:
memory:
enabled: true # set to false to disable
Static context — always injected regardless of whether enabled is true:
memory:
repo_context: |
This is a Next.js 15 app using Drizzle ORM and TypeScript strict mode.
Prefer functional components. No class components.
All API routes use App Router Route Handlers.
Use repo_context to give RabbitAI baseline knowledge about your repo before it has reviewed any PRs.
Blast Radius
For each changed file, RabbitAI counts how many other files in the diff import it. Files with many dependents are flagged as high risk — a bug there can cascade across the codebase.
| Dependents | Risk Level | What happens |
|---|---|---|
| 5+ | 🔴 HIGH | Flagged in prompt, reviewer focuses harder |
| 2–4 | 🟡 MEDIUM | Noted in prompt |
| 0–1 | 🟢 LOW | No special treatment |
Supported import patterns: ES modules (import), CommonJS (require), Python (from x import, import x), Go, CSS/SCSS (@import).
Environment Variables
All config values can be overridden with environment variables. Environment variables always take priority over config.yaml. This is how GitHub Actions injects secrets.
| Environment Variable | Config key |
|---|---|
GITHUB_TOKEN | github_token |
GEMINI_API_KEY | gemini_api_key |
OPENAI_API_KEY | llm.api_key + embedding.api_key |
PINECONE_API_KEY | vector_store.api_key |
QDRANT_HOST | vector_store.host |
QDRANT_API_KEY | vector_store.api_key |
Integrating with Another Repo
You don't need to copy any RabbitAI code into your other repos. Just add the workflow file and secrets.
Step 1 — Add .github/workflows/review.yml to the target repo (see Quick Start for the full workflow YAML).
Step 2 — Add secrets to the target repo under Settings → Secrets and variables → Actions:
| Secret | Required |
|---|---|
GEMINI_API_KEY | If using Gemini |
OPENAI_API_KEY | If using OpenAI |
PINECONE_API_KEY | If using Pinecone |
GITHUB_TOKEN is injected automatically — do not add it.
Step 3 — Open a PR on the target repo. RabbitAI clones itself from GitHub and runs the review automatically.
When you push updates to nikhilsaiankilla/rabbitai, all integrated repos get the latest version automatically on their next PR.
MCP Server
Run RabbitAI as an MCP server to trigger reviews directly from Claude or Cursor IDE.
Start the server:
python mcp/server.py
Add to your MCP config (claude_desktop_config.json or Cursor settings):
{
"mcpServers": {
"rabbitai": {
"command": "python",
"args": ["/absolute/path/to/rabbitai/mcp/server.py"]
}
}
}
Available tools:
| Tool | Description |
|---|---|
review_pr | Review a PR — pass repo_name and pr_number |
review_status | Check current config and memory status |
Example prompts inside Claude:
"Review PR #12 in nikhilsaiankilla/myrepo"
"Check RabbitAI status"
Local Development
git clone https://github.com/nikhilsaiankilla/rabbitai
cd rabbitai
pip install -r requirements.txt
cp config.example.yaml config.yaml
# fill in config.yaml
Create test.py:
from agent import run
result = run(
repo_name="your-username/your-repo",
pr_number=1,
)
print(result)
python test.py
The review is posted as a comment on the PR and the comment URL is printed to stdout.
Use Python 3.11 for best compatibility across all dependencies.
Review Output Format
🐇 RabbitAI Code Review · 📊 8/10
🔴 Bug
→ auth.ts line 23: user.id can be undefined if session expires before check
🟠 Security
→ db.ts line 45: query is not parameterized — SQL injection risk
🟡 Performance
→ dashboard.tsx line 89: value recalculated on every render, consider useMemo
🟢 Looks good
→ Error boundaries correctly implemented
→ TypeScript types well-defined throughout
---
🐇 RabbitAI · AI-powered code review · MIT License
Sections with no findings are skipped. Score line is hidden if post_score: false in config.
Project Structure
rabbitai/
├── .github/
│ └── workflows/
│ └── review.yml ← GitHub Action trigger
├── assets/
│ └── rabbitai.png ← logo used in PR comments
├── nodes/
│ ├── fetcher.py ← GitHub API, fetch PR diff + metadata
│ ├── graph_builder.py ← NetworkX dependency graph + blast radius
│ ├── classifier.py ← change type detection
│ ├── embedder.py ← embeddings + vector DB storage
│ ├── retriever.py ← semantic search over stored chunks
│ ├── reviewer.py ← LLM review generation
│ └── poster.py ← GitHub PR comment poster
├── memory/
│ └── repo_memory.py ← mem0 persistent memory
├── mcp/
│ └── server.py ← MCP server for Claude/Cursor
├── utils/
│ ├── config.py ← config.yaml loader + env var overrides
│ └── prompts.py ← review prompt templates
├── agent.py ← LangGraph 9-node workflow entry point
├── config.example.yaml ← copy to config.yaml and fill in
├── requirements.txt
└── README.md
Roadmap
- → 9-node LangGraph workflow
- → NetworkX knowledge graph + blast radius detection
- → ChromaDB, Pinecone, and Qdrant support
- → Gemini and OpenAI embedding providers
- → Gemini and OpenAI LLM providers
- → mem0 persistent memory
- → MCP server for Claude/Cursor
- → GitLab and Bitbucket support
- → Web dashboard for review history
- → Slack and Discord notifications
- → Fine-tuned prompts per language
License
MIT — use it, fork it, self-host it, build on it.