RabbitAI Documentation

Open-source AI code reviewer. Auto-reviews GitHub PRs with zero cost and full self-hosting.


How It Works

RabbitAI runs a 9-node LangGraph pipeline every time a PR is opened. Each node does one job and passes its output to the next.

fetch → graph → classify → embed → retrieve → load_memory → review → post → save_memory
NodeFileWhat it does
fetchnodes/fetcher.pyPulls PR diff and metadata from GitHub API
graphnodes/graph_builder.pyBuilds NetworkX dependency graph, computes blast radius
classifynodes/classifier.pyDetects change type — bug fix, feature, refactor, security
embednodes/embedder.pyChunks diff, embeds via your chosen model, stores in vector DB
retrievenodes/retriever.pySemantic search over stored chunks
load_memorymemory/repo_memory.pyLoads past learnings from mem0
reviewnodes/reviewer.pyBuilds prompt from all context, calls your chosen LLM
postnodes/poster.pyPosts structured comment on the PR
save_memorymemory/repo_memory.pySaves new learnings to mem0 for future PRs

Intelligence Stack

Three separate systems feed context into the review prompt simultaneously.

NetworkX Knowledge Graph Parses import and require statements from the diff and maps file dependencies into a directed graph. Computes blast radius — how many files in the codebase depend on each changed file. Files with 5+ dependents are flagged HIGH RISK and the reviewer focuses harder on them. Built fresh from each diff, no persistence needed.

Vector DB — RAG Pipeline Chunks the PR diff by file, embeds each chunk using your configured embedding model, and stores it in your vector store. Before each review, semantically similar chunks are retrieved using the classification result as the query. This gives the reviewer relevant code context beyond just the raw diff.

mem0 Persistent Memory After every review, mem0 automatically extracts facts and patterns from the review text — things like "this repo uses Drizzle ORM", "SQL injection found in db.ts previously", "team prefers functional components". Before the next review on the same repo, these are loaded and injected into the prompt. RabbitAI gets smarter with every PR it reviews.


Quick Start

1. Clone and install

git clone https://github.com/nikhilsaiankilla/rabbitai
cd rabbitai
pip install -r requirements.txt

2. Configure

cp config.example.yaml config.yaml

Fill in your keys. See Getting Your API Keys below.

3. Add the workflow to your repo

Create .github/workflows/review.yml in the repo you want reviewed:

name: RabbitAI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
      contents: read
    steps:
      - name: Checkout RabbitAI
        uses: actions/checkout@v4
        with:
          repository: nikhilsaiankilla/rabbitai
          path: rabbitai

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install dependencies
        run: pip install -r rabbitai/requirements.txt

      - name: Run RabbitAI
        env:
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GITHUB_REPOSITORY: ${{ github.repository }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
        run: |
          cd rabbitai
          python -c "
          import os
          from agent import run
          result = run(os.environ['GITHUB_REPOSITORY'], int(os.environ['PR_NUMBER']))
          print(result.comment_url if result.posted else result.reason)
          "

4. Add secrets to your repo

Go to your repo on GitHub → Settings → Secrets and variables → Actions → New repository secret

Add the secrets for whichever providers you chose. See Getting Your API Keys for where to get each one.

GITHUB_TOKEN is injected automatically by GitHub — do not add it manually.

5. Open a PR

That's it. RabbitAI reviews it automatically.


Getting Your API Keys

GitHub Personal Access Token

Required scope: repo (full control of private repositories)

Only needed for local development. GitHub Actions injects GITHUB_TOKEN automatically.

  1. Go to github.com → click your avatar → Settings
  2. Scroll to the bottom → Developer settings
  3. Personal access tokens → Tokens (classic)
  4. Generate new token (classic)
  5. Check repo (the top-level checkbox — checks everything below it)
  6. Set an expiration → Generate token → copy it immediately

Never share this token or commit it to your repo.


Gemini API Key

Free tier available. No credit card required.

  1. Go to aistudio.google.com
  2. Sign in with your Google account
  3. Click Get API key → Create API key
  4. Copy the key

Used when embedding.provider or llm.provider is set to "gemini".


OpenAI API Key

Required if using OpenAI for embeddings or review generation.

  1. Go to platform.openai.com
  2. Sign in → click your avatar → API keys
  3. Create new secret key → copy it

Used when embedding.provider or llm.provider is set to "openai".


Pinecone API Key

Required if using Pinecone as your vector store.

  1. Go to app.pinecone.io → sign up free
  2. API Keys in the left sidebar → copy your key
  3. Indexes → Create Index with these settings:
    • Name: rabbitai (or whatever you set as index_name)
    • Dimensions: match your embedding model (see Vector Store Providers)
    • Metric: cosine
    • Cloud: AWS us-east-1 (free tier)

Configuration

Copy config.example.yaml to config.yaml and fill in your values. config.yaml is gitignored — never commit it.

In GitHub Actions, all values are injected via repository secrets. config.yaml is not needed in CI.

github_token: "" # local dev only
gemini_api_key: "" # required if using gemini for embedding or llm

embedding:
  provider: "gemini" # gemini | openai
  model: "" # leave empty to use default for the provider
  api_key: "" # openai only

llm:
  provider: "gemini" # gemini | openai
  model: "" # leave empty to use default for the provider
  api_key: "" # openai only

vector_store:
  provider: "chromadb" # chromadb | pinecone | qdrant
  path: "./chroma_db" # chromadb only
  collection: "pr-chunks"

memory:
  enabled: true
  repo_context: |
    Describe your repo here so RabbitAI understands it from day one.

review:
  language: "typescript"
  focus:
    - bugs
    - security
    - performance
  min_risk_score: 6
  post_score: true

Embedding Providers

Controls how PR diff chunks are converted to vectors for storage and retrieval. Must match the dimension of your vector store index.

Gemini (default, free)

Model: models/gemini-embedding-001 — outputs 768 dimensions

embedding:
  provider: "gemini"
  model: "models/gemini-embedding-001"

Uses gemini_api_key from the top of your config. Get your key at aistudio.google.com.

Create your vector store index with 768 dimensions.


OpenAI

ModelDimensionsNotes
text-embedding-3-small1536Recommended — fast, low cost
text-embedding-3-large3072Higher quality, higher cost
embedding:
  provider: "openai"
  model: "text-embedding-3-small"
  api_key: "sk-xxx"

Get your key at platform.openai.com/api-keys.

Create your vector store index with 1536 dimensions if using text-embedding-3-small.


LLM Providers

Controls which model generates the actual code review.

Gemini (free)

Model: gemini-2.0-flash — fast, capable, free tier

llm:
  provider: "gemini"
  model: "gemini-2.0-flash"

Uses gemini_api_key from the top of your config.


OpenAI

Model: gpt-4.1-mini — best balance of quality and cost for code review

llm:
  provider: "openai"
  model: "gpt-4.1-mini"
  api_key: "sk-xxx"

Get your key at platform.openai.com/api-keys.


Vector Store Providers

Stores embedded diff chunks for the RAG pipeline.

ChromaDB — local, free, no setup

vector_store:
  provider: "chromadb"
  path: "./chroma_db"
  collection: "pr-chunks"

No account or API key needed. Data persists in ./chroma_db on disk. Recommended for getting started.

Note: ChromaDB requires Python 3.11 or lower. Python 3.12+ may have compatibility issues.

Install:

pip install chromadb

Pinecone — cloud, free tier available

vector_store:
  provider: "pinecone"
  api_key: "YOUR_PINECONE_KEY"
  index_name: "rabbitai"
  collection: "pr-chunks"

Create your index at app.pinecone.io. Set dimensions to match your embedding model:

Embedding modelIndex dimensions
gemini-embedding-001768
text-embedding-3-small1536
text-embedding-3-large3072

Install:

pip install pinecone

Qdrant — self-hosted or cloud

Self-hosted:

vector_store:
  provider: "qdrant"
  host: "localhost"
  port: 6333
  collection: "pr-chunks"

Run locally with Docker:

docker run -p 6333:6333 qdrant/qdrant

Qdrant Cloud — sign up at cloud.qdrant.io:

vector_store:
  provider: "qdrant"
  host: "https://your-cluster.qdrant.io"
  api_key: "YOUR_QDRANT_KEY"
  collection: "pr-chunks"

Install:

pip install qdrant-client

Memory

RabbitAI uses mem0 to build persistent memory across PR reviews. After each review, mem0 extracts facts and patterns and stores them. Before the next review on the same repo, those facts are retrieved and injected into the prompt.

Memory uses the same embedding provider you configured under embedding.

Enable or disable:

memory:
  enabled: true # set to false to disable

Static context — always injected regardless of whether enabled is true:

memory:
  repo_context: |
    This is a Next.js 15 app using Drizzle ORM and TypeScript strict mode.
    Prefer functional components. No class components.
    All API routes use App Router Route Handlers.

Use repo_context to give RabbitAI baseline knowledge about your repo before it has reviewed any PRs.


Blast Radius

For each changed file, RabbitAI counts how many other files in the diff import it. Files with many dependents are flagged as high risk — a bug there can cascade across the codebase.

DependentsRisk LevelWhat happens
5+🔴 HIGHFlagged in prompt, reviewer focuses harder
2–4🟡 MEDIUMNoted in prompt
0–1🟢 LOWNo special treatment

Supported import patterns: ES modules (import), CommonJS (require), Python (from x import, import x), Go, CSS/SCSS (@import).


Environment Variables

All config values can be overridden with environment variables. Environment variables always take priority over config.yaml. This is how GitHub Actions injects secrets.

Environment VariableConfig key
GITHUB_TOKENgithub_token
GEMINI_API_KEYgemini_api_key
OPENAI_API_KEYllm.api_key + embedding.api_key
PINECONE_API_KEYvector_store.api_key
QDRANT_HOSTvector_store.host
QDRANT_API_KEYvector_store.api_key

Integrating with Another Repo

You don't need to copy any RabbitAI code into your other repos. Just add the workflow file and secrets.

Step 1 — Add .github/workflows/review.yml to the target repo (see Quick Start for the full workflow YAML).

Step 2 — Add secrets to the target repo under Settings → Secrets and variables → Actions:

SecretRequired
GEMINI_API_KEYIf using Gemini
OPENAI_API_KEYIf using OpenAI
PINECONE_API_KEYIf using Pinecone

GITHUB_TOKEN is injected automatically — do not add it.

Step 3 — Open a PR on the target repo. RabbitAI clones itself from GitHub and runs the review automatically.

When you push updates to nikhilsaiankilla/rabbitai, all integrated repos get the latest version automatically on their next PR.


MCP Server

Run RabbitAI as an MCP server to trigger reviews directly from Claude or Cursor IDE.

Start the server:

python mcp/server.py

Add to your MCP config (claude_desktop_config.json or Cursor settings):

{
  "mcpServers": {
    "rabbitai": {
      "command": "python",
      "args": ["/absolute/path/to/rabbitai/mcp/server.py"]
    }
  }
}

Available tools:

ToolDescription
review_prReview a PR — pass repo_name and pr_number
review_statusCheck current config and memory status

Example prompts inside Claude:

"Review PR #12 in nikhilsaiankilla/myrepo"

"Check RabbitAI status"


Local Development

git clone https://github.com/nikhilsaiankilla/rabbitai
cd rabbitai
pip install -r requirements.txt
cp config.example.yaml config.yaml
# fill in config.yaml

Create test.py:

from agent import run

result = run(
    repo_name="your-username/your-repo",
    pr_number=1,
)
print(result)
python test.py

The review is posted as a comment on the PR and the comment URL is printed to stdout.

Use Python 3.11 for best compatibility across all dependencies.


Review Output Format

🐇 RabbitAI Code Review · 📊 8/10 🔴 Bug → auth.ts line 23: user.id can be undefined if session expires before check 🟠 Security → db.ts line 45: query is not parameterized — SQL injection risk 🟡 Performance → dashboard.tsx line 89: value recalculated on every render, consider useMemo 🟢 Looks good → Error boundaries correctly implemented → TypeScript types well-defined throughout --- 🐇 RabbitAI · AI-powered code review · MIT License

Sections with no findings are skipped. Score line is hidden if post_score: false in config.


Project Structure

rabbitai/ ├── .github/ │ └── workflows/ │ └── review.yml ← GitHub Action trigger ├── assets/ │ └── rabbitai.png ← logo used in PR comments ├── nodes/ │ ├── fetcher.py ← GitHub API, fetch PR diff + metadata │ ├── graph_builder.py ← NetworkX dependency graph + blast radius │ ├── classifier.py ← change type detection │ ├── embedder.py ← embeddings + vector DB storage │ ├── retriever.py ← semantic search over stored chunks │ ├── reviewer.py ← LLM review generation │ └── poster.py ← GitHub PR comment poster ├── memory/ │ └── repo_memory.py ← mem0 persistent memory ├── mcp/ │ └── server.py ← MCP server for Claude/Cursor ├── utils/ │ ├── config.py ← config.yaml loader + env var overrides │ └── prompts.py ← review prompt templates ├── agent.py ← LangGraph 9-node workflow entry point ├── config.example.yaml ← copy to config.yaml and fill in ├── requirements.txt └── README.md

Roadmap

  • 9-node LangGraph workflow
  • NetworkX knowledge graph + blast radius detection
  • ChromaDB, Pinecone, and Qdrant support
  • Gemini and OpenAI embedding providers
  • Gemini and OpenAI LLM providers
  • mem0 persistent memory
  • MCP server for Claude/Cursor
  • GitLab and Bitbucket support
  • Web dashboard for review history
  • Slack and Discord notifications
  • Fine-tuned prompts per language

License

MIT — use it, fork it, self-host it, build on it.