Skip to content

coolhead/nlp-rag-document-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Based Document Understanding & Semantic Search Platform

This project demonstrates how to build RAG systems that fail safely, remain auditable, and are deployable in real production environments.


Architecture Overview

RAG Architecture


What This System Does

A production-minded Retrieval-Augmented Generation (RAG) system for document understanding and semantic search.

It ingests documents (PDFs), chunks and embeds them, performs vector-based semantic retrieval (FAISS/Chroma), and generates grounded answers using an LLM with explicit citations and relevance gating.


Key Features

  • Document ingestion (PDFs)
  • Text cleaning and chunking
  • Sentence‑transformer embeddings
  • Vector search with FAISS
  • RAG with Ollama / OpenAI support
  • Page‑level citations with excerpts
  • Hallucination guardrails ("I don’t know" on weak evidence)
  • Relevance thresholding and deduplication

Project Structure

nlp-semantic-search-rag/
├── app/
│   ├── api.py          # FastAPI endpoints
│   ├── rag.py          # LLM + RAG logic
│   ├── retriever.py    # Vector store interaction
│   ├── ingest.py       # PDF ingestion
│   ├── embeddings.py  # Embedding generation
│   ├── schemas.py     # API contracts
│   └── settings.py
├── data/
│   ├── raw/            # Input PDFs (ignored by git)
│   ├── processed/
│   └── index/          # FAISS index + metadata (generated)
├── scripts/
├── tests/
├── bootstrap.sh
├── Makefile
├── requirements.txt
├── .env.example
└── README.md

Quickstart

# clone repo
cp .env.example .env

# create virtualenv
python -m venv .venv
source .venv/bin/activate

# install deps
pip install -r requirements.txt

# start API
make run

API will be available at:

http://localhost:8000

Usage

1️⃣ Ingest documents

curl -X POST http://localhost:8000/ingest \
  -H "Content-Type: application/json" \
  -d '{"path":"./data/raw"}'

2️⃣ Ask questions (RAG)

curl -X POST http://localhost:8000/ask \
  -H "Content-Type: application/json" \
  -d '{"query":"List the projects mentioned and their status","top_k":6}'

Example response:

  • Direct answer
  • Page‑level citations
  • Text excerpts
  • Relevance scores

Environment Variables

Defined in .env:

LLM_PROVIDER=ollama      # or openai
OLLAMA_MODEL=llama3.2:latest
MIN_RELEVANCE=0.10      # relevance gate
# OPENAI_API_KEY=...

Why This Exists

Most RAG demos fail in production because they:

  • hallucinate confidently
  • ignore evidence quality
  • mix retrieval with reasoning

This project demonstrates how to build RAG correctly:

  • Retrieval and reasoning are separated
  • Answers are grounded strictly in evidence
  • Weak evidence returns "I don’t know"
  • Citations are explicit and inspectable

This makes the system suitable for real‑world knowledge access, audits, and decision support.


Status

✅ MVP complete

Possible next steps:

  • /debug/search endpoint for tuning
  • Multi‑document knowledge bases
  • Authentication & access control
  • CI + tests
  • UI layer

RAG Design Tradeoffs & Considerations

  • Why relevance gating is critical for production RAG
  • How deduplication prevents citation spam
  • Tradeoffs between FAISS and Chroma
  • Why retrieval and reasoning must be separated

Built with a strong focus on correctness, explainability, and production realism.

About

Production-ready NLP & RAG platform for document understanding, semantic search, and grounded question answering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors