LangCore API

HTTP service for LangCore — production-ready structured document extraction powered by FastAPI, Celery, and Redis.

Overview

langcore-api wraps the entire LangCore plugin ecosystem into a deployable HTTP service. Submit a document URL or raw text via REST, and get back structured entities — asynchronously via polling or webhooks. It integrates all LangCore plugins (LiteLLM, audit, guardrails, hybrid, DSPy, RAG) and adds production concerns: task queuing, caching, observability, security, and idempotency.

Features

Async task queue — FastAPI receives requests, Redis brokers tasks, Celery workers execute extractions
Single and batch extraction — POST /extract for one document, POST /extract/batch for many
Multi-pass extraction with confidence scoring — run multiple passes and score entities by consistency (0.0–1.0)
Consensus mode — run the same extraction through multiple LLM providers and keep only agreed-upon entities
Multi-tier caching — LLM response cache (LiteLLM + Redis) and extraction-result cache (SHA-256 keyed) for near-instant re-runs
Webhook delivery — receive results via HMAC-signed webhook callbacks with custom headers
Idempotency — prevent duplicate tasks with idempotency_key
SSRF protection — private IP blocking, domain allow-lists, DNS timeout, redirect validation
Prometheus metrics — task counters, cache hit/miss rates, processing histograms
Structured logging — JSON-formatted logs via structlog
Full plugin integration — audit logging, output guardrails, hybrid rules, DSPy optimization, and RAG query parsing — all configurable via environment variables
Docker-ready — multi-stage Dockerfile with web, worker, and Flower profiles
100+ LLM support — any model accessible through LiteLLM (OpenAI, Gemini, Anthropic, Azure, Groq, Mistral, Ollama, vLLM, etc.)

Architecture

┌──────────┐      ┌──────────┐      ┌──────────┐
│   API    │─────▶│ FastAPI  │─────▶│  Redis   │
│  Client  │◀──── │  (web)   │      │ (broker) │
└────▲─────┘      └──────────┘      └────┬─────┘
     │                                   │
     │                              ┌────▼─────┐
     │          Webhook / Poll      │  Celery  │
     └──────────────────────────────┤  Worker  │
                                    └──────────┘

Client submits via POST /api/v1/extract (or /extract/batch)
FastAPI validates, enqueues a Celery task in Redis, returns a task ID
A Celery worker downloads the document text and runs the LangCore pipeline
Results are stored in Redis (TTL via RESULT_EXPIRES)
Client polls GET /api/v1/tasks/{task_id} or receives a webhook callback

Quick Start

Docker (Recommended)

cp .env.example .env          # Add your GEMINI_API_KEY or OPENAI_API_KEY
docker compose up --build      # API on :8000, Flower on :5555

Local Development

uv sync                                        # Install dependencies
docker run -d -p 6379:6379 redis:8-alpine      # Start Redis

export REDIS_HOST=localhost

# Terminal 1 — API
uv run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

# Terminal 2 — Worker
uv run celery -A app.workers.celery_app worker --loglevel=info

Production

docker compose --profile production up --build -d

Multi-worker Uvicorn (4 procs), multiple Celery replicas, resource limits, health checks.

API Reference

Method	Path	Description
`POST`	`/api/v1/extract`	Submit single extraction
`POST`	`/api/v1/extract/batch`	Submit batch of extractions
`GET`	`/api/v1/tasks/{task_id}`	Poll task status / result
`DELETE`	`/api/v1/tasks/{task_id}`	Revoke a running task
`POST`	`/api/v1/dspy/optimize`	Optimize extraction prompts with DSPy
`POST`	`/api/v1/rag/parse`	Parse a query for hybrid RAG retrieval
`GET`	`/api/v1/health`	Liveness probe
`GET`	`/api/v1/health/celery`	Worker readiness probe
`GET`	`/api/v1/metrics`	Task counters (submitted / completed / failed)

Interactive docs at http://localhost:8000/api/v1/docs (Swagger UI).

Submit Extraction (URL)

curl -X POST http://localhost:8000/api/v1/extract \
  -H "Content-Type: application/json" \
  -d '{
    "document_url": "https://example.com/contract.txt",
    "callback_url": "https://my-app.com/webhooks/done",
    "callback_headers": {"Authorization": "Bearer eyJhbGciOi..."},
    "provider": "gpt-4o",
    "passes": 2
  }'

{
  "task_id": "a1b2c3d4-...",
  "status": "submitted",
  "message": "Extraction submitted for https://example.com/contract.txt"
}

Submit Extraction (Raw Text)

curl -X POST http://localhost:8000/api/v1/extract \
  -H "Content-Type: application/json" \
  -d '{
    "raw_text": "AGREEMENT between Acme Corp and Beta LLC ...",
    "provider": "gpt-4o"
  }'

Poll Status

curl http://localhost:8000/api/v1/tasks/{task_id}

Idempotency

Pass an idempotency_key to prevent duplicate tasks:

{
  "raw_text": "...",
  "idempotency_key": "my-unique-key-123"
}

Webhook Delivery

Receive results via HMAC-signed webhook with custom headers:

{
  "raw_text": "...",
  "callback_url": "https://my-app.com/webhooks/done",
  "callback_headers": {"Authorization": "Bearer <token>"}
}

Response Schema

{
  "status": "completed",
  "source": "https://example.com/contract.txt",
  "data": {
    "entities": [
      {
        "extraction_class": "party",
        "extraction_text": "Acme Corporation",
        "attributes": {"role": "Seller", "jurisdiction": "Delaware"},
        "char_start": 52,
        "char_end": 68,
        "confidence_score": 1.0
      }
    ],
    "metadata": {
      "provider": "gpt-4o",
      "tokens_used": 1234,
      "processing_time_ms": 1200
    }
  }
}

Integration with LangCore Ecosystem

langcore-api consumes the full suite of LangCore plugins, all configurable via environment variables:

LiteLLM (100+ LLM Providers)

Every extraction routes through langcore-litellm. Override the model per-request via the provider field:

{"raw_text": "...", "provider": "gemini-2.5-flash"}

Supports OpenAI, Gemini, Anthropic, Azure, Groq, Mistral, Ollama, vLLM, and any LiteLLM-compatible backend.

Audit Logging (langcore-audit)

Structured audit trails for every extraction call with configurable sinks:

Variable	Default	Description
`AUDIT_ENABLED`	`true`	Enable structured audit logging
`AUDIT_SINK`	`logging`	Sink type: `logging`, `jsonfile`, `otel`
`AUDIT_LOG_PATH`	`audit.jsonl`	NDJSON file path (when sink=`jsonfile`)
`AUDIT_SAMPLE_LENGTH`	(unset)	Truncate prompt/response samples (chars)

Guardrails (langcore-guardrails)

Output validation with automatic retry:

Variable	Default	Description
`GUARDRAILS_ENABLED`	`true`	Enable output validation with retry
`GUARDRAILS_MAX_RETRIES`	`3`	Max retry attempts on validation failure
`GUARDRAILS_INCLUDE_OUTPUT_IN_CORRECTION`	`true`	Include invalid output in correction prompt

DSPy Prompt Optimization (langcore-dspy)

Optimize extraction prompts automatically:

Variable	Default	Description
`DSPY_ENABLED`	`false`	Enable the `/dspy/optimize` endpoint
`DSPY_MODEL_ID`	`gemini/gemini-2.5-flash`	LLM for optimization
`DSPY_OPTIMIZER`	`miprov2`	Strategy: `miprov2` or `gepa`

RAG Query Parsing (langcore-rag)

Parse natural-language queries for hybrid retrieval:

Variable	Default	Description
`RAG_ENABLED`	`false`	Enable the `/rag/parse` endpoint
`RAG_MODEL_ID`	`gpt-4o`	LLM for query decomposition
`RAG_TEMPERATURE`	`0.0`	Sampling temperature

Multi-Pass, Early Stopping & Consensus Mode

Multi-Pass with Confidence Scoring

Set passes > 1 to run multiple extraction passes. Each entity receives a confidence_score (0.0–1.0) indicating the fraction of passes that found it. Early stopping kicks in automatically when two consecutive passes produce identical results.

Consensus Mode

Run the same extraction through multiple LLM providers and keep only entities they agree on:

{
  "raw_text": "AGREEMENT between Acme Corp and Beta LLC ...",
  "passes": 2,
  "extraction_config": {
    "consensus_providers": ["gpt-4o", "gemini-2.5-pro"],
    "consensus_threshold": 0.7
  }
}

Customising Extraction

Override the default prompt and examples per-request:

{
  "raw_text": "Take Aspirin 81 mg daily.",
  "extraction_config": {
    "prompt_description": "Extract medication names and dosages.",
    "examples": [
      {
        "text": "Take Aspirin 81 mg daily.",
        "extractions": [
          {
            "extraction_class": "medication",
            "extraction_text": "Aspirin 81 mg",
            "attributes": {"dosage": "81 mg", "frequency": "daily"}
          }
        ]
      }
    ],
    "temperature": 0.2
  }
}

`extraction_config` key	Type	Description
`prompt_description`	`string`	Custom extraction prompt
`examples`	`list[dict]`	Few-shot examples
`temperature`	`float`	LLM temperature (0.0–2.0)
`consensus_providers`	`list[str]`	≥ 2 model IDs for consensus mode
`consensus_threshold`	`float`	Similarity threshold (0.0–1.0, default 0.6)
`structured_output`	`bool\|null`	Enable/disable LLM-level `response_format`
`guardrails`	`object`	Output validation config
`audit`	`object`	Audit logging config

Multi-Tier Caching

Tier 1 — LLM Response Cache

Every litellm.completion() call is cached in Redis. Identical prompts hit the cache directly with zero API cost. Multi-pass bypass ensures fresh responses on repeat passes.

Tier 2 — Extraction-Result Cache

Complete extraction results are cached (keyed by SHA-256 of text + prompt + model + settings). Cache hits return in < 500 ms with zero API cost.

Backend	Env Value	Use Case
`redis`	`EXTRACTION_CACHE_BACKEND=redis`	Default. Cross-worker, cross-job.
`disk`	`EXTRACTION_CACHE_BACKEND=disk`	Local dev / offline.
`none`	`EXTRACTION_CACHE_BACKEND=none`	Completely disabled.

Security

SSRF protection — private IP / localhost blocking, subdomain matching, URL length limits, DNS timeout, redirect-hop re-validation
Domain allow-list — ALLOWED_URL_DOMAINS restricts accepted document URLs
Webhook HMAC signing — WEBHOOK_SECRET signs outbound webhooks (HMAC-SHA256)
Provider validation — model IDs validated against strict regex

See docs/security.md for details.

Configuration

All settings are driven by environment variables (.env file supported):

General

Variable	Default	Description
`APP_NAME`	LangCore API	Display name
`API_V1_STR`	`/api/v1`	API version prefix
`DEBUG`	`false`	Enable debug mode
`LOG_LEVEL`	`info`	Logging level
`CORS_ORIGINS`	`["*"]`	Allowed CORS origins

Redis / Celery

Variable	Default	Description
`REDIS_HOST`	`redis`	Redis hostname
`REDIS_PORT`	`6379`	Redis port
`RESULT_EXPIRES`	`86400`	Result TTL in seconds
`TASK_TIME_LIMIT`	`3600`	Hard task timeout (seconds)

LLM / Extraction

Variable	Default	Description
`DEFAULT_PROVIDER`	`gpt-4o`	Default model (overridable per-request)
`OPENAI_API_KEY`	(empty)	OpenAI key
`GEMINI_API_KEY`	(empty)	Google Gemini key
`ANTHROPIC_API_KEY`	(empty)	Anthropic key
`EXTRACTION_CACHE_ENABLED`	`true`	Enable result caching
`EXTRACTION_CACHE_TTL`	`86400`	Cache TTL (seconds)
`EXTRACTION_CACHE_BACKEND`	`redis`	`redis`, `disk`, or `none`

Security

Variable	Default	Description
`ALLOWED_URL_DOMAINS`	(empty)	Comma-separated domain allow-list
`WEBHOOK_SECRET`	(empty)	HMAC-SHA256 signing key
`DOC_DOWNLOAD_TIMEOUT`	`30`	Download timeout (seconds)
`DOC_DOWNLOAD_MAX_BYTES`	`50000000`	Max document size

Supported Models

Any model accessible through LiteLLM:

Provider	Example Models	Key Variable
OpenAI	`gpt-4o`, `gpt-4o-mini`	`OPENAI_API_KEY`
Google	`gemini-2.5-pro`, `gemini-2.0-flash`	`GEMINI_API_KEY`
Anthropic	`claude-3.5-sonnet`, `claude-3-haiku`	`ANTHROPIC_API_KEY`
Azure OpenAI	`azure/gpt-4o`	`AZURE_API_KEY`
Groq	`groq/llama-3.1-70b`	`GROQ_API_KEY`
Mistral	`mistral/mistral-large-latest`	`MISTRAL_API_KEY`
Ollama	`ollama/llama3.1`	`OLLAMA_API_BASE`
vLLM	`hosted_vllm/meta-llama/Llama-3.1-8B`	Custom `api_base`

Project Structure

langcore-api/
├── app/
│   ├── main.py                    # App factory, middleware, lifespan
│   ├── core/                      # Config, logging, metrics, security, Redis
│   ├── services/                  # Extraction, caching, webhooks, providers
│   ├── workers/                   # Celery app, tasks (single + batch)
│   ├── api/routes/                # FastAPI route handlers
│   └── schemas/                   # Pydantic request/response models
├── tests/                         # pytest suite (219 tests)
├── docs/                          # security.md, deployment.md, recipes.md
├── examples/                      # curl, Python, TypeScript, Go clients
├── docker/                        # Multi-stage Dockerfile + entrypoint
├── docker-compose.yml
├── pyproject.toml
└── Makefile

Development

make install   # uv sync
make lint      # ruff check + format check
make format    # Auto-format
make test      # pytest -v
make test-cov  # pytest with coverage
make dev       # docker compose up --build
make clean     # docker compose down -v

Running Tests

uv run pytest -v                           # All tests
uv run pytest --cov=app --cov-report=term  # With coverage
uv run pytest tests/test_tasks.py -v       # Single file

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github		.github
app		app
docker		docker
docs		docs
examples		examples
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.ruff.toml		.ruff.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

LangCore API

Overview

Features

Architecture

Quick Start

Docker (Recommended)

Local Development

Production

API Reference

Submit Extraction (URL)

Submit Extraction (Raw Text)

Poll Status

Idempotency

Webhook Delivery

Response Schema

Integration with LangCore Ecosystem

LiteLLM (100+ LLM Providers)

Audit Logging (langcore-audit)

Guardrails (langcore-guardrails)

DSPy Prompt Optimization (langcore-dspy)

RAG Query Parsing (langcore-rag)

Multi-Pass, Early Stopping & Consensus Mode

Multi-Pass with Confidence Scoring

Consensus Mode

Customising Extraction

Multi-Tier Caching

Tier 1 — LLM Response Cache

Tier 2 — Extraction-Result Cache

Security

Configuration

General

Redis / Celery

LLM / Extraction

Security

Supported Models

Project Structure

Development

Running Tests

Further Reading

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages