Python bindings for the Laurus search engine. Provides lexical search, vector search, and hybrid search from Python via a native Rust extension built with PyO3 and Maturin.
- Lexical Search -- Full-text search powered by an inverted index with BM25 scoring
- Vector Search -- Approximate nearest neighbor (ANN) search using Flat, HNSW, or IVF indexes
- Hybrid Search -- Combine lexical and vector results with fusion algorithms (RRF, WeightedSum)
- Rich Query DSL -- Term, Phrase, Fuzzy, Wildcard, NumericRange, Geo, Boolean, Span queries
- Text Analysis -- Tokenizers, filters, stemmers, and synonym expansion
- Flexible Storage -- In-memory (ephemeral) or file-based (persistent) indexes
- Pythonic API -- Clean, intuitive Python classes with full type information
pip install laurusTo build from source (requires Rust toolchain):
pip install maturin
maturin developimport laurus
# Create an in-memory index
index = laurus.Index()
# Index documents
index.put_document("doc1", {"title": "Introduction to Rust", "body": "Systems programming language."})
index.put_document("doc2", {"title": "Python for Data Science", "body": "Data analysis with Python."})
index.commit()
# Search with a DSL string
results = index.search("title:rust", limit=5)
for r in results:
print(f"[{r.id}] score={r.score:.4f} {r.document['title']}")
# Search with a query object
results = index.search(laurus.TermQuery("body", "python"), limit=5)index = laurus.Index()schema = laurus.Schema()
schema.add_text_field("title")
schema.add_text_field("body")
schema.add_hnsw_field("embedding", dimension=384)
index = laurus.Index(path="./myindex", schema=schema)| Query class | Description |
|---|---|
TermQuery(field, term) |
Exact term match |
PhraseQuery(field, [terms]) |
Ordered phrase match |
FuzzyQuery(field, term, max_edits) |
Approximate term match |
WildcardQuery(field, pattern) |
Wildcard pattern match (*, ?) |
NumericRangeQuery(field, min, max) |
Numeric range (int or float) |
GeoQuery(field, lat, lon, radius_km) |
Geo-distance radius search |
BooleanQuery(must, should, must_not) |
Compound boolean logic |
SpanNearQuery(field, [terms], slop) |
Proximity / ordered span match |
VectorQuery(field, vector) |
Pre-computed vector similarity |
VectorTextQuery(field, text) |
Text-to-vector similarity (requires embedder) |
request = laurus.SearchRequest(
lexical_query=laurus.TermQuery("body", "rust"),
vector_query=laurus.VectorQuery("embedding", query_vec),
fusion=laurus.RRF(k=60.0),
limit=10,
)
results = index.search(request)| Class | Description |
|---|---|
RRF(k=60.0) |
Reciprocal Rank Fusion (rank-based, default for hybrid) |
WeightedSum(lexical_weight=0.5, vector_weight=0.5) |
Score-normalised weighted sum |
syn_dict = laurus.SynonymDictionary()
syn_dict.add_synonym_group(["ml", "machine learning"])
tokenizer = laurus.WhitespaceTokenizer()
filt = laurus.SynonymGraphFilter(syn_dict, keep_original=True, boost=0.8)
tokens = tokenizer.tokenize("ml tutorial")
tokens = filt.apply(tokens)
for tok in tokens:
print(tok.text, tok.position, tok.boost)Usage examples are in the examples/ directory:
| Example | Description |
|---|---|
| quickstart.py | Basic indexing and full-text search |
| lexical_search.py | All query types (Term, Phrase, Boolean, Fuzzy, Wildcard, Range, Geo, Span) |
| vector_search.py | Semantic similarity search with embeddings |
| hybrid_search.py | Combining lexical and vector search with fusion |
| synonym_graph_filter.py | Synonym expansion in the analysis pipeline |
| search_with_openai.py | Cloud-based embeddings via OpenAI |
| multimodal_search.py | Text-to-image and image-to-image search |
This project is licensed under the MIT License - see the LICENSE file for details.