Rehan Malik rehan243

About Me — AI/ML Engineer | Generative AI | LLM Systems

i'm an AI/ML engineer based in the US, currently building production AI systems at Reallytics.ai and Verticiti. most of my work revolves around getting large language models to do useful things in production — not toy demos, actual systems handling real traffic.

before this, i spent years at Afiniti and Cloud Kinetics doing the grunt work of making ML models reliable at scale. fraud detection, voice analytics, enterprise search — the kind of stuff that breaks at 3am and you have to fix.

what keeps me going: that moment when an AI agent you built actually solves a problem you didn't explicitly program it for. still hits different every time.

right now i'm deep into:

multi-agent systems that coordinate without falling apart
RAG pipelines that actually find what you're looking for
writing daily about what i learn — AI Engineering Notes

developer coding animation — AI engineer at work

Featured Projects — AI Agents, RAG, LLM Fine-Tuning

Agentic AI Workflows — Production AI Agents 8 specialized AI agents with LangChain + OpenAI function calling. multi-agent orchestration with planning loops and guardrails. the project i'm most excited about.	RAG Enterprise Search — Retrieval-Augmented Generation production retrieval pipeline over 2TB+ data. LangChain, FAISS, ChromaDB, cross-encoder re-ranking. deployed on AWS SageMaker.
Voice AI Platform — Real-Time Speech AI real-time voice infrastructure handling 500+ concurrent calls. WebSockets, Apache Kafka, gRPC with CUDA. speech-to-text, sentiment analysis.	LLM Fine-Tuning (LoRA/QLoRA) — Parameter-Efficient Fine-Tuning fine-tuning LLaMA-2 and Mistral with LoRA/QLoRA/PEFT. 40% cost reduction vs hosted APIs. vLLM serving on SageMaker.
RLHF LLM Optimization — Reinforcement Learning from Human Feedback full RLHF pipeline — supervised fine-tuning, reward modeling, PPO with KL constraints. 68% win rate, 96% safety compliance.	Sentinel Fraud Detection — Explainable AI ensemble XGBoost + Isolation Forest with 650+ engineered features. SHAP explainability, UMAP clustering, GenAI reports via Amazon Bedrock.

Tech Stack — Python, PyTorch, LangChain, AWS, Docker

i'm not going to pretend i use everything equally. here's what i actually reach for day-to-day:

the full picture (click to expand)


daily drivers	Python, PyTorch, FastAPI, Docker, Git, VS Code
LLM & GenAI	LangChain, LlamaIndex, HuggingFace Transformers, vLLM, PEFT/LoRA/QLoRA
vector & data	FAISS, ChromaDB, Pinecone, PostgreSQL, MongoDB, Redis, Kafka, Elasticsearch
cloud & MLOps	AWS (SageMaker, Bedrock, Lambda, ECS), GCP Vertex AI, Azure OpenAI
ML frameworks	TensorFlow, scikit-learn, XGBoost, LightGBM, ONNX
infrastructure	Kubernetes, Terraform, GitHub Actions, MLflow, Weights & Biases

GitHub Stats

i commit a lot. sometimes it's good code, sometimes it's "fix: typo in typo fix".

GitHub Trophies

Contribution Activity Graph

Contribution Snake Animation

Latest AI Research Articles

i publish research notes daily — not polished papers, just honest writeups of what i'm learning and building. think of it as a public lab notebook for generative AI, LLM fine-tuning, RAG, and agentic systems.

Explainable Ai For Time Series Forecasting _2026-04-17	Retrieval Augmented Generation Rag With Low Late _2026-04-16
Fine Tuning Llms With Synthetic Data For Enterpris _2026-04-16	Model Context Protocol And Tool Use _2026-04-15

📚 View all articles →

Recent Open-Source Activity

💬 Commented on Sparse cotangent decorator (short implementation example) in jax-ml/jax _(2026-04-17)

💬 Commented on Problem 23 (A=B): All test inputs are multiples of 100, caus in FrontierCS/Frontier-CS _(2026-04-17)

💬 Commented on How to backfill offline store with UDF-transformed data for in feast-dev/feast _(2026-04-17)

💬 Commented on very short answer bug in zai-org/CogVLM _(2026-04-17)

💬 Commented on [Feature][AutoDeploy]: Piecewise CUDA graph for MTP in NVIDIA/TensorRT-LLM _(2026-04-17)

💬 Commented on Having a single baseline is kinda wrong? in lmarena/arena-hard-auto _(2026-04-17)

💬 Commented on Automation rule filter: fields accepted by API are silently in comet-ml/opik _(2026-04-17)

💬 Commented on Skills + S3 filesystem Latency - Skills are SUPER slow (abou in mastra-ai/mastra _(2026-04-17)

Currently Researching

topics discovered daily by a multi-model AI research engine (GPT-4.1, Grok-3, DeepSeek R1, Llama-4)

🔬 Graph Neural Networks for Recommendation Systems

🔬 Real-time Data Quality Monitoring for ML Pipelines

🔬 Explainable AI for Time Series Forecasting

🔬 Fine-Tuning LLMs with Synthetic Data for Enterprise Customization

🔬 Retrieval-Augmented Generation (RAG) with Low-Latency Vector Databases

🔬 Model Context Protocol and Tool Use

Code Snippets & Gists

📌 Prompt Version Control & A/B Testing Registry (Python) _(2026-04-17)

📌 Configuration-Driven ML Pipeline Runner with Validation (Python) _(2026-04-16)

📌 Token Budget Manager — LLM Context Window Optimization (Python) _(2026-04-15)

_{🤖 Profile auto-updated on 2026-04-17 19:10 UTC}

_{if you made it this far, you should probably just say hi}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly