Raj
Shekhar
Backend Engineer & Agentic AI Builder
Final-year CS student at LPU building production AI systems — RAG pipelines, LangGraph agents, distributed microservices. Currently at NERVESPARKS engineering the future of enterprise automation.
Tech Stack
Experience
- Architected production Agentic AI systems using LangGraph with multi-agent workflows, memory persistence, and state management — automating complex enterprise tasks end-to-end.
- Built RAG pipelines with ChromaDB achieving 40% improvement in retrieval accuracy, handling 1,000+ concurrent requests at 60% reduced latency.
- Designed an Audio RAG System using PyAnnote speaker diarization, OpenAI Whisper, and timestamp-based retrieval — enabling fully searchable audio knowledge bases.
- Built scalable FastAPI microservices with JWT auth and WebSocket support; optimized on-device GGUF model inference in Kotlin (Iris Android App), improving response times by 35%.
- Deployed containerized microservices via Docker + CI/CD on AWS ensuring zero-downtime releases.
- CGPA: 8.0 / 10 — Final year, graduating 2026.
- Focus on distributed systems, system design, DSA, and microservices architecture.
Projects
Enterprise monitoring dashboard with 100% LLM-powered anomaly detection, processing 10,000+ metrics/min. Cuts incident resolution time by 70% via automated Gemini 2.0 Flash diagnosis. Multi-tenant microservices with full data isolation for 20+ concurrent users.
AI-powered multi-cloud deployment agent that converts GitHub README files into production-ready Terraform configs. Reduced deployment errors by 60% and eliminates manual IaC setup. Security guardrails preventing 100% of unsafe default configs across AWS & GCP.
Full-stack streaming platform with synchronized Party Rooms — real-time playback, WebRTC video/audio calls, live reactions via Socket.IO at sub-100ms latency for 100+ concurrent users. Gemini AI chatbot for natural language music discovery.
Offline-first Android AI chat app with local llama.cpp inference and a full RAG pipeline for document-based Q&A. On-device GGUF model inference in Kotlin — no cloud dependency, 35% faster response than baseline.
Contact
something
great._
Open to backend engineering, AI/ML engineering, and full-stack roles. Graduating July 2026 — actively seeking opportunities.
Send me an email →