My Projects
HR Assistant Platform
Built enterprise-scale AI chatbot serving 10k+ employees, streamlining employee support
- Engineered a multi-format document extraction pipeline, processing 1,200+ files, with OCR and semantic analysis
- Architected an efficient Graph RAG pipeline with robust chunking strategy, improving retrieval accuracy by 30%
- Designed modular retrieval pipelines leveraging Qdrant, Graph RAG connectivity and Cross-Encoder re-ranking
- Leveraged LangChain for multi-turn chats, improving coherence, clarity of responses, and personalization by 20%
Real-Time TTS Engine
Delivered robust and scalable cross-lingual TTS APIs powering HelloJio and Jio Translate
- Engineered end-to-end data preprocessing with enhanced normalization to enable multilingual TTS model fine-tuning
- Developed a Python API wrapper to orchestrate LLM-based semantic tokenization and a VQGAN-based vocoder
- Enabled 100+ concurrent TTS streams on NVIDIA MIG instances, sustaining 0.2 RTF at 90% GPU utilization
- Deployed Gunicorn/Nginx with round-robin distribution, reducing API errors to 0.5% during peak traffic spikes
Multilingual OCR System
Developed a regional multi-language OCR with a high-accuracy language classifier model
- Fine-tuned the PaddleClas PULC-based language classifier model for 10 languages, achieving 93.54% F1 score
- Designed a multi-stage OCR pipeline with a confidence-based adaptive model routing, improving accuracy by 25%
- Engineered asynchronous OCR task handling using Celery with Redis broker, webhook callbacks, and retry logic
- Validated the OCR pipeline with MLOps practices, delivering 85% accuracy for 10 languages in production workflows
Object & Landmark Detection
Enhanced Multi-Object and Landmark Recognition with Scalable Deployment
- Fine-tuned RT-DETR (AP = 0.884 for object detection) and DINOv2 (93.51% F1 score for landmark classification)
- Designed a high-performance pipeline using Flask API, JWT, batch processing, and optimized for real-time response
- Devised scalable Docker/Kubernetes deployments with HPA auto-scaling and ELK stack, handling 4x peak loads
Additional Projects
AI-Powered SQL Assistant
Natural Language Database Interaction & Visualization
- Architected an AI SQL agent with an LLM for complex natural language to SQL translation
- Integrated diverse MCP tools for database operations, temporal queries, and dynamic QuickChart visualizations seamlessly
- Implemented a structured reasoning pipeline: planning, command preview, precise tool use, and insightful result analysis
- Deployed an intuitive Gradio WebUI, enhancing user interaction and system observability via detailed logging
Finance Domain Language Model
Domain-Specific LLM Fine-tuning
- Fine-tuned LLaMA 8B on 500k+ financial instructions to create a specialized financial LLM with domain expertise
- Leveraged Unsloth, 4-bit quantization, and PEFT/LoRA techniques for memory-efficient LLM fine-tuning processes
- Implemented efficient model serving using FP16/INT4 GGUF quantization and Ollama with domain-specific prompts
Telephonic Voice Assistant
Speech-Driven Generative AI
- Developed a voice assistant using Twilio's API, integrating ASR, LLM inference, and TTS pipelines for sub-2s latency
- Implemented LLM-driven user intent recognition and robust response validation to enable context-aware interactions
- Automated export of user interaction data as JSON logs for integration with analytics pipelines and CRM systems
COVID-19 Detection System
Deep Learning-Based Image Classification
- Developed a modular PyTorch Dataset & DataLoader pipeline to efficiently preprocess and batch 10k+ CT scans
- Architected a hierarchical EfficientNet-V2 CNN with a custom image classifier, achieving 94% validation accuracy
- Implemented ROC-AUC analysis, confusion matrix, & F1-score evaluation for robust diagnostic model assessment
Blockchain-Based Voting Solution
Decentralized Voting Platform
- Developed an end-to-end decentralized voting platform to organize secure elections using the Azure Blockchain
- Applied the concepts of cryptography to ensure the anonymity of voters on a public blockchain during the election
- Constructed the front-end using Angular and implemented the service to encrypt and decrypt votes using Python
Automatic Speech Recognition System
Speech Recognition using Vector Quantization
- Built a Speech Recognition system using vector quantized Codebook Matching and MFCC feature extraction
- Achieved 75.89% accuracy on clean set and 47.65% accuracy on noisy set of Google Speech Commands dataset
IITB-RISC Processor
Processor Design
- Designed 16-Bit, 6-Stage Pipeline processor, based on Turing-Complete Instruction Set in VHDL from scratch
- Programmed 14 Instructions including branch, arithmetic, & memory interface; tested it on Altera Nano FPGA
- Implemented a 6-stage pipelined processor with Branch Predictors, Priority Encoders and the Hazard Detection
Social Networking Website
ITSP IIT Bombay
- Developed a Social Networking Website with features of a Chat, News Feed, and Posts aimed at college students
- Implemented a feature of Auto Delete of the posts in the News Feed after all the targeted users have viewed it
- Designed the front-end of the website using CSS & HTML and the back-end using MySQL Connector & PHP