My Projects

HR Assistant Platform

Built enterprise-scale AI chatbot serving 10k+ employees, streamlining employee support

  • Engineered a multi-format document extraction pipeline, processing 1,200+ files, with OCR and semantic analysis
  • Architected an efficient Graph RAG pipeline with robust chunking strategy, improving retrieval accuracy by 30%
  • Designed modular retrieval pipelines leveraging Qdrant, Graph RAG connectivity and Cross-Encoder re-ranking
  • Leveraged LangChain for multi-turn chats, improving coherence, clarity of responses, and personalization by 20%

Real-Time TTS Engine

Delivered robust and scalable cross-lingual TTS APIs powering HelloJio and Jio Translate

  • Engineered end-to-end data preprocessing with enhanced normalization to enable multilingual TTS model fine-tuning
  • Developed a Python API wrapper to orchestrate LLM-based semantic tokenization and a VQGAN-based vocoder
  • Enabled 100+ concurrent TTS streams on NVIDIA MIG instances, sustaining 0.2 RTF at 90% GPU utilization
  • Deployed Gunicorn/Nginx with round-robin distribution, reducing API errors to 0.5% during peak traffic spikes

Multilingual OCR System

Developed a regional multi-language OCR with a high-accuracy language classifier model

  • Fine-tuned the PaddleClas PULC-based language classifier model for 10 languages, achieving 93.54% F1 score
  • Designed a multi-stage OCR pipeline with a confidence-based adaptive model routing, improving accuracy by 25%
  • Engineered asynchronous OCR task handling using Celery with Redis broker, webhook callbacks, and retry logic
  • Validated the OCR pipeline with MLOps practices, delivering 85% accuracy for 10 languages in production workflows

Object & Landmark Detection

Enhanced Multi-Object and Landmark Recognition with Scalable Deployment

  • Fine-tuned RT-DETR (AP = 0.884 for object detection) and DINOv2 (93.51% F1 score for landmark classification)
  • Designed a high-performance pipeline using Flask API, JWT, batch processing, and optimized for real-time response
  • Devised scalable Docker/Kubernetes deployments with HPA auto-scaling and ELK stack, handling 4x peak loads

Additional Projects

AI-Powered SQL Assistant

Natural Language Database Interaction & Visualization

  • Architected an AI SQL agent with an LLM for complex natural language to SQL translation
  • Integrated diverse MCP tools for database operations, temporal queries, and dynamic QuickChart visualizations seamlessly
  • Implemented a structured reasoning pipeline: planning, command preview, precise tool use, and insightful result analysis
  • Deployed an intuitive Gradio WebUI, enhancing user interaction and system observability via detailed logging

Finance Domain Language Model

Domain-Specific LLM Fine-tuning

  • Fine-tuned LLaMA 8B on 500k+ financial instructions to create a specialized financial LLM with domain expertise
  • Leveraged Unsloth, 4-bit quantization, and PEFT/LoRA techniques for memory-efficient LLM fine-tuning processes
  • Implemented efficient model serving using FP16/INT4 GGUF quantization and Ollama with domain-specific prompts

Telephonic Voice Assistant

Speech-Driven Generative AI

  • Developed a voice assistant using Twilio's API, integrating ASR, LLM inference, and TTS pipelines for sub-2s latency
  • Implemented LLM-driven user intent recognition and robust response validation to enable context-aware interactions
  • Automated export of user interaction data as JSON logs for integration with analytics pipelines and CRM systems

COVID-19 Detection System

Deep Learning-Based Image Classification

  • Developed a modular PyTorch Dataset & DataLoader pipeline to efficiently preprocess and batch 10k+ CT scans
  • Architected a hierarchical EfficientNet-V2 CNN with a custom image classifier, achieving 94% validation accuracy
  • Implemented ROC-AUC analysis, confusion matrix, & F1-score evaluation for robust diagnostic model assessment

Blockchain-Based Voting Solution

Decentralized Voting Platform

  • Developed an end-to-end decentralized voting platform to organize secure elections using the Azure Blockchain
  • Applied the concepts of cryptography to ensure the anonymity of voters on a public blockchain during the election
  • Constructed the front-end using Angular and implemented the service to encrypt and decrypt votes using Python

Automatic Speech Recognition System

Speech Recognition using Vector Quantization

  • Built a Speech Recognition system using vector quantized Codebook Matching and MFCC feature extraction
  • Achieved 75.89% accuracy on clean set and 47.65% accuracy on noisy set of Google Speech Commands dataset

IITB-RISC Processor

Processor Design

  • Designed 16-Bit, 6-Stage Pipeline processor, based on Turing-Complete Instruction Set in VHDL from scratch
  • Programmed 14 Instructions including branch, arithmetic, & memory interface; tested it on Altera Nano FPGA
  • Implemented a 6-stage pipelined processor with Branch Predictors, Priority Encoders and the Hazard Detection

Social Networking Website

ITSP IIT Bombay

  • Developed a Social Networking Website with features of a Chat, News Feed, and Posts aimed at college students
  • Implemented a feature of Auto Delete of the posts in the News Feed after all the targeted users have viewed it
  • Designed the front-end of the website using CSS & HTML and the back-end using MySQL Connector & PHP