
Santosh Adabala
ML Engineer • LLM Specialist • AI Systems Architect
I build ML systems that make large language models faster, smaller, and production-ready. Specializing in LLM fine-tuning, knowledge distillation, prompt optimization, and distributed ML at scale.

About Me
Building ML systems that make LLMs production-ready
I build ML systems that make large language models faster, smaller, and production-ready. With 6+ years of experience, I specialize in LLM customization workflows — fine-tuning, knowledge distillation, prompt optimization, and model compression.
Technology Radar
Click a category to filter • Core → Proficient → Familiar
Education
M.S. Data Science — University of Colorado Boulder
B.S. Electronics & Communications — JNTU Kakinada
View DegreeLocation & Availability
Colorado, United States
Available for new opportunities
Awards & Hackathons
2023 DataScience Hackathon
Extra Mile Award (Team & Individual)
Skills & Expertise
Technologies and tools I work with
Languages
ML/AI Frameworks
MLOps & Infrastructure
LLM & NLP Specializations
Data & Visualization
Featured Projects
A selection of ML/AI projects I've built
Wisdom Vault — The Inheritable Agent
An AI-powered inheritance system that lets a parent's decision patterns be inherited through cryptographically scoped tokens via Auth0 Token Vault. Uses Claude for wisdom extraction and multi-generational delegation with 2-of-3 trustee multi-sig.
Agentic AI Parenting Assistant
An AI-driven modular Parenting Agent built with Google's Agent Development Kit (ADK) and FatSecret integration. Features specialized sub-agents for parenting advice, nutrition meal planning, and basic medical guidance with stateful sessions.
Cold Email Generator
A cold email generator for service companies built with Groq, LangChain, and Streamlit. Extracts job listings from career pages and crafts personalized emails with relevant portfolio links from a vector database.
SemEval-2024 Task 8a — AI Text Detection
Binary classification on 119K+ textsResearch project for SemEval-2024 on detecting machine-generated text. Experimented with DistilBERT, DeBERTa, RoBERTa, and ALBERT on 119K+ samples. DeBERTa tokenizer + RoBERTa model achieved best results.
Space Data Analytics Dashboard
Comprehensive analytics project covering 4,630+ space missions from 1957–2022. Features interactive Power BI dashboards analyzing launch trends, mission success rates, rocket usage, and global cooperation patterns.
PHI/PII Parser — FHIR Data Redaction
Redacts 10+ FHIR resource types across PII/PHI fieldsA service that reads HL7 FHIR Bundle JSON files from AWS S3, detects and redacts PII/PHI fields using key-name matching and regex patterns, and outputs cleaned CSVs. Supports both FastAPI local deployment and serverless AWS Lambda with automatic S3 triggers.
Data-Center Scale Computing
Projects focused on distributed computing at data-center scale, implementing scalable data processing pipelines and cloud-native architectures for large-scale ML workloads.

Experience
My career journey in ML and AI
Machine Learning Engineer
Blue Cross Blue Shield Association
Designed LLM fine-tuning and distillation pipelines using PyTorch and HuggingFace, compressing models from 175M to 45M parameters at 90% accuracy. Built distributed ML pipelines processing 10M+ monthly records. Achieved 3x inference speedup via structured pruning and INT8 quantization. Developed agentic AI workflows with LangChain reducing manual intervention by 25%. Implemented anomaly detection flagging $2M+ in potential fraud.
Data Science Intern
Parlay (Techstars '23)
Engineered PySpark data lake architecture on AWS S3 with Apache Hudi, achieving 40% reduction in batch processing time for 500K+ daily records. Automated ETL pipelines using Python multiprocessing and SQLAlchemy. Designed REST APIs reducing integration time for new data sources by 60%.
Software Engineer / ML Engineer
Accenture
Rebuilt ML training and data ingestion pipelines using BigQuery, Spark, Hadoop, and Databricks, improving throughput by 50%. Deployed distributed ML pipelines processing 100+ TB of financial data. Developed predictive analytics models improving forecast accuracy by 18%. Led Oracle-to-PostgreSQL migration with zero downtime.
Application Development Analyst
Accenture
Built enterprise data applications using SQL and Python for Fortune 500 clients in healthcare and financial services. Implemented automated testing frameworks reducing deployment errors by 15%.
Associate Application Developer
Accenture
Developed backend systems and database solutions using SQL and Java, contributing to 3 major client deliverables.
2023 DataScience Hackathon
Hackathon Award
Recognized for building innovative data science solutions. Also received Extra Mile Awards in both team and individual categories at Accenture.
Publications & Writing
Articles, papers, and talks
Certifications
Professional credentials and continuous learning
AWS Certified Machine Learning Engineer - Associate
Amazon Web Services· Mar 2026 · Expires Mar 2029
AWS Certified Solutions Architect
Amazon Web Services
Applications of AI for Anomaly Detection
NVIDIA· Nov 2024
The Structured Query Language (SQL)
University of Colorado Boulder· May 2023
Data Analysis with R Programming
Google· May 2024
Share Data Through the Art of Visualization
Google· May 2024
Analyze Data to Answer Questions
Google· May 2024
Process Data from Dirty to Clean
Google· May 2024
Ask Questions to Make Data-Driven Decisions
Prepare Data for Exploration
Introduction to Programming and Tidyverse
University of Colorado Boulder· Aug 2021
R Programming and Tidyverse Capstone Project
University of Colorado Boulder· Sep 2022
Data Analysis with Tidyverse
University of Colorado Boulder· Aug 2022
The Ultimate MySQL Bootcamp
Udemy
GitHub Activity
Open-source contributions and stats
Top Languages
Contribution Graph

Get in Touch
Let's connect and discuss opportunities
Send a Message
What People Say
“Santosh is great at taking a messy ML problem and turning it into something that actually works in production. He helped us rethink our model compression approach and was always willing to dig into the details when things did not go as expected. Solid engineer, easy to work with.”
David Chen
ML Engineering Lead, Blue Cross Blue Shield Association
“He came in as an intern and immediately started making an impact on our data pipelines. No hand-holding needed — he figured out the PySpark setup, proposed improvements, and shipped them. Would definitely work with him again.”
Rachel Martinez
Data Engineering Manager, Parlay (Techstars '23)
“What I appreciate about Santosh is that he does not just build models — he thinks about the whole system. He is good at communicating trade-offs to non-technical stakeholders and does not overcomplicate things. Reliable teammate.”
Priya Sharma
Senior Product Manager, Blue Cross Blue Shield Association