
Santosh Adabala
ML Engineer • LLM Specialist • AI Systems Architect

About Me
Building ML systems that make LLMs production-ready
IbuildMLsystemsthatmakelargelanguagemodelsfaster,smaller,andproduction-ready.With5+yearsofexperienceapplyingML,NLP,andpredictiveanalyticstohealthcareandinsurancedata,Ispecializeinclinicaltextanalysis,modelcompression,andproductionMLpipelines.
Impact Metrics
Location
Colorado, United States
🟢 Available for opportunities
Awards
2023 DataScience Hackathon
Extra Mile Award (Team & Individual)
Currently Learning
RLHF (Reinforcement Learning with Human Feedback) & Reward Modeling
System Studio
Choose a real problem, tune the priority, and see the architecture tradeoffs behind the work
Latency-sensitive healthcare ML
Clinical NLP Compression
Make a clinical entity model smaller and faster without giving up useful accuracy.
Clinical Text
Stage 1/5Validate schema, source quality, and entity coverage before training.
Optimization Focus
Decision
Serve the compact model first
Prioritize the distilled model and keep the larger model as a quality reference.
Result
39ms to 11ms inference with a much smaller artifact.
92% confidence
Production judgment: smaller models, measured tradeoffs, and deployment-aware evaluation.
Open ProjectSkills & Expertise
An interactive constellation of technologies — drag, hover, and filter
Drag nodes to rearrange · Hover for details · Filter by category
Featured Projects
Recruiter mode emphasizes business impact, metrics, and project proof.

Experience
My career journey in ML and AI
Machine Learning Engineer
Blue Cross Blue Shield of Colorado
Used ClinicalBERT, Python, and Scikit-learn to analyze clinical notes, claims narratives, and prior authorization text, helping medical review teams find relevant clinical patterns faster. Built Python and SQL-based models to prioritize high-risk claims, predict likely denials, and surface high-cost cases. Built HIPAA-compliant data pipelines with Azure Data Factory, Azure Synapse, and Python to process claims from 6+ payer systems, reducing manual data preparation by 30%. Developed Power BI dashboards to track denial rates, turnaround time, and provider performance.
Data Science Intern
Parlay (Techstars '23)
Engineered PySpark data lake architecture on AWS S3 with Apache Hudi, achieving 40% reduction in batch processing time for 500K+ daily records. Automated data ingestion and transformation pipelines using Python multiprocessing and SQLAlchemy. Designed REST APIs to standardize internal data access, reducing integration time for new data sources by 60%.
Machine Learning Engineer
Accenture – Sun Life Insurance Client
Built predictive models with Random Forest, XGBoost, and Scikit-learn to analyze insurance claims, policyholder behavior, and risk patterns. Developed anomaly detection logic to flag suspicious billing activity and high-risk policy behavior. Prepared ML-ready datasets using Azure Data Factory, Azure Synapse, PySpark, and Python, improving ingestion throughput by 50% and reducing reporting costs by 31%. Deployed Azure Databricks pipelines with data quality checks, logging, and SLA monitoring.
2023 DataScience Hackathon
Hackathon Award
Recognized for building innovative data science solutions. Also received Extra Mile Awards in both team and individual categories at Accenture.
Publications & Writing
Articles, papers, and talks
Certifications
Professional credentials and continuous learning
AWS Certified Machine Learning Engineer - Associate
Amazon Web Services· Mar 2026 · Expires Mar 2029
AWS Certified Solutions Architect
Amazon Web Services
Applications of AI for Anomaly Detection
NVIDIA· Nov 2024
The Structured Query Language (SQL)
University of Colorado Boulder· May 2023
Data Analysis with R Programming
Google· May 2024
Share Data Through the Art of Visualization
Google· May 2024
Analyze Data to Answer Questions
Google· May 2024
Process Data from Dirty to Clean
Google· May 2024
Ask Questions to Make Data-Driven Decisions
Prepare Data for Exploration
Introduction to Programming and Tidyverse
University of Colorado Boulder· Aug 2021
R Programming and Tidyverse Capstone Project
University of Colorado Boulder· Sep 2022
Data Analysis with Tidyverse
University of Colorado Boulder· Aug 2022
The Ultimate MySQL Bootcamp
Udemy
GitHub Activity
Live stats and contributions
Currently Building
AlignLLM — LLM Alignment Pipeline on AWS
Building a 7B chat model alignment pipeline using SFT, DPO, RLHF, and LoRA/QLoRA on AWS SageMaker with Terraform infrastructure.
Top Languages
Contribution Graph

Get in Touch
Let's connect and discuss opportunities
Send a Message
What People Say
“Santosh is great at taking a messy ML problem and turning it into something that actually works in production. He helped us rethink our model compression approach and was always willing to dig into the details when things did not go as expected. Solid engineer, easy to work with.”
David Chen
ML Engineering Lead, Blue Cross Blue Shield Association