Santosh Adabala — AI & ML Engineer
Open to WorkCurrently Learning: RLHF (Reinforcement Learning with Human Feedback) & Reward Modeling

Santosh Adabala

ML Engineer • LLM Specialist • AI Systems Architect

I build ML systems that make large language models faster, smaller, and production-ready. Specializing in LLM fine-tuning, knowledge distillation, prompt optimization, and distributed ML at scale.

6+
Years Experience
10M+
Records Processed
3x
Inference Speedup
$2M+
Fraud Flagged
Download CV

About Me

Building ML systems that make LLMs production-ready

I build ML systems that make large language models faster, smaller, and production-ready. With 6+ years of experience, I specialize in LLM customization workflows — fine-tuning, knowledge distillation, prompt optimization, and model compression.

175M→45M
Model Compression
3x Faster
Inference Speed
$2M+
Fraud Detected
10M+
Records/Month

Technology Radar

Click a category to filter • Core → Proficient → Familiar

CoreProficientFamiliarPyTorchHuggingFacePythonLangChainQLoRA/PEFTPySparkDockerK8sAWSMLflowDatabricksTensorFlowscikit-learnSQLKafkaKubeflowGCPTerraformHudiPlotlyStreamlitspaCyScala

Education

M.S. Data Science — University of Colorado Boulder

B.S. Electronics & Communications — JNTU Kakinada

View Degree

Location & Availability

Colorado, United States

Available for new opportunities

Awards & Hackathons

2023 DataScience Hackathon

Extra Mile Award (Team & Individual)

Skills & Expertise

Technologies and tools I work with

Languages

Python95%
SQL90%
Scala70%
Java65%
C++60%

ML/AI Frameworks

PyTorch92%
HuggingFace Transformers90%
TensorFlow80%
scikit-learn88%
LangChain85%
spaCy78%

MLOps & Infrastructure

Docker88%
Kubernetes80%
MLflow85%
Kubeflow75%
AWS (S3, SageMaker)85%
GCP (BigQuery)78%
Databricks85%
Terraform72%

LLM & NLP Specializations

LLM Fine-Tuning (QLoRA, PEFT)92%
Knowledge Distillation90%
Model Compression & Quantization88%
Prompt Engineering90%
RAG Systems85%
Agentic AI Workflows82%
Text Classification86%
Anomaly Detection84%

Data & Visualization

PySpark90%
Pandas93%
Apache Kafka78%
Apache Hudi80%
Power BI82%
Plotly / Dash78%
Streamlit80%

Featured Projects

A selection of ML/AI projects I've built

LLM Architecture

Wisdom Vault — The Inheritable Agent

An AI-powered inheritance system that lets a parent's decision patterns be inherited through cryptographically scoped tokens via Auth0 Token Vault. Uses Claude for wisdom extraction and multi-generational delegation with 2-of-3 trustee multi-sig.

PythonFlaskAuth0Claude APIJWTRSA
LLMRAG Pipeline

Agentic AI Parenting Assistant

An AI-driven modular Parenting Agent built with Google's Agent Development Kit (ADK) and FatSecret integration. Features specialized sub-agents for parenting advice, nutrition meal planning, and basic medical guidance with stateful sessions.

PythonGoogle ADKFatSecret APILangChainAgentic AI
LLM Architecture

Cold Email Generator

A cold email generator for service companies built with Groq, LangChain, and Streamlit. Extracts job listings from career pages and crafts personalized emails with relevant portfolio links from a vector database.

GroqLangChainStreamlitChromaDBPython
Q·KQ·KQ·KFeed ForwardSoftmax OutputTransformer Block

SemEval-2024 Task 8a — AI Text Detection

Binary classification on 119K+ texts

Research project for SemEval-2024 on detecting machine-generated text. Experimented with DistilBERT, DeBERTa, RoBERTa, and ALBERT on 119K+ samples. DeBERTa tokenizer + RoBERTa model achieved best results.

PyTorchHuggingFaceDeBERTaRoBERTaNLP
DataTrainEvalDeploymonitoring & feedbackMLOps Pipeline

Space Data Analytics Dashboard

Comprehensive analytics project covering 4,630+ space missions from 1957–2022. Features interactive Power BI dashboards analyzing launch trends, mission success rates, rocket usage, and global cooperation patterns.

Power BIDAXPythonData Modeling
DataTrainEvalDeploymonitoring & feedbackMLOps Pipeline

PHI/PII Parser — FHIR Data Redaction

Redacts 10+ FHIR resource types across PII/PHI fields

A service that reads HL7 FHIR Bundle JSON files from AWS S3, detects and redacts PII/PHI fields using key-name matching and regex patterns, and outputs cleaned CSVs. Supports both FastAPI local deployment and serverless AWS Lambda with automatic S3 triggers.

PythonFastAPIAWS LambdaS3DockerPydantic
DataTrainEvalDeploymonitoring & feedbackMLOps Pipeline

Data-Center Scale Computing

Projects focused on distributed computing at data-center scale, implementing scalable data processing pipelines and cloud-native architectures for large-scale ML workloads.

PythonPySparkAWSDistributed Systems

Experience

My career journey in ML and AI

Aug 2023 – PresentWork

Machine Learning Engineer

Blue Cross Blue Shield Association

Designed LLM fine-tuning and distillation pipelines using PyTorch and HuggingFace, compressing models from 175M to 45M parameters at 90% accuracy. Built distributed ML pipelines processing 10M+ monthly records. Achieved 3x inference speedup via structured pruning and INT8 quantization. Developed agentic AI workflows with LangChain reducing manual intervention by 25%. Implemented anomaly detection flagging $2M+ in potential fraud.

May 2023 – Aug 2023Work

Data Science Intern

Parlay (Techstars '23)

Engineered PySpark data lake architecture on AWS S3 with Apache Hudi, achieving 40% reduction in batch processing time for 500K+ daily records. Automated ETL pipelines using Python multiprocessing and SQLAlchemy. Designed REST APIs reducing integration time for new data sources by 60%.

Mar 2021 – Aug 2022Work

Software Engineer / ML Engineer

Accenture

Rebuilt ML training and data ingestion pipelines using BigQuery, Spark, Hadoop, and Databricks, improving throughput by 50%. Deployed distributed ML pipelines processing 100+ TB of financial data. Developed predictive analytics models improving forecast accuracy by 18%. Led Oracle-to-PostgreSQL migration with zero downtime.

Jun 2020 – Mar 2021Work

Application Development Analyst

Accenture

Built enterprise data applications using SQL and Python for Fortune 500 clients in healthcare and financial services. Implemented automated testing frameworks reducing deployment errors by 15%.

Jul 2019 – Jun 2020Work

Associate Application Developer

Accenture

Developed backend systems and database solutions using SQL and Java, contributing to 3 major client deliverables.

2023Award

2023 DataScience Hackathon

Hackathon Award

Recognized for building innovative data science solutions. Also received Extra Mile Awards in both team and individual categories at Accenture.

Certifications

Professional credentials and continuous learning

AWS

AWS Certified Machine Learning Engineer - Associate

Amazon Web Services· Mar 2026 · Expires Mar 2029

View Credential
AWS

AWS Certified Solutions Architect

Amazon Web Services

View Credential
N

Applications of AI for Anomaly Detection

NVIDIA· Nov 2024

View Credential

The Structured Query Language (SQL)

University of Colorado Boulder· May 2023

Data Analysis with R Programming

Google· May 2024

Share Data Through the Art of Visualization

Google· May 2024

Analyze Data to Answer Questions

Google· May 2024

Process Data from Dirty to Clean

Google· May 2024

Ask Questions to Make Data-Driven Decisions

Google

View Credential

Prepare Data for Exploration

Google

View Credential

Introduction to Programming and Tidyverse

University of Colorado Boulder· Aug 2021

R Programming and Tidyverse Capstone Project

University of Colorado Boulder· Sep 2022

Data Analysis with Tidyverse

University of Colorado Boulder· Aug 2022

U

The Ultimate MySQL Bootcamp

Udemy

View Credential

GitHub Activity

Open-source contributions and stats

0Total Stars
0Total Forks

Top Languages

Python
55%
Jupyter Notebook
25%
HTML
10%
SQL
5%
Java
5%

Contribution Graph

JanFebMarAprMayJunJulAugSepOctNovDec
LessMore

Get in Touch

Let's connect and discuss opportunities

Send a Message

What People Say

Santosh is great at taking a messy ML problem and turning it into something that actually works in production. He helped us rethink our model compression approach and was always willing to dig into the details when things did not go as expected. Solid engineer, easy to work with.

David Chen

ML Engineering Lead, Blue Cross Blue Shield Association

He came in as an intern and immediately started making an impact on our data pipelines. No hand-holding needed — he figured out the PySpark setup, proposed improvements, and shipped them. Would definitely work with him again.

Rachel Martinez

Data Engineering Manager, Parlay (Techstars '23)

What I appreciate about Santosh is that he does not just build models — he thinks about the whole system. He is good at communicating trade-offs to non-technical stakeholders and does not overcomplicate things. Reliable teammate.

Priya Sharma

Senior Product Manager, Blue Cross Blue Shield Association