Machine Learning Engineer

Neeraj Kumar
Pola

Applied Data & ML Engineer building production-ready systems across data, machine learning, and analytics.

I’ve delivered end-to-end data and ML pipelines that reduced processing time by 12x, improved model performance from 95% to 98%, and supported real-time, data-driven decision-making in production.

My work includes fine-tuning large models such as Wav2Vec2 for low resource languages and LLMs, building scalable data pipelines, and deploying ML systems that bridge research and real-world use cases.

Machine Learning · Data Science · Agentic AI · Data Analytics

Neeraj Kumar Pola
MS in AI
01 / EDUCATION

Background

University at Buffalo

Masters in Artificial Intelligence

GPA: 3.46/4 · Aug 2024 - Present

Deep Learning, Machine Learning, Data Intensive Computing, Reinforcement Learning, Algorithms, Artificial Intelligence, Computer Vision, Pattern Recognition

VBIT

B.Tech in CS (AI & ML)

GPA: 8.38/10 · Sept 2020 - Apr 2024

Data Structures, Cloud Architecture, Computer Vision, NLP, Algorithms, Machine Learning, Database Management Systems, Software Engineering.

02 / EXPERIENCE

Work History

Machine Learning Engineer Intern

98% Accuracy12× Speedup

Media Sales Plus·Aug 2025 - Dec 2025

  • Architected and deployed a production-grade NLP system leveraging spaCy NER, rule-based supervision, regex pipelines, entity normalization, and linguistic feature engineering to convert unstructured text into high-fidelity structured outputs with full correction traceability.
  • Built end-to-end data pipelines with automated validation, error detection, profanity screening, and relationship extraction, improving overall processing accuracy from ~95% to ~98% while enabling scalable analytics and downstream data consumption.
  • Reduced manual processing time from ~60 minutes to 5 minutes per document (12× productivity gain) by automating editorial workflows and deployed the system as a Flask-based service on Azure App Service, ensuring reliability, scalability, and low-latency access in production.

Artificial Intelligence Intern

WER 38.745+ Hours

CAIR, DRDO·Oct 2023 - Mar 2024

  • Fine-tuned Wav2Vec2 transformer ASR models on 45+ hours of Bengali and Assamese audio.
  • Achieved WER 38.7 and CER 10.2 on Bengali, with 12–18% WER improvement on Assamese.
  • Engineered a custom 512-unit neural layer for BPE-based transfer learning on resource-scarce datasets.
  • Collaborated with senior researchers to design and implement methodologies from research literature.

Machine Learning Intern

15% Improvement

Feynn Labs·Mar 2023 - May 2023

  • Built time-series forecasting models using Python, SQL, and Prophet, improving demand prediction accuracy by 15%.
  • Performed statistical analysis, feature engineering, and customer segmentation to support data-driven decisions.
  • Created interactive dashboards in Tableau and Power BI for KPI reporting to stakeholders.

College Club Activities

Machine Learning Associate

200+ Participants500+ Attendees

EpsilonPi (University ML Club)·Aug 2022 - Oct 2023

  • Organized and led Enigma, a large-scale machine learning competition with 200+ participants, designing problem statements and datasets across multiple difficulty levels.
  • Delivered a technical lecture on machine learning models and practical applications to an audience of 500+ students.
  • Actively mentored peers on machine learning fundamentals, model optimization, and real-world use cases.

Senior Software Developer

Full-Stack ProjectsIEEE Competitions

coding.Studio() (University Tech Club)·Aug 2022 - Mar 2024

  • Played a key role in building and maintaining college-wide full-stack projects used by students across departments.
  • Mentored peers in Python programming, software architecture, and project structuring best practices.
  • Participated in multiple IEEE and inter-college coding competitions, collaborating in team-based problem-solving environments.
03 / PROJECTS

Applied Projects

ASR System

Wav2Vec2-for-customized-language

Fine-tuned Wav2Vec2 models for underrepresented languages with custom retrainable modules for Bengali and Assamese speech-to-text conversion.

WER 38.745+ Hours
PyTorch ·Wav2Vec2 ·Transformers ·ASR

AI Mental Health Platform

MindMate

End-to-end RAG platform with TiDB vector DB, LangChain orchestration, and OpenAI APIs for contextual PHQ-9 and GAD-7 mental health assessments.

1000+ SessionsReal-time
RAG ·LangChain ·FastAPI ·Vector DB

Yoga Pose and Sequence Detection

Surya Namaskar with Real-time Feedback

Real-time yoga pose classification using MediaPipe for accurate posture analysis with instant feedback.

Published Research
Computer Vision ·MediaPipe ·Python

Demand Prediction Pipeline

Time Series Forecasting

ETL pipelines feeding Prophet and LSTM models for improved forecast accuracy on 10,000+ time series records.

15% ↑ Accuracy10K+ Records
Prophet ·LSTM ·ETL

Can we generate new music? That is what we tried to find out

Music-Generator-using-Genetic-Algorithm

Music generation using genetic algorithms to explore evolutionary approaches to creative composition.

Python ·FastAPI ·Genetic Algorithms

Reinforcement Learning Project

Comparative Analysis of Algorithms in Discrete and Continuous Action Spaces

Implemented and benchmarked RL algorithms (A3C, PPO, TD3, DDPG, DQN, DDQN) across CartPole, LunarLander, MountainCar, Pendulum, and Atari environments.

PyTorch ·OpenAI Gym ·NumPy ·Matplotlib

Optimization System

Time Table Generator Using Linear Programming

Flexible timetable generator using linear programming to efficiently handle constraints like subjects, faculty, and class frequency.

Python ·PuLP ·Optimization
04 / BLOGS

Technincal Writing & Insights

Sequence Modeling

Contextual Temporal Classification (CTC)

Deep dive into CTC ALgorithm from basics to the core usage of it in sequence based problems.

15-20 min read
CTC ·Sequence Modeling ·Deep Learning ·Automatic Speech Recognition ·Viterbi Algorithm ·Expectations

RAGs and LangChain

Retrieval-Augmented Generation

Deep dive into the working of RAGs and their need along with a code-along to get you ready

12-15 min read
RAG ·LangChain ·Vector Databases ·Retrieval-Augmented Generation ·AI
05 / SKILLS

Technical Stack

Languages

Python
SQL
Java
C
R
JavaScript

ML & AI

PyTorch
TensorFlow
Scikit-learn
RAG
LLMs
n8n

Cloud & Infra

AWS
Snowflake
Azure
GCP
Docker
Kubernetes

Data Tools

Spark
Airflow
FastAPI
Snowflake
MongoDB
Databricks

Core Areas of Expertise

Natural Language Processing
Multilingual ASR
RAG Systems
Time Series Forecasting
Model Deployment
Statistical Modeling

Certification

AWS Certified Cloud Practitioner

Additional Badges

Snowflake Hands-On Essentials: Data WarehouseSnowflake Hands-On Essentials: Collaboration & MarketplaceSnowflake Hands-On Essentials: Data ApplicationsSnowflake Hands-On Essentials: Data LakeSnowflake Hands-On Essentials: Data EngineeringSnowflake Hands-On Essentials: Data Science

Click any badge to view verification on Credly

06 / CONTACT

Get In Touch

I'm currently looking for new opportunities in ML engineering and data science. Feel free to reach out if you'd like to collaborate or just say hello.

Let's build something impactful together.

Whether it's a complex ML system, Finetuning, or an innovative AI application, I'm ready to contribute.

Send a Message