Coders {cookies}

Overview

MLOps applies software engineering and DevOps discipline to machine learning. This track assumes you can train a basic model and use Python and git. You learn to make experiments reproducible, package and serve models, automate the path from data to deployment, and keep models healthy in production. Build a running pipeline at each stage.

Month 1: Reproducibility and pipelines

Week 1: ML engineering foundations

0 of 5

The ML lifecycle: data, train, evaluate, deploy, monitor, retrain
Why notebooks do not survive production
Project structure, configuration, and environments
Reproducibility: seeds, pinned dependencies, deterministic data splits
Code quality for ML: typing, tests, linting

Tools and libraries

uv or conda: environments
ruff and mypy: quality
pydantic: config validation

Build

Convert a notebook model into a clean, configurable, tested Python project

Week 2: Data and model versioning

0 of 4

Data versioning and why git alone is not enough
Dataset splits, lineage, and snapshots
Model versioning and the model registry concept
Feature definitions and consistency between train and serve

Tools and libraries

DVC: data and pipeline versioning
Git LFS for large files when needed

Build

Version a dataset and a trained model so any run is reproducible from a commit

Week 3: Experiment tracking

0 of 4

Tracking parameters, metrics, and artifacts
Comparing runs and choosing the best model
The model registry: staging, production, archived
Reproducible training runs

Tools and libraries

MLflow: tracking and registry
Weights and Biases: tracking and dashboards

Build

Track every training run, log metrics and artifacts, and register the best model

Week 4: Pipelines and orchestration

0 of 4

Turning steps into a pipeline: ingest, validate, train, evaluate, register
Scheduling and dependencies between steps
Idempotency, retries, and backfills
Data validation gates

Tools and libraries

Apache Airflow or Prefect: orchestration
Great Expectations or pandera: data validation

Build

An orchestrated pipeline that ingests data, validates it, trains, evaluates, and registers a model

Month 2: Serving and automation

Week 5: Model packaging and serving

0 of 5

Packaging a model behind an API
Synchronous serving versus batch scoring
Input validation and output contracts
Containerizing the service
Latency, throughput, and batching

Tools and libraries

FastAPI: serving API
BentoML or a custom container: packaging
Docker: containerize

Build

Serve your registered model behind a FastAPI endpoint in a container

Week 6: CI and CD for ML

0 of 4

Continuous integration for ML code and data tests
Continuous training: retrain when data or code changes
Continuous delivery of the model service
Automated evaluation gates before promotion

Tools and libraries

GitHub Actions: pipelines
the model registry as the promotion gate

Build

A pipeline that tests, trains, evaluates against a threshold, and only then promotes and deploys the model

Week 7: Scalable serving and inference

0 of 4

Autoscaling inference services
GPU serving and batching for deep models
Caching and request coalescing
Canary and shadow deployments for models

Tools and libraries

KServe or Seldon: model serving on Kubernetes
vLLM or Triton: high throughput inference for large models

Build

Deploy the model service with autoscaling and a canary rollout

Week 8: Feature stores and data freshness

0 of 4

The train and serve skew problem
Online and offline feature stores
Point in time correctness
Feature freshness and backfills

Tools and libraries

Feast: open source feature store

Build

Move features into a feature store and serve them consistently online and offline

Month 3: Monitoring and LLMOps

Week 9: Monitoring and drift

0 of 4

Service monitoring: latency, errors, throughput
Data drift and concept drift
Prediction monitoring and ground truth delay
Alerting and automated retraining triggers

Tools and libraries

Evidently: drift and quality monitoring
Prometheus and Grafana: service metrics

Build

Add drift detection and service monitoring, and trigger a retrain when drift crosses a threshold

Week 10: LLMOps

0 of 5

Serving and evaluating large language models
Prompt versioning and management
Tracing LLM calls and cost tracking
Evaluation harnesses and regression tests for prompts
Guardrails and safety

Tools and libraries

Langfuse: LLM observability
promptfoo or deepeval: LLM evaluation

Build

Add tracing, cost tracking, and an evaluation harness to an LLM feature

Week 11: Governance and reliability

0 of 5

Model cards and documentation
Reproducibility audits and lineage
Access control for models and data
Rollback and incident response for models
Responsible AI: bias checks, PII handling

Tools and libraries

a registry with lineage
presidio: PII detection

Build

Add a model card, lineage, and a rollback procedure to your service

Week 12: Capstone

Build

An end to end MLOps system: versioned data and models, tracked experiments, an orchestrated training pipeline with validation gates, automated CI and CD, a scalable served model, a feature store, drift monitoring with automated retraining, and full documentation

Resource master reference

Books

Designing Machine Learning Systems by Chip Huyen

Machine Learning Engineering by Andriy Burkov

Reliable Machine Learning by Cathy Chen and others

Repositories

awesome mlops curated list

made with ml by Goku Mohandas

Tools master list

uv, DVC, MLflow, Weights and Biases, Airflow, Prefect, Great Expectations, FastAPI, BentoML, Docker, KServe, Seldon, vLLM, Triton, Feast, Evidently, Langfuse, Prometheus, Grafana

Interview focus

Walk through taking a model from notebook to production

How do you detect and respond to data drift

Explain train and serve skew and how a feature store fixes it

Design a continuous training pipeline with safe promotion

How do you monitor and evaluate an LLM feature in production

Trade offs of batch scoring versus real time serving