MLOps and Production ML
Take machine learning from a notebook to a reliable, monitored, automated production system. Duration twelve weeks. Target outcome: train, version, deploy, serve, monitor, and retrain models with reproducible pipelines.
Overview
MLOps applies software engineering and DevOps discipline to machine learning. This track assumes you can train a basic model and use Python and git. You learn to make experiments reproducible, package and serve models, automate the path from data to deployment, and keep models healthy in production. Build a running pipeline at each stage.
Month 1: Reproducibility and pipelines
Week 1: ML engineering foundations
0 of 5- uv or conda: environments
- ruff and mypy: quality
- pydantic: config validation
- Convert a notebook model into a clean, configurable, tested Python project
Week 2: Data and model versioning
0 of 4- DVC: data and pipeline versioning
- Git LFS for large files when needed
- Version a dataset and a trained model so any run is reproducible from a commit
Week 3: Experiment tracking
0 of 4- MLflow: tracking and registry
- Weights and Biases: tracking and dashboards
- Track every training run, log metrics and artifacts, and register the best model
Week 4: Pipelines and orchestration
0 of 4- Apache Airflow or Prefect: orchestration
- Great Expectations or pandera: data validation
- An orchestrated pipeline that ingests data, validates it, trains, evaluates, and registers a model
Month 2: Serving and automation
Week 5: Model packaging and serving
0 of 5- FastAPI: serving API
- BentoML or a custom container: packaging
- Docker: containerize
- Serve your registered model behind a FastAPI endpoint in a container
Week 6: CI and CD for ML
0 of 4- GitHub Actions: pipelines
- the model registry as the promotion gate
- A pipeline that tests, trains, evaluates against a threshold, and only then promotes and deploys the model
Week 7: Scalable serving and inference
0 of 4- KServe or Seldon: model serving on Kubernetes
- vLLM or Triton: high throughput inference for large models
- Deploy the model service with autoscaling and a canary rollout
Week 8: Feature stores and data freshness
0 of 4- Feast: open source feature store
- Move features into a feature store and serve them consistently online and offline
Month 3: Monitoring and LLMOps
Week 9: Monitoring and drift
0 of 4- Evidently: drift and quality monitoring
- Prometheus and Grafana: service metrics
- Add drift detection and service monitoring, and trigger a retrain when drift crosses a threshold
Week 10: LLMOps
0 of 5- Langfuse: LLM observability
- promptfoo or deepeval: LLM evaluation
- Add tracing, cost tracking, and an evaluation harness to an LLM feature
Week 11: Governance and reliability
0 of 5- a registry with lineage
- presidio: PII detection
- Add a model card, lineage, and a rollback procedure to your service
Week 12: Capstone
- An end to end MLOps system: versioned data and models, tracked experiments, an orchestrated training pipeline with validation gates, automated CI and CD, a scalable served model, a feature store, drift monitoring with automated retraining, and full documentation
Resource master reference
Books
Designing Machine Learning Systems by Chip Huyen
Machine Learning Engineering by Andriy Burkov
Reliable Machine Learning by Cathy Chen and others
Repositories
awesome mlops curated list
made with ml by Goku Mohandas
Tools master list
uv, DVC, MLflow, Weights and Biases, Airflow, Prefect, Great Expectations, FastAPI, BentoML, Docker, KServe, Seldon, vLLM, Triton, Feast, Evidently, Langfuse, Prometheus, Grafana
Interview focus
Walk through taking a model from notebook to production
How do you detect and respond to data drift
Explain train and serve skew and how a feature store fixes it
Design a continuous training pipeline with safe promotion
How do you monitor and evaluate an LLM feature in production
Trade offs of batch scoring versus real time serving