Lesson 1 · Technical Roles

The Machine Learning Engineer

Building the systems that train, deploy, and sustain AI models at scale

What does it actually take to take a research model and make it work in the real world?

When Google researchers published Attention Is All You Need in June 2017, they had a working prototype of the Transformer architecture — but a prototype is not a product. The gap between that research code and a system capable of processing billions of translation requests per day was bridged by ML engineers: people who understood the mathematics well enough to implement it, and systems well enough to deploy it. That gap is where the ML engineer lives.

What an ML Engineer Actually Does

Machine Learning Engineers sit at the intersection of software engineering and data science. They are not primarily researchers who invent new algorithms, nor are they pure software engineers building CRUD applications. Their core job is to design, build, and maintain systems that learn from data — and to make those systems reliable, fast, and scalable in production.

In practice the role involves five recurring activities: data pipeline construction (ingesting and cleaning training data at scale), model training infrastructure (writing distributed training code, managing GPU clusters), experiment tracking (logging runs, comparing hyperparameter sweeps), model serving (deploying trained weights as low-latency APIs), and monitoring (detecting model drift and performance degradation over time).

Real Case — Meta's PyTorch Team, 2019

When Meta (then Facebook) open-sourced PyTorch 1.0 in late 2018 and pushed it toward production use, the engineering challenge was enormous. ML engineers on the team built TorchScript — a way to serialize and deploy PyTorch models without a Python runtime — specifically because research models trained in Python could not run efficiently in C++ production servers. The tooling the ML engineers built changed how an entire industry deploys models.

Core Technical Skills

Python (NumPy, pandas)

PyTorch / TensorFlow

Distributed training (DDP)

Docker & Kubernetes

SQL & data pipelines

REST / gRPC APIs

MLflow / Weights & Biases

Cloud platforms (AWS/GCP)

Linear algebra & calculus

Statistics & probability

The Production Gap

Research papers report accuracy on benchmark datasets. Production systems must also satisfy latency requirements (a recommendation model that takes 500ms is useless), cost constraints (inference at a billion requests per day is expensive), and reliability (a model that returns errors 0.1% of the time still fails millions of users). ML engineers solve these tensions through techniques like model quantization, knowledge distillation, caching, and batching.

When OpenAI deployed GPT-3 as an API in 2020, the engineering challenge was not the model itself — it was serving a 175-billion-parameter model at acceptable cost and latency. The team had to optimize inference across custom Microsoft Azure hardware specifically provisioned for that purpose. That optimization work is quintessential ML engineering.

Salary Context (2024 US Market)

ML Engineer roles at major tech companies (Google, Meta, OpenAI, Anthropic) typically range from $180,000–$350,000 total compensation for mid-level positions, with senior roles exceeding $450,000. Startups often pay less base but offer meaningful equity. The role commands a premium because it requires both mathematical depth and systems engineering breadth — a rare combination.

Key Terms

Model DriftDegradation in model performance over time as the real-world data distribution shifts away from training data.

InferenceUsing a trained model to make predictions on new inputs — as opposed to training, which adjusts model weights.

QuantizationReducing model precision (e.g., from 32-bit to 8-bit weights) to decrease memory and compute requirements with minimal accuracy loss.

Feature EngineeringTransforming raw data into informative representations that models can learn from effectively.

Lesson 1 Quiz — The ML Engineer

Three questions · Select the best answer for each

1. What is the primary distinction between an ML engineer and an AI researcher?

Correct. The ML engineer's role is production-focused: taking models that may originate in research and making them work reliably, efficiently, and at scale in real systems.

Not quite. The key distinction is research-vs-production orientation, not language choice or work setting.

2. When OpenAI deployed GPT-3 in 2020, what was the central ML engineering challenge?

Correct. The model existed; the engineering challenge was inference at scale — serving billions of parameters efficiently enough to offer it as an API product.

The training was already complete. The deployment challenge — cost and latency at scale — was the core ML engineering problem.

3. What does "model drift" refer to?

Correct. Drift occurs when the world changes and the model's learned patterns no longer match current inputs — a key monitoring concern for ML engineers.

Model drift refers to performance degradation over time as the real-world data distribution changes relative to what the model was trained on.

Lab 1 — ML Engineer Role Exploration

Interactive AI conversation · Complete 3 exchanges to finish the lab

Your Task

You are preparing for an informational interview with an ML Engineer at a mid-size AI startup. Use this AI assistant to practice asking sharp, informed questions — and to deepen your understanding of what the role actually involves day-to-day.

Suggested openers: Ask about the difference between a junior and senior ML engineer's responsibilities. Ask what tools matter most on day one. Ask how much time is spent on research vs. systems work.

ML Engineer Role Advisor

Lab 1

Hello! I'm here to help you understand the ML Engineer role from the inside. Whether you're exploring it as a career path or preparing for interviews, ask me anything — day-to-day work, required skills, how the role differs at big tech vs. startups, or how to break in. What would you like to know?

Lesson 2 · Technical Roles

The Data Scientist

Extracting insight and driving decisions from the data that powers AI

When a company says "data science," what are they really hiring for — and how has that role shifted as AI matures?

In October 2006, Netflix announced a $1 million competition: improve their recommendation algorithm by 10% and win the prize. Over three years, more than 40,000 teams from 186 countries submitted solutions. The winning team — BellKor's Pragmatic Chaos — achieved the target on September 21, 2009. The methods they used: collaborative filtering, matrix factorization, ensemble learning. The people who built those solutions were doing what we now call data science, though the title barely existed yet. Netflix's own data scientists incorporated the insights to build what became one of the most influential recommendation systems in history.

What a Data Scientist Does

Data science is fundamentally about extracting actionable knowledge from data. That involves statistical analysis, predictive modeling, data visualization, and communicating findings to non-technical stakeholders. The canonical data science workflow runs: frame the business question → gather and clean data → explore patterns → build models → evaluate rigorously → communicate results → monitor outcomes.

A critical and often underappreciated skill is the last step: communication. A model that improves conversion by 4% is worthless if the data scientist cannot explain it clearly enough for a product team to act on it. The best data scientists are translators between the mathematical and the organizational.

Real Case — Airbnb's Data Science Team, 2013–2018

Airbnb's early data science team, led by figures like Elena Grewal (later Head of Data Science), built the analytical infrastructure that powered their expansion from a startup to a global platform. One documented project: analyzing host and guest behavior to understand what made listings successful. Their findings — that professional-quality photos dramatically increased bookings — led to Airbnb creating a photography program. Data science directly shaped a major product decision. Grewal later described data science at Airbnb as "part statistician, part engineer, part strategist."

The Data Scientist's Toolkit

Python / R

Statistical modeling

SQL (advanced)

scikit-learn

A/B testing

Tableau / Looker

Causal inference

Jupyter notebooks

Spark (big data)

Storytelling with data

How the Role Has Evolved

In 2012, Harvard Business Review called data scientist "the sexiest job of the 21st century." By 2024 the landscape had changed considerably. Large language models can now produce rudimentary analyses from natural language prompts. Many routine data science tasks — simple regression, basic dashboards, exploratory analysis — have been partially automated. This has not eliminated the role but has shifted it upmarket.

Today's data scientists are increasingly expected to work on causal inference (understanding why, not just what), experimentation design (rigorous A/B tests at scale), and strategic data storytelling (influencing executive decisions). The parts of the job that survive automation are the parts that require judgment, domain expertise, and communication — all deeply human skills.

Data Scientist vs. ML Engineer

The two roles overlap significantly and job titles are used inconsistently across companies. A rough heuristic: data scientists ask "what does the data tell us?" and build models to answer business questions; ML engineers ask "how do we build a system that uses this model reliably?" and focus on infrastructure. At smaller companies one person often does both. At Google or Meta the roles are sharply separated with dedicated teams for each.

Key Terms

A/B TestA controlled experiment comparing two versions of a system or product to determine which performs better on a defined metric.

Causal InferenceStatistical methods for determining cause-and-effect relationships, not just correlations, from observational data.

FeatureAn individual measurable property of the data used as input to a machine learning model.

OverfittingWhen a model learns the training data too specifically and fails to generalize to new, unseen examples.

Lesson 2 Quiz — The Data Scientist

Three questions · Select the best answer for each

1. What was the significance of the Netflix Prize (2006–2009) for data science?

Correct. The Netflix Prize brought techniques like matrix factorization and ensemble learning to widespread attention and demonstrated that data-driven recommendation could be dramatically improved — helping legitimize the field.

The Netflix Prize was important for showcasing collaborative filtering and ensemble methods, not for proving deep learning's dominance (which came later).

2. According to Airbnb's experience, what insight did data scientists surface that shaped a major product decision?

Correct. Airbnb's data science team analyzed listing success factors and found photo quality was pivotal — a finding that became a real product initiative, illustrating how data science influences business decisions.

The key Airbnb data science finding was about photo quality driving booking rates, which led to a professional photography program.

3. How has the data scientist role evolved as LLMs have automated simpler analytical tasks?

Correct. Automation handles routine analysis, so data scientists increasingly focus on higher-judgment work: understanding causality, designing rigorous experiments, and influencing strategic decisions through sophisticated communication.

The role has shifted upmarket rather than disappearing or merging entirely — focusing on judgment, causal reasoning, and strategy that automation cannot yet replicate.

Lab 2 — Data Science Career Planning

Interactive AI conversation · Complete 3 exchanges to finish the lab

Your Task

You are designing a 12-month learning plan to become job-ready as a data scientist. Use this AI advisor to identify the highest-leverage skills to learn, how to build a portfolio, and how to position yourself given the role's evolution toward causal reasoning and strategic communication.

Suggested openers: Ask what the most common mistake is when building a data science portfolio. Ask how to demonstrate causal inference skills to employers. Ask whether to focus more on statistics or programming first.

Data Science Career Advisor

Lab 2

Welcome! I'm your data science career advisor. I can help you map a realistic path into the field — covering what skills matter most right now, how to build a portfolio that stands out, how the role differs across company types, and how to think about the data scientist vs. ML engineer distinction. What's your starting point, and where do you want to go?

Lesson 3 · Technical Roles

The AI Research Scientist

Inventing the techniques that define what AI can do next

What separates an AI researcher from everyone else in the field — and what does the path actually look like?

In November 2020, DeepMind's AlphaFold 2 achieved what structural biologists had attempted for 50 years: accurately predicting the 3D structure of proteins from their amino acid sequences. The system placed first in the CASP14 competition with a median score of 92.4 GDT — far beyond any previous method. The research team, led by John Jumper and Demis Hassabis, published the work in Nature in July 2021. By 2022 DeepMind had released structures for over 200 million proteins. This is what AI research scientists do at the frontier: they create capabilities that did not exist before.

The Research Scientist Role

AI Research Scientists develop new algorithms, architectures, and techniques that expand the frontier of what machine learning can accomplish. Unlike ML engineers (who deploy existing methods) or data scientists (who apply them to business problems), researchers ask: What new thing should exist that doesn't yet?

The work is fundamentally hypothesis-driven. A research scientist identifies an open problem — perhaps language models struggle to reason consistently over long contexts — formulates a hypothesis about why, designs experiments to test it, runs those experiments, and iterates. Publication at venues like NeurIPS, ICML, ICLR, or ACL is the primary currency of the role.

Real Case — Ilya Sutskever & the AlexNet Moment, 2012

At NeurIPS 2012 (then called NIPS), Geoffrey Hinton's group at the University of Toronto — including PhD students Ilya Sutskever and Alex Krizhevsky — presented AlexNet, which reduced the ImageNet top-5 error rate from 26% to 15.3%, a margin that stunned the computer vision community. The paper had been rejected from another venue. Sutskever, who later co-founded OpenAI and served as its Chief Scientist, credits this moment as the inflection point that reoriented the entire field toward deep learning. A single research paper changed the trajectory of AI.

Research Environments

Academic Labs

Universities like MIT, Stanford, CMU, Oxford. Freedom to pursue fundamental questions. Lower pay (often $80–180K for faculty). PhD student pipelines. Grant-funded. Slower compute access.

Industrial Research Labs

DeepMind, Google Brain (merged to Google DeepMind 2023), Meta AI, Microsoft Research, Anthropic. High pay ($300–600K+). Massive compute. Publish-or-perish culture still present but resourced.

Frontier AI Labs

OpenAI, Anthropic, Cohere, Mistral. Mission-driven. Research directly feeds product. Very high compensation. Competitive hiring. Often blends researcher and engineer expectations.

Government / Nonprofit

Allen Institute (AI2), NIST, DARPA. Policy-adjacent research. Public interest orientation. Lower pay but significant impact potential. Emerging as important AI safety venues.

The PhD Question

Most AI Research Scientist roles at top labs list a PhD as required or strongly preferred. The practical reason: a PhD teaches you to formulate novel research problems, design rigorous experiments, and communicate findings clearly — skills not typically developed in industry roles. The top PhD programs placing researchers into industry labs are Stanford, MIT, CMU, UC Berkeley, University of Toronto, Oxford, and Cambridge.

However, the field has notable exceptions. Many researchers at leading labs — including at OpenAI and Anthropic — entered with strong engineering backgrounds and published independently. The actual credential the field values is a strong publication record, particularly papers at top venues. A PhD is the conventional path to building that record; it is not the only one.

Key Research Venues

NeurIPS — Neural Information Processing Systems, the largest and most prestigious ML conference. ICML — International Conference on Machine Learning. ICLR — International Conference on Learning Representations (particularly influential for deep learning). ACL / EMNLP — Top venues for Natural Language Processing. CVPR / ICCV — Computer vision. Publishing at these venues, especially as first author, is the primary signal of research excellence.

Key Terms

Ablation StudyAn experiment that systematically removes components of a model or method to understand each component's individual contribution.

BenchmarkA standardized dataset and evaluation protocol used to compare the performance of different models or methods.

PreprintA research paper posted publicly (typically on arXiv) before peer review — common in fast-moving AI research.

SOTAState of the Art — the best known result on a given benchmark at a given time. A research paper often claims to achieve "new SOTA."

Lesson 3 Quiz — The AI Research Scientist

Three questions · Select the best answer for each

1. What made AlphaFold 2's achievement in 2020 historically significant?

Correct. AlphaFold 2 solved protein structure prediction at a level that surpassed all previous methods, releasing predictions for over 200 million proteins and transforming structural biology research.

AlphaFold 2 was significant for solving protein structure prediction — a fundamental problem in biology that had resisted solution for 50 years.

2. What happened at NeurIPS 2012 that Ilya Sutskever described as an inflection point for the field?

Correct. AlexNet's dramatic improvement on ImageNet — presented by Krizhevsky, Sutskever, and Hinton — demonstrated that deep neural networks trained on GPUs could outperform hand-engineered feature pipelines by a historically large margin.

The 2012 NeurIPS inflection point was AlexNet — the deep convolutional network that dramatically improved ImageNet performance and triggered the deep learning revolution.

3. Why is a PhD commonly required for AI Research Scientist roles at top labs?

Correct. The PhD is valued because it specifically trains researchers to identify open problems, design experiments, and communicate findings — skills central to research science but not typically required in engineering or data science roles.

The PhD requirement exists because doctoral training develops research-specific skills: problem formulation, experimental rigor, and scientific communication — not for credential or legal reasons.

Lab 3 — Research Scientist Path Exploration

Interactive AI conversation · Complete 3 exchanges to finish the lab

Your Task

You are a strong undergraduate student trying to decide whether to pursue a PhD in AI or go directly into industry as an ML engineer. Use this advisor to pressure-test both options, understand what research scientists actually do day-to-day, and identify which path better matches your goals.

Suggested openers: Ask what a typical week looks like for a research scientist at a top lab. Ask how to evaluate whether a PhD advisor is a good fit. Ask whether publishing papers matters if you don't want to be a professor.

Research Career Advisor

Lab 3

I'm here to help you think through the research scientist path — from PhD programs to industrial research labs to the question of whether research is right for you at all. I can speak to what research scientists actually do, how different environments (academic vs. DeepMind vs. OpenAI) shape the work, and how to build a research track record. What would you like to explore first?

Lesson 4 · Technical Roles

MLOps & AI Infrastructure

The engineers who make AI systems run — reliably, repeatably, and at scale

Why do so many AI projects succeed in the lab and fail in production — and what role exists specifically to solve that problem?

By 2017, Uber had hundreds of data scientists building models — for surge pricing, ETA prediction, fraud detection, driver matching. But each team was deploying models differently, training on different infrastructure, monitoring in inconsistent ways. The result was a fragile zoo of ML systems that was increasingly impossible to maintain. Uber's ML platform team spent two years building Michelangelo — an internal ML-as-a-service platform that standardized how models were trained, deployed, monitored, and retrained. When they published a description in September 2017, it became one of the most referenced examples of MLOps in practice. The engineers who built Michelangelo were not data scientists or researchers — they were what we now call MLOps engineers.

What MLOps Actually Means

MLOps (Machine Learning Operations) applies DevOps principles — continuous integration, continuous deployment, monitoring, automation — to machine learning systems. The discipline emerged because ML systems have failure modes that traditional software does not: they can degrade silently as data distributions shift, they require expensive retraining cycles, and their behavior is probabilistic rather than deterministic.

An MLOps engineer builds and maintains the infrastructure layer between raw data and deployed model predictions. This includes feature stores (precomputed feature repositories), model registries (version-controlled model artifacts), experiment tracking systems, automated training pipelines, serving infrastructure, and monitoring dashboards.

Real Case — LinkedIn's Real-Time Machine Learning Platform, 2019

LinkedIn's engineering team published a detailed account in 2019 of building a real-time ML platform that processes over 3 trillion feature values per day for applications including feed ranking, job recommendations, and "People You May Know." The infrastructure includes a feature store called Venice, online/offline consistency checks, and automated model evaluation pipelines. The engineers who built and operate this system are MLOps and AI infrastructure engineers — their work is invisible to users but enables every AI-powered surface on the platform.

The MLOps Stack

Data Layer

Data pipelines (Airflow, Spark), feature stores (Feast, Tecton), data validation (Great Expectations). Ensuring consistent, high-quality data reaches training and serving.

Training Layer

Distributed training orchestration (Kubeflow, Ray), hyperparameter optimization (Optuna), experiment tracking (MLflow, W&B). Reproducible training at scale.

Serving Layer

Model servers (TorchServe, Triton, BentoML), A/B testing infrastructure, canary deployments, shadow mode testing. Getting predictions to users reliably.

Monitoring Layer

Data drift detection, model performance tracking, alerting (Evidently AI, WhyLabs), retraining triggers. Keeping models healthy over time.

Salary & Career Positioning

MLOps engineer roles are among the fastest-growing in the AI space. As of 2024, salaries range from $150,000–$280,000 at major tech companies, with senior ML platform engineers at companies like Stripe, Airbnb, or Lyft earning comparable to ML engineers. The role is often less visible than model building but is increasingly recognized as critical infrastructure.

The career typically follows two paths: software engineers who specialize in ML systems (starting from SWE roles and picking up ML context), or ML engineers who move toward platform work (starting with model building and moving toward the infrastructure that supports it). Both are viable; the former is currently more common because strong systems engineering skills are harder to find than ML knowledge.

The Hidden Engineering Work of AI

A 2015 Google paper titled "Hidden Technical Debt in Machine Learning Systems" (Sculley et al.) argued that ML code is typically a small fraction of a real-world ML system — surrounded by configuration, data collection, feature extraction, process management, serving infrastructure, and monitoring. The paper introduced the concept of "technical debt" in ML systems and remains one of the most cited practical ML engineering papers. It describes precisely the problem that MLOps engineers exist to solve.

Key Terms

Feature StoreA centralized repository of computed features that can be shared across models, ensuring consistency between training and serving.

CI/CD for MLContinuous integration and deployment adapted for ML: automated testing, validation, and deployment of new model versions.

Data DriftStatistical change in input data over time that can silently degrade model performance without any code changes.

Model RegistryA version-controlled catalog of trained model artifacts, metadata, and lineage information — the "GitHub for models."

Lesson 4 Quiz — MLOps & AI Infrastructure

Three questions · Select the best answer for each

1. What problem did Uber's Michelangelo platform primarily solve?

Correct. Michelangelo addressed the operational chaos of hundreds of teams deploying models inconsistently — creating shared infrastructure for training, serving, and monitoring that made ML systems maintainable at scale.

Michelangelo was an infrastructure standardization effort, not an algorithmic innovation. It solved the operational problem of managing diverse, inconsistently deployed ML systems.

2. According to the 2015 Google paper "Hidden Technical Debt in Machine Learning Systems," what proportion of a real-world ML system is typically ML code?

Correct. The Sculley et al. paper famously illustrated that ML model code is only a small box surrounded by a much larger infrastructure: data collection, feature engineering, serving, configuration, and monitoring — all of which MLOps engineers build and maintain.

The Google paper argued the opposite — that ML code is typically a small fraction of a real ML system, with most complexity in surrounding infrastructure.

3. What is a "feature store" in the context of MLOps?

Correct. Feature stores solve a critical MLOps problem: ensuring the features computed at training time are identical to those available at serving time, preventing subtle bugs caused by training-serving skew.

A feature store is shared infrastructure for precomputed features — ensuring consistency between model training and serving, a common source of bugs in ML systems.

Lab 4 — MLOps Design Challenge

Interactive AI conversation · Complete 3 exchanges to finish the lab

Your Task

You have just joined a 50-person startup as their first ML platform engineer. The company has three data scientists who have built models for customer churn prediction and product recommendation — but the models are deployed manually via Jupyter notebooks, there is no monitoring, and retraining happens whenever someone remembers to do it. Design the first three infrastructure pieces you would build and why.

Suggested openers: Ask which MLOps component provides the most immediate value in a fragile manual deployment setup. Ask how to prioritize monitoring vs. automation when resources are limited. Ask what open-source tools are best for a small team starting from scratch.

MLOps Design Advisor

Lab 4

Welcome to the design challenge! You're walking into a classic early-stage ML infrastructure situation — capable models, no platform underneath them. I can help you think through what to build first, how to make the case to leadership for infrastructure investment, which open-source tools fit a small team, and how to sequence the work to deliver value quickly while building toward something maintainable. What's your first question?

Module 2 — Module Test

15 questions · Score 80% or above to pass

1. An ML engineer's primary focus is best described as:

Correct. ML engineers bridge the gap between research and production — building the systems that train, deploy, and sustain models at scale.

ML engineers focus on production systems — deploying and sustaining models reliably — rather than research, visualization, or finance.

2. What technique did the Meta (Facebook) PyTorch team develop specifically to deploy Python-trained models in C++ production servers?

Correct. TorchScript was built to serialize PyTorch models in a form that could be executed without a Python runtime, enabling production deployment in C++ systems.

The Meta/Facebook PyTorch team built TorchScript specifically to solve the Python-to-C++ production deployment problem.

3. Quantization in the context of ML engineering refers to:

Correct. Quantization compresses model weights to lower bit-widths, dramatically reducing inference cost with typically small accuracy trade-offs.

Quantization reduces numerical precision of model weights to reduce memory and compute — a key inference optimization technique.

4. The Netflix Prize competition (2006–2009) primarily advanced which set of techniques?

Correct. The winning solution combined collaborative filtering, SVD-based matrix factorization, and powerful ensemble methods — bringing these techniques to widespread attention.

The Netflix Prize advanced collaborative filtering and matrix factorization — not Transformers, RLHF, or GANs, which came later.

5. What did Airbnb's data science team discover that directly led to a major product initiative?

Correct. Photo quality analysis led directly to Airbnb's professional photography program — a canonical example of data science shaping product decisions.

Airbnb's data team found that professional photos were the key driver of booking success, which led to the Airbnb photography initiative.

6. In today's data science landscape, which tasks are increasingly shifting to data scientists as LLMs automate simpler analyses?

Correct. As automation handles routine analysis, data scientists increasingly focus on higher-judgment work requiring domain expertise and communication — causal reasoning, rigorous experiments, and executive influence.

Automation is taking over routine tasks, pushing data scientists toward high-judgment work: causal inference, experiment design, and strategic storytelling.

7. AlphaFold 2 achieved a median GDT score of approximately 92.4 in CASP14. What does this represent?

Correct. GDT (Global Distance Test) measures structural accuracy; AlphaFold 2's 92.4 score was far beyond previous methods and matched experimental precision, solving a 50-year challenge.

The 92.4 GDT score in CASP14 represented near-experimental-accuracy protein structure prediction — the core achievement of AlphaFold 2.

8. What primary currency do AI Research Scientists trade in for career advancement?

Correct. In the research scientist career, publication record at top-tier venues is the primary signal of quality — it is how researchers demonstrate the ability to produce new knowledge recognized by the field.

Research scientists are primarily evaluated on their publication record at top-tier venues — NeurIPS, ICML, ICLR, ACL, CVPR — not patents, revenue, or code commits.

9. Which of these correctly characterizes an ablation study?

Correct. Ablation studies isolate the effect of individual components by removing them and measuring the resulting performance change — a standard method for understanding what parts of a model actually matter.

An ablation study removes components one at a time to understand each component's contribution, not a scaling or comparison experiment.

10. What problem did Uber's Michelangelo specifically address?

Correct. Michelangelo was Uber's ML platform built to replace inconsistent ad-hoc deployments with standardized infrastructure for the entire ML lifecycle.

Michelangelo solved the organizational and operational problem of hundreds of teams deploying models in incompatible, unmaintainable ways.

11. What is "training-serving skew" and why does it matter?

Correct. Training-serving skew is a common and insidious MLOps bug: features are computed differently in training pipelines vs. serving systems, causing models to behave differently in production than they did during evaluation.

Training-serving skew refers to feature inconsistencies between training and serving — a key problem that feature stores are specifically designed to prevent.

12. LinkedIn's ML infrastructure processes over how many feature values per day?

Correct. LinkedIn's real-time ML platform processes over 3 trillion feature values per day — an illustration of the scale at which AI infrastructure engineers operate in large consumer platforms.

LinkedIn's platform processes 3 trillion feature values per day — a number that illustrates why dedicated AI infrastructure engineering exists as a specialized discipline.

13. The 2015 Google paper "Hidden Technical Debt in Machine Learning Systems" argued that ML model code represents:

Correct. The Sculley et al. paper used a famous diagram showing ML code as a small box surrounded by a much larger system — making the case that most of the real engineering challenge is in the surrounding infrastructure.

The paper famously illustrated that ML model code is a small fraction of the real system — surrounded by data, feature, serving, monitoring, and configuration infrastructure.

14. Which best describes the primary difference between a data scientist and an ML engineer?

Correct. While the roles overlap significantly, the core orientation differs: data scientists answer analytical questions, ML engineers build and sustain the systems that embed model predictions into products.

The core distinction is orientation: data scientists focus on insight extraction, ML engineers on production systems — regardless of tool choice, math depth, or work setting.

15. When AlexNet was presented at NeurIPS 2012, what was the significance of its ImageNet result?

Correct. AlexNet's 10-percentage-point improvement stunned the computer vision community and triggered the deep learning revolution. Ilya Sutskever later described it as the moment that changed everything.

AlexNet's significance was its dramatic margin of improvement — cutting top-5 error by over 10 points — which demonstrated the power of deep learning on GPUs and transformed research priorities across AI.