diagram.mmd — flowchart
Model Version Deployment flowchart diagram

Model version deployment is the controlled process of promoting a newly trained model artifact from the registry through staging validation, canary rollout, and full production traffic — with automated rollback if live metrics degrade.

What the diagram shows

This flowchart covers the full promotion path from model registry to live serving:

1. Model registered: a new model version passes training evaluation and is written to the model registry with a version tag (see Model Training Pipeline). 2. Deploy to staging: the artifact is deployed to a staging environment that mirrors production infrastructure. 3. Offline validation: holdout evaluation and integration tests run against the staging deployment to catch serialization or runtime issues. 4. Shadow mode: the new model receives a copy of live production traffic and generates shadow predictions, but results are not returned to users. Shadow predictions are compared to the incumbent for consistency. 5. Canary rollout: a small percentage of live traffic (e.g., 1–5%) is routed to the new model version. Online metrics (latency, error rate, business KPIs) are monitored. 6. Canary health check: automated monitors compare canary metrics against baseline. A statistically significant degradation triggers an automatic rollback. 7. Progressive traffic increase: if canary metrics are healthy, traffic is progressively shifted — 10%, 25%, 50%, 100% — with monitoring gates between each step. 8. Full cutover: 100% of traffic is routed to the new version. The previous version is kept in standby for rapid rollback. 9. Decommission old version: after a stability window, the old version is retired from the registry.

Why this matters

Directly replacing a model in production is high risk. A staged deployment strategy — shadow, canary, progressive rollout — provides multiple safety nets, limiting the blast radius of any regression to a small fraction of users.

Free online editor
Edit this diagram in Graphlet
Fork, modify, and export to SVG or PNG. No sign-up required.
Open in Graphlet →

Frequently asked questions

Model version deployment is the controlled process of promoting a newly trained ML model artifact from the registry through staging validation, shadow mode, canary rollout, and progressive traffic increase — with automated rollback triggers — until the new version fully replaces the incumbent in production.
After staging validation and shadow mode (where the new model processes live traffic silently without affecting users), a small slice of real traffic (1–5%) is routed to the new model. Automated monitors compare online metrics against a baseline. If metrics remain healthy, traffic is progressively shifted in increments (10%, 25%, 50%, 100%), with automated rollback available at every gate.
Use shadow mode whenever the new model produces outputs whose quality or consistency is uncertain relative to the incumbent. Shadow mode lets you compare prediction distributions and latency without exposing users to any regression, making it especially valuable for large model version upgrades or changes to the model architecture.
Common automated rollback triggers include statistically significant latency regressions (p99 exceeding a threshold), elevated error rates (HTTP 5xx spikes), drops in business KPIs (click-through rate, conversion rate) detected during canary monitoring, or prediction distribution shifts that indicate the new model is behaving unexpectedly.
mermaid
flowchart TD A([New model version registered]) --> B[Deploy to staging environment] B --> C[Run offline validation and integration tests] C --> D{Staging tests pass?} D -- Fail --> E([Block promotion: fix model or pipeline]) D -- Pass --> F[Deploy in shadow mode alongside production model] F --> G[Compare shadow predictions to incumbent] G --> H{Shadow metrics acceptable?} H -- No --> I([Rollback: investigate divergence]) H -- Yes --> J[Canary rollout: route 5 percent of live traffic to new version] J --> K[Monitor latency, error rate, and business KPIs] K --> L{Canary metrics healthy?} L -- Degraded --> M([Auto-rollback to previous version]) L -- Healthy --> N[Progressive traffic shift: 25% then 50%] N --> O[Monitor at each traffic step] O --> P{All steps healthy?} P -- No --> M P -- Yes --> Q[Full cutover: 100 percent traffic to new version] Q --> R[Keep previous version in standby] R --> S([Decommission old version after stability window])
Copied to clipboard