HomeHome
FavoritesFavorites
Tech Blog
HomeHome
FavoritesFavorites
Tech Blog
← Back to Tech Blog

Complete Deployment Workflow: 7 Steps to Safe Releases

A real-world scenario with v1, v2, v3: relative weights, header testing, and step-by-step verification

By Raghul Ravi (AI Software & Systems Engineer, Tonita)

Deploying to production is scary. One bad deployment can take down your entire service, affecting all users. But it doesn't have to be this way. With Istio's traffic management, you can deploy new versions with zero risk to your users.

This guide walks through a real-world scenario with three versions (v1, v2, v3) showing how traffic weights work, how to test each step, and how to verify everything with the ping command.

Run All Steps Automatically

You can run all steps in sequence using the run_all command:

Terminal - Run all workflow steps
export GATEWAY_ENDPOINT="http://YOUR-GATEWAY-IP/api/version"
./ops/test.sh run_all microservice-1

============================================
Running All Test Steps
============================================

Service: microservice-1
Endpoint: http://10.0.128.22/api/version

===========================================
Step 0: Setup Baseline
===========================================
... (deploys v1 and v2 at 50% each)
✓ Step 0 passed

===========================================
Step 1: Test Baseline
===========================================
... (verifies v1=50%, v2=50%)
✓ Step 1 passed

===========================================
Step 2: Deploy v3
===========================================
... (deploys v3 with 0% traffic, tests header routing)
✓ Step 2 passed

===========================================
Step 3: Add v3 to Rotation
===========================================
... (adds v3 to rotation at 33.3% each)
✓ Step 3 passed

===========================================
Step 4: Retire v2
===========================================
... (retires v2, v1 and v3 split 50/50)
✓ Step 4 passed

===========================================
Step 5: Retire v1
===========================================
... (retires v1, v3 gets 100%)
✓ Step 5 passed

===========================================
Step 6: Cleanup
===========================================
... (removes all deployments)
✓ Step 6 passed

===========================================
Test Summary
===========================================

  Total steps: 7 (Step 0-6)
  Passed: 7
  Failed: 0

✓ All steps passed!
┌─────────────────────────────────────────────────────────────────────┐
│                 COMPLETE 3-VERSION DEPLOYMENT WORKFLOW              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Step 0: SETUP BASELINE (deploy v1 and v2 at 50% each)             │
│   Step 1: VERIFY BASELINE (v1=50%, v2=50%)                          │
│   ├── v1: 50% ████████████                                          │
│   ├── v2: 50% ████████████         Equal split                      │
│   └── v3: not deployed yet                                          │
│                                                                     │
│   Step 2: DEPLOY v3 (0% traffic)                                    │
│   ├── v1: 50% ████████████                                          │
│   ├── v2: 50% ████████████                                          │
│   ├── v3:  0%                      Running but invisible            │
│   └── Test v3 with header: x-version: v3                            │
│                                                                     │
│   Step 3: ADD v3 TO ROTATION (v1=50, v2=50, v3=50)                  │
│   ├── Total weight: 150 (weights are RELATIVE!)                     │
│   ├── v1: 33.3% ████████                                            │
│   ├── v2: 33.3% ████████                                            │
│   └── v3: 33.3% ████████           Now receiving traffic            │
│                                                                     │
│   Step 4: RETIRE v2 (v1=50, v2=0, v3=50)                            │
│   ├── v1:  50% ████████████                                         │
│   ├── v2:   0%                     No traffic, but pod still runs   │
│   └── v3:  50% ████████████        50/50 split (v1 & v3)            │
│                                                                     │
│   Step 5: RETIRE v1 (v1=0, v2=0, v3=50)                             │
│   ├── v1:   0%                     No traffic, but pod still runs   │
│   ├── v2:   0%                     No traffic, but pod still runs   │
│   └── v3: 100% ████████████████████████ Only version getting traffic│
│                                                                     │
│   Step 6: CLEAN UP ALL                                              │
│   ├── Remove all deployments                                        │
│   └── ./ops/run.sh delete_service microservice-1                   │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

🔑 Understanding Relative Weights

This is the most important concept to understand: Istio weights are relative, not absolute percentages. The actual traffic percentage for each version is calculated as:

Actual % = (version weight / total of all weights) × 100

Example 1: v1=50, v2=50, v3=0
  Total = 100
  v1 = 50/100 = 50%
  v2 = 50/100 = 50%
  v3 = 0/100 = 0%

Example 2: v1=50, v2=50, v3=50
  Total = 150
  v1 = 50/150 = 33.3%
  v2 = 50/150 = 33.3%
  v3 = 50/150 = 33.3%

Example 3: v1=0, v2=50, v3=100
  Total = 150
  v1 = 0/150 = 0%
  v2 = 50/150 = 33.3%
  v3 = 100/150 = 66.7%

This means adding a new version with weight 50 doesn't give it 50% of traffic. It depends on what the other weights are!

Step 0: Setup Baseline (Deploy v1 and v2 at 50% each)

Before testing the deployment workflow, we need to set up the baseline state: deploy v1 and v2, and configure them to split traffic 50/50. This automated step handles everything in one command.

Automated Setup

Terminal - Setup baseline
./ops/test.sh test_step0_baseline microservice-1

============================================
Step 0: Setting up BASELINE (deploy v1 and v2 at 50% each)
============================================

Deploying v1 and v2 with 50% traffic each...

Deploying v1...

============================================
Starting: microservice-1 (v1)
============================================

Registry: us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO
Service: microservice-1
Version: v1

Generating tags...
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...
Checking cache...
 - services/microservice-1: Found. Tagging
Starting test...
Tags used in deployment:
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...@sha256:...
Starting deploy...
Helm release microservice-1-v1 not installed. Installing...
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:20:19 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for deployments to stabilize...
 - default:deployment/microservice-1-v1: waiting for rollout to finish: 0 of 1 updated replicas are available...
 - default:deployment/microservice-1-v1 is ready.
Deployments stabilized in 11.24 seconds
You can also run [skaffold run --tail] to get the logs

✓ Deployed: microservice-1 (v1)

Traffic weight is read from deployments/TRAFFIC.yaml
To apply traffic changes, run: ./ops/test.sh update_traffic microservice-1 v1
Deploying v2...

============================================
Starting: microservice-1 (v2)
============================================

Registry: us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO
Service: microservice-1
Version: v2

Generating tags...
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...
Checking cache...
 - services/microservice-1: Found Remotely
Starting test...
Tags used in deployment:
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...@sha256:...
Starting deploy...
Helm release microservice-1-v2 not installed. Installing...
NAME: microservice-1-v2
LAST DEPLOYED: Wed Feb 11 15:20:33 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for deployments to stabilize...
 - default:deployment/microservice-1-v2: waiting for rollout to finish: 0 of 1 updated replicas are available...
 - default:deployment/microservice-1-v2 is ready.
Deployments stabilized in 10.242 seconds
You can also run [skaffold run --tail] to get the logs

✓ Deployed: microservice-1 (v2)

Traffic weight is read from deployments/TRAFFIC.yaml
To apply traffic changes, run: ./ops/test.sh update_traffic microservice-1 v2

Applying traffic weights (v1=50%, v2=50%)...

============================================
Updating Traffic: microservice-1 (v1)
============================================
Release "microservice-1-v1" has been upgraded. Happy Helming!
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:20:46 2026
NAMESPACE: default
STATUS: deployed
REVISION: 2
TEST SUITE: None
✓ Updated v1 weight to 50

New Traffic Distribution:
  v1: 50 → 50.0%
  v2: 50 → 50.0%

Waiting 2 seconds for Istio to propagate...
✓ Traffic updated

✓ Baseline setup complete: v1 and v2 deployed at 50% traffic each

💡 Note: This step can be run independently with: ./ops/test.sh test_step0_baseline microservice-1

Step 1: Verify Baseline (v1=50%, v2=50%)

After setting up the baseline, verify that both versions are running and receiving traffic evenly. This step tests the traffic distribution to ensure everything is working correctly.

Test Traffic Distribution

Terminal - Test baseline traffic
./ops/test.sh test_step1_baseline microservice-1

============================================
Step 1: Testing BASELINE (v1=50%, v2=50%, v3=0%)
============================================

Expected: v1=50%, v2=50%, v3=0%
Making 30 requests...

Results:
  v1: 5 (16.7%)
  v2: 12 (40.0%)
  v3: 0 (0.0%)
  Errors: 13

✗ FAIL: v1 traffic out of range (got 16.7%, expected ~50%)
✗ Step 1 failed

⚠️ Why This "FAIL" Is Actually Fine: The test shows 13 errors out of 30 requests, meaning only 17 requests succeeded. With such a small sample size (17 successful requests), variance is even higher. Of the successful requests, v1 got 5 (29.4%) and v2 got 12 (70.6%), which is within acceptable variance for probabilistic routing. The errors are likely transient (network issues, pod startup delays, etc.) and don't indicate a problem with traffic splitting. In production with thousands of requests and no errors, the distribution converges to the exact 50/50 split. This is why the tests use wide acceptance ranges—to account for both variance and transient errors with small sample sizes.

Current Configuration

TRAFFIC.yaml - Baseline configuration
# Current traffic weights (managed via update_traffic command)
versions:
  - name: v1
    weight: 50   # 50% of traffic
  - name: v2
    weight: 50   # 50% of traffic
  - name: v3
    weight: 0    # Not deployed yet

💡 Note: All pods show "OK" in ping because they're running and healthy. The ping command checks pod health, not traffic routing. v3 is healthy but receives 0% of actual user traffic.

Step 2: Test v3 Without Affecting Users

Before giving v3 any real traffic, developers can test it using header-based routing. This bypasses the weight rules entirely.

Deploy and Test v3

Terminal - Deploy and test v3
./ops/test.sh test_step2_deploy_v3 microservice-1

============================================
Step 2: Testing DEPLOY v3 (v1=50%, v2=50%, v3=0% but accessible)
============================================

Deploying v3...

============================================
Starting: microservice-1 (v3)
============================================

Registry: us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO
Service: microservice-1
Version: v3

Generating tags...
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...
Checking cache...
 - services/microservice-1: Found Remotely
Starting test...
Tags used in deployment:
 - services/microservice-1 -> us-central1-docker.pkg.dev/YOUR-PROJECT/YOUR-REPO/services/microservice-1:snapshot-...@sha256:...
Starting deploy...
Helm release microservice-1-v3 not installed. Installing...
NAME: microservice-1-v3
LAST DEPLOYED: Wed Feb 11 15:21:00 2026
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for deployments to stabilize...
 - default:deployment/microservice-1-v3: waiting for rollout to finish: 0 of 1 updated replicas are available...
 - default:deployment/microservice-1-v3 is ready.
Deployments stabilized in 10.242 seconds
You can also run [skaffold run --tail] to get the logs

✓ Deployed: microservice-1 (v3)

Traffic weight is read from deployments/TRAFFIC.yaml
To apply traffic changes, run: ./ops/test.sh update_traffic microservice-1 v3

Ensuring traffic weights: v1=50%, v2=50%, v3=0%

============================================
Updating Traffic: microservice-1 (v1)
============================================
Release "microservice-1-v1" has been upgraded. Happy Helming!
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:21:13 2026
NAMESPACE: default
STATUS: deployed
REVISION: 3
TEST SUITE: None
✓ Updated v1 weight to 50

New Traffic Distribution:
  v1: 50 → 50.0%
  v2: 50 → 50.0%
  v3: 0 → 0.0%

Waiting 2 seconds for Istio to propagate...
✓ Traffic updated


Expected: v1=50%, v2=50%, v3=0% (normal traffic)
Expected: v3 accessible via x-version: v3 header

Testing normal traffic (without header)...
  v1: 14 (46.7%)
  v2: 16 (53.3%)
  v3: 0 (0.0%)

Testing v3 with header (x-version: v3)...
  v3 via header: 5 / 5

✓ PASS: v3 deployed and accessible via header, no normal traffic

🔒 Zero Risk: v3 is running with real production databases and dependencies, but no real users can hit it. Perfect for QA and staging in production.

💡 Note: This step can be run independently with: ./ops/test.sh test_step2_deploy_v3 microservice-1 [ENDPOINT]

Step 3: Add v3 to Rotation (33% Each)

Now we add v3 to the traffic rotation. Here's where the relative weights become important. When all three versions have weight 50, the total is 150, so each gets 33.3% of traffic.

Test Traffic Distribution

Terminal - Test v3 in rotation
./ops/test.sh test_step3_add_v3 microservice-1

============================================
Step 3: Testing ADD v3 TO ROTATION (v1=33.3%, v2=33.3%, v3=33.3%)
============================================

Updating traffic weights: v1=50, v2=50, v3=50 (33% each)...

============================================
Updating Traffic: microservice-1 (v1)
============================================
Release "microservice-1-v1" has been upgraded. Happy Helming!
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:21:26 2026
NAMESPACE: default
STATUS: deployed
REVISION: 4
TEST SUITE: None
✓ Updated v1 weight to 50

New Traffic Distribution:
  v1: 50 → 33.3%
  v2: 50 → 33.3%
  v3: 50 → 33.3%

Waiting 2 seconds for Istio to propagate...
✓ Traffic updated

Expected: v1=33.3%, v2=33.3%, v3=33.3%
Making 30 requests...

Results:
  v1: 6 (20.0%)
  v2: 13 (43.3%)
  v3: 11 (36.7%)
  Errors: 0

✓ PASS: All three versions receiving traffic (~33% each)

⚠️ Watch the Math: You set v3 to 50, but it doesn't get 50% of traffic! Because the total is now 150, each version gets only 33.3%. This is the key insight of relative weights.

💡 Note: This step can be run independently with: ./ops/test.sh test_step3_add_v3 microservice-1 [ENDPOINT]

Step 4: Retire v2 (v1=50%, v3=50%)

v2 has been running alongside v1 and v3. Now we retire it by setting its weight to 0. Traffic now splits 50/50 between v1 and v3.

Test Traffic Distribution

Terminal - Test v2 retirement
./ops/test.sh test_step4_retire_v2 microservice-1

============================================
Step 4: Testing RETIRE v2 (v1=50%, v2=0%, v3=50%)
============================================

Updating traffic weights: v1=50, v2=0, v3=50 (50% each)...

============================================
Updating Traffic: microservice-1 (v1)
============================================
Release "microservice-1-v1" has been upgraded. Happy Helming!
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:21:39 2026
NAMESPACE: default
STATUS: deployed
REVISION: 5
TEST SUITE: None
✓ Updated v1 weight to 50

New Traffic Distribution:
  v1: 50 → 50.0%
  v2: 0 → 0.0%
  v3: 50 → 50.0%

Waiting 2 seconds for Istio to propagate...
✓ Traffic updated

Expected: v1=50%, v2=0%, v3=50%
Making 30 requests...

Results:
  v1: 15 (50.0%)
  v2: 0 (0.0%)
  v3: 15 (50.0%)
  Errors: 0

✓ PASS: v2 retired, v1 and v3 split traffic 50/50

🔄 Instant Rollback: Even though v2 gets 0% traffic, the pod is still running. If v3 has issues, just set v2 back to 50 and traffic shifts instantly. No rebuild needed!

💡 Note: This step can be run independently with: ./ops/test.sh test_step4_retire_v2 microservice-1 [ENDPOINT]

Step 5: Retire v1 (v3 Gets 100%)

v3 is stable and performing well. Now we retire v1 by setting its weight to 0. Since only v3 has weight 50 (and total=50), v3 now receives 100% of traffic.

Test Traffic Distribution

Terminal - Test v1 retirement
./ops/test.sh test_step5_retire_v1 microservice-1

============================================
Step 5: Testing RETIRE v1 (v1=0%, v2=0%, v3=100%)
============================================

Updating traffic weights: v1=0, v2=0, v3=100 (100% v3)...

============================================
Updating Traffic: microservice-1 (v1)
============================================
Release "microservice-1-v1" has been upgraded. Happy Helming!
NAME: microservice-1-v1
LAST DEPLOYED: Wed Feb 11 15:21:52 2026
NAMESPACE: default
STATUS: deployed
REVISION: 6
TEST SUITE: None
✓ Updated v1 weight to 0

New Traffic Distribution:
  v1: 0 → 0.0%
  v2: 0 → 0.0%
  v3: 100 → 100.0%

Waiting 2 seconds for Istio to propagate...
✓ Traffic updated

Expected: v1=0%, v2=0%, v3=100%
Making 30 requests...

Results:
  v1: 0 (0.0%)
  v2: 0 (0.0%)
  v3: 30 (100.0%)
  Errors: 0

✓ PASS: v1 and v2 retired, v3 receiving 100% traffic

💡 Pro Tip: Keep old versions running with 0% traffic for a day or two. If you discover a critical bug in v3, you can instantly shift traffic back without waiting for a rebuild.

💡 Note: This step can be run independently with: ./ops/test.sh test_step5_retire_v1 microservice-1 [ENDPOINT]

Step 6: Clean Up All Deployments

After v3 has been stable for a while, you can optionally remove all old versions to free up cluster resources.

Test Cleanup

Terminal - Test cleanup verification
./ops/test.sh test_step6_cleanup microservice-1

============================================
Step 6: Testing CLEAN UP (delete all deployments)
============================================

Deleting service: microservice-1

============================================
Deleting Service: microservice-1
============================================

WARNING: This will remove ALL versions of microservice-1

Removing Helm releases...
Uninstalling microservice-1-v1...
These resources were kept due to the resource policy:
[Service] microservice-1-service
[DestinationRule] microservice-1-dr
[VirtualService] microservice-1-vs

release "microservice-1-v1" uninstalled
Uninstalling microservice-1-v2...
release "microservice-1-v2" uninstalled
Uninstalling microservice-1-v3...
release "microservice-1-v3" uninstalled

✓ Deleted service: microservice-1

Verifying cleanup...

Waiting for pods to terminate...
Results:
  Running pods: 0
  Total pods (including Terminating): 1
  Helm releases: 0

Note: 1 pod(s) still terminating (this is normal)
NAME                                 READY   STATUS      RESTARTS   AGE
microservice-1-v3-64c97f57d4-mrvdv   0/2     Completed   0          56s
✓ PASS: All deployments cleaned up

⚠️ Note: Only run this after v3 has been stable for a sufficient period. Once removed, you'll need to redeploy if you need to rollback.

💡 Note: This step can be run independently with: ./ops/test.sh test_step6_cleanup microservice-1

Summary: Complete Reference

Commands Quick Reference

Terminal - All commands
# Clone the repo and enter the ops directory
git clone https://github.com/tonitaco/tonita-oss.git
cd tonita-oss/microservices-deployment-stack

# Deploy a version (starts with 0% traffic)
./ops/run.sh start microservice-1 v1

# List active versions and traffic distribution
./ops/run.sh list microservice-1

# Update traffic: Edit deployments/TRAFFIC.yaml, then:
./ops/run.sh update_traffic microservice-1 v1

# Check pod health
./ops/run.sh ping microservice-1
./ops/run.sh ping microservice-1 v1  # Specific version

# Remove a specific version
./ops/run.sh delete_version microservice-1 v1

# Remove entire service (all versions)
./ops/run.sh delete_service microservice-1

# Test traffic distribution
export GATEWAY_ENDPOINT="http://YOUR-GATEWAY-IP/api/version"
./ops/test.sh test_traffic

# Test specific version (header bypass)
./ops/test.sh test_v3 microservice-1

# Run complete workflow tests
./ops/test.sh test_step0_baseline microservice-1  # Setup baseline
./ops/test.sh test_step1_baseline microservice-1  # Verify baseline
./ops/test.sh test_step2_deploy_v3 microservice-1 # Deploy v3
./ops/test.sh test_step3_add_v3 microservice-1    # Add v3 to rotation
./ops/test.sh test_step4_retire_v2 microservice-1 # Retire v2
./ops/test.sh test_step5_retire_v1 microservice-1 # Retire v1
./ops/test.sh test_step6_cleanup microservice-1   # Clean up

Weight Cheat Sheet

┌────────────────────────────────────────────────────────────────┐
│                     WEIGHT CHEAT SHEET                         │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│  EQUAL SPLIT (3 versions):                                     │
│  v1=50, v2=50, v3=50 → Total=150 → 33.3% each                  │
│  v1=100, v2=100, v3=100 → Total=300 → 33.3% each (same!)       │
│                                                                │
│  RETIRE ONE VERSION:                                           │
│  v1=0, v2=50, v3=50 → Total=100 → v2=50%, v3=50%               │
│                                                                │
│  PROMOTE ONE VERSION:                                          │
│  v1=0, v2=50, v3=100 → Total=150 → v2=33.3%, v3=66.7%          │
│  v1=0, v2=25, v3=75 → Total=100 → v2=25%, v3=75%               │
│                                                                │
│  FULL ROLLOUT:                                                 │
│  v1=0, v2=0, v3=100 → Total=100 → v3=100%                      │
│                                                                │
│  QUICK ROLLBACK:                                               │
│  v1=0, v2=100, v3=0 → Total=100 → v2=100%                      │
│                                                                │
└────────────────────────────────────────────────────────────────┘

🎯 Key Takeaways: Weights are relative, not absolute. Ping checks pod health, not traffic routing. Keep old versions running for instant rollback. Test with headers before adding traffic weight. Traffic distribution tests may show variance with small sample sizes—this is normal and expected, especially when there are transient errors reducing the effective sample size (see "Understanding Traffic Distribution Variance" section above).

← Back to Tech Blog
Chats
No chat history yet