Best Practices
This guide provides recommendations for running effective and reliable performance tests with virtbench.
Important Notice:
- Do not run these benchmarks directly in your production environment without thorough testing first.
- Always test in a non-production environment to understand the impact and behavior.
- Test results will vary significantly based on your underlying infrastructure, including hardware specifications, storage backend, network configuration, and cluster resources.
- Use at your own risk.
General Testing Practices
1. Start Small, Scale Gradually
- Begin with 5-10 VMs to validate your setup
- Gradually increase to 50, 100, 200+ VMs
- Identify bottlenecks at each scale
- Understand your infrastructure limits before large-scale tests
2. Run Multiple Tests
- Run each test at least 3 times for consistency
- Average results across multiple runs
- Identify and investigate outliers
- Account for cluster variability
3. Save Results Consistently
Always use --save-results to track performance over time:
virtbench datasource-clone \
--start 1 \
--end 50 \
--storage-class YOUR-STORAGE-CLASS \
--save-results \
--storage-version 3.2.0
4. Use Meaningful Test Names
Organize results with storage version and configuration details:
--storage-version "portworx-3.2.0"
--storage-version "ceph-rbd-17.2"
5. Monitor Cluster Resources
Watch cluster resources during tests:
# In a separate terminal
watch kubectl top nodes
# Check storage backend metrics
# (specific to your storage solution)
VM Creation Testing
1. Validate Cluster First
Always run cluster validation before testing:
virtbench validate-cluster --storage-class YOUR-STORAGE-CLASS
2. Use Appropriate Concurrency
- Default concurrency (50) works for most scenarios
- Increase for large-scale tests (100-200 VMs)
- Decrease if experiencing resource contention
virtbench datasource-clone \
--start 1 \
--end 200 \
--concurrency 200 \
--storage-class YOUR-STORAGE-CLASS
3. Namespace Batch Creation
Create namespaces in batches for faster setup:
virtbench datasource-clone \
--start 1 \
--end 100 \
--namespace-batch-size 50 \
--storage-class YOUR-STORAGE-CLASS
4. Boot Storm Testing
- Test both single-node and multi-node boot storms
- Start with smaller VM counts (20-30)
- Gradually increase to find capacity limits
- Compare initial creation vs boot storm performance
Migration Testing
1. Verify VMs Before Migration
Ensure VMs are healthy before starting migration tests:
# Check VM status
kubectl get vm -n kubevirt-perf-test-1
# Verify network connectivity
kubectl exec -it ssh-test-pod -- ping <vm-ip>
2. Choose Appropriate Migration Scenario
- Sequential: For baseline performance
- Parallel: For stress testing
- Evacuation: For node maintenance scenarios
- Round-robin: For load balancing validation
3. Set Realistic Timeouts
Adjust timeouts based on VM size and network:
virtbench migration \
--start 1 \
--end 10 \
--migration-timeout 600 \ # 10 minutes for large VMs
--source-node worker-1
Chaos Benchmark Testing
1. Understand Your Goals
- Stress Test Cluster: Run concurrent chaos operations
- Test Specific Scenarios: Use
--max-iterationsto limit test duration - Skip Unsupported Features: Use
--skip-resize,--skip-clone, or--skip-snapshotif needed
2. Start with Conservative Settings
virtbench chaos-benchmark \
--storage-class YOUR-STORAGE-CLASS \
--concurrency 2 \
--vms 5 \
--max-iterations 5
3. Monitor for Failures
- Watch for resource exhaustion
- Check storage backend health
- Monitor node resources
- Review logs for errors
Failure Recovery Testing
1. Test in Non-Production First
- Validate FAR configuration in test environment
- Understand recovery behavior before production use
- Document expected recovery times
2. Use Appropriate Timeouts
Set timeouts based on your RTO requirements:
virtbench failure-recovery \
--start 1 \
--end 10 \
--recovery-timeout 600 # 10 minutes
3. Clean Up FAR Resources
Always clean up FAR resources after testing:
virtbench failure-recovery \
--start 1 \
--end 10 \
--cleanup \
--cleanup-vms
Logging and Debugging
1. Use Appropriate Log Levels
- INFO: Normal operation (default)
- DEBUG: Detailed troubleshooting
- WARNING: Important issues only
- ERROR: Critical errors only
2. Save Logs to Files
virtbench datasource-clone \
--start 1 \
--end 50 \
--storage-class YOUR-STORAGE-CLASS \
--log-file test-$(date +%Y%m%d-%H%M%S).log
3. Review Logs After Tests
- Check for errors and warnings
- Identify performance bottlenecks
- Validate test completion
Cleanup Practices
1. Always Clean Up After Tests
virtbench datasource-clone \
--start 1 \
--end 50 \
--storage-class YOUR-STORAGE-CLASS \
--cleanup
2. Use Dry Run First
Preview cleanup before executing:
virtbench datasource-clone \
--start 1 \
--end 50 \
--storage-class YOUR-STORAGE-CLASS \
--dry-run-cleanup
3. Clean Up on Failure
Ensure resources are cleaned up even if tests fail:
virtbench datasource-clone \
--start 1 \
--end 50 \
--storage-class YOUR-STORAGE-CLASS \
--cleanup-on-failure
Results Management
1. Organize Results by Version
Use --storage-version to organize results:
--storage-version "portworx-3.2.0"
2. Generate Dashboards Regularly
Create dashboards after each test run:
python3 dashboard/generate_dashboard.py --days 30
3. Archive Important Results
- Save dashboard HTML files
- Keep JSON/CSV results for historical comparison
- Document test conditions and configurations
Performance Optimization
1. Tune for Your Environment
- Adjust concurrency based on cluster size
- Optimize namespace batch size
- Configure appropriate timeouts
2. Minimize External Load
- Run tests when cluster is not under load
- Avoid running multiple tests simultaneously
- Ensure storage backend is not saturated
3. Use Consistent Test Conditions
- Same time of day
- Same cluster state
- Same resource availability
See Also
- Configuration Options - All available options
- Troubleshooting - Common issues and solutions
- Output and Results - Understanding test output
- Cleanup Guide - Cleanup procedures