Integration Test Suite Specification
Date: November 7, 2025
Branch: testing/integration-suite
Status: Design Complete - Ready for Implementation
Executive Summary
Comprehensive specification for 55+ integration tests covering complete workflows across all DGX Music components. This document serves as the implementation blueprint for the integration testing agent.
Test Suite Overview
Objectives
- Validate end-to-end workflows (API → generation → export → database)
- Verify audio quality compliance (WAV format, loudness, no clipping)
- Test error handling and recovery scenarios
- Benchmark performance against MVP targets
- Ensure database consistency and integrity
Coverage Goals
- Target Coverage: 92%+
- Test Count: 55+ tests
- Execution Time: <5 minutes
- Success Rate: 100% (all tests must pass)
Test Modules Specification
1. test_e2e_complete.py (15 tests)
Purpose: Test complete end-to-end workflows
Test Classes:
TestCompleteWorkflow (7 tests)
test_simple_generation_to_database()
# Verify: Generate → Save → Database → Retrieve
test_complete_workflow_with_export()
# Verify: Generate → Export → Database with metadata
test_workflow_with_metadata_extraction()
# Verify: Full pipeline with BPM/key detection
test_multiple_generations_sequential()
# Verify: Sequential generation consistency
test_workflow_with_different_durations()
# Verify: 4s, 8s, 16s, 30s durations
test_workflow_with_status_transitions()
# Verify: PENDING → PROCESSING → COMPLETED
test_concurrent_database_writes()
# Verify: Concurrent operations don't conflictTestAsyncJobQueue (3 tests)
test_job_status_polling()
# Verify: Client can poll job status
test_retrieving_completed_jobs()
# Verify: Completed jobs retrievable
test_pending_jobs_queue()
# Verify: Pending jobs queued correctlyTestFileAndDatabaseSync (5 tests)
test_file_creation_matches_database()
# Verify: Created files match DB records
test_database_records_match_files()
# Verify: All DB records have files
test_wav_file_playable()
# Verify: Generated WAV files are valid
test_complete_workflow_quality_check()
# Verify: Quality checks pass
test_concurrent_database_writes()
# Verify: No race conditions2. test_audio_quality.py (10 tests)
Purpose: Validate audio quality and format compliance
TestWAVFormat (4 tests)
test_wav_format_pcm16_32khz_stereo()
# Verify: PCM_16, 32kHz, stereo format
test_wav_file_not_corrupted()
# Verify: No NaN/Inf values
test_audio_duration_matches_request()
# Verify: Duration within ±1s
test_stereo_channels_present()
# Verify: 2 channels presentTestLoudnessNormalization (3 tests)
test_loudness_within_target_range()
# Verify: -16 LUFS ±1
test_no_clipping()
# Verify: Peak < 0.99
test_normalization_consistent_across_files()
# Verify: Batch consistencyTestAudioProperties (3 tests)
test_metadata_extraction_accuracy()
# Verify: Accurate metadata
test_audio_statistics()
# Verify: Valid peak/RMS/dynamic range
test_stereo_balance()
# Verify: Channels balanced3. test_error_scenarios.py (12 tests)
Purpose: Test error handling and edge cases
TestInvalidInputs (7 tests)
test_empty_prompt()
test_too_long_prompt()
test_special_characters_in_prompt()
test_negative_duration()
test_zero_duration()
test_too_long_duration()
test_invalid_model_name()TestResourceFailures (4 tests)
test_disk_full_simulation()
test_output_directory_missing()
test_file_permission_error()
test_database_connection_failure()TestGPUFallback (2 tests)
test_cuda_unavailable_cpu_fallback()
test_gpu_memory_error_handling()4. test_performance.py (8 tests)
Purpose: Performance benchmarking
TestGenerationLatency (3 tests)
test_16s_generation_under_30s()
# Target: <30s for 16s audio
test_real_generation_performance() # @pytest.mark.gpu
# Target: <30s on GPU
test_multiple_short_generations_throughput()
# Measure: Jobs per minuteTestAPIResponseTime (3 tests)
test_database_query_performance()
# Target: <100ms
test_status_check_performance()
# Target: <50ms
test_bulk_query_performance()
# Target: <200ms for 100 recordsTestMemoryUsage (2 tests)
test_gpu_memory_under_budget() # @pytest.mark.gpu
# Target: <30GB peak
test_memory_cleanup_after_generation()
# Verify: No memory leaks5. test_database_consistency.py (10 tests)
Purpose: Database consistency and integrity
TestDatabaseIntegrity (5 tests)
test_all_generations_have_database_records()
test_completed_generations_have_files()
test_database_records_match_file_properties()
test_foreign_key_constraints()
test_unique_constraints()TestTransactionHandling (2 tests)
test_transaction_rollback_on_failure()
test_partial_completion_rollback()TestOrphanedData (3 tests)
test_detect_orphaned_files()
test_detect_orphaned_database_records()
test_cleanup_orphaned_files()Test Utilities Specification
audio_helpers.py
Functions:
validate_wav_file(file_path, expected_sample_rate=32000,
expected_channels=2, expected_bit_depth='PCM_16')
# Returns: dict with validation results
measure_loudness(file_path)
# Returns: float (LUFS) or None
check_no_clipping(file_path, threshold=0.99)
# Returns: (has_clipping: bool, peak: float)
verify_audio_quality(file_path, target_lufs=-16.0,
lufs_tolerance=1.0, clip_threshold=0.99)
# Returns: dict with comprehensive quality metrics
compare_audio_properties(file_path, expected_duration,
duration_tolerance=1.0)
# Returns: dict with comparison results
create_test_audio_tensor(duration=1.0, sample_rate=32000,
channels=2, frequency=440.0)
# Returns: torch.Tensordb_helpers.py
Functions:
seed_test_generations(session, count=10, status_distribution=None)
# Returns: List[Generation]
verify_database_consistency(session)
# Returns: dict with consistency check results
get_orphaned_files(session, outputs_dir)
# Returns: List[Path]
get_orphaned_records(session, outputs_dir)
# Returns: List[Generation]
create_generation_with_file(session, prompt, file_path,
status=COMPLETED)
# Returns: Generation
cleanup_test_files(generations)
# Returns: int (count deleted)mock_helpers.py
Classes:
class MockMusicGenerationEngine:
"""Fast mock engine for testing without GPU"""
def __init__(self, generation_delay=0.1):
pass
def generate_audio(self, prompt, duration, **kwargs):
# Returns: (np.ndarray, int)
def generate(self, request):
# Returns: GenerationResultShared Fixtures (conftest.py)
Directory Fixtures
@pytest.fixture
def temp_dir(tmp_path) -> Path
@pytest.fixture
def output_dir(temp_dir) -> Path
@pytest.fixture
def data_dir(temp_dir) -> PathDatabase Fixtures
@pytest.fixture
def db_session(test_db_url)
@pytest.fixture
def clean_db_session(test_db_url)
@pytest.fixture
def seeded_db_session(clean_db_session)Engine Fixtures
@pytest.fixture
def mock_engine() -> MockMusicGenerationEngine
@pytest.fixture(scope="module")
def real_engine() # Requires GPU
@pytest.fixture
def mock_settings(output_dir, monkeypatch)Audio Fixtures
@pytest.fixture
def test_audio_file(output_dir) -> Path
@pytest.fixture
def test_audio_tensor() -> torch.TensorPerformance Baselines
Target Metrics
| Metric | Target | Blocker Threshold |
|---|---|---|
| 16s generation (GPU) | <30s | <60s |
| Database query | <100ms | <500ms |
| WAV export | <1s | <5s |
| Memory peak (GPU) | <20GB | <30GB |
| Test suite runtime | <3min | <5min |
Performance Test Output
Each performance test should output:
Generation time: 24.3s for 16s audio
Real-time factor: 1.52x
✓ EXCELLENT: Within 30s target
Coverage Requirements
Module Coverage Targets
| Module | Target | Critical |
|---|---|---|
| services/generation/ | 90% | Yes |
| services/storage/ | 94% | Yes |
| services/audio/ | 90% | Yes |
| Overall | 92% | Yes |
Coverage Report Format
pytest tests/integration/ --cov=services --cov-report=termExpected output:
Name Stmts Miss Cover
-----------------------------------------------------
services/audio/export.py 142 8 94%
services/audio/metadata.py 156 12 92%
services/generation/engine.py 198 15 92%
services/storage/database.py 145 6 96%
-----------------------------------------------------
TOTAL 641 41 94%
CI/CD Integration
GitHub Actions Workflow
name: Integration Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install pytest pytest-cov
- name: Run integration tests
run: |
pytest tests/integration/ -v -m "not gpu and not slow" \
--cov=services --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v3Implementation Checklist
Phase 1: Foundation
- Create
tests/utils/directory - Implement
audio_helpers.py(15 functions) - Implement
db_helpers.py(12 functions) - Implement
mock_helpers.py(MockEngine + 5 functions) - Create
conftest.pywith 20+ fixtures
Phase 2: Test Implementation
- Implement
test_e2e_complete.py(15 tests) - Implement
test_audio_quality.py(10 tests) - Implement
test_error_scenarios.py(12 tests) - Implement
test_performance.py(8 tests) - Implement
test_database_consistency.py(10 tests)
Phase 3: Documentation
- Create
TESTING_GUIDE.md(comprehensive guide) - Document running tests
- Document coverage analysis
- Document CI/CD integration
- Document troubleshooting
Phase 4: Validation
- Run all tests (should pass)
- Generate coverage report (should be >92%)
- Run performance benchmarks
- Verify test execution time (<5min)
Success Criteria
✅ All tests implemented: 55+ tests across 5 modules
✅ All tests pass: 100% success rate
✅ Coverage achieved: 92%+ overall coverage
✅ Performance validated: All targets met
✅ Documentation complete: Comprehensive guide
✅ CI/CD ready: Example workflows provided
Files to Create
tests/utils/__init__.pytests/utils/audio_helpers.py(~400 lines)tests/utils/db_helpers.py(~350 lines)tests/utils/mock_helpers.py(~350 lines)tests/conftest.py(~450 lines)tests/integration/test_e2e_complete.py(~400 lines)tests/integration/test_audio_quality.py(~300 lines)tests/integration/test_error_scenarios.py(~350 lines)tests/integration/test_performance.py(~400 lines)tests/integration/test_database_consistency.py(~350 lines)docs/TESTING_GUIDE.md(~700 lines)
Total: ~4,000 lines of test code and documentation
Status: Ready for implementation
Priority: High - Required for MVP completion
Dependencies: Week 1-2 code complete
Estimated Time: 4-6 hours for full implementation