Skip to main content

Docker Testing

Overview

Testing Sensyze Dataflow within Docker containers ensures production parity.

Running Tests in Docker

Run All Tests

make dev
docker compose exec dataflow-server pytest tests/

Run Dask Tests

make test-dask

Run Specific Test File

docker compose exec dataflow-server pytest tests/test_distributed_dask.py -v

Testing Commands Reference

CommandDescription
make test-dask-quickQuick speedup test (~10 seconds)
make test-daskAll Dask tests (~45 seconds)
make test-dask-allAll tests with detailed output
make test-adapterDataFrame adapter tests
make test-coverageAll tests with coverage

Test Coverage

Unit Tests

  • DataFrame adapter (Pandas ↔ Dask)
  • Pipeline runner components
  • Node executors

Integration Tests

  • End-to-end pipeline execution
  • Database connections
  • API endpoints

Distributed Tests

  • Parallel execution speedup
  • Dask client initialization
  • Error handling in parallel

Manual Testing

Test API Endpoint

curl -X POST http://dataflow-server:8000/python/test \
-H "Content-Type: application/json" \
-d '{"code": "def transform(df): return df", "input_data": [{"a": 1}]}'

Test Pipeline Execution

curl -X POST http://dataflow-server:8000/pipelines/run \
-H "Content-Type: application/json" \
-d '{"pipeline_id": "uuid"}'

CI/CD Integration

# .github/workflows/test.yml
name: Tests

on: [push, pull_request]

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run tests
run: docker compose up --build test

Troubleshooting

Tests timeout

  • Increase pytest timeout: pytest --timeout=300

Dask tests fail

  • Check Dask can initialize: pytest tests/test_distributed_dask.py::test_dask_client_initialization

Import errors

  • Ensure all dependencies are in requirements.txt
  • Rebuild container: make dev-fresh