Environment Setup
Overview
Environment variables in Sensyze Dataflow are organized into three files for different environments:
| File | Purpose | Usage |
|---|---|---|
.env | Common/Base configuration | Shared by all environments. Loaded first. |
.env.backend | Development overrides | Local development settings with relaxed timeouts. |
.env.backend.prod | Production overrides | Production settings with strict timeouts and optimizations. |
File Organization
Load Order
# Development
source .env && source .env.backend
# Production
source .env && source .env.backend.prod
The base .env is loaded first, then environment-specific files override values.
What Goes Where
.env (Base/Common)
- Database credentials (Supabase URLs, keys)
- Third-party API keys (Gemini, GitHub, etc.)
- Service discovery endpoints (Redis, Temporal, etc.)
- Feature flags and thresholds
- Base timeout values
- Frontend variables (NEXTPUBLIC*)
.env.backend (Development Only)
- DEBUG logging levels
- Local service hosts (localhost)
- Relaxed timeouts (60s+ for development)
- Development URLs
- Extended pipeline timeouts (2 hours)
.env.backend.prod (Production Only)
- Minimal logging (INFO/WARNING only)
- Production service hosts
- Strict timeouts (3-15 seconds)
- Production URLs
- SSL/TLS settings
- 30-minute pipeline timeout for production safety
Setup Instructions
1. Clone Base Configuration
cp .env.example .env
Then fill in your Supabase and API credentials:
# Edit .env and set:
SUPABASE_DATABASE_URL=postgresql://...
SUPABASE_SERVICE_ROLE_KEY=...
GEMINI_API_KEY=...
# etc.
2. Development Setup
cp .env.backend.example .env.backend
The .env.backend file is already configured for local development:
- Points to
localhostservices - Uses DEBUG logging
- Has relaxed timeouts
Run with:
cd dataflow-server
source ../.env && source ../.env.backend
python -m api # or uvicorn api:app --reload
Or in Docker:
docker compose up --build
3. Production Deployment
The .env.backend.prod file is configured for production:
- Strict timeouts prevent resource exhaustion
- Minimal logging for performance
- Points to production infrastructure
Critical Environment Variables by Service
Temporal (Workflow Orchestration)
TEMPORAL_HOST=temporal:7233 # Server address
TEMPORAL_NAMESPACE=default # Workflow namespace
TEMPORAL_TASK_QUEUE=dataflow-task-queue # Task queue name
TEMPORAL_CONNECT_TIMEOUT=30 # Connection timeout (sec)
Dask (Distributed Computing)
DASK_SCHEDULER_ADDRESS= # Leave empty for LocalCluster
DATA_PROCESSING_DASK_THRESHOLD=10000 # Rows threshold for Dask
DASK_CONNECT_TIMEOUT=10 # Connection timeout (sec)
Database (Supabase/PostgreSQL)
SUPABASE_DATABASE_URL=postgresql://... # Connection string
DB_INIT_TIMEOUT=15 # Init timeout (sec)
DB_WRITE_TIMEOUT=5.0 # Write timeout (sec)
Redis (Cache)
REDIS_HOST=redis # Service host
REDIS_PORT=6379 # Redis port
Logging
LOG_LEVEL=INFO # Global level
LOG_LEVEL_API=INFO # API-specific
LOG_LEVEL_PIPELINE_RUNNER=INFO # Pipeline-specific
Timeouts Summary
| Service | Development | Production | Purpose |
|---|---|---|---|
| TEMPORAL_CONNECT_TIMEOUT | 60s | 15s | Connect to Temporal |
| DB_INIT_TIMEOUT | 30s | 10s | DB connection init |
| DB_WRITE_TIMEOUT | 10s | 3s | Database writes |
| SUPABASE_INIT_TIMEOUT | 20s | 8s | Supabase client init |
| SUPABASE_POLL_TIMEOUT | 20s | 8s | Supabase queries |
| LOGGER_WORKER_TIMEOUT | 60s | 15s | Log flush operations |
| PIPELINE_EXECUTION_TIMEOUT | 2h | 30m | Pipeline runtime |
Common Tasks
Add a New Environment Variable
- Add to
.envif it's shared (database URL, API key, etc.) - Add to
.env.examplefor documentation - Update local
.env.backendif development-specific - Update local
.env.backend.prodif production-specific
Enable Debug Mode
Add to .env.backend:
LOG_LEVEL=DEBUG
EXECUTION_MODE=debug
Validate Environment
python -c "
import os
required = ['SUPABASE_DATABASE_URL', 'TEMPORAL_HOST', 'REDIS_HOST']
for var in required:
value = os.getenv(var)
status = '✓' if value else '✗'
print(f'{status} {var}: {value[:50] if value else \"MISSING\"}')
"
Secrets Management Best Practices
Never Commit Secrets
These files should be in .gitignore:
.env(if it contains real credentials).env.backend(if it contains real credentials).env.backend.prod(ALWAYS - production secrets)
Production Secrets Injection
Kubernetes:
- name: SUPABASE_DATABASE_URL
valueFrom:
secretKeyRef:
name: sensyze-secrets
key: supabase-url
Docker/Compose:
docker run \
--env-file .env \
-e SUPABASE_DATABASE_URL="$(aws secretsmanager get-secret-value --secret-id supabase-url)" \
sensyze-dataflow