Skip to main content

Environment Setup

Overview

Environment variables in Sensyze Dataflow are organized into three files for different environments:

FilePurposeUsage
.envCommon/Base configurationShared by all environments. Loaded first.
.env.backendDevelopment overridesLocal development settings with relaxed timeouts.
.env.backend.prodProduction overridesProduction settings with strict timeouts and optimizations.

File Organization

Load Order

# Development
source .env && source .env.backend

# Production
source .env && source .env.backend.prod

The base .env is loaded first, then environment-specific files override values.

What Goes Where

.env (Base/Common)

  • Database credentials (Supabase URLs, keys)
  • Third-party API keys (Gemini, GitHub, etc.)
  • Service discovery endpoints (Redis, Temporal, etc.)
  • Feature flags and thresholds
  • Base timeout values
  • Frontend variables (NEXTPUBLIC*)

.env.backend (Development Only)

  • DEBUG logging levels
  • Local service hosts (localhost)
  • Relaxed timeouts (60s+ for development)
  • Development URLs
  • Extended pipeline timeouts (2 hours)

.env.backend.prod (Production Only)

  • Minimal logging (INFO/WARNING only)
  • Production service hosts
  • Strict timeouts (3-15 seconds)
  • Production URLs
  • SSL/TLS settings
  • 30-minute pipeline timeout for production safety

Setup Instructions

1. Clone Base Configuration

cp .env.example .env

Then fill in your Supabase and API credentials:

# Edit .env and set:
SUPABASE_DATABASE_URL=postgresql://...
SUPABASE_SERVICE_ROLE_KEY=...
GEMINI_API_KEY=...
# etc.

2. Development Setup

cp .env.backend.example .env.backend

The .env.backend file is already configured for local development:

  • Points to localhost services
  • Uses DEBUG logging
  • Has relaxed timeouts

Run with:

cd dataflow-server
source ../.env && source ../.env.backend
python -m api # or uvicorn api:app --reload

Or in Docker:

docker compose up --build

3. Production Deployment

The .env.backend.prod file is configured for production:

  • Strict timeouts prevent resource exhaustion
  • Minimal logging for performance
  • Points to production infrastructure

Critical Environment Variables by Service

Temporal (Workflow Orchestration)

TEMPORAL_HOST=temporal:7233              # Server address
TEMPORAL_NAMESPACE=default # Workflow namespace
TEMPORAL_TASK_QUEUE=dataflow-task-queue # Task queue name
TEMPORAL_CONNECT_TIMEOUT=30 # Connection timeout (sec)

Dask (Distributed Computing)

DASK_SCHEDULER_ADDRESS=                  # Leave empty for LocalCluster
DATA_PROCESSING_DASK_THRESHOLD=10000 # Rows threshold for Dask
DASK_CONNECT_TIMEOUT=10 # Connection timeout (sec)

Database (Supabase/PostgreSQL)

SUPABASE_DATABASE_URL=postgresql://...  # Connection string
DB_INIT_TIMEOUT=15 # Init timeout (sec)
DB_WRITE_TIMEOUT=5.0 # Write timeout (sec)

Redis (Cache)

REDIS_HOST=redis                # Service host
REDIS_PORT=6379 # Redis port

Logging

LOG_LEVEL=INFO                           # Global level
LOG_LEVEL_API=INFO # API-specific
LOG_LEVEL_PIPELINE_RUNNER=INFO # Pipeline-specific

Timeouts Summary

ServiceDevelopmentProductionPurpose
TEMPORAL_CONNECT_TIMEOUT60s15sConnect to Temporal
DB_INIT_TIMEOUT30s10sDB connection init
DB_WRITE_TIMEOUT10s3sDatabase writes
SUPABASE_INIT_TIMEOUT20s8sSupabase client init
SUPABASE_POLL_TIMEOUT20s8sSupabase queries
LOGGER_WORKER_TIMEOUT60s15sLog flush operations
PIPELINE_EXECUTION_TIMEOUT2h30mPipeline runtime

Common Tasks

Add a New Environment Variable

  1. Add to .env if it's shared (database URL, API key, etc.)
  2. Add to .env.example for documentation
  3. Update local .env.backend if development-specific
  4. Update local .env.backend.prod if production-specific

Enable Debug Mode

Add to .env.backend:

LOG_LEVEL=DEBUG
EXECUTION_MODE=debug

Validate Environment

python -c "
import os
required = ['SUPABASE_DATABASE_URL', 'TEMPORAL_HOST', 'REDIS_HOST']
for var in required:
value = os.getenv(var)
status = '✓' if value else '✗'
print(f'{status} {var}: {value[:50] if value else \"MISSING\"}')
"

Secrets Management Best Practices

Never Commit Secrets

These files should be in .gitignore:

  • .env (if it contains real credentials)
  • .env.backend (if it contains real credentials)
  • .env.backend.prod (ALWAYS - production secrets)

Production Secrets Injection

Kubernetes:

- name: SUPABASE_DATABASE_URL
valueFrom:
secretKeyRef:
name: sensyze-secrets
key: supabase-url

Docker/Compose:

docker run \
--env-file .env \
-e SUPABASE_DATABASE_URL="$(aws secretsmanager get-secret-value --secret-id supabase-url)" \
sensyze-dataflow