Skip to main content

Introduction

Sensyze Dataflow is a visual ETL/ELT platform for building, running, and observing data pipelines. It combines a React Flow-based mapper with a FastAPI backend, Temporal workflows, and a hybrid Pandas/Dask execution engine. Supabase provides auth, storage, and the primary database.

What It Does

  • Visual pipeline builder for sources, transforms, and destinations
  • Temporal-orchestrated execution with parallel layers
  • Hybrid data processing via DataFrameAdapter (Pandas for small data, Dask for large)
  • Observability with per-node logs, samples, and metrics
  • Usage and billing controls (bi-weekly minutes, AI ops limit, Stripe credits)

Architecture

graph TD
User((User)) --> FE[Next.js Frontend]
User --> MKT[Marketing Site]

FE <-> API[FastAPI API]
API <-> Supabase["Supabase Postgres + Auth"]
API <-> Storage["Supabase Storage / Local Storage"]

API <-> Temporal[Temporal Server]
Temporal <-> Worker[Temporal Worker]
Worker <-> Runner[PipelineRunner]
Worker <-> Redis[Redis Cache]

Runner --> Adapter[DataFrameAdapter]
Adapter --> Pandas[Pandas]
Adapter --> Dask[Dask Cluster]
Runner --> DuckDB["DuckDB (staging)"]

Worker --> Obs[Observability Logger]
Obs --> SQLite["Observability SQLite"]

Performance Strategy

  • < 10,000 rows: Pandas for low overhead
  • = 10,000 rows: Dask for parallel execution

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Node.js 18+
  • Python 3.12+

Configure Environment

cp .env.frontend.example .env.frontend
cp .env.backend.example .env.backend

Fill in Supabase, Stripe, and other service keys as needed.

make dev

Common commands:

make dev
make dev-fresh
make dev-clean
make status

Local URLs

  • App: http://localhost:3000
  • API Docs: http://localhost:8000/docs
  • Temporal UI: http://localhost:8080
  • Dask Dashboard: http://localhost:8787

Repository Layout

dataflow-server/   # FastAPI, Temporal worker, pipeline engine
frontend/ # Next.js app (mapper, jobs, accounts)
marketing/ # Static marketing site
supabase/ # Supabase config
supabase_migrations/ # Database migrations
diagrams/ # Mermaid sequence diagrams
docs/ # Architecture, ops, and product docs

Tech Stack

Frontend (User Interface)

  • Framework: Next.js 14 (App Router)
  • Visual Engine: React Flow (DAG visualization and manipulation)
  • Styling: Tailwind CSS (Premium Dark Mode aesthetics)
  • State Management: Zustand (Persisted drafts and UI state)

Backend (Data Flow Server)

  • Framework: FastAPI (Python 3.12)
  • Execution Engine: Custom PipelineRunner supporting topological execution and async I/O.
  • Compute Strategy:
    • Single Node: Pandas for smaller datasets (< 10k rows).
    • Distributed: Dask for large-scale processing (>= 10k rows).
  • Embedded Database: DuckDB (Staging intermediate results and SQL transformations).
  • Scheduling: Temporal + Redis (Cache).

Integrations

  • Storage/DB: Supabase (PostgreSQL for metadata, Auth for users).
  • Connectors: REST, SQL, CSV/JSON, Database (Postgre, MySQL, MongoDB, Snowflake, BigQuery), SaaS (Salesforce, Stripe, HubSpot).