dbt sources.yml & staging model generator
Drop a Parquet, CSV, JSON, Excel or Avro file to turn its schema into a ready-to-paste dbt scaffold — a sources.yml, a stg_ staging model, and a schema.yml. The schema is inferred in your browser with DuckDB-WASM — only column names and types are used, and nothing is uploaded. Need warehouse CREATE TABLE & load commands instead?
Drop a file or click to browse
Parquet, Avro, CSV, TSV, JSON, JSONL, Excel — read locally to infer the schema, never uploaded
Frequently asked questions
- What does this generate?
- From a file's inferred schema it produces a dbt sources.yml (the source definition), a staging model (models/staging/stg_<table>.sql that selects from that source), and a schema.yml models block documenting each column with its data type — plus the underlying CREATE TABLE for reference.
- What's the difference between sources.yml and schema.yml?
- sources.yml declares a raw source table so models can reference it with source('schema', 'table'). schema.yml (a models block) documents a model's columns and is where you add tests and descriptions. This tool emits both, wired together: the staging model reads from the source declared in sources.yml.
- Does it work with dbt Core and dbt Cloud?
- Yes — the output is plain dbt YAML and SQL that drop straight into a models/ directory in any dbt project (Core or Cloud). Adjust the source database/schema placeholders in sources.yml to match your warehouse.
- How are the column types determined?
- The schema is inferred in your browser with DuckDB-WASM by reading the file. Column names and their DuckDB types are written into the YAML data_type fields — a strong starting point you can refine per warehouse.
- Is my file uploaded?
- No. Everything runs locally via DuckDB-WASM. Only the column names and types are used to build the YAML and SQL — your data never leaves your device.