Config Reference — Overview¶
Every Ubunye task is driven by a single config.yaml file.
This page explains the top-level structure. Each section links to a dedicated reference page.
Top-level keys¶
MODEL: etl # required — job type: etl | ml
VERSION: "1.0.0" # required — semver string
ENGINE: ... # optional — Spark settings and per-profile overrides
CONFIG: ... # required — inputs, transform, outputs
ORCHESTRATION: ... # optional — Airflow / Databricks / Prefect / Dagster metadata
| Key | Type | Required | Description |
|---|---|---|---|
MODEL |
etl | ml |
Yes | Declares the job type |
VERSION |
semver string | Yes | Pipeline version (MAJOR.MINOR.PATCH) |
ENGINE |
EngineConfig | No | Spark conf + per-profile overrides |
CONFIG |
TaskConfig | Yes | Inputs, transform, outputs |
ORCHESTRATION |
OrchestrationConfig | No | Export metadata for orchestrators |
Jinja templating¶
Config values are rendered through Jinja2 before Pydantic validation. You can use environment variables, CLI-injected variables, and filters anywhere in the YAML:
CONFIG:
inputs:
events:
format: hive
db_name: "{{ env.HIVE_DB | default('raw') }}"
tbl_name: events_{{ dt | default('2024-01-01') | replace('-', '_') }}
See Jinja Templating for all supported syntax.
Validation¶
Ubunye validates the rendered YAML against strict Pydantic v2 models. Run validation before deploying:
Reference pages¶
| Section | Description |
|---|---|
| Inputs & Outputs | CONFIG.inputs and CONFIG.outputs — connector format and options |
| Engine & Profiles | ENGINE — Spark conf and dev/staging/prod profiles |
| Transform | CONFIG.transform — noop, task, model, and custom types |
| Orchestration | ORCHESTRATION — schedule, retries, tags, platform-specific settings |
| Jinja Templating | Variable interpolation, filters, and best practices |