Config Reference — Overview¶
Every Ubunye task is driven by a single config.yaml file.
This page explains the top-level structure. Each section links to a dedicated reference page.
Top-level keys¶
MODEL: etl # optional — job type: etl | ml (defaults to etl)
VERSION: "1.0.0" # optional — semver string (defaults to "0.0.0-dev")
ENGINE: ... # optional — Spark settings and per-profile overrides
CONFIG: ... # required — inputs, transform, outputs
ORCHESTRATION: ... # optional — Airflow / Databricks / Prefect / Dagster metadata
| Key | Type | Required | Default | Description |
|---|---|---|---|---|
MODEL |
etl | ml |
No | etl |
Declares the job type |
VERSION |
semver string | No | "0.0.0-dev" |
Pipeline version (MAJOR.MINOR.PATCH, optional -prerelease suffix) |
ENGINE |
EngineConfig | No | — | Spark conf + per-profile overrides |
CONFIG |
TaskConfig | Yes | — | Inputs, transform, outputs |
ORCHESTRATION |
OrchestrationConfig | No | — | Export metadata for orchestrators |
Set MODEL and VERSION explicitly in production pipelines where job type or
version is load-bearing (lineage records, model registry, orchestrator
metadata). For quick local iteration the defaults are fine.
Jinja templating¶
Config values are rendered through Jinja2 before Pydantic validation. You can use environment variables, CLI-injected variables, and filters anywhere in the YAML:
CONFIG:
inputs:
events:
format: hive
db_name: "{{ env.HIVE_DB | default('raw') }}"
tbl_name: events_{{ dt | default('2024-01-01') | replace('-', '_') }}
See Jinja Templating for all supported syntax.
Validation¶
Ubunye validates the rendered YAML against strict Pydantic v2 models. Unknown fields are rejected with typo suggestions; undefined Jinja variables fail immediately. Run validation before deploying:
Strict field checking¶
All config sections use extra="forbid" — a typo like ENGNE triggers a
ConfigFieldError with a "Did you mean?" suggestion. The one exception is
IOConfig (inputs/outputs entries), which allows extra fields so that
connector plugins can read plugin-specific keys like headers, pagination,
auth, etc.
Strict template variables¶
The Jinja resolver uses StrictUndefined — any {{ var }} that isn't
provided as a CLI variable or environment variable raises immediately instead
of silently producing an empty string. Use | default(...) for optional
values.
Reference pages¶
| Section | Description |
|---|---|
| Inputs & Outputs | CONFIG.inputs and CONFIG.outputs — connector format and options |
| Engine & Profiles | ENGINE — Spark conf and dev/staging/prod profiles |
| Transform | CONFIG.transform — noop, task, model, and custom types |
| Orchestration | ORCHESTRATION — schedule, retries, tags, platform-specific settings |
| Jinja Templating | Variable interpolation, filters, and best practices |