Transform¶
CONFIG.transform declares what happens between reading inputs and writing outputs.
Structure¶
CONFIG:
transform:
type: noop # required — transform type name
params: {} # optional — passed to the transform plugin
Built-in transform types¶
noop — pass-through¶
The default transform. Passes all inputs straight to outputs with no modification. Useful for pure connector moves (e.g. Hive → Delta).
task — Python Task class¶
Loads transformations.py from the task directory and calls the Task subclass.
transformations.py:
from ubunye.core.interfaces import Task
class MyTask(Task):
def transform(self, sources: dict) -> dict:
df = sources["input_name"]
cleaned = df.filter("value IS NOT NULL").dropDuplicates(["id"])
return {"output_name": cleaned}
The logical names in sources and the returned dict must match the keys declared
under CONFIG.inputs and CONFIG.outputs.
model — ML model train/predict¶
Runs a UbunyeModel subclass for training or inference.
See Model Contract and Model Registry.
transform:
type: model
params:
action: train # train | predict
model_class: "model.FraudRiskModel" # module.ClassName (model.py in task dir)
registry:
store: ".ubunye/model_store"
use_case: fraud_detection
auto_version: true
promote_to: staging
promotion_gates:
min_auc: 0.85
min_f1: 0.80
ModelTransformParams fields¶
| Field | Type | Default | Description |
|---|---|---|---|
action |
train | predict |
required | Whether to train or score |
model_class |
string | required | module.ClassName of the UbunyeModel subclass |
model_dir |
string | null |
Directory containing the model file; defaults to task dir |
model_path |
string | null |
Path to saved artifact (used for predict without registry) |
input_name |
string | null |
Key in inputs dict to use as training/scoring data |
registry |
RegistryConfig | null |
Model registry settings |
RegistryConfig fields¶
| Field | Type | Default | Description |
|---|---|---|---|
store |
string | required | Filesystem path for the model store |
use_case |
string | "default" |
Logical grouping for the model |
version |
string | null |
Explicit version; auto-generated if null |
auto_version |
bool | true |
Bump patch version automatically |
promote_to |
development | staging | production |
null |
Promote after registration |
use_stage |
development | staging | production |
"production" |
Stage to load from (predict only) |
promotion_gates |
dict | null |
Metric thresholds that must pass before promotion |
Custom transforms¶
Any class registered under the ubunye.transforms entry point can be used as a type.
See Writing a Plugin.
Transform params and extra fields¶
TransformConfig.params is a Dict[str, Any], so you can pass arbitrary keys:
transform:
type: my_custom_transform
params:
window_days: 30
feature_cols: [f1, f2, f3]
threshold: 0.5
These are passed directly to your plugin's apply(inputs, cfg, backend) call as cfg.