In many domains, we have to run thousands of experiments to find plausible candidates.
Model(M)
Input
Dataset {Xi, Yi}
Objective function J(f) to evaluate model performance
Constraints: Data scientist time, accuracy, etc
Output
A trained model in the form y = f(x)
We can describe this in form of y = f(x; α)
Where set α = [ α ₀, α ₁, α ₂, …, αₙ] are parameters of model
Processing
Consider a vector θ. It includes all possible operations on data (e.g. ingestion, transformation, feature engineering, modeling, hyperparameter tuning)
θ = [ θ ₁, θ ₂, …, θ ₙ]
Note: For simplicity, we can consider all θ n as simple element operations. In elaborate settings, trees and graphs can be used to represent dependencies/hierarchy of operations.
We can define problem statement as - we have a pool of preprocessing methods, feature transformation methods, ML algorithms, and hyperparameters. The goal is to select the combination of knobs that produce the best results.
Goal
Efficiently find set of elements in θ that produce the best α
Enable building Orthogonal knobs O
Steps
Intelligently and efficiently determine a set of values in θ that will produce results.
Automate execution of θ vector to produce α and evaluate the result
Enable creating higher-level θs and build dials O[] control