DecisionFocusedLearningAlgorithms.AbstractAlgorithmDecisionFocusedLearningAlgorithms.AbstractImitationAlgorithmDecisionFocusedLearningAlgorithms.AbstractMetricDecisionFocusedLearningAlgorithms.AbstractPolicyDecisionFocusedLearningAlgorithms.AnticipativeImitationDecisionFocusedLearningAlgorithms.DAggerDecisionFocusedLearningAlgorithms.DFLPolicyDecisionFocusedLearningAlgorithms.DFLPolicyDecisionFocusedLearningAlgorithms.FYLLossMetricDecisionFocusedLearningAlgorithms.FYLLossMetricDecisionFocusedLearningAlgorithms.FunctionMetricDecisionFocusedLearningAlgorithms.FunctionMetricDecisionFocusedLearningAlgorithms.LossAccumulatorDecisionFocusedLearningAlgorithms.LossAccumulatorDecisionFocusedLearningAlgorithms.PeriodicMetricDecisionFocusedLearningAlgorithms.PeriodicMetricDecisionFocusedLearningAlgorithms.PeriodicMetricDecisionFocusedLearningAlgorithms.PerturbedFenchelYoungLossImitationDecisionFocusedLearningAlgorithms.TrainingContextBase.getpropertyBase.propertynamesDecisionFocusedLearningAlgorithms._store_metric_value!DecisionFocusedLearningAlgorithms._store_metric_value!DecisionFocusedLearningAlgorithms._store_metric_value!DecisionFocusedLearningAlgorithms._store_metric_value!DecisionFocusedLearningAlgorithms.compute!DecisionFocusedLearningAlgorithms.compute!DecisionFocusedLearningAlgorithms.evaluate!DecisionFocusedLearningAlgorithms.evaluate!DecisionFocusedLearningAlgorithms.evaluate!DecisionFocusedLearningAlgorithms.evaluate!DecisionFocusedLearningAlgorithms.evaluate_metrics!DecisionFocusedLearningAlgorithms.next_epoch!DecisionFocusedLearningAlgorithms.reset!DecisionFocusedLearningAlgorithms.reset!DecisionFocusedLearningAlgorithms.train_policyDecisionFocusedLearningAlgorithms.train_policyDecisionFocusedLearningAlgorithms.train_policyDecisionFocusedLearningAlgorithms.train_policy!DecisionFocusedLearningAlgorithms.train_policy!DecisionFocusedLearningAlgorithms.train_policy!DecisionFocusedLearningAlgorithms.train_policy!DecisionFocusedLearningAlgorithms.update!DecisionFocusedLearningAlgorithms.update!DecisionFocusedLearningAlgorithms.update!DecisionFocusedLearningBenchmarks.Utils.compute_gap
DecisionFocusedLearningAlgorithms.AbstractAlgorithm — Type
abstract type AbstractAlgorithmAn abstract type for decision-focused learning algorithms.
DecisionFocusedLearningAlgorithms.AbstractImitationAlgorithm — Type
abstract type AbstractImitationAlgorithm <: AbstractAlgorithmAn abstract type for imitation learning algorithms.
All subtypes must implement:
train_policy!(algorithm::AbstractImitationAlgorithm, policy::DFLPolicy, train_data; epochs, metrics)
DecisionFocusedLearningAlgorithms.AbstractMetric — Type
abstract type AbstractMetricAbstract base type for all metrics used during training.
All concrete metric types should implement:
evaluate!(metric, context): Evaluate the metric given a training context
See also
DecisionFocusedLearningAlgorithms.AbstractPolicy — Type
abstract type AbstractPolicyAbstract type for policies used in decision-focused learning.
DecisionFocusedLearningAlgorithms.AnticipativeImitation — Type
struct AnticipativeImitation{A} <: AbstractImitationAlgorithmAnticipative Imitation algorithm for supervised learning using anticipative solutions.
Trains a policy in a single shot using expert demonstrations from anticipative solutions.
Reference: https://arxiv.org/abs/2304.00789
Fields
inner_algorithm::Any: inner imitation algorithm for supervised learning
DecisionFocusedLearningAlgorithms.DAgger — Type
struct DAgger{A, S} <: AbstractImitationAlgorithmDataset Aggregation (DAgger) algorithm for imitation learning.
Reference: https://arxiv.org/abs/2402.04463
Fields
inner_algorithm::Any: inner imitation algorithm for supervised learningiterations::Int64: number of DAgger iterationsepochs_per_iteration::Int64: number of epochs per DAgger iterationα_decay::Float64: decay factor for mixing expert and learned policyseed::Any: random seed for the expert/policy mixing coin-flip (nothing = non-reproducible)max_dataset_size::Union{Nothing, Int64}: maximum dataset size across iterations (nothing keeps all samples, an integer caps to the most recent N samples via FIFO)
DecisionFocusedLearningAlgorithms.DFLPolicy — Type
struct DFLPolicy{ML, CO} <: AbstractPolicyDecision-Focused Learning Policy combining a machine learning model and a combinatorial optimizer.
DecisionFocusedLearningAlgorithms.DFLPolicy — Method
Run the policy and get the next decision on the given input features.
DecisionFocusedLearningAlgorithms.FYLLossMetric — Type
struct FYLLossMetric{D} <: AbstractMetricMetric for evaluating Fenchel-Young Loss over a dataset.
This metric stores a dataset and computes the average Fenchel-Young Loss when evaluate! is called. Useful for tracking validation loss during training. Can also be used in the algorithms to accumulate loss over training data with update!.
Fields
dataset::Any: dataset to evaluate onaccumulator::LossAccumulator: accumulator for loss values
Examples
# Create metric with validation dataset
val_metric = FYLLossMetric(val_dataset, :validation_loss)
# Evaluate during training (called by evaluate_metrics!)
context = TrainingContext(policy=policy, epoch=5, loss=loss)
avg_loss = evaluate!(val_metric, context)See also
DecisionFocusedLearningAlgorithms.FYLLossMetric — Type
FYLLossMetric(dataset, name::Symbol=:fyl_loss)Construct a FYLLossMetric for a given dataset.
Arguments
dataset: Dataset to evaluate on (should have samples with.x,.y, and.contextfields)name::Symbol: Identifier for the metric (default::fyl_loss)
DecisionFocusedLearningAlgorithms.FunctionMetric — Type
struct FunctionMetric{F, D} <: AbstractMetricA flexible metric that wraps a user-defined function.
This metric allows users to define custom metrics using functions. The function receives the training context and optionally any stored data. It can return:
- A single value (stored with
metric.name) - A
NamedTuple(each key-value pair stored separately)
Fields
metric_fn::Any: function with signature(context) -> valueor(context, data) -> valuename::Symbol: identifier for the metricdata::Any: optional data stored in the metric (default:nothing)
Examples
# Simple metric using only context
epoch_metric = FunctionMetric(ctx -> ctx.epoch, :current_epoch)
# Metric with stored data (dataset)
gap_metric = FunctionMetric(:val_gap, val_data) do ctx, data
compute_gap(benchmark, data, ctx.policy.statistical_model, ctx.policy.maximizer)
end
# Metric returning multiple values
dual_gap = FunctionMetric(:gaps, (train_data, val_data)) do ctx, datasets
train_ds, val_ds = datasets
return (
train_gap = compute_gap(benchmark, train_ds, ctx.policy.statistical_model, ctx.policy.maximizer),
val_gap = compute_gap(benchmark, val_ds, ctx.policy.statistical_model, ctx.policy.maximizer)
)
endSee also
PeriodicMetric: Wrap a metric to evaluate periodicallyevaluate!
DecisionFocusedLearningAlgorithms.FunctionMetric — Method
FunctionMetric(
metric_fn,
name::Symbol
) -> FunctionMetric{_A, Nothing} where _A
Construct a FunctionMetric without stored data.
The function should have signature (context) -> value.
Arguments
metric_fn::Function: Function to compute the metricname::Symbol: Identifier for the metric
DecisionFocusedLearningAlgorithms.LossAccumulator — Type
LossAccumulator() -> LossAccumulator
LossAccumulator(name::Symbol) -> LossAccumulator
Construct a LossAccumulator with the given name. Initializes total loss and count to zero.
DecisionFocusedLearningAlgorithms.LossAccumulator — Type
mutable struct LossAccumulatorAccumulates loss values during training and computes their average.
This metric is used internally by training loops to track training loss. It accumulates loss values via update! calls and computes the average via compute!.
Fields
name::Symboltotal_loss::Float64: Running sum of loss valuescount::Int64: Number of samples accumulated
Examples
metric = LossAccumulator(:training_loss)
# During training
for sample in dataset
loss_value = compute_loss(model, sample)
update!(metric, loss_value)
end
# Get average and reset
avg_loss = compute!(metric) # Automatically resetsSee also
DecisionFocusedLearningAlgorithms.PeriodicMetric — Type
struct PeriodicMetric{M<:AbstractMetric} <: AbstractMetricWrapper that evaluates a metric only every N epochs.
This is useful for expensive metrics that don't need to be computed every epoch (e.g., gap computation, test set evaluation).
Fields
metric::AbstractMetric: the wrapped metric to evaluate periodicallyfrequency::Int64: evaluate every N epochsoffset::Int64: offset for the first evaluation
Behavior
The metric is evaluated when epoch >= offset and (epoch - offset) % frequency == 0. On other epochs, evaluate! returns nothing (which is skipped by evaluate_metrics!).
See also
DecisionFocusedLearningAlgorithms.PeriodicMetric — Method
PeriodicMetric(
metric_fn,
frequency::Int64;
offset
) -> PeriodicMetric
Construct a PeriodicMetric from a function to be wrapped.
DecisionFocusedLearningAlgorithms.PeriodicMetric — Method
PeriodicMetric(
metric::AbstractMetric,
frequency::Int64;
offset
) -> PeriodicMetric
Construct a PeriodicMetric that evaluates the wrapped metric every N epochs.
DecisionFocusedLearningAlgorithms.PerturbedFenchelYoungLossImitation — Type
struct PerturbedFenchelYoungLossImitation{O, S} <: AbstractImitationAlgorithmStructured imitation learning with a perturbed Fenchel-Young loss.
Reference: https://arxiv.org/abs/2002.08676
Fields
nb_samples::Int64: number of perturbation samplesε::Float64: perturbation magnitudethreaded::Bool: whether to use threading for perturbationstraining_optimizer::Any: optimizer used for trainingseed::Any: random seed for perturbationsuse_multiplicative_perturbation::Bool: whether to use multiplicative perturbation (else additive)
DecisionFocusedLearningAlgorithms.TrainingContext — Type
mutable struct TrainingContext{P, F, O<:NamedTuple}Lightweight mutable context object passed to metrics during training.
Fields
policy::Anyepoch::Int64: current epoch number (mutated in-place during training)maximizer_kwargs::Anyother_fields::NamedTuple
Notes
policy,maximizer_kwargs, andother_fieldsare constant after construction; onlyepochis intended to be mutated.
Base.getproperty — Method
getproperty(pm::PeriodicMetric, s::Symbol) -> Any
Delegate name property to the wrapped metric for seamless integration.
Base.propertynames — Function
propertynames(pm::PeriodicMetric) -> NTuple{4, Symbol}
propertynames(
pm::PeriodicMetric,
private::Bool
) -> NTuple{4, Symbol}
List available properties of PeriodicMetric.
DecisionFocusedLearningAlgorithms._store_metric_value! — Method
_store_metric_value!(
_::ValueHistories.MVHistory,
metric_name::Symbol,
_::Int64,
value
)
Fallback that throws a descriptive error for unsupported return types. Metrics must return a Number, a NamedTuple, or nothing.
DecisionFocusedLearningAlgorithms._store_metric_value! — Method
_store_metric_value!(
history::ValueHistories.MVHistory,
_::Symbol,
epoch::Int64,
value::NamedTuple
)
Internal helper to store multiple metric values from a NamedTuple. Each key-value pair is stored separately in the history.
DecisionFocusedLearningAlgorithms._store_metric_value! — Method
_store_metric_value!(
_::ValueHistories.MVHistory,
_::Symbol,
_::Int64,
_::Nothing
)
Internal helper that skips storing when value is nothing. Used by periodic metrics on epochs when they're not evaluated.
DecisionFocusedLearningAlgorithms._store_metric_value! — Method
_store_metric_value!(
history::ValueHistories.MVHistory,
metric_name::Symbol,
epoch::Int64,
value::Number
)
Internal helper to store a single metric value in the history.
DecisionFocusedLearningAlgorithms.compute! — Method
compute!(metric::FYLLossMetric) -> Float64
Compute the average loss from accumulated values.
DecisionFocusedLearningAlgorithms.compute! — Method
compute!(metric::LossAccumulator; reset) -> Float64
Compute the average loss from accumulated values.
Arguments
metric::LossAccumulator- The accumulator to compute fromreset::Bool- Whether to reset the accumulator after computing (default:true)
Returns
Float64- Average loss (or 0.0 if no values accumulated)
Examples
metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.5)
avg = compute!(metric) # Returns 2.0, then resetsDecisionFocusedLearningAlgorithms.evaluate! — Function
evaluate!(metric::AbstractMetric, context::TrainingContext)Evaluate the metric given the current training context.
Arguments
metric::AbstractMetric: The metric to evaluatecontext::TrainingContext: Current training state (model, epoch, maximizer, etc.)
Returns
Can return:
- A single value (Float64, Int, etc.): stored with
metric.name - A
NamedTuple: each key-value pair stored separately nothing: skipped (e.g., periodic metrics on off-epochs)
DecisionFocusedLearningAlgorithms.evaluate! — Method
evaluate!(
metric::FYLLossMetric,
context::TrainingContext
) -> Float64
Evaluate the average Fenchel-Young Loss over the stored dataset.
This method iterates through the dataset, computes predictions using context.policy, and accumulates losses using context.loss. The dataset should be stored in the metric.
Arguments
metric::FYLLossMetric: The metric to evaluatecontext::TrainingContext: TrainingContext withpolicy,loss, and other fields
DecisionFocusedLearningAlgorithms.evaluate! — Method
evaluate!(
metric::FunctionMetric,
context::TrainingContext
) -> Any
Evaluate the function metric by calling the stored function.
Arguments
metric::FunctionMetric: The metric to evaluatecontext::TrainingContext: TrainingContext with current training state
Returns
- The value returned by
metric.metric_fn(can be single value or NamedTuple)
```
DecisionFocusedLearningAlgorithms.evaluate! — Method
evaluate!(pm::PeriodicMetric, context) -> Any
Evaluate the wrapped metric only if the current epoch matches the frequency pattern.
Arguments
pm::PeriodicMetric: The periodic metric wrappercontext::TrainingContext: TrainingContext with current epoch
Returns
- The result of
evaluate!(pm.metric, context)if epoch matches the pattern nothingotherwise (which is skipped byevaluate_metrics!)
DecisionFocusedLearningAlgorithms.evaluate_metrics! — Method
evaluate_metrics!(
history::ValueHistories.MVHistory,
metrics::Tuple,
context::TrainingContext
)
Evaluate all metrics and store their results in the history.
This function handles three types of metric returns through multiple dispatch:
- Single value: Stored with the metric's name
- NamedTuple: Each key-value pair stored separately (for metrics that compute multiple values)
- nothing: Skipped (e.g., periodic metrics on epochs when not evaluated)
Arguments
history::MVHistory: MVHistory object to store metric valuesmetrics::Tuple: Tuple of AbstractMetric instances to evaluatecontext::TrainingContext: TrainingContext with current training state (policy, epoch, etc.)
Examples
# Create metrics
val_loss = FYLLossMetric(val_dataset, :validation_loss)
epoch_metric = FunctionMetric(ctx -> ctx.epoch, :current_epoch)
# Evaluate and store
context = TrainingContext(policy=policy, epoch=5)
evaluate_metrics!(history, (val_loss, epoch_metric), context)See also
DecisionFocusedLearningAlgorithms.next_epoch! — Method
next_epoch!(ctx::TrainingContext)
Advance the epoch counter in the training context by one.
DecisionFocusedLearningAlgorithms.reset! — Method
reset!(metric::FYLLossMetric) -> Int64
Reset the metric's accumulated loss to zero.
DecisionFocusedLearningAlgorithms.reset! — Method
reset!(metric::LossAccumulator) -> Int64
Reset the accumulator to its initial state (zero total loss and count).
Examples
metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.0)
reset!(metric) # total_loss = 0.0, count = 0DecisionFocusedLearningAlgorithms.train_policy! — Method
train_policy!(
algorithm::AnticipativeImitation,
policy::DFLPolicy,
train_environments;
anticipative_policy,
epochs,
metrics,
maximizer_kwargs
)
Train a DFLPolicy using the Anticipative Imitation algorithm on provided training environments.
Core training method
Generates anticipative solutions from environments and trains the policy using supervised learning.
DecisionFocusedLearningAlgorithms.train_policy! — Method
train_policy!(
algorithm::DAgger,
policy::DFLPolicy,
train_environments;
anticipative_policy,
metrics,
maximizer_kwargs
)
Train a DFLPolicy using the DAgger algorithm on the provided training environments.
Core training method
Requires train_environments and anticipative_policy as keyword arguments.
DecisionFocusedLearningAlgorithms.train_policy! — Method
train_policy!(
algorithm::PerturbedFenchelYoungLossImitation,
policy::DFLPolicy,
train_dataset::AbstractArray{<:DecisionFocusedLearningBenchmarks.Utils.DataSample};
epochs,
metrics,
maximizer_kwargs
) -> ValueHistories.MVHistory{ValueHistories.History}
Train a DFLPolicy using the Perturbed Fenchel-Young Loss Imitation Algorithm with unbatched data.
This convenience method wraps the dataset in a DataLoader with batchsize=1 and delegates to the batched training method. For custom batching behavior, create your own DataLoader and use the batched method directly.
DecisionFocusedLearningAlgorithms.train_policy! — Method
train_policy!(
algorithm::PerturbedFenchelYoungLossImitation,
policy::DFLPolicy,
train_dataset::MLUtils.DataLoader;
epochs,
metrics,
maximizer_kwargs
) -> ValueHistories.MVHistory{ValueHistories.History}
Train a DFLPolicy using the Perturbed Fenchel-Young Loss Imitation Algorithm.
The train_dataset should be a DataLoader for batched training. Gradients are computed from the sum of losses across each batch before updating model parameters.
For unbatched training with a Vector{DataSample}, use the convenience method that automatically wraps the data in a DataLoader with batchsize=1.
DecisionFocusedLearningAlgorithms.train_policy — Method
train_policy(
algorithm::AbstractImitationAlgorithm,
benchmark::DecisionFocusedLearningBenchmarks.Utils.AbstractBenchmark;
target_policy,
dataset_size,
epochs,
metrics,
seed
) -> Tuple{Any, DFLPolicy}
Train a new DFLPolicy on a benchmark using any imitation learning algorithm.
Convenience wrapper that handles dataset generation, model initialization, and policy creation. Returns the training history and the trained policy.
For dynamic benchmarks, use the algorithm-specific train_policy overload that accepts environments and an anticipative policy.
DecisionFocusedLearningAlgorithms.train_policy — Method
train_policy(
algorithm::AnticipativeImitation,
benchmark::DecisionFocusedLearningBenchmarks.Utils.ExogenousDynamicBenchmark;
dataset_size,
epochs,
metrics,
seed
) -> Tuple{Any, DFLPolicy}
Train a DFLPolicy using the Anticipative Imitation algorithm on a benchmark.
Benchmark convenience wrapper
This high-level function handles all setup from the benchmark and returns a trained policy. Uses anticipative solutions as expert demonstrations.
DecisionFocusedLearningAlgorithms.train_policy — Method
train_policy(
algorithm::DAgger,
benchmark::DecisionFocusedLearningBenchmarks.Utils.ExogenousDynamicBenchmark;
dataset_size,
metrics,
seed
) -> Tuple{ValueHistories.MVHistory{ValueHistories.History}, DFLPolicy}
Train a DFLPolicy using the DAgger algorithm on a benchmark.
Benchmark convenience wrapper
This high-level function handles all setup from the benchmark and returns a trained policy.
DecisionFocusedLearningAlgorithms.update! — Method
update!(
metric::FYLLossMetric,
loss_value::Float64
) -> Float64
Update the metric with an already-computed loss value. This avoids re-evaluating the loss inside the metric when the loss was computed during training.
DecisionFocusedLearningAlgorithms.update! — Method
update!(
metric::FYLLossMetric,
loss::InferOpt.FenchelYoungLoss,
θ,
y_target;
kwargs...
) -> Any
Update the metric with a single loss computation.
Arguments
metric::FYLLossMetric: The metric to updateloss::FenchelYoungLoss: Loss function to useθ: Model predictiony_target: Target valuekwargs...: Additional arguments passed to loss function
DecisionFocusedLearningAlgorithms.update! — Method
update!(
metric::LossAccumulator,
loss_value::Float64
) -> Int64
Add a loss value to the accumulator.
Examples
metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.0)
compute!(metric) # Returns 1.75DecisionFocusedLearningBenchmarks.Utils.compute_gap — Function
compute_gap(bench, dataset, policy::DFLPolicy) -> Any
compute_gap(bench, dataset, policy::DFLPolicy, op) -> Any
Convenience overload: evaluate the optimality gap using a DFLPolicy directly, instead of unpacking policy.statistical_model and policy.maximizer.