DecisionFocusedLearningAlgorithms.DAggerType
struct DAgger{A, S} <: AbstractImitationAlgorithm

Dataset Aggregation (DAgger) algorithm for imitation learning.

Reference: https://arxiv.org/abs/2402.04463

Fields

  • inner_algorithm::Any: inner imitation algorithm for supervised learning

  • iterations::Int64: number of DAgger iterations

  • epochs_per_iteration::Int64: number of epochs per DAgger iteration

  • α_decay::Float64: decay factor for mixing expert and learned policy

  • seed::Any: random seed for the expert/policy mixing coin-flip (nothing = non-reproducible)

  • max_dataset_size::Union{Nothing, Int64}: maximum dataset size across iterations (nothing keeps all samples, an integer caps to the most recent N samples via FIFO)

source
DecisionFocusedLearningAlgorithms.FYLLossMetricType
struct FYLLossMetric{D} <: AbstractMetric

Metric for evaluating Fenchel-Young Loss over a dataset.

This metric stores a dataset and computes the average Fenchel-Young Loss when evaluate! is called. Useful for tracking validation loss during training. Can also be used in the algorithms to accumulate loss over training data with update!.

Fields

  • dataset::Any: dataset to evaluate on

  • accumulator::LossAccumulator: accumulator for loss values

Examples

# Create metric with validation dataset
val_metric = FYLLossMetric(val_dataset, :validation_loss)

# Evaluate during training (called by evaluate_metrics!)
context = TrainingContext(policy=policy, epoch=5, loss=loss)
avg_loss = evaluate!(val_metric, context)

See also

source
DecisionFocusedLearningAlgorithms.FYLLossMetricType
FYLLossMetric(dataset, name::Symbol=:fyl_loss)

Construct a FYLLossMetric for a given dataset.

Arguments

  • dataset: Dataset to evaluate on (should have samples with .x, .y, and .context fields)
  • name::Symbol: Identifier for the metric (default: :fyl_loss)
source
DecisionFocusedLearningAlgorithms.FunctionMetricType
struct FunctionMetric{F, D} <: AbstractMetric

A flexible metric that wraps a user-defined function.

This metric allows users to define custom metrics using functions. The function receives the training context and optionally any stored data. It can return:

  • A single value (stored with metric.name)
  • A NamedTuple (each key-value pair stored separately)

Fields

  • metric_fn::Any: function with signature (context) -> value or (context, data) -> value

  • name::Symbol: identifier for the metric

  • data::Any: optional data stored in the metric (default: nothing)

Examples

# Simple metric using only context
epoch_metric = FunctionMetric(ctx -> ctx.epoch, :current_epoch)

# Metric with stored data (dataset)
gap_metric = FunctionMetric(:val_gap, val_data) do ctx, data
    compute_gap(benchmark, data, ctx.policy.statistical_model, ctx.policy.maximizer)
end

# Metric returning multiple values
dual_gap = FunctionMetric(:gaps, (train_data, val_data)) do ctx, datasets
    train_ds, val_ds = datasets
    return (
        train_gap = compute_gap(benchmark, train_ds, ctx.policy.statistical_model, ctx.policy.maximizer),
        val_gap = compute_gap(benchmark, val_ds, ctx.policy.statistical_model, ctx.policy.maximizer)
    )
end

See also

source
DecisionFocusedLearningAlgorithms.FunctionMetricMethod
FunctionMetric(
    metric_fn,
    name::Symbol
) -> FunctionMetric{_A, Nothing} where _A

Construct a FunctionMetric without stored data.

The function should have signature (context) -> value.

Arguments

  • metric_fn::Function: Function to compute the metric
  • name::Symbol: Identifier for the metric
source
DecisionFocusedLearningAlgorithms.LossAccumulatorType
mutable struct LossAccumulator

Accumulates loss values during training and computes their average.

This metric is used internally by training loops to track training loss. It accumulates loss values via update! calls and computes the average via compute!.

Fields

  • name::Symbol

  • total_loss::Float64: Running sum of loss values

  • count::Int64: Number of samples accumulated

Examples

metric = LossAccumulator(:training_loss)

# During training
for sample in dataset
    loss_value = compute_loss(model, sample)
    update!(metric, loss_value)
end

# Get average and reset
avg_loss = compute!(metric)  # Automatically resets

See also

source
DecisionFocusedLearningAlgorithms.PeriodicMetricType
struct PeriodicMetric{M<:AbstractMetric} <: AbstractMetric

Wrapper that evaluates a metric only every N epochs.

This is useful for expensive metrics that don't need to be computed every epoch (e.g., gap computation, test set evaluation).

Fields

  • metric::AbstractMetric: the wrapped metric to evaluate periodically

  • frequency::Int64: evaluate every N epochs

  • offset::Int64: offset for the first evaluation

Behavior

The metric is evaluated when epoch >= offset and (epoch - offset) % frequency == 0. On other epochs, evaluate! returns nothing (which is skipped by evaluate_metrics!).

See also

source
DecisionFocusedLearningAlgorithms.PerturbedFenchelYoungLossImitationType
struct PerturbedFenchelYoungLossImitation{O, S} <: AbstractImitationAlgorithm

Structured imitation learning with a perturbed Fenchel-Young loss.

Reference: https://arxiv.org/abs/2002.08676

Fields

  • nb_samples::Int64: number of perturbation samples

  • ε::Float64: perturbation magnitude

  • threaded::Bool: whether to use threading for perturbations

  • training_optimizer::Any: optimizer used for training

  • seed::Any: random seed for perturbations

  • use_multiplicative_perturbation::Bool: whether to use multiplicative perturbation (else additive)

source
DecisionFocusedLearningAlgorithms.TrainingContextType
mutable struct TrainingContext{P, F, O<:NamedTuple}

Lightweight mutable context object passed to metrics during training.

Fields

  • policy::Any

  • epoch::Int64: current epoch number (mutated in-place during training)

  • maximizer_kwargs::Any

  • other_fields::NamedTuple

Notes

  • policy, maximizer_kwargs, and other_fields are constant after construction; only epoch is intended to be mutated.
source
Base.getpropertyMethod
getproperty(pm::PeriodicMetric, s::Symbol) -> Any

Delegate name property to the wrapped metric for seamless integration.

source
Base.propertynamesFunction
propertynames(pm::PeriodicMetric) -> NTuple{4, Symbol}
propertynames(
    pm::PeriodicMetric,
    private::Bool
) -> NTuple{4, Symbol}

List available properties of PeriodicMetric.

source
DecisionFocusedLearningAlgorithms.compute!Method
compute!(metric::LossAccumulator; reset) -> Float64

Compute the average loss from accumulated values.

Arguments

  • metric::LossAccumulator - The accumulator to compute from
  • reset::Bool - Whether to reset the accumulator after computing (default: true)

Returns

  • Float64 - Average loss (or 0.0 if no values accumulated)

Examples

metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.5)
avg = compute!(metric)  # Returns 2.0, then resets
source
DecisionFocusedLearningAlgorithms.evaluate!Function
evaluate!(metric::AbstractMetric, context::TrainingContext)

Evaluate the metric given the current training context.

Arguments

  • metric::AbstractMetric: The metric to evaluate
  • context::TrainingContext: Current training state (model, epoch, maximizer, etc.)

Returns

Can return:

  • A single value (Float64, Int, etc.): stored with metric.name
  • A NamedTuple: each key-value pair stored separately
  • nothing: skipped (e.g., periodic metrics on off-epochs)
source
DecisionFocusedLearningAlgorithms.evaluate!Method
evaluate!(
    metric::FYLLossMetric,
    context::TrainingContext
) -> Float64

Evaluate the average Fenchel-Young Loss over the stored dataset.

This method iterates through the dataset, computes predictions using context.policy, and accumulates losses using context.loss. The dataset should be stored in the metric.

Arguments

  • metric::FYLLossMetric: The metric to evaluate
  • context::TrainingContext: TrainingContext with policy, loss, and other fields
source
DecisionFocusedLearningAlgorithms.evaluate!Method
evaluate!(
    metric::FunctionMetric,
    context::TrainingContext
) -> Any

Evaluate the function metric by calling the stored function.

Arguments

  • metric::FunctionMetric: The metric to evaluate
  • context::TrainingContext: TrainingContext with current training state

Returns

  • The value returned by metric.metric_fn (can be single value or NamedTuple)

```

source
DecisionFocusedLearningAlgorithms.evaluate!Method
evaluate!(pm::PeriodicMetric, context) -> Any

Evaluate the wrapped metric only if the current epoch matches the frequency pattern.

Arguments

  • pm::PeriodicMetric: The periodic metric wrapper
  • context::TrainingContext: TrainingContext with current epoch

Returns

  • The result of evaluate!(pm.metric, context) if epoch matches the pattern
  • nothing otherwise (which is skipped by evaluate_metrics!)
source
DecisionFocusedLearningAlgorithms.evaluate_metrics!Method
evaluate_metrics!(
    history::ValueHistories.MVHistory,
    metrics::Tuple,
    context::TrainingContext
)

Evaluate all metrics and store their results in the history.

This function handles three types of metric returns through multiple dispatch:

  • Single value: Stored with the metric's name
  • NamedTuple: Each key-value pair stored separately (for metrics that compute multiple values)
  • nothing: Skipped (e.g., periodic metrics on epochs when not evaluated)

Arguments

  • history::MVHistory: MVHistory object to store metric values
  • metrics::Tuple: Tuple of AbstractMetric instances to evaluate
  • context::TrainingContext: TrainingContext with current training state (policy, epoch, etc.)

Examples

# Create metrics
val_loss = FYLLossMetric(val_dataset, :validation_loss)
epoch_metric = FunctionMetric(ctx -> ctx.epoch, :current_epoch)

# Evaluate and store
context = TrainingContext(policy=policy, epoch=5)
evaluate_metrics!(history, (val_loss, epoch_metric), context)

See also

source
DecisionFocusedLearningAlgorithms.reset!Method
reset!(metric::LossAccumulator) -> Int64

Reset the accumulator to its initial state (zero total loss and count).

Examples

metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.0)
reset!(metric)  # total_loss = 0.0, count = 0
source
DecisionFocusedLearningAlgorithms.train_policy!Method
train_policy!(
    algorithm::AnticipativeImitation,
    policy::DFLPolicy,
    train_environments;
    anticipative_policy,
    epochs,
    metrics,
    maximizer_kwargs
)

Train a DFLPolicy using the Anticipative Imitation algorithm on provided training environments.

Core training method

Generates anticipative solutions from environments and trains the policy using supervised learning.

source
DecisionFocusedLearningAlgorithms.train_policy!Method
train_policy!(
    algorithm::DAgger,
    policy::DFLPolicy,
    train_environments;
    anticipative_policy,
    metrics,
    maximizer_kwargs
)

Train a DFLPolicy using the DAgger algorithm on the provided training environments.

Core training method

Requires train_environments and anticipative_policy as keyword arguments.

source
DecisionFocusedLearningAlgorithms.train_policy!Method
train_policy!(
    algorithm::PerturbedFenchelYoungLossImitation,
    policy::DFLPolicy,
    train_dataset::AbstractArray{<:DecisionFocusedLearningBenchmarks.Utils.DataSample};
    epochs,
    metrics,
    maximizer_kwargs
) -> ValueHistories.MVHistory{ValueHistories.History}

Train a DFLPolicy using the Perturbed Fenchel-Young Loss Imitation Algorithm with unbatched data.

This convenience method wraps the dataset in a DataLoader with batchsize=1 and delegates to the batched training method. For custom batching behavior, create your own DataLoader and use the batched method directly.

source
DecisionFocusedLearningAlgorithms.train_policy!Method
train_policy!(
    algorithm::PerturbedFenchelYoungLossImitation,
    policy::DFLPolicy,
    train_dataset::MLUtils.DataLoader;
    epochs,
    metrics,
    maximizer_kwargs
) -> ValueHistories.MVHistory{ValueHistories.History}

Train a DFLPolicy using the Perturbed Fenchel-Young Loss Imitation Algorithm.

The train_dataset should be a DataLoader for batched training. Gradients are computed from the sum of losses across each batch before updating model parameters.

For unbatched training with a Vector{DataSample}, use the convenience method that automatically wraps the data in a DataLoader with batchsize=1.

source
DecisionFocusedLearningAlgorithms.train_policyMethod
train_policy(
    algorithm::AbstractImitationAlgorithm,
    benchmark::DecisionFocusedLearningBenchmarks.Utils.AbstractBenchmark;
    target_policy,
    dataset_size,
    epochs,
    metrics,
    seed
) -> Tuple{Any, DFLPolicy}

Train a new DFLPolicy on a benchmark using any imitation learning algorithm.

Convenience wrapper that handles dataset generation, model initialization, and policy creation. Returns the training history and the trained policy.

For dynamic benchmarks, use the algorithm-specific train_policy overload that accepts environments and an anticipative policy.

source
DecisionFocusedLearningAlgorithms.train_policyMethod
train_policy(
    algorithm::AnticipativeImitation,
    benchmark::DecisionFocusedLearningBenchmarks.Utils.ExogenousDynamicBenchmark;
    dataset_size,
    epochs,
    metrics,
    seed
) -> Tuple{Any, DFLPolicy}

Train a DFLPolicy using the Anticipative Imitation algorithm on a benchmark.

Benchmark convenience wrapper

This high-level function handles all setup from the benchmark and returns a trained policy. Uses anticipative solutions as expert demonstrations.

source
DecisionFocusedLearningAlgorithms.train_policyMethod
train_policy(
    algorithm::DAgger,
    benchmark::DecisionFocusedLearningBenchmarks.Utils.ExogenousDynamicBenchmark;
    dataset_size,
    metrics,
    seed
) -> Tuple{ValueHistories.MVHistory{ValueHistories.History}, DFLPolicy}

Train a DFLPolicy using the DAgger algorithm on a benchmark.

Benchmark convenience wrapper

This high-level function handles all setup from the benchmark and returns a trained policy.

source
DecisionFocusedLearningAlgorithms.update!Method
update!(
    metric::FYLLossMetric,
    loss_value::Float64
) -> Float64

Update the metric with an already-computed loss value. This avoids re-evaluating the loss inside the metric when the loss was computed during training.

source
DecisionFocusedLearningAlgorithms.update!Method
update!(
    metric::FYLLossMetric,
    loss::InferOpt.FenchelYoungLoss,
    θ,
    y_target;
    kwargs...
) -> Any

Update the metric with a single loss computation.

Arguments

  • metric::FYLLossMetric: The metric to update
  • loss::FenchelYoungLoss: Loss function to use
  • θ: Model prediction
  • y_target: Target value
  • kwargs...: Additional arguments passed to loss function
source
DecisionFocusedLearningAlgorithms.update!Method
update!(
    metric::LossAccumulator,
    loss_value::Float64
) -> Int64

Add a loss value to the accumulator.

Examples

metric = LossAccumulator()
update!(metric, 1.5)
update!(metric, 2.0)
compute!(metric)  # Returns 1.75
source