API Reference

Types

InferOpt.AbstractLayerType
AbstractLayer

Supertype for all the layers defined in InferOpt.

All of these layers are callable, and differentiable with any ChainRules-compatible autodiff backend.

Interface

  • (layer::AbstractLayer)(args...; kwargs...)
source
InferOpt.AbstractLossLayerType
AbstractLossLayer <: AbstractLayer

Supertype for all the loss layers defined in InferOpt.

Depending on the precise loss, the arguments to the layer might vary

Interface

  • (layer::AbstractLossLayer)(θ; kwargs...) or
  • (layer::AbstractLossLayer)(θ, θ_true; kwargs...) or
  • (layer::AbstractLossLayer)(θ, y_true; kwargs...) or
  • (layer::AbstractLossLayer)(θ, (; θ_true, y_true); kwargs...)
source
InferOpt.AbstractOptimizationLayerType
AbstractOptimizationLayer <: AbstractLayer

Supertype for all the optimization layers defined in InferOpt.

Interface

  • (layer::AbstractOptimizationLayer)(θ::AbstractArray; kwargs...)
  • compute_probability_distribution(layer, θ; kwargs...) (only if the layer is probabilistic)
source
InferOpt.AbstractPerturbationType
abstract type AbstractPerturbation <: Distributions.Distribution{Distributions.Univariate, Distributions.Continuous}

Abstract type for a perturbation. It's a function that takes a parameter θ and returns a perturbed parameter by a distribution perturbation_dist.

Warning

All subtypes should implement a perturbation_dist field, which is a ContinuousUnivariateDistribution.

Existing implementations

source
InferOpt.AbstractRegularizedType
AbstractRegularized <: AbstractOptimizationLayer

Convex regularization perturbation of a black box linear (in θ) optimizer

ŷ(θ) = argmax_{y ∈ C} {θᵀg(y) + h(y) - Ω(y)}

with g and h functions of y.

Interface

  • (regularized::AbstractRegularized)(θ; kwargs...): return ŷ(θ)
  • compute_regularization(regularized, y): return `Ω(y)
  • get_maximizer(regularized): return the associated optimizer

Available implementations

source
InferOpt.AdditivePerturbationType
struct AdditivePerturbation{F}

Additive perturbation: θ ↦ θ + εZ, where Z is a random variable following perturbation_dist.

Fields

  • perturbation_dist::Any: base distribution for the perturbation

  • ε::Float64: perturbation size

source
InferOpt.FenchelYoungLossType
struct FenchelYoungLoss{O<:InferOpt.AbstractOptimizationLayer} <: InferOpt.AbstractLossLayer

Fenchel-Young loss associated with a given optimization layer.

L(θ, y_true) = (Ω(y_true) - θᵀy_true) - (Ω(ŷ) - θᵀŷ)

Reference: https://arxiv.org/abs/1901.02324

Fields

  • optimization_layer::AbstractOptimizationLayer: optimization layer that can be formulated as ŷ(θ) = argmax {θᵀy - Ω(y)} (either regularized or perturbed)

Compatibility

This loss is compatible with:

source
InferOpt.Fix1KwargsType
struct Fix1Kwargs{F, K, T} <: Function

Callable struct that fixes the first argument of f to x, and the keyword arguments to kwargs....

Fields

  • f::Any: function

  • x::Any: fixed first argument

  • kwargs::Any: fixed keyword arguments

source
InferOpt.FixFirstType
struct FixFirst{F, T}

Callable struct that fixes the first argument of f to x. Compared to Base.Fix1, works on functions with more than two arguments.

source
InferOpt.FixKwargsType
struct FixKwargs{F, K}

Callable struct that fixes the keyword arguments of f to kwargs..., and only accepts positional arguments.

Fields

  • f::Any: function

  • kwargs::Any: fixed keyword arguments

source
InferOpt.ImitationLossType
ImitationLoss <: AbstractLossLayer

Generic imitation loss of the form

L(θ, t_true) = max_y {δ(y, t_true) + α θᵀ(y - y_true) - (Ω(y) - Ω(y_true))}

Note: by default, t_true is a named tuple with field y_true, but it can be any data structure for which the get_y_true method is implemented.

Fields

  • aux_loss_maximizer: function of (θ, t_true, α) that computes the argmax in the problem above
  • δ: base loss function
  • Ω: regularization function
  • α::Float64: hyperparameter with a default value of 1.0
source
InferOpt.InterpolationType
Interpolation <: AbstractOptimizationLayer

Piecewise-linear interpolation of a black-box optimizer.

Fields

  • maximizer: underlying argmax function
  • λ::Float64: smoothing parameter (smaller = more faithful approximation, larger = more informative gradients)

Reference: https://arxiv.org/abs/1912.02175

source
InferOpt.LinearMaximizerMethod
LinearMaximizer(
    maximizer;
    g,
    h
) -> LinearMaximizer{_A, typeof(InferOpt.identity_kw), ComposedFunction{typeof(zero), typeof(InferOpt.eltype_kw)}} where _A

Constructor for LinearMaximizer.

source
InferOpt.MultiplicativePerturbationType
struct MultiplicativePerturbation{F}

Multiplicative perturbation: θ ↦ θ ⊙ exp(εZ - shift)

Fields

  • perturbation_dist::Any: base distribution for the perturbation

  • ε::Float64: perturbation size

  • shift::Float64: optional shift to have 0 mean, default value is ε²/2

source
InferOpt.NormalAdditiveGradLogdensityType
struct NormalAdditiveGradLogdensity

Method with parameters to compute the gradient of the logdensity of η = θ + εZ w.r.t. θ., with Z ∼ N(0, 1).

Fields

  • ε::Float64: perturbation size
source
InferOpt.NormalMultiplicativeGradLogdensityType
struct NormalMultiplicativeGradLogdensity

Method with parameters to compute the gradient of the logdensity of η = θ ⊙ exp(εZ - shift) w.r.t. θ., with Z ∼ N(0, 1).

Fields

  • ε::Float64: perturbation size

  • shift::Float64: optional shift to have 0 mean

source
InferOpt.PerturbedOracleType
struct PerturbedOracle{D, F, t, variance_reduction, G, R, S} <: InferOpt.AbstractOptimizationLayer

Differentiable perturbation of a black box optimizer of type F, with perturbation of type D.

This struct is as wrapper around Reinforce from DifferentiableExpectations.jl.

There are three different available constructors that behave differently in the package:

source
InferOpt.PerturbedOracleMethod
PerturbedOracle(
    maximizer,
    dist_constructor;
    dist_logdensity_grad,
    nb_samples,
    variance_reduction,
    threaded,
    seed,
    rng,
    kwargs...
) -> PerturbedOracle{_A, _B, _C, _D, Nothing, Random.TaskLocalRNG, Nothing} where {_A, _B, _C, _D}

Constructor for PerturbedOracle.

source
InferOpt.PushforwardType
struct Pushforward{O<:InferOpt.AbstractOptimizationLayer, P} <: InferOpt.AbstractLayer

Differentiable pushforward of a probabilistic optimization layer with an arbitrary function post-processing function.

Pushforward can be used for direct regret minimization (aka learning by experience) when the post-processing returns a cost.

Fields

  • optimization_layer::InferOpt.AbstractOptimizationLayer: probabilistic optimization layer

  • post_processing::Any: callable

source
InferOpt.PushforwardMethod

Output the expectation of pushforward.post_processing(X), where X follows the distribution defined by pushforward.optimization_layer applied to θ.

This function is differentiable, even if pushforward.post_processing isn't.

source
InferOpt.RegularizedFrankWolfeType
RegularizedFrankWolfe <: AbstractRegularized

Regularized optimization layer which relies on the Frank-Wolfe algorithm to define a probability distribution while solving

ŷ(θ) = argmax_{y ∈ C} {θᵀy - Ω(y)}
Warning

Since this is a conditional dependency, you need to have loaded the following packages before using RegularizedFrankWolfe:

  • DifferentiableFrankWolfe.jl
  • FrankWolfe.jl
  • ImplicitDifferentiation.jl

Fields

  • linear_maximizer: linear maximization oracle θ -> argmax_{x ∈ C} θᵀx, implicitly defines the polytope C
  • Ω: regularization function Ω(y)
  • Ω_grad: gradient function of the regularization function ∇Ω(y)
  • frank_wolfe_kwargs: named tuple of keyword arguments passed to the Frank-Wolfe algorithm
  • implicit_kwargs: named tuple of keyword arguments passed to the implicit differentiation algorithm (in particular, the needed linear solver)

Frank-Wolfe parameters

Some values you can tune:

  • epsilon::Float64: precision target
  • max_iteration::Integer: max number of iterations
  • timeout::Float64: max runtime in seconds
  • lazy::Bool: caching strategy
  • away_steps::Bool: avoid zig-zagging
  • line_search::FrankWolfe.LineSearchMethod: step size selection
  • verbose::Bool: console output

See the documentation of FrankWolfe.jl for details.

source
InferOpt.RegularizedFrankWolfeMethod
(regularized::RegularizedFrankWolfe)(θ; kwargs...)

Apply compute_probability_distribution(regularized, θ; kwargs...) and return the expectation.

source
InferOpt.SPOPlusLossType
struct SPOPlusLoss{F} <: InferOpt.AbstractLossLayer

Convex surrogate of the Smart "Predict-then-Optimize" loss.

Fields

  • maximizer::Any: linear maximizer function of the form θ -> ŷ(θ) = argmax θᵀy

  • α::Float64: convexification parameter, default = 2.0

Reference: https://arxiv.org/abs/1710.08005

source
InferOpt.SPOPlusLossMethod

Forward pass of the SPO+ loss with given target θ_true and y_true. The third argument y_true is optional, as it can be computed from θ_true. However, providing it directly can save computation time.

source
InferOpt.SPOPlusLossMethod

Forward pass of the SPO+ loss with given target θ_true. For better performance, you can also provide y_true directly as a third argument.

source
InferOpt.SoftArgmaxType
SoftArgmax <: Regularized

Soft argmax activation function s(z) = (e^zᵢ / ∑ e^zⱼ)ᵢ.

Corresponds to regularized prediction on the probability simplex with entropic penalty.

source
InferOpt.SoftRankType
SoftRank{is_l2_regularized} <: AbstractRegularized

Fast differentiable ranking regularized layer. It uses an L2 regularization if is_l2_regularized is true, else it uses an entropic (kl) regularization.

As an AbstractRegularized layer, it can also be used for supervised learning with a FenchelYoungLoss.

Fields

  • ε::Float64: size of the regularization
  • rev::Bool: rank in ascending order if false

Reference: https://arxiv.org/abs/2002.08871

source
InferOpt.SoftRankMethod
SoftRank(; ε::Float64=1.0, rev::Bool=false, is_l2_regularized::Bool=true)

Constructor for SoftRank.

Arguments

  • ε::Float64=1.0: size of the regularization
  • rev::Bool=false: rank in ascending order if false
  • `regularization="l2": used regularization, either "l2" or "kl"
source
InferOpt.SoftSortType
SoftSort{is_l2_regularized} <: AbstractOptimizationLayer

Fast differentiable sorting optimization layer. It uses an L2 regularization if is_l2_regularized is true, else it uses an entropic (kl) regularization.

Reference https://arxiv.org/abs/2002.08871

Fields

  • ε::Float64: size of the regularization
  • rev::Bool: sort in ascending order if false
source
InferOpt.SoftSortMethod
SoftSort(; ε::Float64=1.0, rev::Bool=false, is_l2_regularized::Bool=true)

Constructor for SoftSort.

Arguments

  • ε::Float64=1.0: size of the regularization
  • rev::Bool=false: sort in ascending order if false
  • is_l2_regularized::Bool=true: use l2 regularization if true, else kl regularization
source
InferOpt.SparseArgmaxType
SparseArgmax <: AbstractRegularized

Compute the Euclidean projection of the vector z onto the probability simplex.

Corresponds to regularized prediction on the probability simplex with square norm penalty.

source

Functions

Base.randMethod
rand(
    rng::Random.AbstractRNG,
    perturbation::InferOpt.AbstractPerturbation
) -> Any
source
Base.randMethod
rand(
    rng::Random.AbstractRNG,
    d::InferOpt.ExponentialOf
) -> Any
source
InferOpt.PerturbedAdditiveMethod
PerturbedAdditive(
    maximizer;
    ε,
    perturbation_dist,
    nb_samples,
    variance_reduction,
    seed,
    threaded,
    rng,
    dist_logdensity_grad
) -> Union{PerturbedOracle{InferOpt.AdditivePerturbation{Distributions.Normal{Float64}}, _A, _B, _C, Nothing, Random.TaskLocalRNG, Nothing} where {_A, _B, _C}, PerturbedOracle{InferOpt.AdditivePerturbation{Distributions.Normal{Float64}}, _A, _B, _C, InferOpt.NormalAdditiveGradLogdensity, Random.TaskLocalRNG, Nothing} where {_A, _B, _C}}

Constructor for PerturbedOracle with an additive perturbation.

source
InferOpt.PerturbedMultiplicativeMethod
PerturbedMultiplicative(
    maximizer;
    ε,
    perturbation_dist,
    nb_samples,
    variance_reduction,
    seed,
    threaded,
    rng,
    dist_logdensity_grad
) -> Union{PerturbedOracle{InferOpt.MultiplicativePerturbation{Distributions.Normal{Float64}}, _A, _B, _C, Nothing, Random.TaskLocalRNG, Nothing} where {_A, _B, _C}, PerturbedOracle{InferOpt.MultiplicativePerturbation{Distributions.Normal{Float64}}, _A, _B, _C, InferOpt.NormalMultiplicativeGradLogdensity, Random.TaskLocalRNG, Nothing} where {_A, _B, _C}}

Constructor for PerturbedOracle with a multiplicative perturbation.

source
InferOpt.apply_gMethod
apply_g(f::LinearMaximizer, y; kwargs...) -> Any

Applies the function g of the LinearMaximizer f to y.

source
InferOpt.compute_probability_distributionFunction
compute_probability_distribution(layer, θ; kwargs...)

Apply a probabilistic optimization layer to an objective direction θ in order to generate a FixedAtomsProbabilityDistribution on the vertices of a polytope.

source
InferOpt.compute_regularizationFunction
compute_regularization(regularized::AbstractRegularized, y)

Return the convex penalty Ω(y) associated with an AbstractRegularized layer.

source
InferOpt.get_y_trueFunction
get_y_true(t_true::Any)

Retrieve y_true from t_true.

This method should be implemented when using a custom data structure for t_true other than a NamedTuple.

source
InferOpt.get_y_trueMethod
get_y_true(t_true::NamedTuple)

Retrieve y_true from t_true. t_true must contain an y_true field.

source
InferOpt.isprobadistMethod
isprobadist(p::AbstractArray{R<:Real, 1}) -> Any

Check whether the elements of p are nonnegative and sum to 1.

source
InferOpt.objective_valueMethod
objective_value(f::LinearMaximizer, θ, y; kwargs...) -> Any

Computes the objective value of given LinearMaximizer f, knowing weights θ and solution y. i.e. θᵀg(y) + h(y)

source
InferOpt.one_hot_argmaxMethod
one_hot_argmax(
    z::AbstractArray{R<:Real, 1};
    kwargs...
) -> Any

One-hot encoding of the argmax function.

source
InferOpt.rankingMethod
ranking(θ::AbstractVector; rev, kwargs...) -> Any

Compute the vector r such that rᵢ is the rank of θᵢ in θ.

source
InferOpt.shannon_entropyMethod
shannon_entropy(p::AbstractArray{R<:Real, 1}) -> Any

Compute the Shannon entropy of a probability distribution: H(p) = -∑ pᵢlog(pᵢ).

source
InferOpt.soft_rankMethod
soft_rank(θ::AbstractVector; ε=1.0, rev::Bool=false)

Fast differentiable ranking of vector θ.

Arguments

  • θ: vector to sort

Keyword (optional) arguments

  • ε::Float64=1.0: size of the regularization
  • rev::Bool=false: sort in ascending order if false
  • regularization=:l2: use l2 regularization if :l2, and kl regularization if :kl

See also soft_rank_l2 and soft_rank_kl.

source
InferOpt.soft_sortMethod
soft_sort(θ::AbstractVector; ε=1.0, rev::Bool=false, regularization=:l2)

Fast differentiable sort of vector θ.

Arguments

  • θ: vector to sort

Keyword (optional) arguments

  • ε::Float64=1.0: size of the regularization
  • rev::Bool=false: sort in ascending order if false
  • regularization=:l2: use l2 regularization if :l2, and kl regularization if :kl

See also soft_sort_l2 and soft_sort_kl.

source
InferOpt.zero_one_lossMethod
zero_one_loss(y, y_true)

0-1 loss for multiclass classification: δ(y, y_true) = 0 if y = y_true, and 1 otherwise.

source

Index