DifferentiableFrankWolfe

Documentation for DifferentiableFrankWolfe.jl.

Public API

DifferentiableFrankWolfe.DifferentiableFrankWolfe — Module

DifferentiableFrankWolfe

Differentiable wrapper for FrankWolfe.jl convex optimization routines.

source

DifferentiableFrankWolfe.DiffFW — Type

DiffFW

Callable parametrized wrapper for the Frank-Wolfe algorithm to solve θ -> argmin_{x ∈ C} f(x, θ) from a given starting point x0. The solution routine can be differentiated implicitly with respect θ, but not with respect to x0.

Constructor

DiffFW(f, f_grad1, lmo, alg=away_frank_wolfe; implicit_kwargs=(;))

f: function f(x, θ) to minimize with respect to x
f_grad1: gradient ∇ₓf(x, θ) of f with respect to x
lmo: linear minimization oracle θ -> argmin_{x ∈ C} θᵀx from FrankWolfe.jl, implicitly defines the convex set C
alg: optimization algorithm from FrankWolfe.jl, must return an active_set
implicit_kwargs: keyword arguments passed to the ImplicitFunction object from ImplicitDifferentiation.jl

References

Efficient and Modular Implicit Differentiation, Blondel et al. (2022)

source

DifferentiableFrankWolfe.DiffFW — Method

(dfw::DiffFW)(θ::AbstractArray, x0::AbstractArray; kwargs...)

Apply the differentiable Frank-Wolfe algorithm defined by dfw to parameter θ with starting point x0. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw.

Return the optimal solution x.

source

Private API

DifferentiableFrankWolfe.ConditionsFW — Type

ConditionsFW

Differentiable optimality conditions for DiffFW, which rely on a custom simplex_projection implementation.

source

DifferentiableFrankWolfe.ForwardFW — Type

ForwardFW

Underlying solver for DiffFW, which relies on a variant of Frank-Wolfe with active set memorization.

source

DifferentiableFrankWolfe.detailed_output — Method

detailed_output(dfw::DiffFW, θ::AbstractArray, x0::AbstractArray; kwargs...)

Apply the differentiable Frank-Wolfe algorithm defined by dfw to parameter θ with starting point x0. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw.

Return a couple (x, stats) where x is the solution and stats is a named tuple containing additional information (its contents are not covered by public API, and mostly useful for debugging).

source

DifferentiableFrankWolfe.simplex_projection — Method

simplex_projection(z)

Compute the Euclidean projection of the vector z onto the probability simplex.

This function is differentiable thanks to a custom chain rule.

References

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification, Martins and Astudillo (2016)

source