DifferentiableFrankWolfe
Documentation for DifferentiableFrankWolfe.jl.
Public API
DifferentiableFrankWolfe.DifferentiableFrankWolfe
— ModuleDifferentiableFrankWolfe
Differentiable wrapper for FrankWolfe.jl convex optimization routines.
DifferentiableFrankWolfe.DiffFW
— TypeDiffFW
Callable parametrized wrapper for the Frank-Wolfe algorithm to solve θ -> argmin_{x ∈ C} f(x, θ)
from a given starting point x0
. The solution routine can be differentiated implicitly with respect θ
, but not with respect to x0
.
Constructor
DiffFW(f, f_grad1, lmo, alg=away_frank_wolfe; implicit_kwargs=(;))
f
: functionf(x, θ)
to minimize with respect tox
f_grad1
: gradient∇ₓf(x, θ)
off
with respect tox
lmo
: linear minimization oracleθ -> argmin_{x ∈ C} θᵀx
from FrankWolfe.jl, implicitly defines the convex setC
alg
: optimization algorithm from FrankWolfe.jl, must return anactive_set
implicit_kwargs
: keyword arguments passed to theImplicitFunction
object from ImplicitDifferentiation.jl
References
Efficient and Modular Implicit Differentiation, Blondel et al. (2022)
DifferentiableFrankWolfe.DiffFW
— Method(dfw::DiffFW)(θ::AbstractArray, x0::AbstractArray; kwargs...)
Apply the differentiable Frank-Wolfe algorithm defined by dfw
to parameter θ
with starting point x0
. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw
.
Return the optimal solution x
.
Private API
DifferentiableFrankWolfe.ConditionsFW
— TypeConditionsFW
Differentiable optimality conditions for DiffFW
, which rely on a custom simplex_projection
implementation.
DifferentiableFrankWolfe.ForwardFW
— TypeForwardFW
Underlying solver for DiffFW
, which relies on a variant of Frank-Wolfe with active set memorization.
DifferentiableFrankWolfe.detailed_output
— Methoddetailed_output(dfw::DiffFW, θ::AbstractArray, x0::AbstractArray; kwargs...)
Apply the differentiable Frank-Wolfe algorithm defined by dfw
to parameter θ
with starting point x0
. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw
.
Return a couple (x, stats) where x
is the solution and stats
is a named tuple containing additional information (its contents are not covered by public API, and mostly useful for debugging).
DifferentiableFrankWolfe.simplex_projection
— Methodsimplex_projection(z)
Compute the Euclidean projection of the vector z
onto the probability simplex.
This function is differentiable thanks to a custom chain rule.
References
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification, Martins and Astudillo (2016)