DifferentiableFrankWolfe
Documentation for DifferentiableFrankWolfe.jl.
Public API
DifferentiableFrankWolfe.DifferentiableFrankWolfe — ModuleDifferentiableFrankWolfeDifferentiable wrapper for FrankWolfe.jl convex optimization routines.
DifferentiableFrankWolfe.DiffFW — TypeDiffFWCallable parametrized wrapper for the Frank-Wolfe algorithm to solve θ -> argmin_{x ∈ C} f(x, θ) from a given starting point x0. The solution routine can be differentiated implicitly with respect θ, but not with respect to x0.
Constructor
DiffFW(f, f_grad1, lmo, alg=away_frank_wolfe; implicit_kwargs=(;))f: functionf(x, θ)to minimize with respect toxf_grad1: gradient∇ₓf(x, θ)offwith respect toxlmo: linear minimization oracleθ -> argmin_{x ∈ C} θᵀxfrom FrankWolfe.jl, implicitly defines the convex setCalg: optimization algorithm from FrankWolfe.jl, must return anactive_setimplicit_kwargs: keyword arguments passed to theImplicitFunctionobject from ImplicitDifferentiation.jl
References
Efficient and Modular Implicit Differentiation, Blondel et al. (2022)
DifferentiableFrankWolfe.DiffFW — Method(dfw::DiffFW)(θ::AbstractArray, x0::AbstractArray; kwargs...)Apply the differentiable Frank-Wolfe algorithm defined by dfw to parameter θ with starting point x0. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw.
Return the optimal solution x.
Private API
DifferentiableFrankWolfe.ConditionsFW — TypeConditionsFWDifferentiable optimality conditions for DiffFW, which rely on a custom simplex_projection implementation.
DifferentiableFrankWolfe.ForwardFW — TypeForwardFWUnderlying solver for DiffFW, which relies on a variant of Frank-Wolfe with active set memorization.
DifferentiableFrankWolfe.detailed_output — Methoddetailed_output(dfw::DiffFW, θ::AbstractArray, x0::AbstractArray; kwargs...)Apply the differentiable Frank-Wolfe algorithm defined by dfw to parameter θ with starting point x0. Keyword arguments are passed on to the Frank-Wolfe algorithm inside dfw.
Return a couple (x, stats) where x is the solution and stats is a named tuple containing additional information (its contents are not covered by public API, and mostly useful for debugging).
DifferentiableFrankWolfe.simplex_projection — Methodsimplex_projection(z)Compute the Euclidean projection of the vector z onto the probability simplex.
This function is differentiable thanks to a custom chain rule.
References
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification, Martins and Astudillo (2016)