Estimation & Learning in Stochastic Systems II – Invited Special Session

C3L-E: Estimation & Learning in Stochastic Systems II - Invited Special Session

Session Type: Lecture
Session Code: C3L-E
Location: Room 5
Date & Time: Friday March 24, 2023 (11:20-12:20)
Chair: James Spall
Track: 12
Paper IDPaper TitleAuthorsAbstract
3096Dynamics of SGD with Stochastic Polyak Stepsizes: Truly Adaptive Variants and Convergence to Exact SolutionNicolas LoizouRecently Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochastic Polyak stepsize (SPS). The proposed SPS comes with strong convergence guarantees and competitive performance; however, it has two main drawbacks when it is used in non-over-parameterized regimes: (i) It requires a priori knowledge of the optimal mini-batch losses, which are not available when the interpolation condition is not satisfied (e.g., regularized objectives), and (ii) it guarantees convergence only to a neighborhood of the solution. In this work, we study the dynamics and the convergence properties of SGD equipped with new variants of the stochastic Polyak stepsize and provide solutions to both drawbacks of the original SPS. We first show that a simple modification of the original SPS that uses lower bounds instead of the optimal function values can directly solve issue (i). On the other hand, solving issue (ii) turns out to be more challenging and leads us to valuable insights into the method’s behavior. We show that if interpolation is not satisfied, the correlation between SPS and stochastic gradients introduces a bias, which effectively distorts the expectation of the gradient signal near minimizers, leading to non-convergence - even if the stepsize is scaled down during training. To fix this issue, we propose DecSPS, a novel modification of SPS, which guarantees convergence to the exact minimizer - without a priori knowledge of the problem parameters. For strongly-convex optimization problems, DecSPS is the first stochastic adaptive optimization method that converges to the exact solution without restrictive assumptions like bounded iterates/gradients.
3168SPSA-Based Switch Updating Algorithm for Constrained Stochastic OptimizationZhichao Jia, Ziyi WeiSimultaneous perturbation stochastic approximation (SPSA) is widely used in stochastic optimization due to its high efficiency, asymptotic stability, and reduced number of required loss function measurements. However, the standard SPSA algorithm needs to be modified to deal with constrained problems. In recent years, sequential quadratic programming (SQP)-based projection ideas and penalty ideas have been analyzed. Both ideas have convergence results and a potentially wide range of applications, but with some limitations in practical consideration, such as computation time, complexity, and feasibility guarantee. We propose an SPSA-based switch updating algorithm, which updates based on the loss function or the inequality constraints, depending on current feasibility in each iteration. We show convergence results on this algorithm, and analyze its properties relative to other methods. We also numerically compare the switch updating algorithm with the penalty function approach for one constrained example.
3141Piecewise Linear and Stochastic Models for the Analysis of Cyber ResilienceMichael Weisman{1}, Alexander Kott{1}, Joachim Vandekerckhove{2}We model a vehicle equipped with an autonomous cyber-defense system and which also has some inherent physical resilience features. When attacked, this ensemble of cyber-physical features (i.e., “bonware”) strives to resist and recover from the performance degradation caused by the malware’s attack. We model the underlying differential equations governing such attacks for piecewise linear characterizations of malware and bonware, develop a discrete time stochastic model, and show that averages of instantiations of the stochastic model, approximate solutions to the continuous differential equation. We develop a theory and methodology for approximating the parameters associated with these equations.