Source author record

Carlos Alberto Gomez-Uribe

Carlos Alberto Gomez-Uribe appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

3works
1topics
3close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

Shift-Curvature, SGD, and Generalization

A longstanding debate surrounds the related hypotheses that low-curvature minima generalize better, and that SGD discourages curvature. We offer a more complete and nuanced view in support of both. First, we show that curvature harms test performance through two new mechanisms, the shift-curvature and bias-curvature, in addition to a known parameter-covariance mechanism. The three curvature-mediated contributions to test performance are reparametrization-invariant although curvature is not. The shift in the shift-curvature is the line connecting train and test local minima, which differ due to dataset sampling or distribution shift. Although the shift is unknown at training time, the shift-curvature can still be mitigated by minimizing overall curvature. Second, we derive a new, explicit SGD steady-state distribution showing that SGD optimizes an effective potential related to but different from train loss, and that SGD noise mediates a trade-off between deep versus low-curvature regions of this effective potential. Third, combining our test performance analysis with the SGD steady state shows that for small SGD noise, the shift-curvature may be the most significant of the three mechanisms. Our experiments confirm the impact of shift-curvature on test loss, and further explore the relationship between SGD noise and curvature.

preprint2021arXiv

The decoupled extended Kalman filter for dynamic exponential-family factorization models

Motivated by the needs of online large-scale recommender systems, we specialize the decoupled extended Kalman filter (DEKF) to factorization models, including factorization machines, matrix and tensor factorization, and illustrate the effectiveness of the approach through numerical experiments on synthetic and on real-world data. Online learning of model parameters through the DEKF makes factorization models more broadly useful by (i) allowing for more flexible observations through the entire exponential family, (ii) modeling parameter drift, and (iii) producing parameter uncertainty estimates that can enable explore/exploit and other applications. We use a different parameter dynamics than the standard DEKF, allowing parameter drift while encouraging reasonable values. We also present an alternate derivation of the extended Kalman filter and DEKF that highlights the role of the Fisher information matrix in the EKF.

preprint2016arXiv

Online Algorithms For Parameter Mean And Variance Estimation In Dynamic Regression Models

We study the problem of estimating the parameters of a regression model from a set of observations, each consisting of a response and a predictor. The response is assumed to be related to the predictor via a regression model of unknown parameters. Often, in such models the parameters to be estimated are assumed to be constant. Here we consider the more general scenario where the parameters are allowed to evolve over time, a more natural assumption for many applications. We model these dynamics via a linear update equation with additive noise that is often used in a wide range of engineering applications, particularly in the well-known and widely used Kalman filter (where the system state it seeks to estimate maps to the parameter values here). We derive an approximate algorithm to estimate both the mean and the variance of the parameter estimates in an online fashion for a generic regression model. This algorithm turns out to be equivalent to the extended Kalman filter. We specialize our algorithm to the multivariate exponential family distribution to obtain a generalization of the generalized linear model (GLM). Because the common regression models encountered in practice such as logistic, exponential and multinomial all have observations modeled through an exponential family distribution, our results are used to easily obtain algorithms for online mean and variance parameter estimation for all these regression models in the context of time-dependent parameters. Lastly, we propose to use these algorithms in the contextual multi-armed bandit scenario, where so far model parameters are assumed static and observations univariate and Gaussian or Bernoulli. Both of these restrictions can be relaxed using the algorithms described here, which we combine with Thompson sampling to show the resulting performance on a simulation.