Researcher profile

Miguel Ballesteros

Miguel Ballesteros contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

GSM-SEM: Benchmark and Framework for Generating Semantically Variant Augmentations

Benchmarks like GSM8K are popular measures of mathematical reasoning, but leaderboard gains can overstate true capability due to memorization of fixed test sets. Most robustness variants apply surface-level perturbations (paraphrases, renamings, number swaps, distractors) that largely preserve the underlying facts, and static releases can themselves become memorization targets over time. We introduce GSM-SEM, a reusable and stochastic framework for generating semantically diverse benchmark variants with substantially higher semantic variance than prior approaches. GSM-SEM perturbs problem statements by modifying entities, attributes, and/or relationships, frequently altering underlying facts and requiring models to recompute solutions under new conditions, while constraining generation to preserve the original calculations/answer and approximate problem difficulty. GSM-SEM generates fresh variants on each run without requiring re-annotation, reducing reliance on static public benchmarks for evaluation and thereby lowering the bias of memorization. We apply GSM-SEM on GSM8K and two existing variation suites (GSM-Symbolic and GSM-Plus), producing GSM8K-SEM, GSM-Symbolic-SEM, and GSM-Plus-SEM. Evaluating 14 SOTA LLMs, we observe consistent performance drops with larger decline when semantic perturbations are coupled with symbolic/plus variations (average drop rate 28% in maximum strictness configuration of GSM-SEM). We publicly release the three SEM variants as fully human-validated datasets. Finally, to demonstrate applicability beyond GSM-style math problems, we apply GSM-SEM to additional benchmarks including BigBenchHard, LogicBench, and NLR-BIRD.

preprint2022arXiv

Band edge limit of the scattering matrix for quasi-one-dimensional discrete Schrödinger operators

This paper is about the scattering theory for one-dimensional matrix Schrödinger operators with a matrix potential having a finite first moment. The transmission coefficients are analytically continued and extended to the band edges. An explicit expression is given for these extensions. The limits of the reflection coefficients at the band edges is also calculated.

preprint2022arXiv

Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks. In contrast, literature on task transferability has established that the choice of intermediate tasks can heavily affect downstream task performance. In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task representation learning. We find that, on average, increasing the scale of multi-task learning, in terms of the number of tasks, indeed results in better learned representations than smaller multi-task setups. However, if the target tasks are known ahead of time, then training on a smaller set of related tasks is competitive to the large-scale multi-task training at a reduced computational cost.

preprint2022arXiv

Label Semantics for Few Shot Named Entity Recognition

We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format. Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder. The label semantics signal is shown to support improved state-of-the-art results in multiple few shot NER benchmarks and on-par performance in standard benchmarks. Our model is especially effective in low resource settings.

preprint2021arXiv

Analyticity properties of the scattering matrix for matrix Schrödinger operators on the discrete line

Explicit formulas for the analytic extensions of the scattering matrix and the time delay of a quasi-one-dimensional discrete Schrödinger operator with a potential of finite support are derived. This includes a careful analysis of the band edge singularities and allows to prove a Levinson-type theorem. The main algebraic tool are the plane wave transfer matrices.

preprint2021arXiv

Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a novel adaptation of the triplet loss into a linear classification objective. We show that the use of a suitable fine-tuning objective and external knowledge in pre-trained transformer models yields significant improvements in the effectiveness of contextual embeddings for clustering. Our model achieves a new state-of-the-art on a standard stream clustering dataset of English documents.

preprint2021arXiv

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among other information, it has been shown that entire syntax trees are implicitly embedded in the geometry of such models. As these models are often fine-tuned, it becomes increasingly important to understand how the encoded knowledge evolves along the fine-tuning. In this paper, we analyze the evolution of the embedded syntax trees along the fine-tuning process of BERT for six different tasks, covering all levels of the linguistic structure. Experimental results show that the encoded syntactic information is forgotten (PoS tagging), reinforced (dependency and constituency parsing) or preserved (semantics-related tasks) in different ways along the fine-tuning process depending on the task.

preprint2020arXiv

One-Boson Scattering Processes in the Massless Spin-Boson Model -- A Non-Perturbative Formula

In scattering experiments, physicists observe so-called resonances as peaks at certain energy values in the measured scattering cross sections per solid angle. These peaks are usually associate with certain scattering processes, e.g., emission, absorption, or excitation of certain particles and systems. On the other hand, mathematicians define resonances as poles of an analytic continuation of the resolvent operator through complex dilations. A major challenge is to relate these scattering and resonance theoretical notions, e.g., to prove that the poles of the resolvent operator induce the above mentioned peaks in the scattering matrix. In the case of quantum mechanics, this problem was addressed in numerous works that culminated in Simon's seminal paper [33] in which a general solution was presented for a large class of pair potentials. However, in quantum field theory the analogous problem has been open for several decades despite the fact that scattering and resonance theories have been well-developed for many models. In certain regimes these models describe very fundamental phenomena, such as emission and absorption of photons by atoms, from which quantum mechanics originated. In this work we present a first non-perturbative formula that relates the scattering matrix to the resolvent operator in the massless Spin-Boson model. This result can be seen as a major progress compared to our previous works [13] and [12] in which we only managed to derive a perturbative formula.

preprint2020arXiv

Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ELMo), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. Experiments on the MATRES dataset of English documents establish a new state-of-the-art on this task.

preprint2020arXiv

Transition-Based Dependency Parsing using Perceptron Learner

Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learner, outperforms a baseline arc-standard parser. We beat the UAS of the MALT and LSTM parsers. We also give possible ways to address parsing of non-projective trees.

preprint2007arXiv

High-Velocity Estimates for the Scattering Operator and Aharonov-Bohm Effect in Three Dimensions

We obtain high-velocity estimates with error bounds for the scattering operator of the Schrödinger equation in three dimensions with electromagnetic potentials in the exterior of bounded obstacles that are handlebodies. A particular case is a finite number of tori. We prove our results with time-dependent methods. We consider high-velocity estimates where the direction of the velocity of the incoming electrons is kept fixed as its absolute value goes to infinity. In the case of one torus our results give a rigorous proof that quantum mechanics predicts the interference patterns observed in the fundamental experiments of Tonomura et al. that gave a conclusive evidence of the existence of the Aharonov-Bohm effect using a toroidal magnet. We give a method for the reconstruction of the flux of the magnetic field over a cross-section of the torus modulo $2π$. Equivalently, we determine modulo $2π$ the difference in phase for two electrons that travel to infinity, when one goes inside the hole and the other outside it. For this purpose we only need the high-velocity limit of the scattering operator for one direction of the velocity of the incoming electrons. When there are several tori -or more generally handlebodies- the information that we obtain in the fluxes, and on the difference of phases, depends on the relative position of the tori and on the direction of the velocities when we take the high-velocity limit of the incoming electrons. For some locations of the tori we can determine all the fluxes modulo 2$π$ by taking the high-velocity limit in only one direction. We also give a method for the unique reconstruction of the electric potential and the magnetic field outside the handlebodies from the high-velocity limit of the scattering operator.