Source author record

Hamed Karimi

Hamed Karimi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation Computer Vision cond-mat.mes-hall cond-mat.str-el math.OC

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LLMs Uncertainty Quantification via Adaptive Conformal Semantic Entropy

LLMs' overconfidence, particularly when hallucinating, poses a significant challenge for the deployment of the models in safety-critical settings and makes a reliable estimation of uncertainty necessary. Existing approaches for uncertainty quantification typically prioritize lexical or probabilistic measures; however, these techniques often ignore the semantic variance of different responses with similar meaning. In this paper, we propose Adaptive Conformal Semantic Entropy (ACSE), a method for estimating prompt-level uncertainty by adaptively measuring semantic dispersion in LLMs outputs. Our uncertainty scoring function is based on clustering semantic entropy of multiple diverse responses to the same prompt. The function adaptively adjusts the uncertainty score based on semantic features of each cluster. To ensure statistical reliability of our score, we use conformal calibration to apply a decision rule to accept/abstain the prompts, providing a finite-sample, distribution-free guarantee such that the error rate among the accepted responses remains bounded by a user-specified tolerance. Our extensive experimental evaluations using different LLMs and datasets, demonstrate that our approach consistently outperforms state-of-the-art uncertainty quantification baselines using discriminative performance, conformal guarantees, and probabilistic calibration indicators. As a highlight, for TriviaQA dataset, AUROC of our approach is 0.88 compared to 0.65 produced by the token entropy approach.

preprint2024arXiv

Quantifying Deep Learning Model Uncertainty in Conformal Prediction

Precise estimation of predictive uncertainty in deep neural networks is a critical requirement for reliable decision-making in machine learning and statistical modeling, particularly in the context of medical AI. Conformal Prediction (CP) has emerged as a promising framework for representing the model uncertainty by providing well-calibrated confidence levels for individual predictions. However, the quantification of model uncertainty in conformal prediction remains an active research area, yet to be fully addressed. In this paper, we explore state-of-the-art CP methodologies and their theoretical foundations. We propose a probabilistic approach in quantifying the model uncertainty derived from the produced prediction sets in conformal prediction and provide certified boundaries for the computed uncertainty. By doing so, we allow model uncertainty measured by CP to be compared by other uncertainty quantification methods such as Bayesian (e.g., MC-Dropout and DeepEnsemble) and Evidential approaches.

preprint2020arXiv

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Łojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older Polyak-Łojasiewicz (PL) inequality is actually weaker than the main conditions that have been explored to show linear convergence rates without strong convexity over the last 25 years. We also use the PL inequality to give new analyses of randomized and greedy coordinate descent methods, sign-based gradient descent methods, and stochastic gradient methods in the classic setting (with decreasing or constant step-sizes) as well as the variance-reduced setting. We further propose a generalization that applies to proximal-gradient methods for non-smooth optimization, leading to simple proofs of linear convergence of these methods. Along the way, we give simple convergence results for a wide variety of problems in machine learning: least squares, logistic regression, boosting, resilient backpropagation, L1-regularization, support vector machines, stochastic dual coordinate ascent, and stochastic variance-reduced gradient methods.

preprint2012arXiv

Towards a Rigorous Proof of Magnetism on the Edges of Graphene Nano-ribbons

A zigzag edge of a graphene nanoribbon supports localized zero modes, ignoring interactions. Based mainly on mean field arguments and numerical approaches, it has been suggested that interactions can produce a large magnetic moment on the edges. By considering the Hubbard model in the weak coupling limit, U<<t, for bearded as well as zigzag edges, we argue for such a magnetic state, based on Lieb's theorem. Projecting the Hubbard interactions onto the flat edge band, we then prove that resulting 1 dimensional model has a fully polarized ferromagnetic ground state. We also study excitons and the effects of second neighbor hopping as well as a potential energy term acting on the edge only, proposing a simple and possibly exact phase diagram with the magnetic moment varying smoothly to zero. Finally, we consider corrections of second order in U arising from integrating out the gapless bulk Dirac excitations.

preprint2011arXiv

Transverse spectral functions and Dzyaloshinskii-Moriya interactions in XXZ spin chains

Recently much progress has been made in applying field theory methods, first developed to study X-ray edge singularities, to interacting one dimensional systems in order to include band curvature effects and study edge singularities at arbitrary momentum. Finding experimental confirmations of this theory remains an open challenge. Here we point out that spin chains with uniform Dzyaloshinskii-Moriya (DM) interactions provide an opportunity to test these theories since these interactions may be exactly eliminated by a gauge transformation which shifts the momentum. However, this requires an extension of these X-ray edge methods to the transverse spectral function of the xxz spin chain in a magnetic field, which we provide.