Source author record

Juntao Wang

Juntao Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Science and Game Theory hep-th Methodology Multiagent Systems Computation cs.CY hep-ph Human-Computer Interaction Machine Learning Social and Information Networks

Catalog footprint

What is connected

8works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OptArgus: A Multi-Agent System to Detect Hallucinations in LLM-based Optimization Modeling

Large language models (LLMs) are increasingly used to translate natural-language optimization problems into mathematical formulations and solver code, but matching the reference objective value is not a reliable test of correctness: an artifact may agree numerically while still changing the underlying optimization semantics. We formulate this issue as \emph{optimization-modeling hallucination detection}, namely structural consistency auditing over the problem description, symbolic model, and solver implementation. We develop, to our knowledge, the first fine-grained hallucination taxonomy specifically for optimization modeling, spanning objective, variable, constraint, and implementation failures. We use this taxonomy to design OptArgus, a multi-agent detector with conductor routing, specialist auditors, and evidence consolidation. To evaluate this setting, we introduce a three-part benchmark suite with $484$ clean artifacts, $1266$ controlled injected artifacts, and $6292$ natural LLM-generated artifacts. Against a matched single-agent baseline, OptArgus produces fewer false alarms on clean artifacts, more accurate top-ranked localization on controlled single-error cases, and stronger detection on natural model outputs. Together, these contributions turn optimization-modeling hallucination detection into a concrete empirical problem and suggest that modular, taxonomy-grounded auditing is a practical route to more reliable optimization modeling.

preprint2022arXiv

Forecast Aggregation via Peer Prediction

Crowdsourcing enables the solicitation of forecasts on a variety of prediction tasks from distributed groups of people. How to aggregate the solicited forecasts, which may vary in quality, into an accurate final prediction remains a challenging yet critical question. Studies have found that weighing expert forecasts more in aggregation can improve the accuracy of the aggregated prediction. However, this approach usually requires access to the historical performance data of the forecasters, which are often not available. In this paper, we study the problem of aggregating forecasts without having historical performance data. We propose using peer prediction methods, a family of mechanisms initially designed to truthfully elicit private information in the absence of ground truth verification, to assess the expertise of forecasters, and then using this assessment to improve forecast aggregation. We evaluate our peer-prediction-aided aggregators on a diverse collection of 14 human forecast datasets. Compared with a variety of existing aggregators, our aggregators achieve a significant and consistent improvement on aggregation accuracy measured by the Brier score and the log score. Our results reveal the effectiveness of identifying experts to improve aggregation even without historical data.

preprint2022arXiv

Randomized Wagering Mechanisms

Wagering mechanisms are one-shot betting mechanisms that elicit agents' predictions of an event. For deterministic wagering mechanisms, an existing impossibility result has shown incompatibility of some desirable theoretical properties. In particular, Pareto optimality (no profitable side bet before allocation) can not be achieved together with weak incentive compatibility, weak budget balance and individual rationality. In this paper, we expand the design space of wagering mechanisms to allow randomization and ask whether there are randomized wagering mechanisms that can achieve all previously considered desirable properties, including Pareto optimality. We answer this question positively with two classes of randomized wagering mechanisms: i) one simple randomized lottery-type implementation of existing deterministic wagering mechanisms, and ii) another family of simple and randomized wagering mechanisms which we call surrogate wagering mechanisms, which are robust to noisy ground truth. This family of mechanisms builds on the idea of learning with noisy labels (Natarajan et al. 2013) as well as a recent extension of this idea to the information elicitation without verification setting (Liu and Chen 2018). We show that a broad family of randomized wagering mechanisms satisfy all desirable theoretical properties.

preprint2021arXiv

Free Quotients of Favorable Calabi-Yau Manifolds

Non-simply connected Calabi-Yau threefolds play a central role in the study of string compactifications. Such manifolds are usually described by quotienting a simply connected Calabi-Yau variety by a freely acting discrete symmetry. For the Calabi-Yau threefolds described as complete intersections in products of projective spaces, a classification of such symmetries descending from linear actions on the ambient spaces of the varieties has been given in the literature. However, which symmetries can be described in this manner depends upon the description that is being used to represent the manifold. In recent work new, favorable, descriptions were given of this data set of Calabi-Yau threefolds. In this paper, we perform a classification of cyclic symmetries that descend from linear actions on the ambient spaces of these new favorable descriptions. We present a list of 129 symmetries/non-simply connected Calabi-Yau threefolds. Of these, at least 33, and potentially many more, are topologically new varieties.

preprint2021arXiv

Sequential Gibbs Sampling Algorithm for Cognitive Diagnosis Models with Many Attributes

Cognitive diagnosis models (CDMs) are useful statistical tools to provide rich information relevant for intervention and learning. As a popular approach to estimate and make inference of CDMs, the Markov chain Monte Carlo (MCMC) algorithm is widely used in practice. However, when the number of attributes, $K$, is large, the existing MCMC algorithm may become time-consuming, due to the fact that $O(2^K)$ calculations are usually needed in the process of MCMC sampling to get the conditional distribution for each attribute profile. To overcome this computational issue, motivated by Culpepper and Hudson (2018), we propose a computationally efficient sequential Gibbs sampling method, which needs $O(K)$ calculations to sample each attribute profile. We use simulation and real data examples to show the good finite-sample performance of the proposed sequential Gibbs sampling, and its advantage over existing methods.

preprint2020arXiv

Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

The last decade saw the emergence of systematic large-scale replication projects in the social and behavioral sciences, (Camerer et al., 2016, 2018; Ebersole et al., 2016; Klein et al., 2014, 2018; Collaboration, 2015). These projects were driven by theoretical and conceptual concerns about a high fraction of "false positives" in the scientific publications (Ioannidis, 2005) (and a high prevalence of "questionable research practices" (Simmons, Nelson, and Simonsohn, 2011). Concerns about the credibility of research findings are not unique to the behavioral and social sciences; within Computer Science, Artificial Intelligence (AI) and Machine Learning (ML) are areas of particular concern (Lucic et al., 2018; Freire, Bonnet, and Shasha, 2012; Gundersen and Kjensmo, 2018; Henderson et al., 2018). Given the pioneering role of the behavioral and social sciences in the promotion of novel methodologies to improve the credibility of research, it is a promising approach to analyze the lessons learned from this field and adjust strategies for Computer Science, AI and ML In this paper, we review approaches used in the behavioral and social sciences and in the DARPA SCORE project. We particularly focus on the role of human forecasting of replication outcomes, and how forecasting can leverage the information gained from relatively labor and resource-intensive replications. We will discuss opportunities and challenges of using these approaches to monitor and improve the credibility of research areas in Computer Science, AI, and ML.

preprint2020arXiv

Surrogate Scoring Rules

Strictly proper scoring rules (SPSR) are incentive compatible for eliciting information about random variables from strategic agents when the principal can reward agents after the realization of the random variables. They also quantify the quality of elicited information, with more accurate predictions receiving higher scores in expectation. In this paper, we extend such scoring rules to settings where a principal elicits private probabilistic beliefs but only has access to agents' reports. We name our solution \emph{Surrogate Scoring Rules} (SSR). SSR build on a bias correction step and an error rate estimation procedure for a reference answer defined using agents' reports. We show that, with a single bit of information about the prior distribution of the random variables, SSR in a multi-task setting recover SPSR in expectation, as if having access to the ground truth. Therefore, a salient feature of SSR is that they quantify the quality of information despite the lack of ground truth, just as SPSR do for the setting \emph{with} ground truth. As a by-product, SSR induce \emph{dominant truthfulness} in reporting. Our method is verified both theoretically and empirically using data collected from real human forecasters.

preprint2019arXiv

Jumping Spectra and Vanishing Couplings in Heterotic Line Bundle Standard Models

We study two aspects of the physics of heterotic Line Bundle Standard Models on smooth Calabi-Yau threefolds. First, we investigate to what degree modern moduli stabilization scenarios can affect the standard model spectrum in such compactifications. Specifically, we look at the case where some of the complex structure moduli are fixed by a choice of hidden sector bundle. In this context, we study the frequency with which the system tends to be forced to a point in moduli space where the cohomology groups determining the spectrum in the standard model sector jump in dimension. Second, we investigate to what degree couplings, that are permitted by all of the obvious symmetries of the theory, actually vanish due to certain topological constraints associated to their higher dimensional origins. We find that both effects are prevalent within the data set of heterotic Line Bundle Standard Models studied.

Juntao Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

OptArgus: A Multi-Agent System to Detect Hallucinations in LLM-based Optimization Modeling

Forecast Aggregation via Peer Prediction

Randomized Wagering Mechanisms

Free Quotients of Favorable Calabi-Yau Manifolds

Sequential Gibbs Sampling Algorithm for Cognitive Diagnosis Models with Many Attributes

Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

Surrogate Scoring Rules

Jumping Spectra and Vanishing Couplings in Heterotic Line Bundle Standard Models