Source author record

Xintong Wang

Xintong Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Science and Game Theory Artificial Intelligence Multiagent Systems Computation and Language Computational Engineering, Finance, and Science cond-mat.mtrl-sci cond-mat.other

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Multimodal Dataset for Visually Grounded Ambiguity in Machine Translation

Ambiguity resolution is a key challenge in multimodal machine translation (MMT), where models must genuinely leverage visual input to map an ambiguous expression to its intended meaning. Although prior work has proposed disambiguation-oriented benchmarks that provide supportive evidence for the role of vision, we observe substantial issues in data quality and a mismatch with translation scenarios. Moreover, existing ambiguity-oriented evaluations are not well suited to broader ambiguity types in open-ended translation. To address these limitations, we present VIDA (Visually-Dependent Ambiguity), a dataset of 2,500 carefully curated instances in which resolving an annotated ambiguous source span requires visual evidence. We further propose Disambiguation-Centric Metrics that use an LLM-as-a-judge classifier to verify whether annotated ambiguous expressions are resolved correctly at the span level. Experiments with two state-of-the-art Large Vision Language Models under vanilla inference, supervised fine-tuning (SFT), and our chain-of-thought SFT (CoT-SFT) show that while SFT improves overall translation quality, CoT-SFT yields more consistent gains in disambiguation accuracy, especially on out-of-distribution subsets, indicating a stronger generalization for resolving diverse ambiguity types.

preprint2026arXiv

IndustryBench: Probing the Industrial Knowledge Boundaries of LLMs

In industrial procurement, an LLM answer is useful only if it survives a standards check: recommended material must match operating condition, every parameter must respect a regulated threshold, and no procedure may contradict a safety clause. Partial correctness can mask safety-critical contradictions that aggregate LLM benchmarks rarely capture. We introduce IndustryBench, a 2,049-item benchmark for industrial procurement QA in Chinese, grounded in Chinese national standards (GB/T) and structured industrial product records, organized by seven capability dimensions, ten industry categories, and panel-derived difficulty tiers, with item-aligned English, Russian, and Vietnamese renderings. Our construction pipeline rejects 70.3% of LLM-generated candidates at a search-based external-verification stage, calibrating how unreliable industrial QA remains after LLM-only filtering. Our evaluation decouples raw correctness, scored by a Qwen3-Max judge validated at $κ_w = 0.798$ against a domain expert, from a separate safety-violation (SV) check against source texts. Across 17 models in Chinese and an 8-model intersection over four languages, we find: (i) the best system reaches only 2.083 on the 0--3 rubric, leaving substantial headroom; (ii) Standards & Terminology is the most persistent capability weakness and survives item-aligned translation; (iii) extended reasoning lowers safety-adjusted scores for 12 of 13 models, primarily by introducing unsupported safety-critical details into longer final answers; and (iv) safety-violation rates reshuffle the leaderboard -- GPT-5.4 climbs from rank 6 to rank 3 after SV adjustment, while Kimi-k2.5-1T-A32B drops seven positions. Industrial LLM evaluation therefore requires source-grounded, safety-aware diagnosis rather than aggregate accuracy. We release IndustryBench with all prompts, scoring scripts, and dataset documentation.

preprint2023arXiv

Platform Behavior under Market Shocks: A Simulation Framework and Reinforcement-Learning Based Study

We study the behavior of an economic platform (e.g., Amazon, Uber Eats, Instacart) under shocks, such as COVID-19 lockdowns, and the effect of different regulation considerations imposed on a platform. To this end, we develop a multi-agent Gym environment of a platform economy in a dynamic, multi-period setting, with the possible occurrence of economic shocks. Buyers and sellers are modeled as economically-motivated agents, choosing whether or not to pay corresponding fees to use the platform. We formulate the platform's problem as a partially observable Markov decision process, and use deep reinforcement learning to model its fee setting and matching behavior. We consider two major types of regulation frameworks: (1) taxation policies and (2) platform fee restrictions, and offer extensive simulated experiments to characterize regulatory tradeoffs under optimal platform responses. Our results show that while many interventions are ineffective with a sophisticated platform actor, we identify a particular kind of regulation -- fixing fees to optimal, pre-shock fees while still allowing a platform to choose how to match buyer demands to sellers -- as promoting the efficiency, seller diversity, and resilience of the overall economic system.

preprint2022arXiv

Differential Liquidity Provision in Uniswap v3 and Implications for Contract Design

Decentralized exchanges (DEXs) provide a means for users to trade pairs of assets on-chain without the need for a trusted third party to effectuate a trade. Amongst these, constant function market maker DEXs such as Uniswap handle the most volume of trades between ERC-20 tokens. With the introduction of Uniswap v3, liquidity providers can differentially allocate liquidity to trades that occur within specific price intervals. In this paper, we formalize the profit and loss that liquidity providers can earn when providing specific liquidity allocations to a v3 contract. We give a convex stochastic optimization problem for computing optimal liquidity allocation for a liquidity provider who holds a belief on how prices will evolve over time and use this to study the design question regarding how v3 contracts should partition the price space for permissible liquidity allocations. Our results show that making a greater diversity of price-space partitions available to a contract designer can simultaneously benefit both liquidity providers and traders.

preprint2021arXiv

Log-time Prediction Markets for Interval Securities

We design a prediction market to recover a complete and fully general probability distribution over a random variable. Traders buy and sell interval securities that pay \$1 if the outcome falls into an interval and \$0 otherwise. Our market takes the form of a central automated market maker and allows traders to express interval endpoints of arbitrary precision. We present two designs in both of which market operations take time logarithmic in the number of intervals (that traders distinguish), providing the first computationally efficient market for a continuous variable. Our first design replicates the popular logarithmic market scoring rule (LMSR), but operates exponentially faster than a standard LMSR by exploiting its modularity properties to construct a balanced binary tree and decompose computations along the tree nodes. The second design consists of two or more parallel LMSR market makers that mediate submarkets of increasingly fine-grained outcome partitions. This design remains computationally efficient for all operations, including arbitrage removal across submarkets. It adds two additional benefits for the market designer: (1) the ability to express utility for information at various resolutions by assigning different liquidity values, and (2) the ability to guarantee a true constant bounded loss by appropriately decreasing the liquidity in each submarket.

preprint2020arXiv

Electronic states and magnetic response of MnBi2Te4 by scanning tunneling microscopy and spectroscopy

Exotic quantum phenomena have been demonstrated in recently discovered intrinsic magnetic topological insulator MnBi2Te4. At its two-dimensional limit, quantum anomalous Hall (QAH) effect and axion insulator state are observed in odd and even layers of MnBi2Te4, respectively. The measured band structures exhibit intriguing and complex properties. Here we employ low-temperature scanning tunneling microscopy to study its surface states and magnetic response. The quasiparticle interference patterns indicate that the electronic structures on the topmost layer of MnBi2Te4 is different from that of the expected out-of-plane A-type antiferromagnetic phase. The topological surface states may be embedded in deeper layers beneath the topmost surface. Such novel electronic structure presumably related to the modification of crystalline structure during sample cleaving and re-orientation of magnetic moment of Mn atoms near the surface. Mn dopants substituted at the Bi site on the second atomic layer are observed. The ratio of Mn/Bi substitutions is 5%. The electronic structures are fluctuating at atomic scale on the surface, which can affect the magnetism of MnBi2Te4. Our findings shed new lights on the magnetic property of MnBi2Te4 and thus the design of magnetic topological insulators.