Source author record

Jiaming Hu

Jiaming Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision cond-mat.mtrl-sci Machine Learning Software Engineering

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Ab initio study of carrier mobility in Bi$_2$O$_2$Se

Bi$_2$O$_2$Se is an emerging high-performance layered semiconductor with excellent stability. While experimental studies have explored carrier transport across various doping levels for both $n$-type and $p$-type conduction, a comprehensive theoretical understanding remains incomplete. In this work, we present parameter-free first-principles calculations of the electron and hole mobilities in Bi$_2$O$_2$Se, based on iterative solution of the Boltzmann transport equation that includes electron-phonon scattering and ionized impurity scattering on an equal footing. Intriguingly, we find that Bi$_2$O$_2$Se exhibits high electron mobilities in both the in-plane and out-of-plane directions, whereas the hole mobilities are only significant in the in-plane direction, displaying a unique three-dimensional (3D) electron transport and two-dimensional (2D) hole transport behavior. At 300~K, the calculated intrinsic electron and hole mobilities along the in-plane direction are 447~$\mathrm{cm^2\,V^{-1}\,s^{-1}}$ and 29~$\mathrm{cm^2\,V^{-1}\,s^{-1}}$, respectively, which are primarily affected by Fröhlich electron-phonon interactions. Due to its large static dielectric permittivity, Bi$_2$O$_2$Se exhibits an exceptionally high low-temperature electron mobilities above $1.0\times10^5~\mathrm{cm^2\,V^{-1}\,s^{-1}}$, and its electron mobilities above 50~K is robust against ionized impurity scattering over a wide range of impurity concentrations. By incorporating the Hall effect into our analysis, we predict an in-plane electron Hall mobility of 517~$\mathrm{cm^2\,V^{-1}\,s^{-1}}$ at 300~K, in excellent agreement with experimental data. These results provide valuable insights into the carrier transport mechanisms in Bi$_2$O$_2$Se, and offer predictive benchmarks for future theoretical and experimental investigations.

preprint2026arXiv

BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

Image captioning is one of the most fundamental tasks in computer vision. Owing to its open-ended nature, it has received significant attention in the era of multimodal large language models (MLLMs). In pursuit of ever more detailed and accurate captions, recent work has increasingly turned to reinforcement learning (RL). However, existing captioning-RL methods and evaluation metrics often emphasize a narrow notion of caption quality, inducing trade-offs across core dimensions of captioning. For example, utility-oriented objectives can encourage noisy, hallucinated, or overlong captions that improve downstream question answering while harming fluency, whereas arena-style objectives can favor fluent but generic descriptions with limited usefulness. To address this, we propose a more balanced RL framework that jointly optimizes utility-aware correctness, reference coverage, and linguistic quality. In order to effectively optimize the resulting continuous multi-objective reward formulation, we apply GDPO-style reward-decoupled normalization to continuous-valued captioning rewards and show that it improves performance over vanilla GRPO. Additionally, we introduce length-conditional reward masking, yielding a more suitable length penalty for captioning. Across LLaVA-1.5-7B and Qwen2.5-VL 3B and 7B base models, our method consistently improves caption quality, with peak gains of +13.6 DCScore, +9.0 CaptionQA, and +29.0 CapArena across different models.

preprint2026arXiv

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

AI agents may soon become capable of autonomously completing valuable, long-horizon tasks in diverse domains. Current benchmarks either do not measure real-world tasks, or are not sufficiently difficult to meaningfully measure frontier models. To this end, we present Terminal-Bench 2.0: a carefully curated hard benchmark composed of 89 tasks in computer terminal environments inspired by problems from real workflows. Each task features a unique environment, human-written solution, and comprehensive tests for verification. We show that frontier models and agents score less than 65\% on the benchmark and conduct an error analysis to identify areas for model and agent improvement. We publish the dataset and evaluation harness to assist developers and researchers in future work at https://www.tbench.ai/ .

preprint2026arXiv

Towards General Preference Alignment: Diffusion Models at Nash Equilibrium

Reinforcement learning from human feedback (RLHF) has been popular for aligning text-to-image (T2I) diffusion models with human preferences. As a mainstream branch of RLHF, Direct Preference Optimization (DPO) offers a computationally efficient alternative that avoids explicit reward modeling and has been widely adopted in diffusion alignment. However, existing preference-based methods for diffusion alignment still rely on reward-induced preference signals and typically assume that human preferences can be adequately modeled by the Bradley--Terry (BT) model, which may fail to capture the full complexity of human preferences. In this paper, we formulate diffusion alignment from a game-theoretic perspective. We propose Diffusion Nash Preference Optimization (Diff.-NPO), an intuitive general preference framework for diffusion alignment. Diff.-NPO encourages the current policy to play against itself to achieve self improvement and lead to a better alignment. Empirically, we demonstrate the effectiveness of Diff.-NPO on the text-to-image generation task via various metrics. Diff.-NPO consistently outperforms existing preference-based diffusion alignment methods.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint