Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
39works
0followers
35topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

39 published item(s)

preprint2026arXiv

CLEAR: Revealing How Noise and Ambiguity Degrade Reliability in LLMs for Medicine

Medical large language model (LLM) evaluations rely on simplified, exam-style benchmarks that rarely reflect the ambiguity of real-world medical inquiries. We introduce the CLinical Evaluation of Ambiguity and Reliability (CLEAR) framework, which assesses how decision-space presentation, ambiguity, and uncertainty affect LLMs' reasoning on medical benchmarks. CLEAR systematically perturbs (1) the number of plausible answer options, (2) the presence of a ground truth or abstention option, and (3) the semantic framing of answer options. Applying CLEAR on three benchmarks evaluated across 17 LLMs reveals three notable limitations of existing evaluation methods. First, increasing the number of plausible answers degrades a model's ability to identify the correct answer and abstain against incorrect ones. Second, this lack of caution intensifies as the framing of abstention shifts from assertive rejection like "None of the Above" to uncertainty admission like "I don't know" (IDK). Notably, just including IDK in the answer space increases incorrect answer selections. Lastly, we formalize the performance gap between identifying the correct answer and abstaining from incorrect ones as the humility deficit, which worsens with model scale. Our findings reveal limitations in standard medical benchmarks and underscore that scaling alone does not resolve LLM reliability issues.

preprint2026arXiv

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Building upon FutureX, which established a live benchmark for general-purpose future prediction, this report introduces FutureX-Pro, including FutureX-Finance, FutureX-Retail, FutureX-PublicHealth, FutureX-NaturalDisaster, and FutureX-Search. These together form a specialized framework extending agentic future prediction to high-value vertical domains. While generalist agents demonstrate proficiency in open-domain search, their reliability in capital-intensive and safety-critical sectors remains under-explored. FutureX-Pro targets four economically and socially pivotal verticals: Finance, Retail, Public Health, and Natural Disaster. We benchmark agentic Large Language Models (LLMs) on entry-level yet foundational prediction tasks -- ranging from forecasting market indicators and supply chain demands to tracking epidemic trends and natural disasters. By adapting the contamination-free, live-evaluation pipeline of FutureX, we assess whether current State-of-the-Art (SOTA) agentic LLMs possess the domain grounding necessary for industrial deployment. Our findings reveal the performance gap between generalist reasoning and the precision required for high-value vertical applications.

preprint2026arXiv

Goal-Conditioned Supervised Learning for LLM Fine-Tuning

Large language models often require fine-tuning to better align their behavior with user intent at deployment. Existing approaches are commonly divided into online and offline paradigms. Online methods, such as RL-based alignment, can directly optimize outcome quality but typically rely on external reward models and iterative rollouts, making them costly and difficult to deploy in many cases. Offline methods are more efficient, but prevailing approaches such as supervised fine-tuning (SFT) and direct preference optimization (DPO) remain limited: SFT typically collapses graded feedback into binary supervision, while DPO depends on paired preference data that is often unavailable or expensive to construct. In this paper, we propose goal-conditioned supervised learning (GCSL) as an offline fine-tuning framework for LLMs. Our core idea is to treat feedback signals directly as an explicit goal and train the model, purely through supervised learning, to generate responses that achieve that goal. To better exploit graded feedback, we further introduce a novel goal formulation that defines learning as consistently pursuing outcomes above a target quality threshold, rather than imitating samples from a selected high-quality subset. This design mitigates the bounded-learning effect of SFT and classic GCSL by explicitly guiding the model to learn the directional progression of quality. We also propose natural-language goal representations to better leverage the semantic understanding and reasoning capabilities of LLMs. We evaluate our method on three tasks: non-toxic generation, code generation, and LLM for recommendation. Results show that our approach consistently outperforms standard offline fine-tuning baselines while retaining the efficiency, scalability, and simple data requirements of supervised learning.

preprint2026arXiv

Inverse Rendering for High-Genus 3D Surface Meshes from Multi-view Images with Persistent Homology Priors

Reconstructing 3D objects from images is inherently an ill-posed problem due to ambiguities in geometry, appearance, and topology. This paper introduces collaborative inverse rendering with persistent homology priors, a novel strategy that leverages topological constraints to resolve these ambiguities. By incorporating priors that capture critical features such as tunnel loops and handle loops, our approach directly addresses the difficulty of reconstructing high-genus surfaces. The collaboration between photometric consistency from multi-view images and homology-based guidance enables recovery of complex high-genus geometry while circumventing catastrophic failures such as collapsing tunnels or losing high-genus structure. Instead of neural networks, our method relies on gradient-based optimization within a mesh-based inverse rendering framework to highlight the role of topological priors. Experimental results show that incorporating persistent homology priors leads to lower Chamfer Distance (CD) and higher Volume IoU compared to state-of-the-art mesh-based methods, demonstrating improved geometric accuracy and robustness against topological failure.

preprint2026arXiv

Learning Domain Agnostic Latent Embeddings of 3D Faces for Zero-shot Animal Expression Transfer

We present a zero-shot framework for transferring human facial expressions to 3D animal face meshes. Our method combines intrinsic geometric descriptors (HKS/WKS) with a mesh-agnostic latent embedding that disentangles facial identity and expression. The ID latent space captures species-independent facial structure, while the expression latent space encodes deformation patterns that generalize across humans and animals. Trained only with human expression pairs, the model learns the embeddings, decoupling, and recoupling of cross-identity expressions, enabling expression transfer without requiring animal expression data. To enforce geometric consistency, we employ Jacobian loss together with vertex-position and Laplacian losses. Experiments show that our approach achieves plausible cross-species expression transfer, effectively narrowing the geometric gap between human and animal facial shapes.

preprint2026arXiv

LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation

Large Language Models (LLMs) perform well on standard reasoning and question-answering benchmarks, yet such evaluations often fail to capture their ability to handle long-tail, expertise-intensive knowledge in real-world professional scenarios. We introduce LPFQA, a long-tail knowledge benchmark derived from authentic professional forum discussions, covering 7 academic and industrial domains with 430 curated tasks grounded in practical expertise. LPFQA evaluates specialized reasoning, domain-specific terminology understanding, and contextual interpretation, and adopts a hierarchical difficulty structure to ensure semantic clarity and uniquely identifiable answers. Experiments on over multiple mainstream LLMs reveal substantial performance gaps, particularly on tasks requiring deep domain reasoning, exposing limitations overlooked by existing benchmarks. Overall, LPFQA provides an authentic and discriminative evaluation framework that complements prior benchmarks and informs future LLM development.

preprint2026arXiv

Neuroscience-inspired Staged Representation Learning with Disentangled Coarse- and Fine-Grained Semantics for EEG Visual Decoding

Decoding visual information from electroencephalography (EEG) signals remains a fundamental challenge in brain-computer interfaces and medical rehabilitation. Existing EEG visual decoding methods mainly focus on learning a single global EEG embedding for cross-modal alignment, but they largely overlook the staged and hierarchical characteristics of human visual processing. To address this limitation, we propose a neuroscience-inspired staged representation learning framework that reformulates EEG visual decoding as a stage-specific representation decomposition problem. The proposed framework organizes EEG representation learning into three complementary phases: low-level visual representation learning, high-level semantic representation learning, and integrative information fusion. To strengthen semantic modeling, we further introduce a multimodal dual-level semantic learning mechanism that separates coarse label-level semantics from fine image-level visual-semantic information. In addition, semantic latent channels are introduced as computational representation channels generated from observed visual EEG signals, expanding the channel-level semantic representation space for structured semantic abstraction and cross-modal alignment. Extensive experiments on the THINGS-EEG benchmark demonstrate that the proposed method achieves superior performance under subject-dependent zero-shot evaluation and improved exact retrieval under subject-independent zero-shot evaluation. Additional analyses, including layer-wise retrieval, temporal accumulation, expanded multi-image retrieval, and ablation studies, further support the effectiveness of staged decomposition and structured semantic modeling. These results suggest that explicitly modeling staged perceptual, semantic, and integrative representations provides an effective neuroscience-inspired framework for EEG-based visual decoding.

preprint2026arXiv

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Recent advances in coding agents suggest rapid progress toward autonomous software development, yet existing benchmarks fail to rigorously evaluate the long-horizon capabilities required to build complete software systems. Most prior evaluations focus on localized code generation, scaffolded completion, or short-term repair tasks, leaving open the question of whether agents can sustain coherent reasoning, planning, and execution over the extended horizons demanded by real-world repository construction. To address this gap, we present NL2Repo Bench, a benchmark explicitly designed to evaluate the long-horizon repository generation ability of coding agents. Given only a single natural-language requirements document and an empty workspace, agents must autonomously design the architecture, manage dependencies, implement multi-module logic, and produce a fully installable Python library. Our experiments across state-of-the-art open- and closed-source models reveal that long-horizon repository generation remains largely unsolved: even the strongest agents achieve below 40% average test pass rates and rarely complete an entire repository correctly. Detailed analysis uncovers fundamental long-horizon failure modes, including premature termination, loss of global coherence, fragile cross-file dependencies, and inadequate planning over hundreds of interaction steps. NL2Repo Bench establishes a rigorous, verifiable testbed for measuring sustained agentic competence and highlights long-horizon reasoning as a central bottleneck for the next generation of autonomous coding agents.

preprint2026arXiv

RIMRULE: Improving Tool-Using Language Agents via MDL-Guided Rule Learning

Large language models (LLMs) often struggle to use tools reliably in domain-specific settings, where APIs may be idiosyncratic, under-documented, or tailored to private workflows. This highlights the need for effective adaptation to task-specific tools. We propose RIMRULE, a neuro-symbolic approach for LLM adaptation based on dynamic rule injection. Compact, interpretable rules are distilled from failure traces and injected into the prompt during inference to improve task performance. These rules are proposed by the LLM itself and consolidated using a Minimum Description Length (MDL) objective that favors generality and conciseness. Each rule is stored in both natural language and a structured symbolic form, supporting efficient retrieval at inference time. Experiments on tool-use benchmarks show that this approach improves accuracy on both seen and unseen tools without modifying LLM weights. It outperforms prompting-based adaptation methods and complements finetuning. Moreover, rules learned from one LLM can be reused to improve others, including long reasoning LLMs, highlighting the portability of symbolic knowledge across architectures.

preprint2026arXiv

Student Classroom Behavior Recognition Based on Improved YOLOv8s

In classroom teaching, student behavior can reflect their learning state and classroom participation, which is of great significance for teaching quality analysis. To address the problems of dense student targets, numerous small objects, frequent occlusions, and imbalanced class distribution in real classroom scenes, this paper proposes an improved student classroom behavior recognition model named ALC-YOLOv8s based on YOLOv8s. The model introduces SPPF-LSKA to enhance contextual feature extraction, employs CFC-CRB and SFC-G2 to optimize multi-scale feature fusion, and incorporates ATFLoss to improve the learning ability for minority classes and hard samples. Experimental results show that compared with the baseline model, the improved model achieves increases of 1.8% in mAP50 and 2.1% in mAP50-95. Compared with several mainstream detection methods, the proposed model can well meet the requirements of automatic student behavior recognition in complex classroom scenarios.

preprint2025arXiv

Boundary error control for numerical solution of BSDEs by the convolution-FFT method

We first review the convolution fast-Fourier-transform (CFFT) approach for the numerical solution of backward stochastic differential equations (BSDEs) introduced in (Hyndman and Oyono Ngou, 2017). We then propose a method for improving the boundary errors obtained when valuing options using this approach. We modify the damping and shifting schemes used in the original formulation, which transforms the target function into a bounded periodic function so that Fourier transforms can be applied successfully. Time-dependent shifting reduces boundary error significantly. We present numerical results for our implementation and provide a detailed error analysis showing the improved accuracy and convergence of the modified convolution method.

preprint2024arXiv

InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

The convenience of 3D sensors has led to an increase in the use of 3D point clouds in various applications. However, the differences in acquisition devices or scenarios lead to divergence in the data distribution of point clouds, which requires good generalization of point cloud representation learning methods. While most previous methods rely on domain adaptation, which involves fine-tuning pre-trained models on target domain data, this may not always be feasible in real-world scenarios where target domain data may be unavailable. To address this issue, we propose InvariantOODG, which learns invariability between point clouds with different distributions using a two-branch network to extract local-to-global features from original and augmented point clouds. Specifically, to enhance local feature learning of point clouds, we define a set of learnable anchor points that locate the most useful local regions and two types of transformations to augment the input point clouds. The experimental results demonstrate the effectiveness of the proposed model on 3D domain generalization benchmarks.

preprint2023arXiv

Automated Repair of Programs from Large Language Models

Large language models such as Codex, have shown the capability to produce code for many programming tasks. However, the success rate of existing models is low, especially for complex programming tasks. One of the reasons is that language models lack awareness of program semantics, resulting in incorrect programs, or even programs which do not compile. In this paper, we systematically study whether automated program repair (APR) techniques can fix the incorrect solutions produced by language models in LeetCode contests. The goal is to study whether APR techniques can enhance reliability in the code produced by large language models. Our study revealed that: (1) automatically generated code shares common programming mistakes with human-crafted solutions, indicating APR techniques may have potential to fix auto-generated code; (2) given bug location information provided by a statistical fault localization approach, the newly released Codex edit mode, which supports editing code, is similar to or better than existing Java repair tools TBar and Recoder in fixing incorrect solutions. By analyzing the experimental results generated by these tools, we provide several suggestions: (1) enhancing APR tools to surpass limitations in patch space (e.g., introducing more flexible fault localization) is desirable; (2) as large language models can derive more fix patterns by training on more data, future APR tools could shift focus from adding more fix patterns to synthesis/semantics based approaches, (3) combination of language models with APR to curate patch ingredients, is worth studying.

preprint2023arXiv

Generalized Parton Distributions from Lattice QCD with Asymmetric Momentum Transfer: Unpolarized Quarks

Traditionally, lattice QCD computations of generalized parton distributions (GPDs) have been carried out in a symmetric frame, where the transferred momentum is symmetrically distributed between the incoming and outgoing hadrons. However, such frames are inconvenient since they require a separate calculation for each value of the momentum transfer, increasing significantly the computational cost. In this work, by focusing on the quasi-distribution approach, we lay the foundation for faster and more effective lattice QCD calculations of GPDs exploiting asymmetric frames, with freedom in the transferred momentum distribution. An important ingredient of our approach is the Lorentz covariant parameterization of the matrix elements in terms of Lorentz-invariant amplitudes, which allows one to relate matrix elements in different frames. We also use this amplitude approach to propose a new definition of quasi-GPDs that is frame-independent and, more importantly, may lead to smaller power corrections in the matching relations to the light-cone GPDs. We demonstrate the efficacy of the formalism through numerical calculations using one ensemble of $N_f$=2+1+1 twisted mass fermions with a clover improvement. The value of the light-quark masses lead to a pion mass of about 260 MeV. Concentrating on the proton, and limiting ourselves to a vanishing longitudinal momentum transfer to the target, we extract the invariant amplitudes from matrix element calculations in both the symmetric and asymmetric frame, and obtain results for the twist-2 light-cone GPDs for unpolarized quarks, that is, $H$ and $E$.

preprint2023arXiv

GPDs in asymmetric frames

It is often taken for granted that Generalized Parton Distributions (GPDs) are defined in the "symmetric" frame, where the transferred momentum is symmetrically distributed between the incoming/outgoing hadrons. However, such frames pose computational challenges for the lattice QCD practitioners. In these proceedings, we lay the foundation for lattice QCD calculations of GPDs in "asymmetric" frames, where the transferred momentum is not symmetrically distributed between the incoming/outgoing hadrons. The novelty of our work relies on the parameterization of the matrix elements in terms of Lorentz-invariant amplitudes, which not only helps in establishing relations between the said frames but also helps in isolating higher-twist contaminations. As an example, we focus on the unpolarized GPDs for spin-1/2 particles.

preprint2023arXiv

The study of eleven contact binaries with mass ratios less than 0.1

Multi-band photometric observations of eleven totally eclipsing contact binaries were carried out. Applying the Wilson-Devinney program, photometric solutions were obtained. There are two W-subtype systems, which are CRTS J133031.1+161202 and CRTS J154254.0+324652, and the rest systems are A-subtype systems. CRTS J154254.0+324652 has the highest fill-out factor with 94.3$\%$, and the lowest object is CRTS J155009.2+493639 with only 18.9$\%$. The mass ratios of the eleven systems are all less than 0.1, which means that they are extremely low mass ratio binary systems. We performed period variation investigation and found that the orbital periods of three systems decrease slowly, which may be caused by the angular momentum loss, and of six systems increase slowly, which indicates that the materials may transfer from the secondary component to the primary component. LAMOST low$-$resolution spectra of four objects were analyzed, and using the spectral subtraction technique, H$α$ emission line was detected, which means that the four objects exhibit chromospheric activity. In order to understand their evolutionary status, the mass-luminosity and mass-radius diagrams were plotted. The two diagrams indicate that the primary component is in the main sequence evolution stage, and the secondary component is above TAMS, indicating that they are over-luminous. To determine whether the eleven systems are in stable state, the ratio of spin angular momentum to orbital angular momentum ($J_{s}/J_{o}$) and the instability parameters were calculated, and we argued that CRTS J234634.7+222824 is on the verge of a merger.

preprint2022arXiv

A theoretic model for sonogenetic antiarrhythmia

Sonogenetics can be used as a new alternative for treating arrhythmia due to its advantages of noninvasive, high safety and strong penetration. In the treatment of arrhythmias by sonogenetics, cardiac myocytes are deformed by ultrasonic radiation force. We quantitatively calculated the shape variation of cardiomyocytes under ultrasonic radiation force, and the deformation of cardiomyocytes caused the change of membrane tension. Membrane tension consists of two parts, plasma membrane tension and cortical tension between the cell membrane and cytoskeleton. Since plasma membrane tension was mainly considered in existing experiments, we proposed a quantitative model of the relationship between ultrasonic radiation force and plasma membrane tension. The Boltzmann relationship between plasma membrane tension and ion channel opening probability is presented based on the experimental results of ion channel activation by stretching. Finally, a quantitative model was obtained for ultrasonic radiation force to regulate the opening probability of ion channel activated by stretching. Based on this quantitative model, we proposed the regulation mechanism of ultrasonic radiation force under hypercompression and hyperstretching, and verified that this mechanism can eliminate arrhythmias by sonogenetics.

preprint2022arXiv

Dual-domain Attention-based Deep Network for Sparse-view CT Artifact Reduction

Due to the wide applications of X-ray computed tomography (CT) in medical imaging activities, radiation exposure has become a major concern for public health. Sparse-view CT is a promising approach to reduce the radiation dose by down-sampling the total number of acquired projections. However, the CT images reconstructed by this sparse-view imaging approach suffer from severe streaking artifacts and structural information loss. In this work, an end-to-end dual-domain attention-based deep network (DDANet) is proposed to solve such an ill-posed CT image reconstruction problem. The image-domain CT image and the projection-domain sinogram are put into the two parallel sub-networks of the DDANet to independently extract the distinct high-level feature maps. In addition, a specified attention module is introduced to fuse the aforementioned dual-domain feature maps to allow complementary optimizations of removing the streaking artifacts and mitigating the loss of structure. Numerical simulations, anthropomorphic thorax phantom and in vivo pre-clinical experiments are conducted to verify the sparse-view CT imaging performance of the DDANet. Results demonstrate that this newly developed approach is able to robustly remove the streaking artifacts while maintaining the fine structures. As a result, the DDANet provides a promising solution in achieving high quality sparse-view CT imaging.

preprint2022arXiv

Interaction effects of pseudospin-based magnetic monopoles and kinks in a doped dipolar superlattice gas

Magnetic monopoles and kinks are topological excitations extensively investigated in quantum spin systems, but usually they are studied in different setups. We explore the conditions for the coexistence and the interaction effects of these quasiparticles in the pseudospin chain of the atomic dipolar superlattice gas. In this chain, the magnetic kink is the intrinsic quasiparticle, and the particle/hole defect takes over the role of the north/south magnetic monopole, exerting monopolar magnetic fields to neighboring spins. A confinement effect between the monopole and kink is revealed, which renormalizes the dispersion of the kink. The corresponding dynamical deconfinement process is observed and arises due to the kink-antikink annihilation. The rich interaction effects of the two quasiparticles could stimulate corresponding investigations in bulk spin systems.

preprint2022arXiv

Lattice QCD Determination of the Bjorken-$x$ Dependence of Parton Distribution Functions at Next-to-next-to-leading Order

We report the first lattice QCD calculation of pion valence quark distribution with next-to-next-to-leading order perturbative matching correction, which is done using two fine lattices with spacings $a=0.04$ fm and $0.06$ fm and valence pion mass $m_π=300$ MeV, at boost momentum as large as $2.42$ GeV. As a crucial step to control the systematics, we renormalize the pion valence quasi distribution in the recently proposed hybrid scheme, which features a Wilson-line mass subtraction at large distances in coordinate space, and develop a procedure to match it to the $\overline{\rm MS}$ scheme. We demonstrate that the renormalization and the perturbative matching in Bjorken-$x$ space yield a reliable determination of the valence quark distribution for $0.03\lesssim x \lesssim 0.80$ with 5-20\% uncertainties.

preprint2022arXiv

Optimal annuitization post-retirement with labor income

Evidence shows that the labor participation rate of retirement age cohorts is non-negligible, and it is a widespread phenomenon globally. In the United States, the labor force participation rate for workers age 75 and older is projected to be over 10 percent by 2026 as reported by the Bureau of Labor Statistics. The prevalence of post-retirement work changes existing considerations of optimal annuitization, a research question further complicated by novel factors such as post-retirement labor rates, wage rates, and capacity or willingness to work. To our knowledge, this poses a practical and theoretical problem not previously investigated in actuarial literature. In this paper, we study the problem of post-retirement annuitization with extra labor income in the framework of stochastic control, optimal stopping, and expected utility maximization. The utility functions are of the Cobb-Douglas type. The martingale methodology and duality techniques are employed to obtain closed-form solutions for the dual and primal problems. The effect of labor income is investigated by exploiting the explicit solutions and Monte-Carlo simulation. The latter reveals that the optimal annuitization time is strongly linear with respect to the initial wealth, with or without labor income. When it comes to optimal annuitization, we find that the wage and labor rates may play opposite roles. However, their impact is mediated by the leverage ratio.

preprint2022arXiv

Pion form factor and charge radius from Lattice QCD at physical point

We present our results on the electromagnetic form factor of pion over a wide range of $Q^2$ using lattice QCD simulations with Wilson-clover valence quarks and HISQ sea quarks. We study the form factor at the physical point with a lattice spacing $a=0.076$ fm. To study the lattice spacing and quark mass effects, we also present results for 300 MeV pion at two different lattice spacings $a=0.04$ and 0.06 fm. The lattice calculations at the physical quark mass appear to agree with the experimental results. Through fits to the form factor, we estimate the charge radius of pion for physical pion mass to be $\langle r_π^2 \rangle=0.42(2)~{\rm fm}^2$.

preprint2022arXiv

RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling

Recent advances in large-scale pre-training such as GPT-3 allow seemingly high quality text to be generated from a given prompt. However, such generation systems often suffer from problems of hallucinated facts, and are not inherently designed to incorporate useful external information. Grounded generation models appear to offer remedies, but their training typically relies on rarely-available parallel data where information-relevant documents are provided for context. We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal. The model learns to reward retrieval of the documents with the highest utility in generation, and attentively combines them using a Mixture-of-Experts (MoE) ensemble to generate follow-on text. We demonstrate that both generator and retriever can take advantage of this joint training and work synergistically to produce more informative and relevant text in both prose and dialogue generation.

preprint2022arXiv

The Mathematics of the Ensemble Theory

This study shows that the generalized Boltzmann distribution is the only distribution mathematically consistent with thermodynamics when the system is described by an ensemble of a certain mathematical form. This mathematical form is very general, such that the canonical, grand-canonical, or isothermal-isobaric ensemble theories are all special cases of this form. Compared with the standard textbook formalism of the statistical mechanics (SM), this approach does not require a prior distribution, does not assume the functional form or maximization of entropy, and employs fewer assumptions. Therefore, this new insight challenges the belief on the requirement of a prior distribution in SM and provides a new way to derive the Boltzmann distribution. This study also reveals the logical and mathematical constraints of SM's fundamental components; therefore, it could potentially benefit researchers on non-Boltzmann-Gibbs SM and philosophers studying the foundations of SM.

preprint2022arXiv

Trust Enhancement Issues in Program Repair

Automated program repair is an emerging technology that seeks to automatically rectify bugs and vulnerabilities using learning, search, and semantic analysis. Trust in automatically generated patches is necessary for achieving greater adoption of program repair. Towards this goal, we survey more than 100 software practitioners to understand the artifacts and setups needed to enhance trust in automatically generated patches. Based on the feedback from the survey on developer preferences, we quantitatively evaluate existing test-suite based program repair tools. We find that they cannot produce high-quality patches within a top-10 ranking and an acceptable time period of 1 hour. The developer feedback from our qualitative study and the observations from our quantitative examination of existing repair tools point to actionable insights to drive program repair research. Specifically, we note that producing repairs within an acceptable time-bound is very much dependent on leveraging an abstract search space representation of a rich enough search space. Moreover, while additional developer inputs are valuable for generating or ranking patches, developers do not seem to be interested in a significant human-in-the-loop interaction.

preprint2021arXiv

High Entropy Oxide Relaxor Ferroelectrics

Relaxor ferrolectrics are important in technological applications due to a strong electromechanical response, energy storage capacity, electrocaloric effect, and pyroelectric energy conversion properties. Current efforts to discover and design new materials in this class generally rely on substitutional doping of known ferroelectrics, as slight changes to local compositional order can significantly affect the Curie temperature, morphotropic phase boundary, and electromechanical responses. In this work, we demonstrate that moving to the strong limit of compositional complexity in an ABO3 perovskite allows stabilization of novel relaxor responses that do not rely on a single narrow phase transition region. Entropy-assisted synthesis approaches are used to create single crystal Ba(Ti0.2Sn0.2Zr0.2Hf0.2Nb0.2)O3 [Ba(5B)O] films. The high levels of configurational disorder present in this system is found to influence dielectric relaxation, phase transitions, nano-polar domain formation, and Curie temperature. Temperature-dependent dielectric, Raman spectroscopy and second-harmonic generation measurements reveal multiple phase transitions, a high Curie temperature of 570 K, and the relaxor ferroelectric nature of Ba(5B)O films. The first principles theory calculations are used to predict possible combinations of cations to quantify the relative feasibility of formation of highly disordered single-phase perovskite systems. The ability to stabilize single-phase perovskites with such a large number of different cations on the B-sites offers new possibilities for designing high-performance materials for piezoelectric, pyroelectric and tunable dielectric applications.

preprint2021arXiv

Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration

We are motivated by the real challenges presented in a human-robot system to develop new designs that are efficient at data level and with performance guarantees such as stability and optimality at systems level. Existing approximate/adaptive dynamic programming (ADP) results that consider system performance theoretically are not readily providing practically useful learning control algorithms for this problem; and reinforcement learning (RL) algorithms that address the issue of data efficiency usually do not have performance guarantees for the controlled system. This study fills these important voids by introducing innovative features to the policy iteration algorithm. We introduce flexible policy iteration (FPI), which can flexibly and organically integrate experience replay and supplemental values from prior experience into the RL controller. We show system level performances including convergence of the approximate value function, (sub)optimality of the solution, and stability of the system. We demonstrate the effectiveness of the FPI via realistic simulations of the human-robot system. It is noted that the problem we face in this study may be difficult to address by design methods based on classical control theory as it is nearly impossible to obtain a customized mathematical model of a human-robot system either online or offline. The results we have obtained also indicate the great potential of RL control to solving realistic and challenging problems with high dimensional control inputs.

preprint2021arXiv

Self-Supervised Multi-View Learning via Auto-Encoding 3D Transformations

3D object representation learning is a fundamental challenge in computer vision to infer about the 3D world. Recent advances in deep learning have shown their efficiency in 3D object recognition, among which view-based methods have performed best so far. However, feature learning of multiple views in existing methods is mostly performed in a supervised fashion, which often requires a large amount of data labels with high costs. In contrast, self-supervised learning aims to learn multi-view feature representations without involving labeled data. To this end, we propose a novel self-supervised paradigm to learn Multi-View Transformation Equivariant Representations (MV-TER), exploring the equivariant transformations of a 3D object and its projected multiple views. Specifically, we perform a 3D transformation on a 3D object, and obtain multiple views before and after the transformation via projection. Then, we self-train a representation to capture the intrinsic 3D object representation by decoding 3D transformation parameters from the fused feature representations of multiple views before and after the transformation. Experimental results demonstrate that the proposed MV-TER significantly outperforms the state-of-the-art view-based approaches in 3D object classification and retrieval tasks, and show the generalization to real-world datasets.

preprint2020arXiv

3D Dynamic Point Cloud Denoising via Spatial-Temporal Graph Learning

The prevalence of accessible depth sensing and 3D laser scanning techniques has enabled the convenient acquisition of 3D dynamic point clouds, which provide efficient representation of arbitrarily-shaped objects in motion. Nevertheless, dynamic point clouds are often perturbed by noise due to hardware, software or other causes. While a plethora of methods have been proposed for static point cloud denoising, few efforts are made for the denoising of dynamic point clouds with varying number of irregularly-sampled points in each frame. In this paper, we represent dynamic point clouds naturally on graphs and address the denoising problem by inferring the underlying graph via spatio-temporal graph learning, exploiting both the intra-frame similarity and inter-frame consistency. Firstly, assuming the availability of a relevant feature vector per node, we pose spatial-temporal graph learning as optimizing a Mahalanobis distance metric $\mathbf{M}$, which is formulated as the minimization of graph Laplacian regularizer. Secondly, to ease the optimization of the symmetric and positive definite metric matrix $\mathbf{M}$, we decompose it into $\mathbf{M}=\mathbf{R}^{\top}\mathbf{R}$ and solve $\mathbf{R}$ instead via proximal gradient. Finally, based on the spatial-temporal graph learning, we formulate dynamic point cloud denoising as the joint optimization of the desired point cloud and underlying spatio-temporal graph, which leverages both intra-frame affinities and inter-frame consistency and is solved via alternating minimization. Experimental results show that the proposed method significantly outperforms independent denoising of each frame from state-of-the-art static point cloud denoising approaches.

preprint2020arXiv

Bounds on the Jensen Gap, and Implications for Mean-Concentrated Distributions

This paper gives upper and lower bounds on the gap in Jensen's inequality, i.e., the difference between the expected value of a function of a random variable and the value of the function at the expected value of the random variable. The bounds depend only on growth properties of the function and specific moments of the random variable. The bounds are particularly useful for distributions that are concentrated around the mean, a commonly occurring scenario such as the average of i.i.d. samples and in statistical mechanics.

preprint2020arXiv

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges extracted from Reddit comment chains over a period spanning from 2005 through 2017, DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both in terms of automatic and human evaluation in single-turn dialogue settings. We show that conversational systems that leverage DialoGPT generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained model and training pipeline are publicly released to facilitate research into neural response generation and the development of more intelligent open-domain dialogue systems.

preprint2020arXiv

Dialogue Response Ranking Training with Large-Scale Human Feedback Data

Existing open-domain dialog models are generally trained to minimize the perplexity of target human responses. However, some human replies are more engaging than others, spawning more followup interactions. Current conversational models are increasingly capable of producing turns that are context-relevant, but in order to produce compelling agents, these models need to be able to predict and optimize for turns that are genuinely engaging. We leverage social media feedback data (number of replies and upvotes) to build a large-scale training dataset for feedback prediction. To alleviate possible distortion between the feedback and engagingness, we convert the ranking problem to a comparison of response pairs which involve few confounding factors. We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data and the resulting ranker outperformed several baselines. Particularly, our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback. We finally combine the feedback prediction models and a human-like scoring model to rank the machine-generated dialog responses. Crowd-sourced human evaluation shows that our ranking method correlates better with real human preferences than baseline models.

preprint2020arXiv

Feature Graph Learning for 3D Point Cloud Denoising

Identifying an appropriate underlying graph kernel that reflects pairwise similarities is critical in many recent graph spectral signal restoration schemes, including image denoising, dequantization, and contrast enhancement. Existing graph learning algorithms compute the most likely entries of a properly defined graph Laplacian matrix $\mathbf{L}$, but require a large number of signal observations $\mathbf{z}$'s for a stable estimate. In this work, we assume instead the availability of a relevant feature vector $\mathbf{f}_i$ per node $i$, from which we compute an optimal feature graph via optimization of a feature metric. Specifically, we alternately optimize the diagonal and off-diagonal entries of a Mahalanobis distance matrix $\mathbf{M}$ by minimizing the graph Laplacian regularizer (GLR) $\mathbf{z}^{\top} \mathbf{L} \mathbf{z}$, where edge weight is $w_{i,j} = \exp\{-(\mathbf{f}_i - \mathbf{f}_j)^{\top} \mathbf{M} (\mathbf{f}_i - \mathbf{f}_j) \}$, given a single observation $\mathbf{z}$. We optimize diagonal entries via proximal gradient (PG), where we constrain $\mathbf{M}$ to be positive definite (PD) via linear inequalities derived from the Gershgorin circle theorem. To optimize off-diagonal entries, we design a block descent algorithm that iteratively optimizes one row and column of $\mathbf{M}$. To keep $\mathbf{M}$ PD, we constrain the Schur complement of sub-matrix $\mathbf{M}_{2,2}$ of $\mathbf{M}$ to be PD when optimizing via PG. Our algorithm mitigates full eigen-decomposition of $\mathbf{M}$, thus ensuring fast computation speed even when feature vector $\mathbf{f}_i$ has high dimension. To validate its usefulness, we apply our feature graph learning algorithm to the problem of 3D point cloud denoising, resulting in state-of-the-art performance compared to competing schemes in extensive experiments.

preprint2020arXiv

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

Recent advances in Graph Convolutional Neural Networks (GCNNs) have shown their efficiency for non-Euclidean data on graphs, which often require a large amount of labeled data with high cost. It it thus critical to learn graph feature representations in an unsupervised manner in practice. To this end, we propose a novel unsupervised learning of Graph Transformation Equivariant Representations (GraphTER), aiming to capture intrinsic patterns of graph structure under both global and local transformations. Specifically, we allow to sample different groups of nodes from a graph and then transform them node-wise isotropically or anisotropically. Then, we self-train a representation encoder to capture the graph structures by reconstructing these node-wise transformations from the feature representations of the original and transformed graphs. In experiments, we apply the learned GraphTER to graphs of 3D point cloud data, and results on point cloud segmentation/classification show that GraphTER significantly outperforms state-of-the-art unsupervised approaches and pushes greatly closer towards the upper bound set by the fully supervised counterparts. The code is available at: https://github.com/gyshgx868/graph-ter.

preprint2020arXiv

MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform

We present MixingBoard, a platform for quickly building demos with a focus on knowledge grounded stylized text generation. We unify existing text generation algorithms in a shared codebase and further adapt earlier algorithms for constrained generation. To borrow advantages from different models, we implement strategies for cross-model integration, from the token probability level to the latent space level. An interface to external knowledge is provided via a module that retrieves on-the-fly relevant knowledge from passages on the web or any document collection. A user interface for local development, remote webpage access, and a RESTful API are provided to make it simple for users to build their own demos.

preprint2020arXiv

Pion valence quark PDF from lattice QCD

We present lattice results on the valence-quark structure of the pion using a coordinate space method within the framework of Large Momentum Effective Theory (LaMET). In this method one relies on the matrix elements of a Euclidean correlator in boosted hadronic states, which have an operator product expansion at short distance that allows us to extract the moments of PDFs. We renormalize the Euclidean correlator by forming the reduced Ioffe-time distribution (rITD), and reconstruct the second and fourth moments of the pion PDF by taking into account of QCD evolution effects.

preprint2020arXiv

Sliding Over Graphene Grain Boundaries: A Step Towards Macroscale Superlubricity

In light of the race towards macroscale superlubricity of graphitic contacts, the effect of grain boundaries on their frictional properties becomes of central importance. Here, we elucidate the unique frictional mechanisms characterizing topological defects along typical grain boundaries that can vary from being nearly flat to highly corrugated, depending on the boundary misfit angle. We find that frictional energy dissipation over grain boundaries can originate from variations of compressibility along the surface, heat produced during defect (un)buckling events, and elastic energy storage in irreversible buckling processes. These may lead to atypical non-monotonic dependence of the averaged friction on the normal load. The knowledge gained in the present study constitutes an important step towards the realization of superlubricity in macroscopic graphitic contacts.

preprint2020arXiv

Super Resolution Convolutional Neural Network for Feature Extraction in Spectroscopic Data

Two dimensional (2D) peak finding is a common practice in data analysis for physics experiments, which is typically achieved by computing the local derivatives. However, this method is inherently unstable when the local landscape is complicated, or the signal-to-noise ratio of the data is low. In this work, we propose a new method in which the peak tracking task is formalized as an inverse problem, thus can be solved with a convolutional neural network (CNN). In addition, we show that the underlying physics principle of the experiments can be used to generate the training data. By generalizing the trained neural network on real experimental data, we show that the CNN method can achieve comparable or better results than traditional derivative based methods. This approach can be further generalized in different physics experiments when the physical process is known.

preprint2020arXiv

Theory of Subcycle Linear Momentum Transfer in Strong-Field Tunneling Ionization

Interaction of a strong laser pulse with matter transfers not only energy but also linear momentum of the photons. Recent experimental advances have made it possible to detect the small amount of linear momentum delivered to the photoelectrons in strong-field ionization of atoms. We present numerical simulations as well as an analytical description of the subcycle phase (or time) resolved momentum transfer to an atom accessible by an attoclock protocol. We show that the light-field-induced momentum transfer is remarkably sensitive to properties of the ultrashort laser pulse such as its carrier-envelope phase and ellipticity. Moreover, we show that the subcycle resolved linear momentum transfer can provide novel insights into the interplay between nonadiabatic and nondipole effects in strong-field ionization. This work paves the way towards the investigation of the so-far unexplored time-resolved nondipole nonadiabatic tunneling dynamics.