Source author record

Hongyi Liu

Hongyi Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence Computer Vision Machine Learning Multimedia cond-mat.mtrl-sci Graphics Information Retrieval math.DG physics.chem-ph physics.comp-ph Software Engineering

Catalog footprint

What is connected

10works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Reasoning-capable large language models (LLMs) achieve strong performance on complex tasks but often exhibit overthinking after distillation, generating unnecessarily long chain-of-thought (CoT) reasoning even for simple inputs and incurring high inference cost. However, naively shortening reasoning length can degrade reasoning accuracy, as concise reasoning may be insufficient for certain inputs and lacks explicit supervision. We propose Auto Long-Short Reasoning (AutoL2S), a distillation framework that empowers non-reasoning LLMs to think thoroughly but only when necessary. AutoL2S first learns a lightweight switching token with verified long-short CoTs to enable instance-wise long-short reasoning selection. Then it leverages long-short reasoning rollouts induced by a switching token in a GRPO-style loss to improve reasoning efficiency while maintaining accuracy. Experiments demonstrate that AutoL2S effectively reduces reasoning length up to 71% with minimal accuracy loss, yielding markedly better trade-off in token length and inference time while preserving accuracy.

preprint2026arXiv

Extrinsic Vector Field Processing

We propose a novel discretization of tangent vector fields for triangle meshes. Starting with a Phong map continuously assigning normals to all points on the mesh, we define an extrinsic bases for continuous tangent vector fields by using the Rodrigues rotation to transport tangent vectors assigned to vertices to tangent vectors in the interiors of the triangles. As our vector fields are continuous and weakly differentiable, we can use them to define a covariant derivative field that is evaluatable almost-everywhere on the mesh. Decomposing the covariant derivative in terms of diagonal multiple of the identity, anti-symmetric, and trace-less symmetric components, we can define the standard operators used for vector field processing including the Hodge Laplacian energy, Connection Laplacian energy, and Killing energy. Additionally, the ability to perform point-wise evaluation of the covariant derivative also makes it possible for us to define the Lie bracket.

preprint2026arXiv

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

Long-horizon LLM agents leave traces that could become reusable experience, but raw trajectories are noisy and hard to govern. We treat Agent Skills as an experience schema that couples executable scripts, with non-executable guidance on procedures. Yet open skill ecosystems contain redundant, uneven, environment-sensitive artifacts, and indiscriminate updates can pollute future context. We present SkillsVote, a lifecycle-governance framework for Agent Skills from collection and recommendation to evolution. SkillsVote profiles a million-scale open-source corpus for environment requirements, quality, and verifiability, then synthesizes tasks for verifiable skills. Before execution, SkillsVote performs agentic library search over structured skill library to expose instructional skill context. After execution, it decomposes trajectories into skill-linked subtasks, attributes outcomes to skill use, agent exploration, environment, and result signals, and admits only successful reusable discoveries to evidence-gated updates. In our evaluation, offline evolution improves GPT-5.2 on Terminal-Bench 2.0 by up to 7.9 pp, while online evolution improves SWE-Bench Pro by up to 2.6 pp. Overall, governed external skill libraries can improve frozen agents without model updates when systems control exposure, credit, and preservation.

preprint2026arXiv

Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

Reinforcement fine-tuning (RFT) has become a core paradigm for post-training large language models, yet its training process remains highly fragile. Existing efforts mainly improve reliability at the system level or address specific issues in individual subproblems by modifying RFT algorithms. Despite their effectiveness, they largely overlook the problem of failure management at the training-process level. When training goes wrong, practitioners still rely heavily on expert-driven manual inspection and correction, and automatic failure management for RFT remains largely unexplored. In this paper, we take a first step toward systematic failure management for reinforcement fine-tuning. To understand the empirical structure of RFT failures, we first construct RFT-FaultBench, the first benchmark for fine-grained failures in reinforcement fine-tuning, covering 5 fault families, 16 fault types, 779 training runs, 22,549 train-step records, and 1,457,288 trajectory-level records. Based on this benchmark, we conduct a comprehensive empirical study showing that RFT failures are both observable from training dynamics and distinguishable through their empirical fault fingerprints. Building on these findings, we propose RFT-FM, an automatic failure management framework for reinforcement fine-tuning that unifies anomaly detection, failure diagnosis, and auto remediation in a closed loop. Experimental results show that RFT-FaultBench is neither trivial nor saturated: it exhibits clear anomaly structure while still posing substantial challenges, especially under subtle fault settings. Moreover, RFT-FM shows strong capability in detecting, diagnosing, and mitigating RFT failures.

preprint2022arXiv

A compactness theorem for hyperkaehler 4-manifolds with boundary

In this paper, we study the compactness of a boundary value problem for hyperkaehler 4-manifolds. We show that under certain topological conditions and the positive mean curvature condition on the boundary, a sequence of hyperkaehler triples converges smoothly up to diffeomorphisms if and only if their restrictions to the boundary converge smoothly up to diffeomorphisms. We also generalize this result to torsion-free hypersymplectic triples.

preprint2022arXiv

Efficient Chemical Space Exploration Using Active Learning Based on Marginalized Graph Kernel: an Application for Predicting the Thermodynamic Properties of Alkanes with Molecular Simulation

We introduce an explorative active learning (AL) algorithm based on Gaussian process regression and marginalized graph kernel (GPR-MGK) to explore chemical space with minimum cost. Using high-throughput molecular dynamics simulation to generate data and graph neural network (GNN) to predict, we constructed an active learning molecular simulation framework for thermodynamic property prediction. In specific, targeting 251,728 alkane molecules consisting of 4 to 19 carbon atoms and their liquid physical properties: densities, heat capacities, and vaporization enthalpies, we use the AL algorithm to select the most informative molecules to represent the chemical space. Validation of computational and experimental test sets shows that only 313 (0.124\% of the total) molecules were sufficient to train an accurate GNN model with $\rm R^2 > 0.99$ for computational test sets and $\rm R^2 > 0.94$ for experimental test sets. We highlight two advantages of the presented AL algorithm: compatibility with high-throughput data generation and reliable uncertainty quantification.

preprint2016arXiv

Columbia MVSO Image Sentiment Dataset

The Multilingual Visual Sentiment Ontology (MVSO) consists of 15,600 concepts in 12 different languages that are strongly related to emotions and sentiments expressed in images. These concepts are defined in the form of Adjective-Noun Pair (ANP), which are crawled and discovered from online image forum Flickr. In this work, we used Amazon Mechanical Turk as a crowd-sourcing platform to collect human judgments on sentiments expressed in images that are uniformly sampled over 3,911 English ANPs extracted from a tag-restricted subset of MVSO. Our goal is to use the dataset as a benchmark for the evaluation of systems that automatically predict sentiments in images or ANPs.

preprint2016arXiv

EventNet Version 1.1 Technical Report

EventNet is a large-scale video corpus and event ontology consisting of 500 events associated with event-specific concepts. In order to improve the quality of the current EventNet, we conduct the following steps and introduce EventNet version 1.1: (1) manually verify the correctness of event labels for all videos; (2) remove the YouTube user bias by limiting the maximum number of videos in each event from the same YouTube user as 3; (3) remove the videos which are currently not accessible online; (4) remove the video belonging to multiple event categories. After the above procedure, some events may contain only a small number of videos, and therefore we crawl more videos for those events to ensure every event will contain more than 50 videos. Finally, EventNet version 1.1 contains 67,641 videos, 500 events, and 5,028 event-specific concepts. In addition, we train a Convolutional Neural Network (CNN) model for event classification via fine-tuning AlexNet using EventNet version 1.1. Then we use the trained CNN model to extract FC7 layer feature and train binary classifiers using linear SVM for each event-specific concept. We believe this new version of EventNet will significantly facilitate research in computer vision and multimedia, and will put it online for public downloading in the future.

preprint2016arXiv

Multilingual Visual Sentiment Concept Matching

The impact of culture in visual emotion perception has recently captured the attention of multimedia research. In this study, we pro- vide powerful computational linguistics tools to explore, retrieve and browse a dataset of 16K multilingual affective visual concepts and 7.3M Flickr images. First, we design an effective crowdsourc- ing experiment to collect human judgements of sentiment connected to the visual concepts. We then use word embeddings to repre- sent these concepts in a low dimensional vector space, allowing us to expand the meaning around concepts, and thus enabling insight about commonalities and differences among different languages. We compare a variety of concept representations through a novel evaluation task based on the notion of visual semantic relatedness. Based on these representations, we design clustering schemes to group multilingual visual concepts, and evaluate them with novel metrics based on the crowdsourced sentiment annotations as well as visual semantic relatedness. The proposed clustering framework enables us to analyze the full multilingual dataset in-depth and also show an application on a facial data subset, exploring cultural in- sights of portrait-related affective visual concepts.

preprint2013arXiv

Molecular dynamics (MD) calculation of the zeta potential of neutral surfaces

Molecular dynamics (MD) simulations of the zeta potential are so poor that it has become common to term their predictions 'apparent'. Here we demonstrate how zeta potentials that agree with measured values can be calculated by: (1) integrating the net average charge in surface-parallel layers from the midpoint of the fluid layer (where the electrostatic potential is zero) to and then into two solid caps, (2) determining the position of slipping plane with separate Couette flow models, and (3) calculating the charge distribution and electrostatic potential under static conditions. The solids we model are charge neutral surfaces composed of atoms with zero charge or charge balanced monovalent or divalent ions. The zeta potentials calculated are within a few millivolts of measured values, and the measured values fall within the simulation error bars. Insights provided by the improved MD simulations into the complex phenomena that affect surface charge and zeta potential are discussed.

Hongyi Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Extrinsic Vector Field Processing

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

A compactness theorem for hyperkaehler 4-manifolds with boundary

Efficient Chemical Space Exploration Using Active Learning Based on Marginalized Graph Kernel: an Application for Predicting the Thermodynamic Properties of Alkanes with Molecular Simulation

Columbia MVSO Image Sentiment Dataset

EventNet Version 1.1 Technical Report

Multilingual Visual Sentiment Concept Matching

Molecular dynamics (MD) calculation of the zeta potential of neutral surfaces