Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
38works
0followers
23topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

38 published item(s)

preprint2026arXiv

ARC: Active and Reflection-driven Context Management for Long-Horizon Information Seeking Agents

Large language models are increasingly deployed as research agents for deep search and long-horizon information seeking, yet their performance often degrades as interaction histories grow. This degradation, known as context rot, reflects a failure to maintain coherent and task-relevant internal states over extended reasoning horizons. Existing approaches primarily manage context through raw accumulation or passive summarization, treating it as a static artifact and allowing early errors or misplaced emphasis to persist. Motivated by this perspective, we propose ARC, which is the first framework to systematically formulate context management as an active, reflection-driven process that treats context as a dynamic internal reasoning state during execution. ARC operationalizes this view through reflection-driven monitoring and revision, allowing agents to actively reorganize their working context when misalignment or degradation is detected. Experiments on challenging long-horizon information-seeking benchmarks show that ARC consistently outperforms passive context compression methods, achieving up to an 11% absolute improvement in accuracy on BrowseComp-ZH with Qwen2.5-32B-Instruct.

preprint2026arXiv

EntroCoT: Enhancing Chain-of-Thought via Adaptive Entropy-Guided Segmentation

Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models. We find existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm, where correct final answers are derived from hallucinated, redundant, or logically invalid intermediate steps. This paper proposes EntroCoT, a unified framework for automatically identifying and refining low-quality CoT supervision traces. EntroCoT first proposes an entropy-based mechanism to segment the reasoning trace into multiple steps at uncertain junctures, and then introduces a Monte Carlo rollout-based mechanism to evaluate the marginal contribution of each step. By accurately filtering deceptive reasoning samples, EntroCoT constructs a high-quality dataset where every intermediate step in each reasoning trace facilitates the final answer. Extensive experiments on mathematical benchmarks demonstrate that fine-tuning on the subset constructed by EntroCoT consistently outperforms the baseslines of full-dataset supervision.

preprint2026arXiv

Formal Skill: Programmable Runtime Skills for Efficient and Accurate LLM Agents

Large Language Model (LLM) agents increasingly act inside real workspaces, where tools and skills determine whether model reasoning becomes reliable action. Existing skills remain largely informal: Markdown skills and instruction packs encode procedures as long natural-language documents, while function calling, Model Context Protocol (MCP) servers, and framework tools structure individual actions but usually leave workflow state, policy enforcement, and completion discipline outside the skill itself. We introduce Formal Skill, a runtime-native abstraction that represents reusable capability with JSON metadata and action schemas, reliable Python executors, hook-governed control logic, Formal Skill routing, and skill-local runtime state. By moving reusable procedure from repeated prompt text into executable state machines and hook policies, Formal Skill gives agents a token-efficient and enforceable control surface. We implement the abstraction in FairyClaw, an open-source event-driven runtime for executable, observable, and composable Formal Skills. On Harness-Bench, FairyClaw obtains highly competitive average scores while using substantially fewer tokens, with especially strong results on tasks that expose the role of Formal Skill.

preprint2026arXiv

From Static Risk to Dynamic Trajectories: Toward World-Model-Inspired Clinical Prediction

Clinical decision-making is a feedback system where risk estimates influence treatment, which in turn changes disease trajectories, and both shape clinicians' measurement practices. Static prediction often fails clinically: models trained on observational care logs conflate disease biology with clinician behavior, particularly under treatment confounder feedback and irregular or informative observation. This Review focuses on intervention-aware disease trajectory modeling in clinical AI--methods estimating patient-specific longitudinal disease evolution and assessing trajectory changes under alternative treatments. We organize the field around six linked components: three decision tasks (factual forecasting, counterfactual estimation, policy evaluation) and three data-generating mechanisms (disease evolution, treatment assignment, observation process) that determine identifiability. We present the first unified framework bridging forecasting, counterfactual trajectories, and policy evaluation across discrete/continuous time, explicitly addressing treatment assignment, time-varying confounding, and observation bias. We synthesize key method families (multistate/joint models, temporal point-process, deep sequence architectures, longitudinal causal inference), map them to relevant components, and align evaluation with claim strength via overlap diagnostics, uncertainty quantification, off-policy robustness, and target-trial validation. This synthesis advances benchmark prediction to decision-grade clinical evidence, enabling treatment-sensitive individualized futures, pre-deployment policy stress-testing, and safer closed-loop learning health systems that adapt/abstain when evidence is insufficient.

preprint2026arXiv

Leveraging Error Diversity in Group Rollouts for Reinforcement Learning

Reinforcement Learning from Verifiable Rewards (RLVR) typically samples multiple responses per prompt and assigns binary rewards based on individual correctness, yet the collective structure of the group output, specifically the distribution of errors, is largely discarded. We identify this as a missed opportunity: empirical analysis reveals that error diversity within a group is a strong predictor of training success, with problems eliciting diverse wrong answers benefiting substantially more from RLVR than those producing homogeneous failures. Motivated by this observation, we propose Error Diversity Advantage Shaping (EDAS), a lightweight, algorithm-agnostic technique that modulates the advantage signal for incorrect rollouts based on intra-group error diversity. EDAS amplifies penalties for dominant, repeated errors and attenuates penalties for rare, exploratory ones, thereby encouraging the model to maintain diverse reasoning paths and discouraging error perseveration. Crucially, EDAS operates as a simple post-hoc adjustment that can be seamlessly integrated into any RLVR algorithm. We validate EDAS on top of several mainstream RLVR methods across a series of models and seven challenging math benchmarks, demonstrating consistent improvements. Notably, EDAS yields an average improvement of 6.29 points over DAPO on Qwen3-8B across seven benchmarks, confirming that exploiting the latent information in group rollouts is a broadly effective strategy for strengthening RLVR.

preprint2026arXiv

LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation

Large Language Models (LLMs) perform well on standard reasoning and question-answering benchmarks, yet such evaluations often fail to capture their ability to handle long-tail, expertise-intensive knowledge in real-world professional scenarios. We introduce LPFQA, a long-tail knowledge benchmark derived from authentic professional forum discussions, covering 7 academic and industrial domains with 430 curated tasks grounded in practical expertise. LPFQA evaluates specialized reasoning, domain-specific terminology understanding, and contextual interpretation, and adopts a hierarchical difficulty structure to ensure semantic clarity and uniquely identifiable answers. Experiments on over multiple mainstream LLMs reveal substantial performance gaps, particularly on tasks requiring deep domain reasoning, exposing limitations overlooked by existing benchmarks. Overall, LPFQA provides an authentic and discriminative evaluation framework that complements prior benchmarks and informs future LLM development.

preprint2026arXiv

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Recent advances in coding agents suggest rapid progress toward autonomous software development, yet existing benchmarks fail to rigorously evaluate the long-horizon capabilities required to build complete software systems. Most prior evaluations focus on localized code generation, scaffolded completion, or short-term repair tasks, leaving open the question of whether agents can sustain coherent reasoning, planning, and execution over the extended horizons demanded by real-world repository construction. To address this gap, we present NL2Repo Bench, a benchmark explicitly designed to evaluate the long-horizon repository generation ability of coding agents. Given only a single natural-language requirements document and an empty workspace, agents must autonomously design the architecture, manage dependencies, implement multi-module logic, and produce a fully installable Python library. Our experiments across state-of-the-art open- and closed-source models reveal that long-horizon repository generation remains largely unsolved: even the strongest agents achieve below 40% average test pass rates and rarely complete an entire repository correctly. Detailed analysis uncovers fundamental long-horizon failure modes, including premature termination, loss of global coherence, fragile cross-file dependencies, and inadequate planning over hundreds of interaction steps. NL2Repo Bench establishes a rigorous, verifiable testbed for measuring sustained agentic competence and highlights long-horizon reasoning as a central bottleneck for the next generation of autonomous coding agents.

preprint2026arXiv

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Large language models (LLMs) often fail to learn effective long chain-of-thought (Long CoT) reasoning from human or non-Long-CoT LLMs imitation. To understand this, we propose that effective and learnable Long CoT trajectories feature stable molecular-like structures in unified view, which are formed by three interaction types: Deep-Reasoning (covalent-like), Self-Reflection (hydrogen-bond-like), and Self-Exploration (van der Waals-like). Analysis of distilled trajectories reveals these structures emerge from Long CoT fine-tuning, not keyword imitation. We introduce Effective Semantic Isomers and show that only bonds promoting fast entropy convergence support stable Long CoT learning, while structural competition impairs training. Drawing on these findings, we present Mole-Syn, a distribution-transfer-graph method that guides synthesis of effective Long CoT structures, boosting performance and RL stability across benchmarks.

preprint2024arXiv

Slot-guided Volumetric Object Radiance Fields

We present a novel framework for 3D object-centric representation learning. Our approach effectively decomposes complex scenes into individual objects from a single image in an unsupervised fashion. This method, called slot-guided Volumetric Object Radiance Fields (sVORF), composes volumetric object radiance fields with object slots as a guidance to implement unsupervised 3D scene decomposition. Specifically, sVORF obtains object slots from a single image via a transformer module, maps these slots to volumetric object radiance fields with a hypernetwork and composes object radiance fields with the guidance of object slots at a 3D location. Moreover, sVORF significantly reduces memory requirement due to small-sized pixel rendering during training. We demonstrate the effectiveness of our approach by showing top results in scene decomposition and generation tasks of complex synthetic datasets (e.g., Room-Diverse). Furthermore, we also confirm the potential of sVORF to segment objects in real-world scenes (e.g., the LLFF dataset). We hope our approach can provide preliminary understanding of the physical world and help ease future research in 3D object-centric representation learning.

preprint2022arXiv

3D hyperbolic Navier-Stokes equations in a thin strip: global well-posedness and hydrostatic limit in Gevrey space

We consider the hyperbolic version of three-dimensional anisotropic Naver-Stokes equations in a thin strip and its hydrostatic limit that is a hyperbolic Prandtl type equations. We prove the global-in-time existence and uniqueness for the two systems and the hydrostatic limit when the initial data belong to the Gevrey function space with index 2. The proof is based on a direct energy method by observing the damping effect in the systems.

preprint2022arXiv

Anchor DETR: Query Design for Transformer-Based Object Detection

In this paper, we propose a novel query design for the transformer-based object detection. In previous transformer-based detectors, the object queries are a set of learned embeddings. However, each learned embedding does not have an explicit physical meaning and we cannot explain where it will focus on. It is difficult to optimize as the prediction slot of each object query does not have a specific mode. In other words, each object query will not focus on a specific region. To solved these problems, in our query design, object queries are based on anchor points, which are widely used in CNN-based detectors. So each object query focuses on the objects near the anchor point. Moreover, our query design can predict multiple objects at one position to solve the difficulty: "one region, multiple objects". In addition, we design an attention variant, which can reduce the memory cost while achieving similar or better performance than the standard attention in DETR. Thanks to the query design and the attention variant, the proposed detector that we called Anchor DETR, can achieve better performance and run faster than the DETR with 10$\times$ fewer training epochs. For example, it achieves 44.2 AP with 19 FPS on the MSCOCO dataset when using the ResNet50-DC5 feature for training 50 epochs. Extensive experiments on the MSCOCO benchmark prove the effectiveness of the proposed methods. Code is available at \url{https://github.com/megvii-research/AnchorDETR}.

preprint2022arXiv

Cannikin's Law in Tensor Modeling: A Rank Study for Entanglement and Separability in Tensor Complexity and Model Capacity

This study clarifies the proper criteria to assess the modeling capacity of a general tensor model. The work analyze the problem based on the study of tensor ranks, which is not a well-defined quantity for higher order tensors. To process, the author introduces the separability issue to discuss the Cannikin's law of tensor modeling. Interestingly, a connection between entanglement studied in information theory and tensor analysis is established, shedding new light on the theoretical understanding for modeling capacity problems.

preprint2022arXiv

Efficient Distance-Optimal Tethered Path Planning in Planar Environments: The Workspace Convexity

The main contribution of this paper is the proof of the convexity of the omni-directional tethered robot workspace (namely, the set of all tether-length-admissible robot configurations), as well as a set of distance-optimal tethered path planning algorithms that leverage the workspace convexity. The workspace is proven to be topologically a simply-connected subset and geometrically a convex subset of the set of all configurations. As a direct result, the tether-length-admissible optimal path between two configurations is proven exactly the untethered collision-free locally shortest path in the homotopy specified by the concatenation of the tether curve of the given configurations, which can be simply constructed by performing an untethered path shortening process in the 2D environment instead of a path searching process in the pre-calculated workspace. The convexity is an intrinsic property to the tethered robot kinematics, thus has universal impacts on all high-level distance-optimal tethered path planning tasks: The most time-consuming workspace pre-calculation (WP) process is replaced with a goal configuration pre-calculation (GCP) process, and the homotopy-aware path searching process is replaced with untethered path shortening processes. Motivated by the workspace convexity, efficient algorithms to solve the following problems are naturally proposed: (a) The optimal tethered reconfiguration (TR) planning problem is solved by a locally untethered path shortening (UPS) process, (b) The classic optimal tethered path (TP) planning problem (from a starting configuration to a goal location whereby the target tether state is not assigned) is solved by a GCP process and $n$ UPS processes, where $n$ is the number of tether-length-admissible configurations that visit the goal location, (c) The optimal tethered motion to visit a sequence of multiple goal locations, referred to as

preprint2022arXiv

Efficient Search of the k Shortest Non-Homotopic Paths by Eliminating Non-k-Optimal Topologies

An efficient algorithm to solve the $k$ shortest non-homotopic path planning ($k$-SNPP) problem in a 2D environment is proposed in this paper. Motivated by accelerating the inefficient exploration of the homotopy-augmented space of the 2D environment, our fundamental idea is to identify the non-$k$-optimal path topologies as early as possible and terminate the pathfinding along them. This is a non-trivial practice because it has to be done at an intermediate state of the path planning process when locally shortest paths have not been fully constructed. In other words, the paths to be compared have not rendezvoused at the goal location, which makes the homotopy theory, modelling the spatial relationship among the paths having the same endpoint, not applicable. This paper is the first work that develops a systematic distance-based topology simplification mechanism to solve the $k$-SNPP task, whose core contribution is to assert the distance-based order of non-homotopic locally shortest paths before constructing them. If the order can be predicted, then those path topologies having more than $k$ better topologies are proven free of the desired $k$ paths and thus can be safely discarded during the path planning process. To this end, a hierarchical topological tree is proposed as an implementation of the mechanism, whose nodes are proven to expand in non-homotopic directions and edges (collision-free path segments) are proven locally shortest. With efficient criteria that observe the order relations between partly constructed locally shortest paths being imparted into the tree, the tree nodes that expand in non-$k$-optimal topologies will not be expanded. As a result, the computational time for solving the $k$-SNPP problem is reduced by near two orders of magnitude.

preprint2022arXiv

Global Well-posedness of a Prandtl Model from MHD in Gevrey Function Spaces

We consider a Prandtl model derived from MHD in the Prandtl-Hartmann regime that has a damping term due to the effect of the Hartmann boundary layer. A global-in-time well-posedness is obtained in the Gevrey function space with the optimal index $2$. The proof is based on a cancellation mechanism through some auxiliary functions from the study of the Prandtl equation and an observation about the structure of the loss of one order tangential derivatives through twice operations of the Prandtl operator

preprint2022arXiv

Knowledgebra: An Algebraic Learning Framework for Knowledge Graph

Knowledge graph (KG) representation learning aims to encode entities and relations into dense continuous vector spaces such that knowledge contained in a dataset could be consistently represented. Dense embeddings trained from KG datasets benefit a variety of downstream tasks such as KG completion and link prediction. However, existing KG embedding methods fell short to provide a systematic solution for the global consistency of knowledge representation. We developed a mathematical language for KG based on an observation of their inherent algebraic structure, which we termed as Knowledgebra. By analyzing five distinct algebraic properties, we proved that the semigroup is the most reasonable algebraic structure for the relation embedding of a general knowledge graph. We implemented an instantiation model, SemE, using simple matrix semigroups, which exhibits state-of-the-art performance on standard datasets. Moreover, we proposed a regularization-based method to integrate chain-like logic rules derived from human knowledge into embedding training, which further demonstrates the power of the developed language. As far as we know, by applying abstract algebra in statistical learning, this work develops the first formal language for general knowledge graphs, and also sheds light on the problem of neural-symbolic integration from an algebraic perspective.

preprint2022arXiv

LGD: Label-guided Self-distillation for Object Detection

In this paper, we propose the first self-distillation framework for general object detection, termed LGD (Label-Guided self-Distillation). Previous studies rely on a strong pretrained teacher to provide instructive knowledge that could be unavailable in real-world scenarios. Instead, we generate an instructive knowledge based only on student representations and regular labels. Our framework includes sparse label-appearance encoder, inter-object relation adapter and intra-object knowledge mapper that jointly form an implicit teacher at training phase, dynamically dependent on labels and evolving student representations. They are trained end-to-end with detector and discarded in inference. Experimentally, LGD obtains decent results on various detectors, datasets, and extensive tasks like instance segmentation. For example in MS-COCO dataset, LGD improves RetinaNet with ResNet-50 under 2x single-scale training from 36.2% to 39.0% mAP (+ 2.8%). It boosts much stronger detectors like FCOS with ResNeXt-101 DCN v2 under 2x multi-scale training from 46.1% to 47.9% (+ 1.8%). Compared with a classical teacher-based method FGFI, LGD not only performs better without requiring pretrained teacher but also reduces 51% training cost beyond inherent student learning. Codes are available at https://github.com/megvii-research/LGD.

preprint2022arXiv

Q-ViT: Fully Differentiable Quantization for Vision Transformer

In this paper, we propose a fully differentiable quantization method for vision transformer (ViT) named as Q-ViT, in which both of the quantization scales and bit-widths are learnable parameters. Specifically, based on our observation that heads in ViT display different quantization robustness, we leverage head-wise bit-width to squeeze the size of Q-ViT while preserving performance. In addition, we propose a novel technique named switchable scale to resolve the convergence problem in the joint training of quantization scales and bit-widths. In this way, Q-ViT pushes the limits of ViT quantization to 3-bit without heavy performance drop. Moreover, we analyze the quantization robustness of every architecture component of ViT and show that the Multi-head Self-Attention (MSA) and the Gaussian Error Linear Units (GELU) are the key aspects for ViT quantization. This study provides some insights for further research about ViT quantization. Extensive experiments on different ViT models, such as DeiT and Swin Transformer show the effectiveness of our quantization method. In particular, our method outperforms the state-of-the-art uniform quantization method by 1.5% on DeiT-Tiny.

preprint2022arXiv

QCluster: Clustering Packets for Flow Scheduling

Flow scheduling is crucial in data centers, as it directly influences user experience of applications. According to different assumptions and design goals, there are four typical flow scheduling problems/solutions: SRPT, LAS, Fair Queueing, and Deadline-Aware scheduling. When implementing these solutions in commodity switches with limited number of queues, they need to set static parameters by measuring traffic in advance, while optimal parameters vary across time and space. This paper proposes a generic framework, namely QCluster, to adapt all scheduling problems for limited number of queues. The key idea of QCluster is to cluster packets with similar weights/properties into the same queue. QCluster is implemented in Tofino switches, and can cluster packets at a speed of 3.2 Tbps. To the best of our knowledge, QCluster is the fastest clustering algorithm. Experimental results in testbed with programmable switches and ns-2 show that QCluster reduces the average flow completion time (FCT) for short flows up to 56.6%, and reduces the overall average FCT up to 21.7% over state-of-the-art. All the source code in ns-2 is available in Github without.

preprint2022arXiv

Tensor-networks for High-order Polynomial Approximation: A Many-body Physics Perspective

We analyze the problem of high-order polynomial approximation from a many-body physics perspective, and demonstrate the descriptive power of entanglement entropy in capturing model capacity and task complexity. Instantiated with a high-order nonlinear dynamics modeling problem, tensor-network models are investigated and exhibit promising modeling advantages. This novel perspective establish a connection between quantum information and functional approximation, which worth further exploration in future research.

preprint2022arXiv

Unravelling Distance-Dependent Inter-Site Interactions and Magnetic Transition Effects of Heteronuclear Single Atom Catalysts on Electrochemical Oxygen Reduction

Inter-site interactions between single atom catalysts (SACs) in the high loading regime are critical to tuning the catalytic performance. However, the understanding on such interactions and their distance dependent effects remains elusive, especially for the heteronuclear SACs. In this study, we reveal the effects of the distance-dependent inter-site interaction on the catalytic performance of SACs. Using the density functional theory calculations, we systematically investigate the heteronuclear iron and cobalt single atoms co-supported on the nitrogen-doped graphene (FeN4-C and CoN4-C) for oxygen reduction reaction (ORR). We find that as the distance between Fe and Co SACs decreases, FeN4-C exhibits a reduced catalytic activity, which can be mitigated by the presence of an axial hydroxyl ligand, whereas the activity of CoN4-C shows a volcano-like evolution with the optimum reached at the intermediate distance. We further unravel that the transition towards the high-spin state upon adsorption of ORR intermediate adsorbates is responsible for the decreased activity of both FeN4-C and CoN4-C at short inter-site distance. Such high-spin state transition is also found to significantly shift the linear relation between hydroxyl (*OH) and hydroperoxyl (*OOH) adsorbates. These findings not only shed light on the SAC-specific effect of the distance-dependent inter-site interaction between heteronuclear SACs, but also pave a way towards shifting the long-standing linear relations observed in multiple-electron chemical reactions.

preprint2022arXiv

Vector fields of Cancellation for the Prandtl Operators

It has been a fascinating topic in the study of boundary layer theory about the well-posedness of Prandtl equation that was derived in 1904. Recently, new ideas about cancellation to overcome the loss of tangential derivatives were obtained so that Prandtl equation can be shown to be well-posed in Sobolev spaces to avoid the use of Crocco transformation as in the classical work of Oleinik. This short note aims to show that the cancellation mechanism is in fact related to some intrinsic directional derivative that can be used to recover the tangential derivative under some structural assumption on the fluid near the boundary.

preprint2021arXiv

Phase diagram and superlattice structures of monolayer phosphorus carbide (P$_x$C$_{1-x}$)

Phase stability and properties of two-dimensional phosphorus carbide, P$_x$C$_{1-x}$, are investigated using the first-principles method in combination with cluster expansion and Monte Carlo simulation. Monolayer P$_x$C$_{1-x}$ is found to be a phase separating system which indicates difficulty in fabricating monolayer P$_x$C$_{1-x}$ or crystalline P$_x$C$_{1-x}$ thin films. Nevertheless, a bottom-up design approach is used to determine the stable structures of P$_x$C$_{1-x}$ of various compositions which turn out to be superlattices consisting of alternating carbon and phosphorus nanoribbons along the armchair direction. Results of first-principles calculations indicate that once these structures are produced, they are mechanically and thermodynamically stable. All the ordered structures are predicted to be semiconductors, with band gap ranging from 0.2 to 1.2 eV. In addition, the monolayer P$_x$C$_{1-x}$ are predicted to have high carrier mobility, and high optical absorption in the ultraviolet region which shows a red-shift as the P:C ratio increases. These properties make 2D P$_x$C$_{1-x}$ promising materials for applications in electronics and optoelectronics.

preprint2020arXiv

A Deep Learning Approach for COVID-19 Trend Prediction

In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure. The identification of dominating demographic factors is delivered in the end.

preprint2020arXiv

A new stability and convergence proof of the Fourier-Galerkin spectral method for the spatially homogeneous Boltzmann equation

Numerical approximation of the Boltzmann equation is a challenging problem due to its high-dimensional, nonlocal, and nonlinear collision integral. Over the past decade, the Fourier-Galerkin spectral method has become a popular deterministic method for solving the Boltzmann equation, manifested by its high accuracy and potential of being further accelerated by the fast Fourier transform. Albeit its practical success, the stability of the method is only recently proved by Filbet, F. & Mouhot, C. in [$ Trans. Amer. Math. Soc.$ 363, no. 4 (2011): 1947-1980.] by utilizing the "spreading" property of the collision operator. In this work, we provide a new proof based on a careful $L^2$ estimate of the negative part of the solution. We also discuss the applicability of the result to various initial data, including both continuous and discontinuous functions.

preprint2020arXiv

Cellular Decomposition for Non-repetitive Coverage Task with Minimum Discontinuities

A mechanism to derive non-repetitive coverage path solutions with a proven minimal number of discontinuities is proposed in this work, with the aim to avoid unnecessary, costly end effector lift-offs for manipulators. The problem is motivated by the automatic polishing of an object. Due to the non-bijective mapping between the workspace and the joint-space, a continuous coverage path in the workspace may easily be truncated in the joint-space, incuring undesirable end effector lift-offs. Inversely, there may be multiple configuration choices to cover the same point of a coverage path through the solution of the Inverse Kinematics. The solution departs from the conventional local optimisation of the coverage path shape in task space, or choosing appropriate but possibly disconnected configurations, to instead explicitly explore the leaast number of discontinuous motions through the analysis of the structure of valid configurations in joint-space. The two novel contributions of this paper include proof that the least number of path discontinuities is predicated on the surrounding environment, independent from the choice of the actual coverage path; thus has a minimum. And an efficient finite cellular decomposition method to optimally divide the workspace into the minimum number of cells, each traversable without discontinuties by any arbitrary coverage path within. Extensive simulation examples and real-world results on a 5 DoF manipulator are presented to prove the validity of the proposed strategy in realistic settings.

preprint2020arXiv

CovidNet: To Bring Data Transparency in the Era of COVID-19

Timely, creditable, and fine-granular case information is vital for local communities and individual citizens to make rational and data-driven responses to the COVID-19 pandemic. This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset, which was initiated by 1Point3Acres. To the best of our knowledge, the project is the only platform providing real-time global case information of more than 4,124 sub-divisions from over 27 countries worldwide with multi-language supports. The platform also offers interactive visualization tools to analyze the full historical case curves in each region. Initially launched as a voluntary project to bridge the data transparency gap in North America in January 2020, this project by far has become one of the major independent sources worldwide and has been consumed by many other tracking platforms. The accuracy and freshness of the dataset is a result of the painstaking efforts from our voluntary teamwork, crowd-sourcing channels, and automated data pipelines. As of May 18, 2020, the project website has been visited more than 200 million times and the CovidNet dataset has empowered over 522 institutions and organizations worldwide in policy-making and academic researches. All datasets are openly accessible for non-commercial purposes at https://coronavirus.1point3acres.com via a formal request through our APIs.

preprint2020arXiv

Learning Human-Object Interaction Detection using Interaction Points

Understanding interactions between humans and objects is one of the fundamental problems in visual classification and an essential step towards detailed scene understanding. Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them. Most existing HOI detection approaches are instance-centric where interactions between all possible human-object pairs are predicted based on appearance features and coarse spatial information. We argue that appearance features alone are insufficient to capture complex human-object interactions. In this paper, we therefore propose a novel fully-convolutional approach that directly detects the interactions between human-object pairs. Our network predicts interaction points, which directly localize and classify the inter-action. Paired with the densely predicted interaction vectors, the interactions are associated with human and object detections to obtain final predictions. To the best of our knowledge, we are the first to propose an approach where HOI detection is posed as a keypoint detection and grouping problem. Experiments are performed on two popular benchmarks: V-COCO and HICO-DET. Our approach sets a new state-of-the-art on both datasets. Code is available at https://github.com/vaesl/IP-Net.

preprint2020arXiv

Magnetic effects on the solvability of 2D MHD boundary layer equations without resistivity in Sobolev spaces

In this paper, we are concerned with the magnetic effect on the Sobolev solvability of boundary layer equations for the 2D incompressible MHD system without resistivity. The MHD boundary layer is described by the Prandtl type equations derived from the incompressible viscous MHD system without resistivity under the no-slip boundary condition on the velocity. Assuming that the initial tangential magnetic field does not degenerate, a local-in-time well-posedness in Sobolev spaces is proved without the monotonicity condition on the velocity field. Moreover, we show that if the tangential magnetic field shear layer is degenerate at one point, then the linearized MHD boundary layer system around the shear layer profile is ill-posed in the Sobolev settings provided that the initial velocity shear flow is non-degenerately critical at the same point.

preprint2020arXiv

NagE: Non-Abelian Group Embedding for Knowledge Graphs

We demonstrated the existence of a group algebraic structure hidden in relational knowledge embedding problems, which suggests that a group-based embedding framework is essential for designing embedding models. Our theoretical analysis explores merely the intrinsic property of the embedding problem itself hence is model-independent. Motivated by the theoretical analysis, we have proposed a group theory-based knowledge graph embedding framework, in which relations are embedded as group elements, and entities are represented by vectors in group action spaces. We provide a generic recipe to construct embedding models associated with two instantiating examples: SO3E and SU2E, both of which apply a continuous non-Abelian group as the relation embedding. Empirical experiments using these two exampling models have shown state-of-the-art results on benchmark datasets.

preprint2020arXiv

NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best perceptual quality and similar to the ground truth. The track had 280 registered participants, and 19 teams submitted the final results. They gauge the state-of-the-art in single image super-resolution.

preprint2020arXiv

Perceptual Extreme Super Resolution Network with Receptive Field Block

Perceptual Extreme Super-Resolution for single image is extremely difficult, because the texture details of different images vary greatly. To tackle this difficulty, we develop a super resolution network with receptive field block based on Enhanced SRGAN. We call our network RFB-ESRGAN. The key contributions are listed as follows. First, for the purpose of extracting multi-scale information and enhance the feature discriminability, we applied receptive field block (RFB) to super resolution. RFB has achieved competitive results in object detection and classification. Second, instead of using large convolution kernels in multi-scale receptive field block, several small kernels are used in RFB, which makes us be able to extract detailed features and reduce the computation complexity. Third, we alternately use different upsampling methods in the upsampling stage to reduce the high computation complexity and still remain satisfactory performance. Fourth, we use the ensemble of 10 models of different iteration to improve the robustness of model and reduce the noise introduced by each individual model. Our experimental results show the superior performance of RFB-ESRGAN. According to the preliminary results of NTIRE 2020 Perceptual Extreme Super-Resolution Challenge, our solution ranks first among all the participants.

preprint2020arXiv

Space- and Computationally-Efficient Set Reconciliation via Parity Bitmap Sketch (PBS)

Set reconciliation is a fundamental algorithmic problem that arises in many networking, system, and database applications. In this problem, two large sets A and B of objects (bitcoins, files, records, etc.) are stored respectively at two different network-connected hosts, which we name Alice and Bob respectively. Alice and Bob communicate with each other to learn $AΔB$, the difference between A and B, and as a result the reconciled set $A\bigcup B$. Current set reconciliation schemes are based on either Invertible Bloom Filters (IBF) or Error-Correction Codes (ECC). The former has a low computational complexity of O(d), where d is the cardinality of $AΔB$, but has a high communication overhead that is several times larger than the theoretical minimum. The latter has a low communication overhead close to the theoretical minimum, but has a much higher computational complexity of $O(d^2)$. In this work, we propose Parity Bitmap Sketch (PBS), an ECC- based set reconciliation scheme that gets the better of both worlds: PBS has both a low computational complexity of O(d) just like IBF-based solutions and a low communication overhead of roughly twice the theoretical minimum. A separate contribution of this work is a novel rigorous analytical framework that can be used for the precise calculation of various performance metrics and for the near-optimal parameter tuning of PBS.

preprint2020arXiv

Spectral Clustering with Smooth Tiny Clusters

Spectral clustering is one of the most prominent clustering approaches. The distance-based similarity is the most widely used method for spectral clustering. However, people have already noticed that this is not suitable for multi-scale data, as the distance varies a lot for clusters with different densities. State of the art(ROSC and CAST ) addresses this limitation by taking the reachability similarity of objects into account. However, we observe that in real-world scenarios, data in the same cluster tend to present in a smooth manner, and previous algorithms never take this into account. Based on this observation, we propose a novel clustering algorithm, which con-siders the smoothness of data for the first time. We first divide objects into a great many tiny clusters. Our key idea is to cluster tiny clusters, whose centers constitute smooth graphs. Theoretical analysis and experimental results show that our clustering algorithm significantly outperforms state of the art. Although in this paper, we singly focus on multi-scale situations, the idea of data smoothness can certainly be extended to any clustering algorithms

preprint2020arXiv

Variance Regularization for Accelerating Stochastic Optimization

While nowadays most gradient-based optimization methods focus on exploring the high-dimensional geometric features, the random error accumulated in a stochastic version of any algorithm implementation has not been stressed yet. In this work, we propose a universal principle which reduces the random error accumulation by exploiting statistic information hidden in mini-batch gradients. This is achieved by regularizing the learning-rate according to mini-batch variances. Due to the complementarity of our perspective, this regularization could provide a further improvement for stochastic implementation of generic 1st order approaches. With empirical results, we demonstrated the variance regularization could speed up the convergence as well as stabilize the stochastic optimization.

preprint2020arXiv

Well-posedness in Gevrey function space for 3D Prandtl equations without Structural Assumption

We establish the well-posedness in Gevrey function space with optimal class of regularity 2 for the three dimensional Prandtl system without any structural assumption. The proof combines in a novel way a new cancellation in the system with some of the old ideas to overcome the difficulty of the loss of derivatives in the system.This shows that the three dimensional instabilities in the system leading to ill-posedness are not worse than the two dimensional ones.

preprint2020arXiv

Well-posedness of the MHD boundary layer system in Gevrey function space without Structural Assumption

We establish the well-posedness of the MHD boundary layer system in Gevrey function space without any structural assumption. Compared to the classical Prandtl equation, the loss of tangential derivative comes from both the velocity and magnetic fields that are coupled with each other. By observing a new type of cancellation mechanism in the system for overcoming the loss derivative degeneracy, we show that the MHD boundary layer system is well-posed with Gevrey index up to $3/2$ in both two and three dimensional spaces.

preprint2019arXiv

DetNAS: Backbone Search for Object Detection

Object detectors are usually equipped with backbone networks designed for image classification. It might be sub-optimal because of the gap between the tasks of image classification and object detection. In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection. It is non-trivial because detection training typically needs ImageNet pre-training while NAS systems require accuracies on the target detection task as supervisory signals. Based on the technique of one-shot supernet, which contains all possible networks in the search space, we propose a framework for backbone search on object detection. We train the supernet under the typical detector training schedule: ImageNet pre-training and detection fine-tuning. Then, the architecture search is performed on the trained supernet, using the detection task as the guidance. This framework makes NAS on backbones very efficient. In experiments, we show the effectiveness of DetNAS on various detectors, for instance, one-stage RetinaNet and the two-stage FPN. We empirically find that networks searched on object detection shows consistent superiority compared to those searched on ImageNet classification. The resulting architecture achieves superior performance than hand-crafted networks on COCO with much less FLOPs complexity.