Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
55works
0followers
33topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

55 published item(s)

preprint2026arXiv

MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models

Vision-Language Models (VLMs) frequently suffer from visual perception errors and hallucinations that compromise answer accuracy in complex reasoning tasks. Reinforcement Learning with Verifiable Rewards (RLVR) offers a promising solution by optimizing policies using answer correctness signals. Despite their effectiveness, prevailing RLVR methods face two critical limitations. First, much of the sampling budget is wasted on trajectories doomed to fail due to early visual description errors. Second, sparse rewards cannot distinguish whether failures stem from visual perception or reasoning stages. We introduce MIRL, a decoupled framework that addresses both limitations by leveraging mutual information (MI) between generated descriptions and visual inputs as a cheap pre-screening signal. This enables intelligent budget allocation toward high-potential trajectories via forking, while decoupled training provides independent MI-based rewards for visual perception optimization, resolving reward blindness. Experiments on six vision-language reasoning benchmarks demonstrate that MIRL achieves 70.22% average accuracy and successfully surpasses the performance of sampling 16 complete trajectories using only 10 pre-samples with top-6 selection (25% fewer complete trajectories). Our code is available at: https://anonymous.4open.science/r/mirl-main/.

preprint2026arXiv

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

Existing robot video world models are typically trained with low-level objectives such as reconstruction and perceptual similarity, which are poorly aligned with the capabilities that matter most for robot decision making, including instruction following, manipulation success, and physical plausibility. They also suffer from error accumulation in long-horizon autoregressive prediction. We present RoboAlign-R1, a framework that combines reward-aligned post-training with stabilized long-horizon inference for robot video world models. We construct RobotWorldBench, a benchmark of 10,000 annotated video-instruction pairs collected from four robot data sources, and train a multimodal teacher judge, RoboAlign-Judge, to provide fine-grained six-dimensional evaluation of generated videos. We then distill the teacher into a lightweight student reward model for efficient reinforcement-learning-based post-training. To reduce long-horizon rollout drift, we further introduce Sliding Window Re-encoding (SWR), a training-free inference strategy that periodically refreshes the generation context. Under our in-domain evaluation protocol, RoboAlign-R1 improves the aggregate six-dimension score by 10.1% over the strongest baseline, including gains of 7.5% on Manipulation Accuracy and 4.6% on Instruction Following; these ranking improvements are further supported by an external VLM-based cross-check and a blinded human study. Meanwhile, SWR improves long-horizon prediction quality with only about 1% additional latency, yielding a 2.8% gain in SSIM and a 9.8% reduction in LPIPS. Together, these results show that reward-aligned post-training and stabilized long-horizon decoding improve task consistency, physical realism, and long-horizon prediction quality in robot video world models.

preprint2023arXiv

Cross-Platform Comparison of Arbitrary Quantum Processes

In this work, we present a protocol for comparing the performance of arbitrary quantum processes executed on spatially or temporally disparate quantum platforms using Local Operations and Classical Communication (LOCC). The protocol involves sampling local unitary operators, which are then communicated to each platform via classical communication to construct quantum state preparation and measurement circuits. Subsequently, the local unitary operators are implemented on each platform, resulting in the generation of probability distributions of measurement outcomes. The max process fidelity is estimated from the probability distributions, which ultimately quantifies the relative performance of the quantum processes. Furthermore, we demonstrate that this protocol can be adapted for quantum process tomography. We apply the protocol to compare the performance of five quantum devices from IBM and the "Qianshi" quantum computer from Baidu via the cloud. Remarkably, the experimental results reveal that the protocol can accurately compare the performance of the quantum processes implemented on different quantum computers, requiring significantly fewer measurements than those needed for full quantum process tomography. We view our work as a catalyst for collaborative efforts in cross-platform comparison of quantum computers.

preprint2023arXiv

Light dark matter confronted with the 95 GeV diphoton excess

The correlation between Higgs-like scalars and light dark matter is an interesting topic, especially now that a $125 GeV$ Higgs was discovered and dark matter (DM) searches got negative results. The $95 GeV$ excess reported by the CMS collaboration with $132 fb^{-1}$ data recently, and the DM search results by XENONnT and LZ collaborations motivate us to revise that. In this work, we study that in the GUT-scale constrained (GUTc) Next-to-Minimal Supersymmetric Model (NMSSM), where most parameters are input at the GUT scale, but with scalar and gaugino masses not unified there. In the calculation we also consider other recent experimental constraints, such as Higgs data, Supersymmetry (SUSY) searches, DM relic density, etc. After detailed analysis and discussion, we find that: (i) The light DM can be bino- or singlino-dominated, but can be mixed with minor components of Higgsino. (ii) Both cases can get right relic density and sizable Higgs invisible decay, by adjusting the dimensionless parameters $λ, κ$, or suitably mixing with Higgsino. (iii) Both cases can have four funnel annihilation mechanisms, i.e., annihilating through $Z, a_1, h_2, h_1$. (iv) Samples with right relic density usually get weak signal of Higgs invisible decay at future lepton collider, but the $95 GeV$ scalar can have sizable $b\bar{b}$ signal.

preprint2023arXiv

Smuon contribution to muon g-2 in Grand Unified supersymmetric theories

In GUT-scale constrained (GUTc) supersymmetric (SUSY) models, the mass of smuon $\tildeμ_1$ is typically heavier than that of stau $\tildeτ_1$, and stau co-annihilation is a typical annihilation mechanism of dark matter. However, light smuon is more favored by the muon $g-2$ anomaly, thus smuon-neutralino loop contribution to muon $g-2$ is usually smaller than that of sneutrino-chargino. Inspired by the latest muon $g-2$ results, we take the GUTc- Next-to-Minimal Supersymmetric Model (NMSSM) as an example, where the gaugino (Higgs) masses are not unified to the usual parameter $M_{1/2}$ ($M_0$), exploring its possibility of light smuon and its contribution to muon $g-2$. After complicated calculations and discussions, we conclude that in GUTc-NMSSM the smuon can be lighter than stau. In this light-smuon scenario, the contribution of smuon-neutralino loop to the muon $g-2$ can be larger than that of the sneutrino-chargino loop. The annihilation mechanisms of dark matter are dominated by multiple slepton or chargino co-annihilation. In our calculations, we consider also other latest related constraints like Higgs data, SUSY searches, dark matter relic density and direct detections, etc.

preprint2022arXiv

A Blind All-sky Search for Star Clusters in Gaia EDR3: 886 Clusters within 1.2 kpc of the Sun

Although previous searches for star clusters have been very successful, many clusters are likely still omitted, especially at high Galactic latitude regions. In this work, based on the astrometry of Gaia EDR3, we searched nearby (parallax > 0.8 mas) all-sky regions, obtaining 886 star clusters, of which 270 candidates have not been cataloged before. At the same time, we have presented the physical parameters of the clusters by fitting theoretical isochrones to their optical magnitudes. More halo members and expanding structures in many star clusters were also found. Most of the new objects are young clusters that are less than 100 million years old. Our work greatly increased the sample size and physical parameters of star clusters in the solar neighborhood, in particular, 46 clusters are newly found with |b| > 20 deg, which represents an increase of nearly three fold of cluster numbers at high Galactic latitude regions. The cluster parameters and member stars are available at CDS via https://cdsarc.u-strasbg.fr/ftp/vizier.submit//hezh22b/, and the cluster figure sets are available via https://doi.org/10.12149/101133.

preprint2022arXiv

A mm-Wave Patch Antenna with Broad Bandwidth and a Wide Angular Range

A novel mm-wave microstrip-fed patch antenna with broad bandwidth and wide angular coverage suitable for integration in planar arrays is designed, analyzed and verified by measurements. The antenna provides a bandwidth of 13.1% between 34.1 GHz and 38.9 GHz, which is achieved by a slotted multiple resonances microstrip patch and a matching circuit in microstrip technology. The antenna is built on RO3003 substrate with top and ground layers, which is low cost compared to other techniques. For simple integration with microstrip and frontend circuits, the feeding happens in the top layer with a microstrip coupling gap feed. The wide half power beamwidth is achieved by suitably designed parasitic patches for the first resonant mode. The second resonant mode has a wide half power beamwidth by default. The half power beamwidth is between 100° and 125° within the matched bandwidth, which is a very good value for a microstrip patch antenna radiating over a ground plane. The measured input impedance and radiation characteristic show very good agreement with simulation results.

preprint2022arXiv

A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data

Tensegrity robots, composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Differentiable physics engines have been recently proposed as a data-driven approach for model identification of such complex robotic systems. These engines are often executed at a high-frequency to achieve accurate simulation. Ground truth trajectories for training differentiable engines, however, are not typically available at such high frequencies due to limitations of real-world sensors. The present work focuses on this frequency mismatch, which impacts the modeling accuracy. We proposed a recurrent structure for a differentiable physics engine of tensegrity robots, which can be trained effectively even with low-frequency trajectories. To train this new recurrent engine in a robust way, this work introduces relative to prior work: (i) a new implicit integration scheme, (ii) a progressive training pipeline, and (iii) a differentiable collision checker. A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data. Simulated experiments show that once the recurrent differentiable engine has been trained given the low-frequency trajectories from MuJoCo, it is able to match the behavior of MuJoCo's system. The criterion for success is whether a locomotion strategy learned using the differentiable engine can be transferred back to the ground-truth system and result in a similar motion. Notably, the amount of ground truth data needed to train the differentiable engine, such that the policy is transferable to the ground truth system, is 1% of the data needed to train the policy directly on the ground-truth system.

preprint2022arXiv

Angular momentum and parity projected multidimensionally constrained relativistic Hartree-Bogoliubov model

The nuclear deformations are of fundamental importance in nuclear physics. Recently we developed a multi-dimensionally constrained relativistic Hartree-Bogoliubov (MDCRHB) model, in which all multipole deformations respecting the $V_4$ symmetry can be considered self-consistently. In this work we extend this model by incorporating the angular momentum projection (AMP) and parity projection (PP) to restore the rotational and parity symmetries broken in the mean-field level. This projected-MDCRHB (p-MDCRHB) model enables us to connect certain nuclear spectra to exotic intrinsic shapes such as triangle or tetrahedron. We present the details of the method and an exemplary calculation for $^{12}$C. We develop a triangular moment constraint to generate the triangular configurations consisting of three $α$ clusters arranged as an equilateral triangle. The resulting $^{12}$C spectra are consistent with that from a triangular rigid rotor for large separations between the $α$ clusters. We also calculate the $B(E2)$ and $B(E3)$ values for low-lying states and find good agreement with the experiments.

preprint2022arXiv

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy

Large-scale datasets play a vital role in computer vision. But current datasets are annotated blindly without differentiation to samples, making the data collection inefficient and unscalable. The open question is how to build a mega-scale dataset actively. Although advanced active learning algorithms might be the answer, we experimentally found that they are lame in the realistic annotation scenario where out-of-distribution data is extensive. This work thus proposes a novel active learning framework for realistic dataset annotation. Equipped with this framework, we build a high-quality vision dataset -- Bamboo, which consists of 69M image classification annotations with 119K categories and 28M object bounding box annotations with 809 categories. We organize these categories by a hierarchical taxonomy integrated from several knowledge bases. The classification annotations are four times larger than ImageNet22K, and that of detection is three times larger than Object365. Compared to ImageNet22K and Objects365, models pre-trained on Bamboo achieve superior performance among various downstream tasks (6.2% gains on classification and 2.1% gains on detection). We believe our active learning framework and Bamboo are essential for future work.

preprint2022arXiv

Development of the modified quasichemical model in the distinguishable-pair approximation for multiple compositions of short-range ordering

A binary solution with short-range ordering (SRO) exhibits characteristic solution thermodynamics. The Modified Quasichemical Model in the Pair Approximation (MQMPA) can effectively capture the thermodynamic features of a binary solution with a onefold SRO. However, once the SRO occurs at multiple compositions in a binary solution, the MQMPA is in principle inconvenient to treat the solution thermodynamics. It usually requires a large number of model parameters to fit the thermodynamic data of such a solution. The present work proposes the Modified Quasichemical Model in the Distinguishable-Pair Approximation (MQMDPA), which is a further improvement of the MQMPA. The MQMDPA can more realistically describe the solution thermodynamics with manifold SROs using fewer model parameters. It benefits from grouping the ordered pairs, which were assumed to be indistinguishable within the MQMPA, into several distinguishable types. Each kind of ordered pair has unique coordination numbers and pair energy, which is responsible for describing one of the SROs at the desired composition and strength. Interestingly, the MQMDPA can be completely transformed into the MQMPA when all kinds of ordered pairs are assigned the same pair energy and coordination numbers. The distinguishable pairs have thus become indistinguishable. Several real liquids with at least two observed SROs were successfully treated by the MQMDPA to demonstrate its effectiveness and reliability.

preprint2022arXiv

ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification

Document-level Event Causality Identification (DECI) aims to identify causal relations between event pairs in a document. It poses a great challenge of across-sentence reasoning without clear causal indicators. In this paper, we propose a novel Event Relational Graph TransfOrmer (ERGO) framework for DECI, which improves existing state-of-the-art (SOTA) methods upon two aspects. First, we formulate DECI as a node classification problem by constructing an event relational graph, without the needs of prior knowledge or tools. Second, ERGO seamlessly integrates event-pair relation classification and global inference, which leverages a Relational Graph Transformer (RGT) to capture the potential causal chain. Besides, we introduce edge-building strategies and adaptive focal loss to deal with the massive false positives caused by common spurious correlation. Extensive experiments on two benchmark datasets show that ERGO significantly outperforms previous SOTA methods (13.1% F1 gains on average). We have conducted extensive quantitative analysis and case studies to provide insights for future research directions (Section 4.8).

preprint2022arXiv

Experimental optimal verification of three-dimensional entanglement on a silicon chip

High-dimensional entanglement is significant for the fundamental studies of quantum physics and offers unique advantages in various quantum information processing (QIP) tasks. Integrated quantum devices have recently emerged as a promising platform for creating, processing, and detecting complex high-dimensional entangled states. A crucial step towards practical quantum technologies is to verify that these devices work reliably with an optimal strategy. In this work, we experimentally implement an optimal quantum verification strategy on a three-dimensional maximally entangled state using local projective measurements on a silicon photonic chip. A 95% confidence is achieved from 1190 copies to verify the target quantum state. The obtained scaling of infidelity as a function of the number of copies is -0.5497+-0.0002, exceeding the standard quantum limit of -0.5 with 248 standard deviations. Our results indicate that quantum state verification could serve as an efficient tool for complex quantum measurement tasks.

preprint2022arXiv

Extremely low-mass white dwarf stars observed in Gaia DR2 and LAMOST DR8

We present the first results from our ongoing project to study extremely low mass (ELM) white dwarfs (WDs) ($M$ $\leq$ 0.3$M_{\sun}$) with the Large Sky Area Multi-Object Fibre Spectroscopic Telescope (LAMOST) spectra. Based on the LAMOST DR8 spectral database, we analyzed 136 ELM WD candidates selected from $\it Gaia$ DR2 data and 12 known objects previously identified by the ELM Survey. The atmospheric parameters and radial velocities of these stars were obtained by fitting the LAMOST low-resolution spectra. After comparing the atmospheric parameters of the 12 known objects from this work to the results reported by the ELM Survey, we demonstrated the potential of LAMOST spectra in probing into the nature of ELM WDs. Based on the atmospheric parameters and $\it Gaia$ EDR3 data, we identified 21 new high-probability ELM WDs with masses $M$ $\leq$ 0.3$M_{\sun}$ and parallax estimates that agree to within a factor of 3. Two of them, J0338+4134 and J1129+4715, show significant radial velocity variability and are very likely to be binary systems containing at least one ELM WD.

preprint2022arXiv

Fiber-optic multimode interference sensing: comprehensive characterization and its potential for strain-insensitive temperature sensing

A strain-insensitive temperature sensor based on multimode interference using standard multimode fibers (MMFs) is proposed according to the comprehensive study of the characteristics of the MMFs. The temperature and strain dependences on the core diameter, numerical aperture (NA), and the length of the MMF section in the single-mode--multimode--single-mode (SMS) fiber structure are investigated experimentally. The results indicate that the larger core diameter of the MMF leads to higher temperature sensitivity but lower strain sensitivity (absolute values); the higher NA does not influence the temperature sensitivity but results in higher absolute value of strain sensitivity; the longer MMF section brings lower temperature sensitivity but does not have an impact on strain sensitivity. These findings also contribute to the theoretical analysis of the length dependence in the SMS fiber sensors. Besides, the results of the characterization study show that the strain sensitivity is relatively low, which brings a possibility to develop a strain-insensitive temperature sensor. The proposed sensor is used for temperature sensing while the strain is constantly applied from 0 to 1100 $με$ with steps of 100 $με$. The measured results are consistent with the comprehensive study. The mean temperature sensitivity is 6.14 pm/$^{\circ}$C with a standard deviation of 0.39 pm/$^{\circ}$C, which proves that the proposed temperature sensor exhibits good stability and is insensitive to strain. We expect that these results will provide a profound guideline to fiber sensors based on multimode interference.

preprint2022arXiv

Full-Waveform Modeling for Time-of-Flight Measurements based on Arrival Time of Photons

Modern LiDAR sensors find increasing use in safety-critical applications. Therefore, highly accurate modeling of the system's behavior under demanding environmental conditions is necessary. In this paper, we present a modular structure to accurately simulate the amplified raw detector signal of a direct time-of-flight LiDAR system for coaxial transmitter-receiver optics. Our model describes, a measurement system based on standard optical components and a detector able of converting single photons to an electrical signal. To verify the model's predictions, single-point measurements for targets of different reflectivity at defined distances were performed. Statistical analysis shows an R-squared value greater than 0.990 for simulated and measured signal amplitude levels. Noise modeling shows good accordance with the performed measurements for different target irradiance levels. The presented results have a guiding significance in the modeling of the complex signal processing chain of LiDAR systems, as it enables the prediction of key parameters of the system early in the development process. Hence, unnecessary costs by design flaws can be mitigated. The modular structure allows easy adaption for arbitrary LiDAR systems.

preprint2022arXiv

INTERN: A New Learning Paradigm Towards General Vision

Enormous waves of technological innovations over the past several years, marked by the advances in AI technologies, are profoundly reshaping the industry and the society. However, down the road, a key challenge awaits us, that is, our capability of meeting rapidly-growing scenario-specific demands is severely limited by the cost of acquiring a commensurate amount of training data. This difficult situation is in essence due to limitations of the mainstream learning paradigm: we need to train a new model for each new scenario, based on a large quantity of well-annotated data and commonly from scratch. In tackling this fundamental problem, we move beyond and develop a new learning paradigm named INTERN. By learning with supervisory signals from multiple sources in multiple stages, the model being trained will develop strong generalizability. We evaluate our model on 26 well-known datasets that cover four categories of tasks in computer vision. In most cases, our models, adapted with only 10% of the training data in the target domain, outperform the counterparts trained with the full set of data, often by a significant margin. This is an important step towards a promising prospect where such a model with general vision capability can dramatically reduce our reliance on data, thus expediting the adoption of AI technologies. Furthermore, revolving around our new paradigm, we also introduce a new data system, a new architecture, and a new benchmark, which, together, form a general vision ecosystem to support its future development in an open and inclusive manner. See project website at https://opengvlab.shlab.org.cn .

preprint2022arXiv

LDoS attack detection method based on traffic time-frequency characteristics

For the traditional denial-of-service attack detection methods have complex algorithms and high computational overhead, which are difficult to meet the demand of online detection; and the experimental environment is mostly a simulation platform, which is difficult to deploy in real network environment, we propose a real network environment-oriented LDoS attack detection method based on the time-frequency characteristics of traffic data. All the traffic data flowing through the Web server is obtained through the acquisition storage system, and the detection data set is constructed using pre-processing; the simple features of the flow fragments are used as input, and the deep neural network is used to learn the time-frequency domain features of normal traffic features and generate reconstructed sequences, and the LDoS attack is discriminated based on the differences between the reconstructed sequences and the input data in the time-frequency domain. The experimental results show that the proposed method can accurately detect the attack features in the flow fragments in a very short time and achieve high detection accuracy for complex and diverse LDoS attacks; since only the statistical features of the packets are used, there is no need to parse the packet data, which can be adapted to different network environments.

preprint2022arXiv

Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion

In this paper, we formulate a potentially valuable panoramic depth completion (PDC) task as panoramic 3D cameras often produce 360° depth with missing data in complex scenes. Its goal is to recover dense panoramic depths from raw sparse ones and panoramic RGB images. To deal with the PDC task, we train a deep network that takes both depth and image as inputs for the dense panoramic depth recovery. However, it needs to face a challenging optimization problem of the network parameters due to its non-convex objective function. To address this problem, we propose a simple yet effective approach termed M{^3}PT: multi-modal masked pre-training. Specifically, during pre-training, we simultaneously cover up patches of the panoramic RGB image and sparse depth by shared random mask, then reconstruct the sparse depth in the masked regions. To our best knowledge, it is the first time that we show the effectiveness of masked pre-training in a multi-modal vision task, instead of the single-modal task resolved by masked autoencoders (MAE). Different from MAE where fine-tuning completely discards the decoder part of pre-training, there is no architectural difference between the pre-training and fine-tuning stages in our M$^{3}$PT as they only differ in the prediction density, which potentially makes the transfer learning more convenient and effective. Extensive experiments verify the effectiveness of M{^3}PT on three panoramic datasets. Notably, we improve the state-of-the-art baselines by averagely 26.2% in RMSE, 51.7% in MRE, 49.7% in MAE, and 37.5% in RMSElog on three benchmark datasets.

preprint2022arXiv

Network Traffic Anomaly Detection Method Based on Multi scale Residual Feature

To address the problem that traditional network traffic anomaly detection algorithms do not suffi-ciently mine potential features in long time domain, an anomaly detection method based on mul-ti-scale residual features of network traffic is proposed. The original traffic is divided into subse-quences of different time spans using sliding windows, and each subsequence is decomposed and reconstructed into data sequences of different levels using wavelet transform technique; the stacked autoencoder (SAE) constructs similar feature space using normal network traffic, and gen-erates reconstructed error vector using the difference between reconstructed samples and input samples in the similar feature space; the multi-path residual group is used to learn reconstructed error The traffic classification is completed by a lightweight classifier. The experimental results show that the detection performance of the proposed method for anomalous network traffic is sig-nificantly improved compared with traditional methods; it confirms that the longer time span and more S transformation scales have positive effects on discovering potential diversity information in the original network traffic.

preprint2022arXiv

Nonlinear Optimal Guidance for Fixed-Time Impact on a Stationary Target

This paper is concerned with devising the nonlinear optimal guidance for intercepting a stationary target with a fixed impact time. According to Pontryagin's Maximum Principle (PMP), some optimality conditions for the solutions of the nonlinear optimal interception problem are established, and the structure of the corresponding optimal control is presented. By employing the optimality conditions, we formulate a parameterized system so that its solution space is the same as that of the nonlinear optimal interception problem. As a consequence, a simple propagation of the parameterized system, without using any optimization method, is sufficient to generate enough sampled data for the mapping from current state and time-to-go to the optimal guidance command. By virtue of the universal approximation theorem, a feedforward neural network, trained by the generated data, is able to represent the mapping from current state and time-to-go to the optimal guidance command. Therefore, the trained network eventually can generate fixed-impact-time nonlinear optimal guidance within a constant time. Finally, the developed nonlinear optimal guidance is exemplified and studied through simulations, showing that the nonlinear optimal guidance law performs better than existing interception guidance laws.

preprint2022arXiv

Physical Properties of 29 sdB+dM Eclipsing Binaries in Zwicky Transient Facility

The development of large-scale time-domain surveys provides an opportunity to study the physical properties as well as the evolutionary scenario of B-type subdwarfs (sdB) and M-type dwarfs (dM). Here, we obtained 33 sdB+dM eclipsing binaries based on the Zwicky Transient Facility (ZTF) light curves and $Gaia$ early data release 3 (EDR3) parallaxes. By using the PHOEBE code for light curve analysis, we obtain probability distributions for parameters of 29 sdB+dM. $R_1$, $R_2$, and $i$ are well determined, and the average uncertainty of mass ratio $q$ is 0.08. Our parameters are in good agreement with previous works if a typical mass of sdB is assumed. Based on parameters of 29 sdB+dM, we find that both the mass ratio $q$ and the companion's radius $R_2$ decrease with the shortening of the orbital period. For the three sdB+dMs with orbital periods less than 0.075 days, their companions are all brown dwarfs. The masses and radii of the companions satisfy the mass--radius relation for low-mass stars and brown dwarfs. Companions with radii between $0.12R_\odot$ and $0.15R_\odot$ seem to be missing in the observations. As more short-period sdB+dM eclipsing binaries are discovered and classified in the future with ZTF and $Gaia$, we will have more information to constrain the evolutionary ending of sdB+dM.

preprint2022arXiv

Prompt for Extraction? PAIE: Prompting Argument Interaction for Event Argument Extraction

In this paper, we propose an effective yet efficient model PAIE for both sentence-level and document-level Event Argument Extraction (EAE), which also generalizes well when there is a lack of training data. On the one hand, PAIE utilizes prompt tuning for extractive objectives to take the best advantages of Pre-trained Language Models (PLMs). It introduces two span selectors based on the prompt to select start/end tokens among input texts for each role. On the other hand, it captures argument interactions via multi-role prompts and conducts joint optimization with optimal span assignments via a bipartite matching loss. Also, with a flexible prompt design, PAIE can extract multiple arguments with the same role instead of conventional heuristic threshold tuning. We have conducted extensive experiments on three benchmarks, including both sentence- and document-level EAE. The results present promising improvements from PAIE (3.5\% and 2.3\% F1 gains in average on three benchmarks, for PAIE-base and PAIE-large respectively). Further analysis demonstrates the efficiency, generalization to few-shot settings, and effectiveness of different extractive prompt tuning strategies. Our code is available at https://github.com/mayubo2333/PAIE.

preprint2022arXiv

RigNet: Repetitive Image Guided Network for Depth Completion

Depth completion deals with the problem of recovering dense depth maps from sparse ones, where color images are often used to facilitate this task. Recent approaches mainly focus on image guided learning frameworks to predict dense depth. However, blurry guidance in the image and unclear structure in the depth still impede the performance of the image guided frameworks. To tackle these problems, we explore a repetitive design in our image guided network to gradually and sufficiently recover depth values. Specifically, the repetition is embodied in both the image guidance branch and depth generation branch. In the former branch, we design a repetitive hourglass network to extract discriminative image features of complex environments, which can provide powerful contextual instruction for depth prediction. In the latter branch, we introduce a repetitive guidance module based on dynamic convolution, in which an efficient convolution factorization is proposed to simultaneously reduce its complexity and progressively model high-frequency structures. Extensive experiments show that our method achieves superior or competitive results on KITTI benchmark and NYUv2 dataset.

preprint2022arXiv

The modified quasichemical model in the Distinguishable-Pair Approximation for multicomponent solutions

The Modified Quasichemcial Model in the Distinguishable-Pair Approximation (MQMDPA) for manifold short-range orders in liquids has been successfully extended to multicomponent solutions. The extension is conducted by means of the geometrical interpolation method. Three types of interpolation models, namely Kohler, Toop and Chou, are introduced to initially formulate the pair interaction energies in ternary solutions by employing those in their constituent binary solutions. The pair energies can be expanded in terms of the pair fractions (configuration-dependent) or in terms of the coordination-equivalent fractions (composition-dependent). These methods are subsequently extended for use in multicomponent solutions. A general formalism for the combined Kohler-Toop model is employed to permit complete freedom of choice to treat any ternary subsystems with a symmetric or asymmetric model. Meanwhile, a general Chou model is also used to treat all ternary subsystems without any human interference to select symmetric or asymmetric components but only dependent upon the similarity and difference in properties from each two binary solutions with a co-member. Advantages and shortcomings are critically discussed regarding the utilization of different interpolation models.

preprint2022arXiv

The temporal limits of predicting fault failure

Machine learning models using seismic emissions can predict instantaneous fault characteristics such as displacement in laboratory experiments and slow slip in Earth. Here, we address whether the acoustic emission (AE) from laboratory experiments contains information about near-future frictional behavior. The approach uses a convolutional encoder-decoder containing a transformer layer. We use as input progressively larger AE input time windows and progressively larger output friction time windows. The attention map from the transformer is used to interpret which regions of the AE contain hidden information corresponding to future frictional behavior. We find that very near-term predictive information is indeed contained in the AE signal, but farther into the future the predictions are progressively worse. Notably, information for predicting near future frictional failure and recovery are found to be contained in the AE signal. This first effort predicting future fault frictional behavior with machine learning will guide efforts for applications in Earth.

preprint2022arXiv

Towards Robust 2D Convolution for Reliable Visual Recognition

2D convolution (Conv2d), which is responsible for extracting features from the input image, is one of the key modules of a convolutional neural network (CNN). However, Conv2d is vulnerable to image corruptions and adversarial samples. It is an important yet rarely investigated problem that whether we can design a more robust alternative of Conv2d for more reliable feature extraction. In this paper, inspired by the recently developed learnable sparse transform that learns to convert the CNN features into a compact and sparse latent space, we design a novel building block, denoted by RConv-MK, to strengthen the robustness of extracted convolutional features. Our method leverages a set of learnable kernels of different sizes to extract features at different frequencies and employs a normalized soft thresholding operator to adaptively remove noises and trivial features at different corruption levels. Extensive experiments on clean images, corrupted images as well as adversarial samples validate the effectiveness of the proposed robust module for reliable visual recognition. The source codes are enclosed in the submission.

preprint2021arXiv

Atlas-aware ConvNetfor Accurate yet Robust Anatomical Segmentation

Convolutional networks (ConvNets) have achieved promising accuracy for various anatomical segmentation tasks. Despite the success, these methods can be sensitive to data appearance variations. Considering the large variability of scans caused by artifacts, pathologies, and scanning setups, robust ConvNets are vital for clinical applications, while have not been fully explored. In this paper, we propose to mitigate the challenge by enabling ConvNets' awareness of the underlying anatomical invariances among imaging scans. Specifically, we introduce a fully convolutional Constraint Adoption Module (CAM) that incorporates probabilistic atlas priors as explicit constraints for predictions over a locally connected Conditional Random Field (CFR), which effectively reinforces the anatomical consistency of the labeling outputs. We design the CAM to be flexible for boosting various ConvNet, and compact for co-optimizing with ConvNets for fusion parameters that leads to the optimal performance. We show the advantage of such atlas priors fusion is two-fold with two brain parcellation tasks. First, our models achieve state-of-the-art accuracy among ConvNet-based methods on both datasets, by significantly reducing structural abnormalities of predictions. Second, we can largely boost the robustness of existing ConvNets, proved by: (i) testing on scans with synthetic pathologies, and (ii) training and evaluation on scans of different scanning setups across datasets. Our method is proposing to be easily adopted to existing ConvNets by fine-tuning with CAM plugged in for accuracy and robustness boosts.

preprint2021arXiv

Elongation of Curvature-Bounded Path

The paper is concerned with elongating the shortest curvature-bounded path between two oriented points to an expected length. The elongation of curvature-bounded paths to an expected length is fundamentally important to plan missions for nonholonomic-constrained vehicles in many practical applications, such as coordinating multiple nonholonomic-constrained vehicles to reach a destination simultaneously or performing a mission with a strict time window. In the paper, the explicit conditions for the existence of curvature-bounded paths joining two oriented points with an expected length are established by applying the properties of the reachability set of curvature-bounded paths. These existence conditions are numerically verifiable, allowing readily checking the existence of curvature-bounded paths between two prescribed oriented points with a desired length. In addition, once the existence conditions are met, elongation strategies are provided in the paper to get curvature-bounded paths with expected lengths. Finally, some examples of minimum-time path planning for multiple fixed-wing aerial vehicles to cooperatively achieve a triangle-shaped flight formation are presented, illustrating and verifying the developments of the paper.

preprint2021arXiv

Exploring Instance-Level Uncertainty for Medical Detection

The ability of deep learning to predict with uncertainty is recognized as key for its adoption in clinical routines. Moreover, performance gain has been enabled by modelling uncertainty according to empirical evidence. While previous work has widely discussed the uncertainty estimation in segmentation and classification tasks, its application on bounding-box-based detection has been limited, mainly due to the challenge of bounding box aligning. In this work, we explore to augment a 2.5D detection CNN with two different bounding-box-level (or instance-level) uncertainty estimates, i.e., predictive variance and Monte Carlo (MC) sample variance. Experiments are conducted for lung nodule detection on LUNA16 dataset, a task where significant semantic ambiguities can exist between nodules and non-nodules. Results show that our method improves the evaluating score from 84.57% to 88.86% by utilizing a combination of both types of variances. Moreover, we show the generated uncertainty enables superior operating points compared to using the probability threshold only, and can further boost the performance to 89.52%. Example nodule detections are visualized to further illustrate the advantages of our method.

preprint2021arXiv

GaitSet: Cross-view Gait Recognition through Utilizing Gait as a Deep Set

Gait is a unique biometric feature that can be recognized at a distance; thus, it has broad applications in crime prevention, forensic identification, and social security. To portray a gait, existing gait recognition methods utilize either a gait template which makes it difficult to preserve temporal information, or a gait sequence that maintains unnecessary sequential constraints and thus loses the flexibility of gait recognition. In this paper, we present a novel perspective that utilizes gait as a deep set, which means that a set of gait frames are integrated by a global-local fused deep network inspired by the way our left- and right-hemisphere processes information to learn information that can be used in identification. Based on this deep set perspective, our method is immune to frame permutations, and can naturally integrate frames from different videos that have been acquired under different scenarios, such as diverse viewing angles, different clothes, or different item-carrying conditions. Experiments show that under normal walking conditions, our single-model method achieves an average rank-1 accuracy of 96.1% on the CASIA-B gait dataset and an accuracy of 87.9% on the OU-MVLP gait dataset. Under various complex scenarios, our model also exhibits a high level of robustness. It achieves accuracies of 90.8% and 70.3% on CASIA-B under bag-carrying and coat-wearing walking conditions respectively, significantly outperforming the best existing methods. Moreover, the proposed method maintains a satisfactory accuracy even when only small numbers of frames are available in the test samples; for example, it achieves 85.0% on CASIA-B even when using only 7 frames. The source code has been released at https://github.com/AbnerHqC/GaitSet.

preprint2021arXiv

Oral-3D: Reconstructing the 3D Bone Structure of Oral Cavity from 2D Panoramic X-ray

Panoramic X-ray (PX) provides a 2D picture of the patient's mouth in a panoramic view to help dentists observe the invisible disease inside the gum. However, it provides limited 2D information compared with cone-beam computed tomography (CBCT), another dental imaging method that generates a 3D picture of the oral cavity but with more radiation dose and a higher price. Consequently, it is of great interest to reconstruct the 3D structure from a 2D X-ray image, which can greatly explore the application of X-ray imaging in dental surgeries. In this paper, we propose a framework, named Oral-3D, to reconstruct the 3D oral cavity from a single PX image and prior information of the dental arch. Specifically, we first train a generative model to learn the cross-dimension transformation from 2D to 3D. Then we restore the shape of the oral cavity with a deformation module with the dental arch curve, which can be obtained simply by taking a photo of the patient's mouth. To be noted, Oral-3D can restore both the density of bony tissues and the curved mandible surface. Experimental results show that Oral-3D can efficiently and effectively reconstruct the 3D oral structure and show critical information in clinical applications, e.g., tooth pulling and dental implants. To the best of our knowledge, we are the first to explore this domain transformation problem between these two imaging methods.

preprint2021arXiv

Predicting Fault Slip via Transfer Learning

Data-driven machine-learning for predicting instantaneous and future fault-slip in laboratory experiments has recently progressed markedly due to large training data sets. In Earth however, earthquake interevent times range from 10's-100's of years and geophysical data typically exist for only a portion of an earthquake cycle. Sparse data presents a serious challenge to training machine learning models. Here we describe a transfer learning approach using numerical simulations to train a convolutional encoder-decoder that predicts fault-slip behavior in laboratory experiments. The model learns a mapping between acoustic emission histories and fault-slip from numerical simulations, and generalizes to produce accurate results using laboratory data. Notably slip-predictions markedly improve using the simulation-data trained-model and training the latent space using a portion of a single laboratory earthquake-cycle. The transfer learning results elucidate the potential of using models trained on numerical simulations and fine-tuned with small geophysical data sets for potential applications to faults in Earth.

preprint2021arXiv

T-Net: Learning Feature Representation with Task-specific Supervision for Biomedical Image Analysis

The encoder-decoder network is widely used to learn deep feature representations from pixel-wise annotations in biomedical image analysis. Under this structure, the performance profoundly relies on the effectiveness of feature extraction achieved by the encoding network. However, few models have considered adapting the attention of the feature extractor even in different kinds of tasks. In this paper, we propose a novel training strategy by adapting the attention of the feature extractor according to different tasks for effective representation learning. Specifically, the framework, named T-Net, consists of an encoding network supervised by task-specific attention maps and a posterior network that takes in the learned features to predict the corresponding results. The attention map is obtained by the transformation from pixel-wise annotations according to the specific task, which is used as the supervision to regularize the feature extractor to focus on different locations of the recognition object. To show the effectiveness of our method, we evaluate T-Net on two different tasks, i.e. , segmentation and localization. Extensive results on three public datasets (BraTS-17, MoNuSeg and IDRiD) have indicated the effectiveness and efficiency of our proposed supervision method, especially over the conventional encoding-decoding network.

preprint2021arXiv

Towards the standardization of quantum state verification using optimal strategies

Quantum devices for generating entangled states have been extensively studied and widely used. As so, it becomes necessary to verify that these devices truly work reliably and efficiently as they are specified. Here, we experimentally realize the recently proposed two-qubit entangled state verification strategies using both local measurements (nonadaptive) and active feed-forward operations (adaptive) with a photonic platform. About 3283/536 number of copies ($N$) are required to achieve a 99% confidence to verify the target quantum state for nonadaptive/adaptive strategies. These optimal strategies provide the Heisenberg scaling of the infidelity $ε$ as a function of $N$ ($ε$ $\sim$ $N^r$) with the parameter $r=-1$, exceeding the standard quantum limit with $r=-0.5$. We experimentally obtain the scaling parameter of $r=-0.88\pm$0.03 and $-0.78\pm$0.07 for nonadaptive and adaptive strategies, respectively. Our experimental work could serve as a standardized procedure for the verification of quantum states.

preprint2020arXiv

A First Principles Approach for Data-Efficient System Identification of Spring-Rod Systems via Differentiable Physics Engines

We propose a novel differentiable physics engine for system identification of complex spring-rod assemblies. Unlike black-box data-driven methods for learning the evolution of a dynamical system and its parameters, we modularize the design of our engine using a discrete form of the governing equations of motion, similar to a traditional physics engine. We further reduce the dimension from 3D to 1D for each module, which allows efficient learning of system parameters using linear regression. As a side benefit, the regression parameters correspond to physical quantities, such as spring stiffness or the mass of the rod, making the pipeline explainable. The approach significantly reduces the amount of training data required, and also avoids iterative identification of data sampling and model training. We compare the performance of the proposed engine with previous solutions, and demonstrate its efficacy on tensegrity systems, such as NASA's icosahedron.

preprint2020arXiv

A Novel Scenario in the Semi-constrained NMSSM

In this work, we develop a novel efficient scan method, combining the Heuristically Search (HS) and the Generative Adversarial Network (GAN), where the HS can shift marginal samples to perfect samples, and the GAN can generate a huge amount of recommended samples from noise in a short time. With this efficient method, we find a new scenario in the semi-constrained Next-to Minimal Supersymmetric Standard Model (scNMSSM), or NMSSM with non-universal Higgs masses. In this scenario, (i) Both muon g-2 and right relic density can be satisfied, along with the high mass bound of gluino, etc. As far as we know, that had not been realized in the scNMSSM before this work. (ii) With the right relic density, the lightest neutralinos are singlino-dominated, and can be as light as 0-12 GeV. (iii) The future direct detections XENONnT and LUX-ZEPLIN (LZ-7 2T) can give strong constraints to this scenario. (iv) The current indirect constraints to Higgs invisible decay $h_2\to \tildeχ^0_1 \tildeχ^0_1$ are weak, but the direct detection of Higgs invisible decay at the future HL-LHC may cover half of the samples, and that of the CEPC may cover most. (v) The branching ratio of Higgs exotic decay $h_2\to h_1h_1, a_1a_1$ can be over 20 percent, while their contributions to the invisible decay $h_2\to4χ_1^0$ are very small.

preprint2020arXiv

Accurate Anchor Free Tracking

Visual object tracking is an important application of computer vision. Recently, Siamese based trackers have achieved good accuracy. However, most of Siamese based trackers are not efficient, as they exhaustively search potential object locations to define anchors and then classify each anchor (i.e., a bounding box). This paper develops the first Anchor Free Siamese Network (AFSN). Specifically, a target object is defined by a bounding box center, tracking offset, and object size. All three are regressed by Siamese network with no additional classification or regional proposal, and performed once for each frame. We also tune the stride and receptive field for Siamese network, and further perform ablation experiments to quantitatively illustrate the effectiveness of our AFSN. We evaluate AFSN using five most commonly used benchmarks and compare to the best anchor-based trackers with source codes available for each benchmark. AFSN is 3-425 times faster than these best anchor based trackers. AFSN is also 5.97% to 12.4% more accurate in terms of all metrics for benchmark sets OTB2015, VOT2015, VOT2016, VOT2018 and TrackingNet, except that SiamRPN++ is 4% better than AFSN in terms of Expected Average Overlap (EAO) on VOT2018 (but SiamRPN++ is 3.9 times slower).

preprint2020arXiv

Adapting Object Detectors with Conditional Domain Normalization

Real-world object detectors are often challenged by the domain gaps between different datasets. In this work, we present the Conditional Domain Normalization (CDN) to bridge the domain gap. CDN is designed to encode different domain inputs into a shared latent space, where the features from different domains carry the same domain attribute. To achieve this, we first disentangle the domain-specific attribute out of the semantic features from one domain via a domain embedding module, which learns a domain-vector to characterize the corresponding domain attribute information. Then this domain-vector is used to encode the features from another domain through a conditional normalization, resulting in different domains' features carrying the same domain attribute. We incorporate CDN into various convolution stages of an object detector to adaptively address the domain shifts of different level's representation. In contrast to existing adaptation works that conduct domain confusion learning on semantic features to remove domain-specific factors, CDN aligns different domain distributions by modulating the semantic features of one domain conditioned on the learned domain-vector of another domain. Extensive experiments show that CDN outperforms existing methods remarkably on both real-to-real and synthetic-to-real adaptation benchmarks, including 2D image detection and 3D point cloud detection.

preprint2020arXiv

Application of the Resource Theory of Channels to Communication Scenarios

We introduce a resource theory of channels relevant to communication via quantum channels, in which the set of constant channels --- useless channels for communication tasks --- is considered as the free resource. We find that our theory with such a simple structure is useful to address central problems in quantum Shannon theory --- in particular, we provide a converse bound for the one-shot non-signalling assisted classical capacity that naturally leads to its strong converse property, as well as obtain the one-shot channel simulation cost with non-signalling assistance. We clarify an intimate connection between the non-signalling assistance and our formalism by identifying the non-signalling assisted channel coding with the channel transformation under the maximal set of resource non-generating superchannels, providing a physical characterization of the latter. Our results provide new perspectives and concise arguments to those problems, connecting the recently developed fields of resource theories to `classic' settings in quantum information theory and shedding light on the validity of resource theories of channels as effective tools to address practical problems.

preprint2020arXiv

Effective Scaling of Blockchain Beyond Consensus Innovations and Moore's Law

As an emerging technology, blockchain has achieved great success in numerous application scenarios, from intelligent healthcare to smart cities. However, a long-standing bottleneck hindering its further development is the massive resource consumption attributed to the distributed storage and computation methods. This makes blockchain suffer from insufficient performance and poor scalability. Here, we analyze the recent blockchain techniques and demonstrate that the potential of widely-adopted consensus-based scaling is seriously limited, especially in the current era when Moore's law-based hardware scaling is about to end. We achieve this by developing an open-source benchmarking tool, called Prism, for investigating the key factors causing low resource efficiency and then discuss various topology and hardware innovations which could help to scale up blockchain. To the best of our knowledge, this is the first in-depth study that explores the next-generation scaling strategies by conducting large-scale and comprehensive benchmarking.

preprint2020arXiv

Funnel annihilations of light dark matter and the invisible decay of the Higgs boson

The semi-constrained NMSSM (scNMSSM), or NMSSM with non-universal Higgs masses, can naturally predict a light dark matter under current constraints including Higgs data, sparticle-mass bounds, dark matter searches, and muon g-2, etc. In this work, we take this scenario of scNMSSM as an example to study the funnel-annihilation mechanisms of light dark matter ($1\!\thicksim\!62$ GeV) and the invisible Higgs decay. In this scenario we found that: (i) There can be four funnel-annihilation mechanisms for the LSP $\tildeχ^0_1$, which are the $h_2$, $Z$, $h_1$ and $a_1$ funnel. (ii) For the $h_1$ and $a_1$ funnel with right relic density, the $\tildeχ^0_1$ mass is lighter than 12 GeV, and the invisible Higgs decay can be $2\%$ at most. (iii) For the $h_2$ and $Z$ funnel with right relic density, the invisible Higgs decay can be about $0.4\%$ and $1\%$ respectively at most. (iv) If the invisible Higgs decay was discovered at the HL-LHC, the four funnel-annihilation mechanisms of light dark matter may be all excluded with $\tildeχ^0_1$ as the only dark matter source. Four benchmark points, one for each mechanism, are proposed for future checking with updated experimental results.

preprint2020arXiv

Higgs decay to light scalars in the semi-constrained NMSSM

The next-to minimal supersymmetric standard model (NMSSM) with non-universal Higgs masses, or the semi-constrained NMSSM (scNMSSM), extend the minimal supersymmetric standard model (MSSM) by a singlet superfield and assume universal conditions except for the Higgs sector. It can not only keep the simpleness and grace of the fully constrained MSSM and NMSSM, and relax the tension that they face after the 125-GeV Higgs boson discovered, but also predict an exotic phenomenon that Higgs decay to a pair of light singlet-dominated scalars ($10\!\sim\! 60\;{\rm GeV}$). This condition can be classified to three scenarios according to the identities of the SM-like Higgs and the light scalar: (i) the light scalar is CP-odd, and the SM-like Higgs is $h_2$; (ii) the light scalar is CP-odd, and the SM-like Higgs is $h_1$; (iii) the light scalar is CP-even, and the SM-like Higgs is $h_2$. In this work, we compare the three scenarios, checking the interesting parameter schemes that lead to the scenarios, the mixing levels of the doublets and singlets, the tri-scalar coupling between the SM-like Higgs and a pair of light scalars, the branching ratio of Higgs decay to the light scalars, and sensitivities in hunting for the exotic decay at the HL-LHC and the future lepton colliders such as CEPC, FCC-ee, and ILC.

preprint2020arXiv

Low Precision Floating-point Arithmetic for High Performance FPGA-based CNN Acceleration

Low precision data representation is important to reduce storage size and memory access for convolutional neural networks (CNNs). Yet, existing methods have two major limitations: (1) requiring re-training to maintain accuracy for deep CNNs, and (2) needing 16-bit floating-point or 8-bit fixed-point for a good accuracy. In this paper, we propose a low precision (8-bit) floating-point (LPFP) quantization method for FPGA-based acceleration to overcome the above limitations. Without any re-training, LPFP finds an optimal 8-bit data representation with negligible top-1/top-5 accuracy loss (within 0.5%/0.3% in our experiments, respectively, and significantly better than existing methods for deep CNNs). Furthermore, we implement one 8-bit LPFP multiplication by one 4-bit multiply-adder (MAC) and one 3-bit adder, and therefore implement four 8-bit LPFP multiplications using one DSP slice of Xilinx Kintex 7 family (KC705 in this paper) while one DSP can implement only two 8-bit fixed-point multiplications. Experiments on six typical CNNs for inference show that on average, we improve throughput by 64.5x over Intel i9 CPU and by 1.5x over existing FPGA accelerators. Particularly for VGG16 and YOLO, compared to six recent FPGA accelerators, we improve average throughput by 3.5x and 27.5x and improve average throughput per DSP by 4.1x and 5x, respectively. To the best of our knowledge, this is the first in-depth study to simplify one multiplication for CNN inference to one 4-bit MAC and implement four multiplications within one DSP while maintaining comparable accuracy without any re-training.

preprint2020arXiv

Multifunctional Lateral Transition-Metal Disulfides Heterojunctions

The intrinsic spin-dependent transport properties of two types of lateral VS2|MoS2 heterojunctions are systematically investigated using first-principles calculations, and their various nanodevices with novel properties are designed. The lateral VS2|MoS2 heterojunction diodes show a perfect rectifying effect and are promising for the applications of Schottky diodes. A large spin-polarization ratio is observed for the A-type device and pure spin-mediated current is then realized. The gate voltage significantly tunes the current and rectification ratio of their field-effect transistors (FETs). In addition, they all have sensitive photoresponse to blue light, and could be used as photodetector and photovoltaic device. Moreover, they generate the effective thermally-driven current when a temperature gratitude appears between the two terminals, suggesting them as potential thermoelectric materials. Hence, the lateral VS2|MoS2 heterojunctions show a multifunctional nature and have various potential applications in spintronics, optoelectronics, and spin caloritronics.

preprint2020arXiv

On $U(n)$-invariant strongly convex complex Finsler metrics

In this paper, we obtain a necessary and sufficient condition for a $U(n)$-invariant complex Finsler metric $F$ on domains in $\mathbb{C}^n$ to be strongly convex, which also makes it possible to investigate relationship between real and complex Finsler geometry via concrete and computable examples. We prove a rigid theorem which states that a $U(n)$-invariant strongly convex complex Finsler metric $F$ is a real Berwald metric if and only if $F$ comes from a $U(n)$-invariant Hermitian metric. We give a characterization of $U(n)$-invariant weakly complex Berwald metrics with vanishing holomorphic sectional curvature and obtain an explicit formula for holomorphic curvature of $U(n)$-invariant strongly pseudoconvex complex Finsler metric. Finally, we prove that the real geodesics of some $U(n)$-invariant complex Finsler metric restricted on the unit sphere $\pmb{S}^{2n-1}\subset\mathbb{C}^n$ share a specific property as that of the complex Wrona metric on $\mathbb{C}^n$.cc

preprint2020arXiv

OralCam: Enabling Self-Examination and Awareness of Oral Health Using a Smartphone Camera

Due to a lack of medical resources or oral health awareness, oral diseases are often left unexamined and untreated, affecting a large population worldwide. With the advent of low-cost, sensor-equipped smartphones, mobile apps offer a promising possibility for promoting oral health. However, to the best of our knowledge, no mobile health (mHealth) solutions can directly support a user to self-examine their oral health condition. This paper presents OralCam, the first interactive app that enables end-users' self-examination of five common oral conditions (diseases or early disease signals) by taking smartphone photos of one's oral cavity. OralCam allows a user to annotate additional information (e.g. living habits, pain, and bleeding) to augment the input image, and presents the output hierarchically, probabilistically and with visual explanations to help a laymen user understand examination results. Developed on our in-house dataset that consists of 3,182 oral photos annotated by dental experts, our deep learning based framework achieved an average detection sensitivity of 0.787 over five conditions with high localization accuracy. In a week-long in-the-wild user study (N=18), most participants had no trouble using OralCam and interpreting the examination results. Two expert interviews further validate the feasibility of OralCam for promoting users' awareness of oral health.

preprint2020arXiv

OralViewer: 3D Demonstration of Dental Surgeries for Patient Education with Oral Cavity Reconstruction from a 2D Panoramic X-ray

Patient's understanding on forthcoming dental surgeries is required by patient-centered care and helps reduce fear and anxiety. Due to the gap of expertise between patients and dentists, conventional techniques of patient education are usually not effective for explaining surgical steps. In this paper, we present \textit{OralViewer} -- the first interactive application that enables dentist's demonstration of dental surgeries in 3D to promote patients' understanding. \textit{OralViewer} takes a single 2D panoramic dental X-ray to reconstruct patient-specific 3D teeth structures, which are then assembled with registered gum and jaw bone models for complete oral cavity modeling. During the demonstration, \textit{OralViewer} enables dentists to show surgery steps with virtual dental instruments that can animate effects on a 3D model in real-time. A technical evaluation shows our deep learning based model achieves a mean Intersection over Union (IoU) of 0.771 for 3D teeth reconstruction. A patient study with 12 participants shows \textit{OralViewer} can improve patients' understanding of surgeries. An expert study with 3 board-certified dentists further verifies the clinical validity of our system.

preprint2020arXiv

Permutation Enhances Classical Communication Assisted by Entangled States

We give a capacity formula for the classical communication over a noisy quantum channel, when local operations and global permutations allowed in the encoding and bipartite states preshared between the sender and the receiver. The two endpoints of this formula are the Holevo capacity (without entanglement assistance) and the entanglement-assisted capacity (with unlimited entanglement assistance). What's more, we show that the capacity satisfies the strong converse property and thus the formula serves as a sharp dividing line between achievable and unachievable rates of communication. We prove that the difference between the assisted capacity and the Holevo capacity is upper bounded by the discord of formation of the preshared state. As examples, we derive analytically the classical capacity of various quantum channels of interests. Our result witnesses the power of random permutation in classical communication, whenever entanglement assistance is available.

preprint2020arXiv

Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks

Convolutional neural networks (CNNs) achieve state-of-the-art performance at the cost of becoming deeper and larger. Although quantization (both fixed-point and floating-point) has proven effective for reducing storage and memory access, two challenges -- 1) accuracy loss caused by quantization without calibration, fine-tuning or re-training for deep CNNs and 2) hardware inefficiency caused by floating-point quantization -- prevent processors from completely leveraging the benefits. In this paper, we propose a low-precision floating-point quantization oriented processor, named Phoenix, to address the above challenges. We primarily have three key observations: 1) 8-bit floating-point quantization incurs less error than 8-bit fixed-point quantization; 2) without using any calibration, fine-tuning or re-training techniques, normalization before quantization further reduces accuracy degradation; 3) 8-bit floating-point multiplier achieves higher hardware efficiency than 8-bit fixed-point multiplier if the full-precision product is applied. Based on these key observations, we propose a normalization-oriented 8-bit floating-point quantization method to reduce storage and memory access with negligible accuracy loss (within 0.5%/0.3% for top-1/top-5 accuracy, respectively). We further design a hardware processor to address the hardware inefficiency caused by floating-point multiplier. Compared with a state-of-the-art accelerator, Phoenix is 3.32x and 7.45x better in performance with the same core area for AlexNet and VGG16, respectively.

preprint2020arXiv

Potassium Isotope Compositions of Carbonaceous and Ordinary Chondrites: Implications on the Origin of Volatile Depletion in the Early Solar System

Solar system materials are variably depleted in moderately volatile elements (MVEs) relative to the proto-solar composition. To address the origin of this MVE depletion, we conducted a systematic study of high-precision K isotopic composition on 16 carbonaceous chondrites (CCs) of types CM1-2, CO3, CV3, CR2, CK4-5 and CH3 and 28 ordinary chondrites (OCs) covering petrological types 3 to 6 and chemical groups H, L, and LL. We observed significant overall K isotope (delta41K) variations (-1.54 to 0.70 permil). The K isotope compositions of CCs are largely higher than the Bulk Silicate Earth (BSE) value, whereas OCs show typically lower values than BSE. Neither CCs nor OCs show resolvable correlations between K isotopes and chemical groups, petrological types, shock levels, exposure ages, fall or find occurrence, or terrestrial weathering. The lack of a clear trend between K isotopes and K content indicates that the K isotope fractionations were decoupled from the relative elemental K depletions. The range of K isotope variations in the CCs is consistent with a four-component (chondrule, refractory inclusion, matrix and water) mixing model that is able to explain the bulk elemental and isotopic compositions of the main CC groups, but requires a fractionation in K isotopic compositions in chondrules. We propose that the major control of the isotopic compositions of group averages is condensation or vaporization in nebular environments that is preserved in the compositional variation of chondrules. Parent-body processes (aqueous alteration, thermal metamorphism, and metasomatism) can mobilize K and affect the K isotopes in individual samples. In the case of the OCs, the full range of K isotopic variations can only be explained by the combined effects of the size and relative abundances of chondrules, parent-body aqueous and thermal alteration.

preprint2020arXiv

Potassium Isotopic Compositions of Enstatite Meteorites

Enstatite chondrites and aubrites are meteorites that show the closest similarities to the Earth in many isotope systems that undergo mass-independent and mass-dependent isotope fractionations. Due to the analytical challenges to obtain high-precision K isotopic compositions in the past, potential differences in K isotopic compositions between enstatite meteorites and the Earth remained uncertain. We report the first high-precision K isotopic compositions of eight enstatite chondrites and four aubrites and find that there is a significant variation of K isotopic compositions among enstatite meteorites (from -2.34 permil to -0.18 permil). However, K isotopic compositions of nearly all enstatite meteorites scatter around the Bulk Silicate Earth (BSE) value. The average K isotopic composition of the eight enstatite chondrites (-0.47 +/- 0.57 permil) is indistinguishable from the BSE value (-0.48 +/- 0.03 permil), thus further corroborating the isotopic similarity between Earth' building blocks and enstatite meteorite precursors. We found no correlation of K isotopic compositions with the chemical groups, petrological types, shock degrees, and terrestrial weathering conditions; however, the variation of K isotopes among enstatite meteorite can be attributed to the parent body processing. Our sample of the main group aubrite MIL 13004 is exceptional and has an extremely light K isotopic composition (delta 41K= -2.34 +/- 0.12 permil). We attribute this unique K isotopic feature to the presence of abundant djerfisherite inclusions in our sample because this K-bearing sulfide mineral is predicted to be enriched in 39K during equilibrium exchange with silicates.

preprint2020arXiv

The Light Higgsino-dominated NLSPs in the Semi-constrained NMSSM

In the semi-constrained NMSSM (scNMSSM, or NMSSM with non-universal Higgs mass) under current constraints, we consider a scenario where $h_2$ is the SM-like Higgs, $\tildeχ^0_1$ is singlino-dominated LSP, $\tildeχ^{\pm}_1$ and $\tildeχ^0_{2,3}$ are mass-degenerated, light and higgsino-dominated NLSPs (next-to-lightest supersymmetric particles). We investigate the constraints to these NLSPs from searching for SUSY particles at the LHC Run-I and Run-II, discuss the possibility of discovering these NLSPs in the future, and come to the following conclusions: (i) With all data of Run I and up to 36 $\rm fb^{-1}$ data of Run II at the LHC, the search results by ATLAS and CMS can still not exclude the higgsino-dominated NLSPs of $100\sim200$ GeV. (ii) When the mass difference with $\tildeχ^0_{1}$ is smaller than $m_{h_2}$, $\tildeχ^0_{2}$ and $\tildeχ^0_{3}$ have opposite preference on decaying to $Z/Z^*$ or $h_1$. (iii) When the mass difference between NLSP and LSP is larger than $m_Z$, most of the samples can be checked at $5σ$ level with future 300 $\rm fb^{-1}$ data at the LHC. While with 3000 $\rm fb^{-1}$ data at the High Luminosity LHC (HL-LHC), nearly all of the samples can be checked at $5σ$ level even if the mass difference is insufficient. (iv) The $a_1$ funnel and the $h_2/Z$ funnel mechanisms for the singlino-dominated LSP annihilating can not be distinguished by searching for NLSPs.

preprint2019arXiv

A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation

We introduce a multi-agent meta-modeling game to generate data, knowledge, and models that make predictions on constitutive responses of elasto-plastic materials. We introduce a new concept from graph theory where a modeler agent is tasked with evaluating all the modeling options recast as a directed multigraph and find the optimal path that links the source of the directed graph (e.g. strain history) to the target (e.g. stress) measured by an objective function. Meanwhile, the data agent, which is tasked with generating data from real or virtual experiments (e.g. molecular dynamics, discrete element simulations), interacts with the modeling agent sequentially and uses reinforcement learning to design new experiments to optimize the prediction capacity. Consequently, this treatment enables us to emulate an idealized scientific collaboration as selections of the optimal choices in a decision tree search done automatically via deep reinforcement learning.

preprint2016arXiv

The Lackadaisical Quantum Walker is NOT Lazy at all

In this paper, we study the properties of lackadaisical quantum walks on a line. This model is first proposed in~\cite{wong2015grover} as a quantum analogue of lazy random walks where each vertex is attached $τ$ self-loops. We derive an analytic expression for the localization probability of the walker at the origin after infinite steps, and obtain the peak velocities of the walker. We also calculate rigorously the wave function of the walker starting from the origin and obtain a long time approximation for the entire probability density function. As an application of the density function, we prove that lackadaisical quantum walks spread ballistically for arbitrary $τ$, and give an analytic solution for the variance of the walker's probability distribution.