Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
31works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

31 published item(s)

preprint2026arXiv

A multitask framework for automated interpretation of multi-frame right upper quadrant ultrasound in clinical decision support

Ultrasound is a cornerstone of emergency and hepatobiliary imaging, yet its interpretation remains highly operator-dependent and time-sensitive. Here, we present a multitask vision-language agent (VLM) developed to assist with comprehensive right upper quadrant (RUQ) ultrasound interpretation across the full diagnostic workflow. The system was trained on a large, multi-center dataset comprising a primary cohort from Johns Hopkins Medical Institutions (9,189 cases, 594,099 images) and externally validated on cohorts from Stanford University (108 cases, 3,240 images) and a major Chinese medical center (257 cases, 3,178 images). Built on the Qwen2.5-VL-7B architecture, the agent integrates frame-level visual understanding with report-grounded language reasoning to perform three tasks: (i) classification of 18 hepatobiliary and gallbladder conditions, (ii) generation of clinically coherent diagnostic reports, and (iii) surgical decision support based on ultrasound findings and clinical data. The model achieved high diagnostic accuracy across all tasks, generated reports that were indistinguishable from expert-written versions in blinded evaluations, and demonstrated superior factual accuracy and information density on content-based metrics. The agent further identified patients requiring cholecystectomy with high precision, supporting real-time decision-making. These results highlight the potential of generalist vision-language models to improve diagnostic consistency, reporting efficiency, and surgical triage in real-world ultrasound practice.

preprint2026arXiv

Internal Representations as Indicators of Hallucinations in Agent Tool Selection

Large Language Models (LLMs) have shown remarkable capabilities in tool calling and tool usage, but suffer from hallucinations where they choose incorrect tools, provide malformed parameters and exhibit 'tool bypass' behavior by performing simulations and generating outputs instead of invoking specialized tools or external systems. This undermines the reliability of LLM based agents in production systems as it leads to inconsistent results, and bypasses security and audit controls. Such hallucinations in agent tool selection require early detection and error handling. Unlike existing hallucination detection methods that require multiple forward passes or external validation, we present a computationally efficient framework that detects tool-calling hallucinations in real-time by leveraging LLMs' internal representations during the same forward pass used for generation. We evaluate this approach on reasoning tasks across multiple domains, demonstrating strong detection performance (up to 86.4\% accuracy) while maintaining real-time inference capabilities with minimal computational overhead, particularly excelling at detecting parameter-level hallucinations and inappropriate tool selections, critical for reliable agent deployment.

preprint2026arXiv

JURY-RL: Votes Propose, Proofs Dispose for Label-Free RLVR

Reinforcement learning with verifiable rewards (RLVR) enhances the reasoning of large language models (LLMs), but standard RLVR often depends on human-annotated answers or carefully curated reward specifications. In machine-checkable domains, label-free alternatives such as majority voting or LLM-as-a-judge remove annotation cost but can introduce false positives that destabilize training. We introduce JURY-RL, a label-free RLVR framework that decouples answer proposal from reward disposal: votes from model rollouts propose a candidate answer, and a formal verifier determines whether that candidate can receive positive reward. Concretely, only rollouts matching the plurality-voted answer are rewarded when that answer is successfully verified in Lean. When verification is inconclusive, we invoke ResZero (Residual-Zero), a fallback reward that discards the unverified plurality proposal and redistributes a zero-mean, variance-preserving signal over the residual answers. This design maintains a stable optimization gradient without reinforcing unverifiable consensus. Across three backbone models trained on mathematical data, JURY-RL consistently outperforms other label-free baselines on mathematical reasoning benchmarks and transfers competitively to code generation and general benchmarks. It attains pass@1 performance comparable to supervised ground-truth training, with superior generalization demonstrated by higher pass@k and response diversity.

preprint2026arXiv

Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections

Mirror reflections are common in everyday environments and can provide stereo information within a single capture, as the real and reflected virtual views are visible simultaneously. We exploit this property by treating the reflection as an auxiliary view and designing a transformation that constructs a physically valid virtual camera, allowing direct pixel-domain generation of the virtual view while adhering to the real-world imaging process. This enables a multi-view stereo setup from a single image, simplifying the imaging process, making it compatible with powerful feed-forward reconstruction models for generalizable and robust 3D reconstruction. To further exploit the geometric symmetry introduced by mirrors, we propose a symmetric-aware loss to refine pose estimation. Our framework also naturally extends to dynamic scenes, where each frame contains a mirror reflection, enabling efficient per-frame geometry recovery. For quantitative evaluation, we provide a fully customizable synthetic dataset of 16 Blender scenes, each with ground-truth point clouds and camera poses. Extensive experiments on real-world data and synthetic data are conducted to illustrate the effectiveness of our method.

preprint2026arXiv

Training-Free Video Editing via Optical Flow-Enhanced Score Distillation

The rapid advancement in visual generation, particularly the emergence of pre-trained text-to-image and text-to-video models, has catalyzed growing interest in training-free video editing research. Mirroring training-free image editing techniques, current approaches preserve original video information through video input inversion and manipulating intermediate features and attention during the inference process to achieve content editing. Although they have demonstrated promising results, the lossy nature of the inversion process poses significant challenges in maintaining unedited regions of the video. Furthermore, feature and attention manipulation during inference can lead to unintended over-editing and face challenges in both local temporal continuity and global content consistency. To address these challenges, this study proposes a score distillation paradigm based on pre-trained text-to-video models, where the original video is iteratively optimized through multiple steps guided by editing gradients provided by score distillation to ultimately obtain the target video. The iterative optimization starting from the original video, combined with content preservation loss, ensures the maintenance of unedited regions in the original video and suppresses over-editing. To further guarantee video content consistency and temporal continuity, we additionally introduce a global consistency auxiliary loss and optical flow prediction-based local editing gradient smoothing. Experiments demonstrate that these strategies effectively address the aforementioned challenges, achieving comparable or superior performance across multiple dimensions including preservation of unedited regions, local temporal continuity, and global content consistency of editing results, compared to state-of-the-art methods.

preprint2024arXiv

SwitchTab: Switched Autoencoders Are Effective Tabular Learners

Self-supervised representation learning methods have achieved significant success in computer vision and natural language processing, where data samples exhibit explicit spatial or semantic dependencies. However, applying these methods to tabular data is challenging due to the less pronounced dependencies among data samples. In this paper, we address this limitation by introducing SwitchTab, a novel self-supervised method specifically designed to capture latent dependencies in tabular data. SwitchTab leverages an asymmetric encoder-decoder framework to decouple mutual and salient features among data pairs, resulting in more representative embeddings. These embeddings, in turn, contribute to better decision boundaries and lead to improved results in downstream tasks. To validate the effectiveness of SwitchTab, we conduct extensive experiments across various domains involving tabular data. The results showcase superior performance in end-to-end prediction tasks with fine-tuning. Moreover, we demonstrate that pre-trained salient embeddings can be utilized as plug-and-play features to enhance the performance of various traditional classification methods (e.g., Logistic Regression, XGBoost, etc.). Lastly, we highlight the capability of SwitchTab to create explainable representations through visualization of decoupled mutual and salient features in the latent space.

preprint2022arXiv

A Survey On Universal Adversarial Attack

The intriguing phenomenon of adversarial examples has attracted significant attention in machine learning and what might be more surprising to the community is the existence of universal adversarial perturbations (UAPs), i.e. a single perturbation to fool the target DNN for most images. With the focus on UAP against deep classifiers, this survey summarizes the recent progress on universal adversarial attacks, discussing the challenges from both the attack and defense sides, as well as the reason for the existence of UAP. We aim to extend this work as a dynamic survey that will regularly update its content to follow new works regarding UAP or universal attack in a wide range of domains, such as image, audio, video, text, etc. Relevant updates will be discussed at: https://bit.ly/2SbQlLG. We welcome authors of future works in this field to contact us for including your new finding.

preprint2022arXiv

A Unified Understanding of Deep NLP Models for Text Classification

The rapid development of deep natural language processing (NLP) models for text classification has led to an urgent need for a unified understanding of these models proposed individually. Existing methods cannot meet the need for understanding different models in one framework due to the lack of a unified measure for explaining both low-level (e.g., words) and high-level (e.g., phrases) features. We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification. The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample. We model the intra- and inter-word information at each layer measuring the importance of a word to the final prediction as well as the relationships between words, such as the formation of phrases. A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples. Two case studies on classification tasks and comparison between models demonstrate that DeepNLPVis can help users effectively identify potential problems caused by samples and model architectures and then make informed improvements.

preprint2022arXiv

Accessing negative Poisson`s ratio of graphene by machine learning interatomic potentials

The negative Poisson`s ratio (NPR) is a novel property of materials, which enhances the mechanical feature and creates a wide range of application prospects in lots of fields, such as aerospace, electronics, medicine, etc. Fundamental understanding on the mechanism underlying NPR plays an important role in designing advanced mechanical functional materials. However, with different methods used, the origin of NPR is found different and conflicting with each other, for instance, in the representative graphene. In this study, based on machine learning technique, we constructed a moment tensor potential (MTP) for molecular dynamics (MD) simulations of graphene. By analyzing the evolution of key geometries, the increase of bond angle is found to be responsible for the NPR of graphene instead of bond length. The results on the origin of NPR are well consistent with the start-of-art first-principles, which amend the results from MD simulations using classic empirical potentials. Our study facilitates the understanding on the origin of NPR of graphene and paves the way to improve the accuracy of MD simulations being comparable to first-principle calculations. Our study would also promote the applications of machine learning interatomic potentials in multiscale simulations of functional materials. *Author

preprint2022arXiv

Electrically driven robust tuning of lattice thermal conductivity

Two-dimensional (2D) materials represented by graphene stand out in future electrical industry and have been widely studied. As a commonly existing factor in electronic devices, the electric field has been extensively utilized to modulate the performance. However, how the electric field regulates thermal transport is rarely studied. Herein, we investigate the modulation of thermal transport properties by applying the external electric field ranging from 0 to 0.4 VA-1, with bilayer graphene, monolayer silicene, and germanene as study cases. The monotonic decreasing trend of thermal conductivity of all the three materials is revealed. The significant effect on the scattering rate is found to be responsible for the decreased thermal conductivity by electric field. Further evidences show that the reconstruction of internal electric field and the generation of induced charges lead to the increased scattering rate from strong phonon anharmonicity. Thus, the ultra-low thermal conductivity emerges with external electric field applied. Applying external electric field to regulate thermal conductivity enlightens the constructive idea for high-efficient thermal management.

preprint2022arXiv

Network Hawkes Process Models for Exploring Latent Hierarchy in Social Animal Interactions

Group-based social dominance hierarchies are of essential interest in animal behavior research. Studies often record aggressive interactions observed over time, and models that can capture such dynamic hierarchy are therefore crucial. Traditional ranking methods summarize interactions across time, using only aggregate counts. Instead, we take advantage of the interaction timestamps, proposing a series of network point process models with latent ranks. We carefully design these models to incorporate important characteristics of animal interaction data, including the winner effect, bursting and pair-flip phenomena. Through iteratively constructing and evaluating these models we arrive at the final cohort Markov-Modulated Hawkes process (C-MMHP), which best characterizes all aforementioned patterns observed in interaction data. We compare all models using simulated and real data. Using statistically developed diagnostic perspectives, we demonstrate that the C-MMHP model outperforms other methods, capturing relevant latent ranking structures that lead to meaningful predictions for real data.

preprint2022arXiv

Optimizing Nitrogen Management with Deep Reinforcement Learning and Crop Simulations

Nitrogen (N) management is critical to sustain soil fertility and crop production while minimizing the negative environmental impact, but is challenging to optimize. This paper proposes an intelligent N management system using deep reinforcement learning (RL) and crop simulations with Decision Support System for Agrotechnology Transfer (DSSAT). We first formulate the N management problem as an RL problem. We then train management policies with deep Q-network and soft actor-critic algorithms, and the Gym-DSSAT interface that allows for daily interactions between the simulated crop environment and RL agents. According to the experiments on the maize crop in both Iowa and Florida in the US, our RL-trained policies outperform previous empirical methods by achieving higher or similar yield while using less fertilizers

preprint2022arXiv

Playing Lottery Tickets in Style Transfer Models

Style transfer has achieved great success and attracted a wide range of attention from both academic and industrial communities due to its flexible application scenarios. However, the dependence on a pretty large VGG-based autoencoder leads to existing style transfer models having high parameter complexities, which limits their applications on resource-constrained devices. Compared with many other tasks, the compression of style transfer models has been less explored. Recently, the lottery ticket hypothesis (LTH) has shown great potential in finding extremely sparse matching subnetworks which can achieve on par or even better performance than the original full networks when trained in isolation. In this work, we for the first time perform an empirical study to verify whether such trainable matching subnetworks also exist in style transfer models. Specifically, we take two most popular style transfer models, i.e., AdaIN and SANet, as the main testbeds, which represent global and local transformation based style transfer methods respectively. We carry out extensive experiments and comprehensive analysis, and draw the following conclusions. (1) Compared with fixing the VGG encoder, style transfer models can benefit more from training the whole network together. (2) Using iterative magnitude pruning, we find the matching subnetworks at 89.2% sparsity in AdaIN and 73.7% sparsity in SANet, which demonstrates that style transfer models can play lottery tickets too. (3) The feature transformation module should also be pruned to obtain a much sparser model without affecting the existence and quality of the matching subnetworks. (4) Besides AdaIN and SANet, other models such as LST, MANet, AdaAttN and MCCNet can also play lottery tickets, which shows that LTH can be generalized to various style transfer models.

preprint2022arXiv

The synergistic modulation of electronic and geometry structures leads to ultra-low thermal conductivity of graphene-like borides (g-B3X5, X=N, P, As)

The design of novel devices with specific technical interests through modulating structural properties and bonding characteristics promotes the vigorous development of materials informatics. Herein, we propose a synergy strategy of component reconstruction by combining geometric configuration and bonding characteristics. With the synergy strategy, we designed a novel two-dimensional (2D) graphene-like borides, e.g. g-B3N5, which possesses counter-intuitive ultra-low thermal conductivity of 21.08 W/mK despite the small atomic mass. The ultra-low thermal conductivity is attributed to the synergy effect of electronics and geometry on thermal transport due to the combining reconstruction of g-BN and nitrogene. With the synergy effect, the dominant acoustic branches are strongly softened, and the scattering absorption and Umklapp process are simultaneously suppressed. Thus, the thermal conductivity is significantly lowered. To verify the component reconstruction strategy, we further constructed g-B3P5 and g-B3As5, and uncovered the ultra-low thermal conductivity of 2.50 and 1.85 W/mK, respectively. The synergy effect and the designed ultra-low thermal conductivity materials with lightweight atomic mass cater to the demand for light development of momentum machinery and heat protection, such as aerospace vehicles, high-speed rail, automobiles.

preprint2021arXiv

Control cost and quantum speed limit time in controlled almost exact state transmission in open systems

We investigate the influence of environment noise on the control cost and the quantum speed limit time (QSLT) in the process of almost exact state transmission (AEST) through a spin chain under pulse control. The chain is immersed in its surrounding non-Markovian, finite temperature heat baths. We find that AEST can be realized in weak system-bath coupling, low temperature, and strong non-Markovian baths under effective external control. Correspondingly, the control cost and QSLT increases with increasing bath temperature and coupling strength. It is noticeable that non-Markovianity from the baths can be helpful to reduce the control cost and shorten the QSLT. Furthermore, we find that there exists a trade-off between the control cost and transmission fidelity and higher fidelity requires higher cost. In addition, the minimum control cost has been found to obtain certain transmission fidelity.

preprint2021arXiv

Decision-based Universal Adversarial Attack

A single perturbation can pose the most natural images to be misclassified by classifiers. In black-box setting, current universal adversarial attack methods utilize substitute models to generate the perturbation, then apply the perturbation to the attacked model. However, this transfer often produces inferior results. In this study, we directly work in the black-box setting to generate the universal adversarial perturbation. Besides, we aim to design an adversary generating a single perturbation having texture like stripes based on orthogonal matrix, as the top convolutional layers are sensitive to stripes. To this end, we propose an efficient Decision-based Universal Attack (DUAttack). With few data, the proposed adversary computes the perturbation based solely on the final inferred labels, but good transferability has been realized not only across models but also span different vision tasks. The effectiveness of DUAttack is validated through comparisons with other state-of-the-art attacks. The efficiency of DUAttack is also demonstrated on real world settings including the Microsoft Azure. In addition, several representative defense methods are struggling with DUAttack, indicating the practicability of the proposed method.

preprint2021arXiv

Deep reinforcement learning for universal quantum state preparation via dynamic pulse control

Accurate and efficient preparation of quantum state is a core issue in building a quantum computer. In this paper, we investigate how to prepare a certain single- or two-qubit target state from arbitrary initial states in semiconductor double quantum dots with the aid of deep reinforcement learning. Our method is based on the training of the network over numerous preparing tasks. The results show that once the network is well trained, it works for any initial states in the continuous Hilbert space. Thus repeated training for new preparation tasks is avoided. Our scheme outperforms the traditional optimization approaches based on gradient with both the higher designing efficiency and the preparation quality in discrete control space. Moreover, we find that the control trajectories designed by our scheme are robust against static and dynamic fluctuations, such as charge and nuclear noises.

preprint2021arXiv

Tunable Doping of Rhenium and Vanadium into Transition Metal Dichalcogenides for Two-Dimensional Electronics

Two-dimensional (2D) transition metal dichalcogenides (TMDCs) with unique electrical properties are fascinating materials used for future electronics. However, the strong Fermi level pinning effect at the interface of TMDCs and metal electrodes always leads to high contact resistance, which seriously hinders their application in 2D electronics. One effective way to overcome this is to use metallic TMDCs or transferred metal electrodes as van der Waals (vdW) contacts. Alternatively, using highly conductive doped TMDCs will have a profound impact on the contact engineering of 2D electronics. Here, a novel chemical vapor deposition using mixed molten salts is established for vapor-liquid-solid growth of high-quality rhenium (Re) and vanadium (V)-doped TMDC monolayers with high controllability and reproducibility. A tunable semiconductor to metal transition is observed in the Re and V-doped TMDCs. Electrical conductivity increases up to a factor of 108 in the degenerate V-doped WS2 and WSe2. Using V-doped WSe2 as vdW contact, the on-state current and on/off ratio of WSe2-based field-effect transistors have been substantially improved (from ~10-8 to 10-5 A; ~104 to 108), compared to metal contacts. Future studies on lateral contacts and interconnects using doped TMDCs will pave the way for 2D integrated circuits and flexible electronics.

preprint2021arXiv

Universal quantum state preparation via revised greedy algorithm

Preparation of quantum state lies at the heart of quantum information processing. The greedy algorithm provides a potential method to effectively prepare quantum states. However, the standard greedy algorithm, in general, cannot take the global maxima and instead becomes stuck on a local maxima. Based on the standard greedy algorithm, in this paper we propose a revised version to design dynamic pulses to realize universal quantum state preparation, i.e., preparing any arbitrary state from another arbitrary one. As applications, we implement this scheme to the universal preparation of single- and two-qubit state in the context of semiconductor quantum dots and superconducting circuits. Evaluation results show that our scheme outperforms the alternative numerical optimizations with higher preparation quality while possesses the comparable high efficiency. Compared with the emerging machine learning, it shows a better accessibility and does not require any training. Moreover, the numerical results show that the pulse sequences generated by our scheme are robust against various errors and noises. Our scheme opens a new avenue of optimization in few-level system and limited action space quantum control problems.

preprint2020arXiv

$E^3$: Visual Exploration of Spatiotemporal Energy Demand

Understanding demand-side energy behaviour is critical for making efficiency responses for energy demand management. We worked closely with energy experts and identified the key elements of the energy demand problem including temporal and spatial demand and shifts in spatiotemporal demand. To our knowledge, no previous research has investigated the shifts in spatiotemporal demand. To fill this research gap, we propose a unified visual analytics approach to support exploratory demand analysis; we developed E3, a highly interactive tool that support users in making and verifying hypotheses through human-client-server interactions. A novel potential flow based approach was formalized to model shifts in energy demand and integrated into a server-side engine. Experts then evaluated and affirmed the usefulness of this approach through case studies of real-world electricity data. In the future, we will improve the modelling algorithm, enhance visualisation, and expand the process to support more forms of energy data.

preprint2020arXiv

Adversarial Imitation Attack

Deep learning models are known to be vulnerable to adversarial examples. A practical adversarial attack should require as little as possible knowledge of attacked models. Current substitute attacks need pre-trained models to generate adversarial examples and their attack success rates heavily rely on the transferability of adversarial examples. Current score-based and decision-based attacks require lots of queries for the attacked models. In this study, we propose a novel adversarial imitation attack. First, it produces a replica of the attacked model by a two-player game like the generative adversarial networks (GANs). The objective of the generative model is to generate examples that lead the imitation model returning different outputs with the attacked model. The objective of the imitation model is to output the same labels with the attacked model under the same inputs. Then, the adversarial examples generated by the imitation model are utilized to fool the attacked model. Compared with the current substitute attacks, imitation attacks can use less training data to produce a replica of the attacked model and improve the transferability of adversarial examples. Experiments demonstrate that our imitation attack requires less training data than the black-box substitute attacks, but achieves an attack success rate close to the white-box attack on unseen data with no query.

preprint2020arXiv

Analyzing the Noise Robustness of Deep Neural Networks

Adversarial examples, generated by adding small but intentionally imperceptible perturbations to normal examples, can mislead deep neural networks (DNNs) to make incorrect predictions. Although much work has been done on both adversarial attack and defense, a fine-grained understanding of adversarial examples is still lacking. To address this issue, we present a visual analysis method to explain why adversarial examples are misclassified. The key is to compare and analyze the datapaths of both the adversarial and normal examples. A datapath is a group of critical neurons along with their connections. We formulate the datapath extraction as a subset selection problem and solve it by constructing and training a neural network. A multi-level visualization consisting of a network-level visualization of data flows, a layer-level visualization of feature maps, and a neuron-level visualization of learned features, has been designed to help investigate how datapaths of adversarial and normal examples diverge and merge in the prediction process. A quantitative evaluation and a case study were conducted to demonstrate the promise of our method to explain the misclassification of adversarial examples.

preprint2020arXiv

DaST: Data-free Substitute Training for Adversarial Attacks

Machine learning models are vulnerable to adversarial examples. For the black-box setting, current substitute attacks need pre-trained models to generate adversarial examples. However, pre-trained models are hard to obtain in real-world tasks. In this paper, we propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks without the requirement of any real data. To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models. In particular, we design a multi-branch architecture and label-control loss for the generative model to deal with the uneven distribution of synthetic samples. The substitute model is then trained by the synthetic samples generated by the generative model, which are labeled by the attacked model subsequently. The experiments demonstrate the substitute models produced by DaST can achieve competitive performance compared with the baseline models which are trained by the same train set with attacked models. Additionally, to evaluate the practicability of the proposed method on the real-world task, we attack an online machine learning model on the Microsoft Azure platform. The remote model misclassifies 98.35% of the adversarial examples crafted by our method. To the best of our knowledge, we are the first to train a substitute model for adversarial attacks without any real data.

preprint2020arXiv

Deep Learning of Accurate Force Field of Ferroelectric HfO$_2$

The discovery of ferroelectricity in HfO$_2$-based thin films opens up new opportunities for using this silicon-compatible ferroelectric to realize low-power logic circuits and high-density non-volatile memories. The functional performances of ferroelectrics are intimately related to their dynamic responses to external stimuli such as electric fields at finite temperatures. Molecular dynamics is an ideal technique for investigating dynamical processes on large length and time scales, though its applications to new materials is often hindered by the limited availability and accuracy of classical force fields. Here we present a deep neural network-based interatomic force field of HfO$_2$ learned from {\em ab initio} data using a concurrent learning procedure. The model potential is able to predict structural properties such as elastic constants, equation of states, phonon dispersion relationships, and phase transition barriers of various hafnia polymorphs with accuracy comparable with density functional theory calculations. The validity of this model potential is further confirmed by the reproduction of experimental sequences of temperature-driven ferroelectric-paraelectric phase transitions of HfO$_2$ with isobaric-isothermal ensemble molecular dynamics simulations. We suggest a general approach to extend the model potential of HfO$_2$ to related material systems including dopants and defects.

preprint2020arXiv

Diagnostics and Visualization of Point Process Models for Event Times on a Social Network

Point process models have been used to analyze interaction event times on a social network, in the hope to provides valuable insights for social science research. However, the diagnostics and visualization of the modeling results from such an analysis have received limited discussion in the literature. In this paper, we develop a systematic set of diagnostic tools and visualizations for point process models fitted to data from a network setting. We analyze the residual process and Pearson residual on the network by inspecting their structure and clustering structure. Equipped with these tools, we can validate whether a model adequately captures the temporal and/or network structures in the observed data. The utility of our approach is demonstrated using simulation studies and point process models applied to a study of animal social interactions.

preprint2020arXiv

MLCVNet: Multi-Level Context VoteNet for 3D Object Detection

In this paper, we address the 3D object detection task by capturing multi-level contextual information with the self-attention mechanism and multi-scale feature fusion. Most existing 3D object detection methods recognize objects individually, without giving any consideration on contextual information between these objects. Comparatively, we propose Multi-Level Context VoteNet (MLCVNet) to recognize 3D objects correlatively, building on the state-of-the-art VoteNet. We introduce three context modules into the voting and classifying stages of VoteNet to encode contextual information at different levels. Specifically, a Patch-to-Patch Context (PPC) module is employed to capture contextual information between the point patches, before voting for their corresponding object centroid points. Subsequently, an Object-to-Object Context (OOC) module is incorporated before the proposal and classification stage, to capture the contextual information between object candidates. Finally, a Global Scene Context (GSC) module is designed to learn the global scene context. We demonstrate these by capturing contextual information at patch, object and scene levels. Our method is an effective way to promote detection accuracy, achieving new state-of-the-art detection performance on challenging 3D object detection datasets, i.e., SUN RGBD and ScanNet. We also release our code at https://github.com/NUAAXQ/MLCVNet.

preprint2020arXiv

Observation of topological polaritons and photonic magic angles in twisted van der Waals bi-layers

Twisted two-dimensional bi-layers offer exquisite control on the electronic bandstructure through the interlayer rotation and coupling, enabling magic-angle flat-band superconductivity and moiré excitons. Here, we demonstrate how analogous principles, combined with large anisotropy, enable extreme control and manipulation of the photonic dispersion of phonon polaritons (PhPs) in van der Waals (vdW) bi-layers. We experimentally observe tunable topological transitions from open (hyperbolic) to closed (elliptic) dispersion contours in twisted bi-layered α-MoO3 at photonic magic angles, induced by polariton hybridization and robustly controlled by a topological quantity. At these transitions the bilayer dispersion flattens, exhibiting low-loss tunable polariton canalization and diffractionless propagation with resolution below λ0/40. Our findings extend twistronics and moiré physics to nanophotonics and polaritonics, with great potential for nano-imaging, nanoscale light propagation, energy transfer and quantum applications.

preprint2020arXiv

Open Set Modulation Recognition Based on Dual-Channel LSTM Model

Deep neural networks have achieved great success in computer vision, speech recognition and many other areas. The potential of recurrent neural networks especially the Long Short-Term Memory (LSTM) for open set communication signal modulation recognition is investigated in this letter. Time-domain sampled signals are first converted to two normalized matrices which will be fed into a four layer Dual-Channel LSTM network tailored for open set modulation recognition. With two cascaded Dual-Channel LSTM layers, the designed network can automatically learn sequence-correlated features from the raw data. With center loss and weibull distribution, proposed algorithm can recognize partial open set modulations. Experiments on the public RadioML dataset indicates that different analog and digital modulations can be effectively classified by the proposed model, while partial open set modulations can be recognized. Quantitative analysis on the dataset shows that the proposed method can achieve an average accuracy of 90.2% at varying SNR ranging from 0dB to 18dB in classifying the considered 11 classes, while accuracy of open set experiment dramatically improved by 14.2%.

preprint2020arXiv

ProbaNet: Proposal-balanced Network for Object Detection

Candidate object proposals generated by object detectors based on convolutional neural network (CNN) encounter easy-hard samples imbalance problem, which can affect overall performance. In this study, we propose a Proposal-balanced Network (ProbaNet) for alleviating the imbalance problem. Firstly, ProbaNet increases the probability of choosing hard samples for training by discarding easy samples through threshold truncation. Secondly, ProbaNet emphasizes foreground proposals by increasing their weights. To evaluate the effectiveness of ProbaNet, we train models based on different benchmarks. Mean Average Precision (mAP) of the model using ProbaNet achieves 1.2$\%$ higher than the baseline on PASCAL VOC 2007. Furthermore, it is compatible with existing two-stage detectors and offers a very small amount of additional computational cost.

preprint2020arXiv

Shonan Rotation Averaging: Global Optimality by Surfing $SO(p)^n$

Shonan Rotation Averaging is a fast, simple, and elegant rotation averaging algorithm that is guaranteed to recover globally optimal solutions under mild assumptions on the measurement noise. Our method employs semidefinite relaxation in order to recover provably globally optimal solutions of the rotation averaging problem. In contrast to prior work, we show how to solve large-scale instances of these relaxations using manifold minimization on (only slightly) higher-dimensional rotation manifolds, re-using existing high-performance (but local) structure-from-motion pipelines. Our method thus preserves the speed and scalability of current SFM methods, while recovering globally optimal solutions.

preprint2020arXiv

T-square resistivity without Umklapp scattering in dilute metallic Bi$_2$O$_2$Se

The electrical resistivity of Fermi liquids (FLs) displays a quadratic temperature ($T$) dependence because of electron-electron (e-e) scattering. For such collisions to decay the charge current, there are two known mechanisms: inter-band scattering (identified by Baber) and Umklapp events. However, dilute metallic strontium titanate (STO) was found to display $T^2$ resistivity in absence of either of these two mechanisms. The presence of soft phonons and their possible role as scattering centers raised the suspicion that $T$-square resistivity in STO is not due to e-e scattering. Here, we present the case of Bi$_2$O$_2$Se, a layered semiconductor with hard phonons, which becomes a dilute metal with a small single-component Fermi surface upon doping. It displays $T$-square resistivity well below the degeneracy temperature where neither Umklapp nor interband scattering is conceivable. We observe a universal scaling between the prefactor of $T^2$ resistivity and the Fermi energy, which is an extension of the Kadowaki-Woods plot to dilute metals. Our results imply the absence of a satisfactory theoretical basis for the ubiquity of e-e driven $T$-square resistivity in Fermi liquids.