Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
43works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

43 published item(s)

preprint2026arXiv

GIFT: Guided Fine-Tuning and Transfer for Enhancing Instruction-Tuned Language Models

A promising paradigm for adapting instruction-tuned language models is to learn task-specific updates on a pretrained base model and subsequently merge them into the instruction-tuned model. However, existing approaches typically treat the instruction-tuned model as a passive target that is only involved at the final merging stage, without guiding the training process. We propose GIFT (Guided Fine-Tuning and Transfer), a simple and efficient framework that incorporates guidance from the instruction model into task adaptation. GIFT fine-tunes a low-rank adapter on the pretrained base model using confidence signals derived from the instruction-tuned model. The learned adapter is then merged into the instruction-tuned model, yielding task-specialized models that preserve general instruction-following behavior. We evaluate GIFT on mathematical and knowledge-intensive benchmarks across multiple model families and scales. Results show that GIFT consistently outperforms direct fine-tuning and representative transfer-based baselines, while maintaining robust generalization and favorable test-time scaling behavior.

preprint2025arXiv

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. However, a growing concern lies in their tendency to produce excessively long reasoning traces, which are often filled with redundant content (e.g., repeated definitions), over-analysis of simple problems, and superficial exploration of multiple reasoning paths for harder tasks. This inefficiency introduces significant challenges for training, inference, and real-world deployment (e.g., in agent-based systems), where token economy is critical. In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm. We identify common patterns of inefficiency, examine methods proposed across the LRM lifecycle, i.e., from pretraining to inference, and discuss promising future directions for research. To support ongoing development, we also maintain a real-time GitHub repository tracking recent progress in the field. We hope this survey serves as a foundation for further exploration and inspires innovation in this rapidly evolving area.

preprint2024arXiv

Pulse shape discrimination based on the Tempotron: a powerful classifier on GPU

This study utilized the Tempotron, a robust classifier based on a third-generation neural network model, for pulse shape discrimination. By eliminating the need for manual feature extraction, the Tempotron model can process pulse signals directly, generating discrimination results based on prior knowledge. The study performed experiments using GPU acceleration, resulting in over 500 times faster compared to the CPU-based model, and investigated the impact of noise augmentation on the Tempotron performance. Experimental results substantiated that Tempotron serves as a formidable classifier, adept at accomplishing high discrimination accuracy on both AmBe and time-of-flight PuBe datasets. Furthermore, analyzing the neural activity of Tempotron during training shed light on its learning characteristics and aided in selecting its hyperparameters. Moreover, the study addressed the constraints and potential avenues for future development in utilizing the Tempotron for pulse shape discrimination. The dataset used in this study and the GPU-based Tempotron are publicly available on GitHub at https://github.com/HaoranLiu507/TempotronGPU.

preprint2024arXiv

Random-coupled Neural Network

Improving the efficiency of current neural networks and modeling them in biological neural systems have become popular research directions in recent years. Pulse-coupled neural network (PCNN) is a well applicated model for imitating the computation characteristics of the human brain in computer vision and neural network fields. However, differences between the PCNN and biological neural systems remain: limited neural connection, high computational cost, and lack of stochastic property. In this study, random-coupled neural network (RCNN) is proposed. It overcomes these difficulties in PCNN's neuromorphic computing via a random inactivation process. This process randomly closes some neural connections in the RCNN model, realized by the random inactivation weight matrix of link input. This releases the computational burden of PCNN, making it affordable to achieve vast neural connections. Furthermore, the image and video processing mechanisms of RCNN are researched. It encodes constant stimuli as periodic spike trains and periodic stimuli as chaotic spike trains, the same as biological neural information encoding characteristics. Finally, the RCNN is applicated to image segmentation, fusion, and pulse shape discrimination subtasks. It is demonstrated to be robust, efficient, and highly anti-noised, with outstanding performance in all applications mentioned above.

preprint2023arXiv

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Diverse video captioning aims to generate a set of sentences to describe the given video in various aspects. Mainstream methods are trained with independent pairs of a video and a caption from its ground-truth set without exploiting the intra-set relationship, resulting in low diversity of generated captions. Different from them, we formulate diverse captioning into a semantic-concept-guided set prediction (SCG-SP) problem by fitting the predicted caption set to the ground-truth set, where the set-level relationship is fully captured. Specifically, our set prediction consists of two synergistic tasks, i.e., caption generation and an auxiliary task of concept combination prediction providing extra semantic supervision. Each caption in the set is attached to a concept combination indicating the primary semantic content of the caption and facilitating element alignment in set prediction. Furthermore, we apply a diversity regularization term on concepts to encourage the model to generate semantically diverse captions with various concept combinations. These two tasks share multiple semantics-specific encodings as input, which are obtained by iterative interaction between visual features and conceptual queries. The correspondence between the generated captions and specific concept combinations further guarantees the interpretability of our model. Extensive experiments on benchmark datasets show that the proposed SCG-SP achieves state-of-the-art (SOTA) performance under both relevance and diversity metrics.

preprint2022arXiv

A Computational Framework of Cortical Microcircuits Approximates Sign-concordant Random Backpropagation

Several recent studies attempt to address the biological implausibility of the well-known backpropagation (BP) method. While promising methods such as feedback alignment, direct feedback alignment, and their variants like sign-concordant feedback alignment tackle BP's weight transport problem, their validity remains controversial owing to a set of other unsolved issues. In this work, we answer the question of whether it is possible to realize random backpropagation solely based on mechanisms observed in neuroscience. We propose a hypothetical framework consisting of a new microcircuit architecture and its supporting Hebbian learning rules. Comprising three types of cells and two types of synaptic connectivity, the proposed microcircuit architecture computes and propagates error signals through local feedback connections and supports the training of multi-layered spiking neural networks with a globally defined spiking error function. We employ the Hebbian rule operating in local compartments to update synaptic weights and achieve supervised learning in a biologically plausible manner. Finally, we interpret the proposed framework from an optimization point of view and show its equivalence to sign-concordant feedback alignment. The proposed framework is benchmarked on several datasets including MNIST and CIFAR10, demonstrating promising BP-comparable accuracy.

preprint2022arXiv

A Roadmap for Big Model

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

preprint2022arXiv

A Simple but Effective Pluggable Entity Lookup Table for Pre-trained Language Models

Pre-trained language models (PLMs) cannot well recall rich factual knowledge of entities exhibited in large-scale corpora, especially those rare entities. In this paper, we propose to build a simple but effective Pluggable Entity Lookup Table (PELT) on demand by aggregating the entity's output representations of multiple occurrences in the corpora. PELT can be compatibly plugged as inputs to infuse supplemental entity knowledge into PLMs. Compared to previous knowledge-enhanced PLMs, PELT only requires 0.2%-5% pre-computation with capability of acquiring knowledge from out-of-domain corpora for domain adaptation scenario. The experiments on knowledge-related tasks demonstrate that our method, PELT, can flexibly and effectively transfer entity knowledge from related corpora into PLMs with different architectures.

preprint2022arXiv

Anisotropic Dzyaloshinskii-Moriya interaction and topological magnetism in two-dimensional magnets protected by P4-m2 crystal symmetry

As a fundamental magnetic parameter, Dzyaloshinskii-Moriya interaction (DMI), has gained a great deal of attention in the last two decades due to its critical role in formation of magnetic skyrmions. Recent discoveries of two-dimensional (2D) van der Waals (vdW) magnets has also gained a great deal of attention due to appealing physical properties, such as gate tunability, flexibility and miniaturization. Intensive studies have shown that isotropic DMI stabilizes ferromagnetic (FM) topological spin textures in 2D magnets or their corresponding heterostructures. However, the investigation of anisotropic DMI and antiferromagnetic (AFM) topological spin configurations remains elusive. Here, we propose and demonstrate that a family of 2D magnets with P4-m2 symmetry-protected anisotropic DMI. More interestingly, various topological spin configurations, including FM/AFM antiskyrmion and AFM vortex-antivortex pair, emerge in this family. These results give a general method to design anisotropic DMI and pave the way towards topological magnetism in 2D materials using crystal symmetry.

preprint2022arXiv

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks. To this end, we propose a BNN framework comprising 1) a minimalistic inference scheme for hardware-friendliness, 2) an over-parameterized training scheme for high accuracy, and 3) a simple procedure to adapt to different vision tasks. The resultant framework overtakes 8-bit quantization in the speed-vs-accuracy tradeoff for classification, detection, segmentation, super-resolution and matching: our BNNs not only retain the accuracy levels of their 8-bit baselines but also showcase 1.3-2.4$\times$ faster FPS on mobile CPUs. Similar conclusions can be drawn for prototypical systolic-array-based AI accelerators, where our BNNs promise 2.8-7$\times$ fewer execution cycles than 8-bit and 2.1-2.7$\times$ fewer cycles than alternative BNN designs. These results suggest that the time for large-scale BNN adoption could be upon us.

preprint2022arXiv

CF-YOLO: Cross Fusion YOLO for Object Detection in Adverse Weather with a High-quality Real Snow Dataset

Snow is one of the toughest adverse weather conditions for object detection (OD). Currently, not only there is a lack of snowy OD datasets to train cutting-edge detectors, but also these detectors have difficulties learning latent information beneficial for detection in snow. To alleviate the two above problems, we first establish a real-world snowy OD dataset, named RSOD. Besides, we develop an unsupervised training strategy with a distinctive activation function, called $Peak \ Act$, to quantitatively evaluate the effect of snow on each object. Peak Act helps grading the images in RSOD into four-difficulty levels. To our knowledge, RSOD is the first quantitatively evaluated and graded snowy OD dataset. Then, we propose a novel Cross Fusion (CF) block to construct a lightweight OD network based on YOLOv5s (call CF-YOLO). CF is a plug-and-play feature aggregation module, which integrates the advantages of Feature Pyramid Network and Path Aggregation Network in a simpler yet more flexible form. Both RSOD and CF lead our CF-YOLO to possess an optimization ability for OD in real-world snow. That is, CF-YOLO can handle unfavorable detection problems of vagueness, distortion and covering of snow. Experiments show that our CF-YOLO achieves better detection results on RSOD, compared to SOTAs. The code and dataset are available at https://github.com/qqding77/CF-YOLO-and-RSOD.

preprint2022arXiv

Deterministic relation between optical polarization and lattice symmetry revealed in ion-doped single microcrystals

Rare-earth ions doped crystals are of great significance for micro-sensing and quantum information, whilst the ions in the crystals emit light with spontaneous partial polarization, which is, though believed to be originated from the crystal lattice structure, still lacking a deterministic explanation that can be tested with quantitative accuracy. We report the experimental evidence showing the profound physical relation between the polarization degree of light emitted by the doped ion and the lattice symmetry, by demonstrating, with unprecedented precision, that the lattice constant ratio c/a directly quantifies the macroscopic effective polar angle of the electric and magnetic dipoles, which essentially determines the linear polarization degree of the emission. Based on this discovery, we further propose a pure optical technology to identify the three-dimensional orientation of a rod-shaped single microcrystal using the polarization-resolved micro-spectroscopy. Our results, revealing the physical origin of light polarization in ion-doped crystals, open the way towards on-demand polarization control with crystallography, and provide a versatile platform for polarization-based microscale sensing in dynamical systems.

preprint2022arXiv

ELLE: Efficient Lifelong Pre-training for Emerging Data

Current pre-trained language models (PLM) are typically trained with static data, ignoring that in real-world scenarios, streaming data of various sources may continuously grow. This requires PLMs to integrate the information from all the sources in a lifelong manner. Although this goal could be achieved by exhaustive pre-training on all the existing data, such a process is known to be computationally expensive. To this end, we propose ELLE, aiming at efficient lifelong pre-training for emerging data. Specifically, ELLE consists of (1) function preserved model expansion, which flexibly expands an existing PLM's width and depth to improve the efficiency of knowledge acquisition; and (2) pre-trained domain prompts, which disentangle the versatile knowledge learned during pre-training and stimulate the proper knowledge for downstream tasks. We experiment ELLE with streaming data from 5 domains on BERT and GPT. The results show the superiority of ELLE over various lifelong learning baselines in both pre-training efficiency and downstream performances. The codes are publicly available at https://github.com/thunlp/ELLE.

preprint2022arXiv

Evidence of Magnon-Mediated Orbital Magnetism in a Quasi-2D Topological Magnon Insulator

We explore spin dynamics in Cu(1,3-bdc), a quasi-2D topological magnon insulator. The results show that the thermal evolution of Landé $g$-factor ($g$) is anisotropic: $g_\textrm{in-plane}$ reduces while $g_\textrm{out-plane}$ increases with increasing temperature $T$. Moreover, the anisotropy of the $g$-factor ($Δg$) and the anisotropy of saturation magnetization ($ΔM_\textrm{s}$) are correlated below 4 K, but they diverge above 4 K. We show that the electronic orbital moment contributes to the $g$ anisotropy at lower $T$, while the topological orbital moment induced by thermally excited spin chirality dictates the $g$ anisotropy at higher $T$. Our work suggests an interplay among topology, spin chirality, and orbital magnetism in Cu(1,3-bdc).

preprint2022arXiv

Fully Hyperbolic Neural Networks

Hyperbolic neural networks have shown great potential for modeling complex data. However, existing hyperbolic networks are not completely hyperbolic, as they encode features in a hyperbolic space yet formalize most of their operations in the tangent space (a Euclidean subspace) at the origin of the hyperbolic space. This hybrid method greatly limits the modeling ability of networks. In this paper, we propose a fully hyperbolic framework to build hyperbolic networks based on the Lorentz model by adapting the Lorentz transformations (including boost and rotation) to formalize essential operations of neural networks. Moreover, we also prove that linear transformation in tangent spaces used by existing hyperbolic networks is a relaxation of the Lorentz rotation and does not include the boost, implicitly limiting the capabilities of existing hyperbolic networks. The experimental results on four NLP tasks show that our method has better performance for building both shallow and deep networks. Our code will be released to facilitate follow-up research.

preprint2022arXiv

GoG: Relation-aware Graph-over-Graph Network for Visual Dialog

Visual dialog, which aims to hold a meaningful conversation with humans about a given image, is a challenging task that requires models to reason the complex dependencies among visual content, dialog history, and current questions. Graph neural networks are recently applied to model the implicit relations between objects in an image or dialog. However, they neglect the importance of 1) coreference relations among dialog history and dependency relations between words for the question representation; and 2) the representation of the image based on the fully represented question. Therefore, we propose a novel relation-aware graph-over-graph network (GoG) for visual dialog. Specifically, GoG consists of three sequential graphs: 1) H-Graph, which aims to capture coreference relations among dialog history; 2) History-aware Q-Graph, which aims to fully understand the question through capturing dependency relations between words based on coreference resolution on the dialog history; and 3) Question-aware I-Graph, which aims to capture the relations between objects in an image based on fully question representation. As an additional feature representation module, we add GoG to the existing visual dialogue model. Experimental results show that our model outperforms the strong baseline in both generative and discriminative settings by a significant margin.

preprint2022arXiv

Growth, Electronic Structure and Superconductivity of Ultrathin Epitaxial CoSi2 Films

We report growth, electronic structure and superconductivity of ultrathin epitaxial CoSi2 films on Si(111). At low coverages, preferred islands with 2, 5 and 6 monolayers height develop, which agrees well with the surface energy calculation. We observe clear quantum well states as a result of electronic confinement and their dispersion agrees well with density functional theory calculations, indicating weak correlation effect despite strong contributions from Co 3d electrons. Ex-situ transport measurements show that superconductivity persists down to at least 10 monolayers, with reduced Tc but largely enhanced upper critical field. Our study opens up the opportunity to study the interplay between quantum confinement, interfacial symmetry breaking and superconductivity in an epitaxial silicide film, which is technologically relevant in microelectronics.

preprint2022arXiv

Knowledge Inheritance for Pre-trained Language Models

Recent explorations of large-scale pre-trained language models (PLMs) have revealed the power of PLMs with huge amounts of parameters, setting off a wave of training ever-larger PLMs. However, it requires tremendous computational resources to train a large-scale PLM, which may be practically unaffordable. In addition, existing large-scale PLMs are mainly trained from scratch individually, ignoring that many well-trained PLMs are available. To this end, we explore the question how could existing PLMs benefit training large-scale PLMs in future. Specifically, we introduce a pre-training framework named "knowledge inheritance" (KI) and explore how could knowledge distillation serve as auxiliary supervision during pre-training to efficiently learn larger PLMs. Experimental results demonstrate the superiority of KI in training efficiency. We also conduct empirical analyses to explore the effects of teacher PLMs' pre-training settings, including model architecture, pre-training data, etc. Finally, we show that KI could be applied to domain adaptation and knowledge transfer.

preprint2022arXiv

LF-VIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras with Negative Plane

Visual-inertial-odometry has attracted extensive attention in the field of autonomous driving and robotics. The size of Field of View (FoV) plays an important role in Visual-Odometry (VO) and Visual-Inertial-Odometry (VIO), as a large FoV enables to perceive a wide range of surrounding scene elements and features. However, when the field of the camera reaches the negative half plane, one cannot simply use [u,v,1]^T to represent the image feature points anymore. To tackle this issue, we propose LF-VIO, a real-time VIO framework for cameras with extremely large FoV. We leverage a three-dimensional vector with unit length to represent feature points, and design a series of algorithms to overcome this challenge. To address the scarcity of panoramic visual odometry datasets with ground-truth location and pose, we present the PALVIO dataset, collected with a Panoramic Annular Lens (PAL) system with an entire FoV of 360°x(40°-120°) and an IMU sensor. With a comprehensive variety of experiments, the proposed LF-VIO is verified on both the established PALVIO benchmark and a public fisheye camera dataset with a FoV of 360°x(0°-93.5°). LF-VIO outperforms state-of-the-art visual-inertial-odometry methods. Our dataset and code are made publicly available at https://github.com/flysoaryun/LF-VIO

preprint2022arXiv

Manual-Guided Dialogue for Flexible Conversational Agents

How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be two critical issues in building a task-oriented dialogue system. In this paper, we propose a novel manual-guided dialogue scheme to alleviate these problems, where the agent learns the tasks from both dialogue and manuals. The manual is an unstructured textual document that guides the agent in interacting with users and the database during the conversation. Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains. We then contribute a fully-annotated multi-domain dataset MagDial to support our scheme. It introduces three dialogue modeling subtasks: instruction matching, argument filling, and response generation. Modeling these subtasks is consistent with the human agent's behavior patterns. Experiments demonstrate that the manual-guided dialogue scheme improves data efficiency and domain scalability in building dialogue systems. The dataset and benchmark will be publicly available for promoting future research.

preprint2022arXiv

MoEfication: Transformer Feed-forward Layers are Mixtures of Experts

Recent work has shown that feed-forward networks (FFNs) in pre-trained Transformers are a key component, storing various linguistic and factual knowledge. However, the computational patterns of FFNs are still unclear. In this work, we study the computational patterns of FFNs and observe that most inputs only activate a tiny ratio of neurons of FFNs. This phenomenon is similar to the sparsity of the human brain, which drives research on functional partitions of the human brain. To verify whether functional partitions also emerge in FFNs, we propose to convert a model into its MoE version with the same parameters, namely MoEfication. Specifically, MoEfication consists of two phases: (1) splitting the parameters of FFNs into multiple functional partitions as experts, and (2) building expert routers to decide which experts will be used for each input. Experimental results show that MoEfication can conditionally use 10% to 30% of FFN parameters while maintaining over 95% original performance for different models on various downstream tasks. Besides, MoEfication brings two advantages: (1) it significantly reduces the FLOPS of inference, i.e., 2x speedup with 25% of FFN parameters, and (2) it provides a fine-grained perspective to study the inner mechanism of FFNs. The source code of this paper can be obtained from https://github.com/thunlp/MoEfication.

preprint2022arXiv

OPAL: Occlusion Pattern Aware Loss for Unsupervised Light Field Disparity Estimation

Light field disparity estimation is an essential task in computer vision with various applications. Although supervised learning-based methods have achieved both higher accuracy and efficiency than traditional optimization-based methods, the dependency on ground-truth disparity for training limits the overall generalization performance not to say for real-world scenarios where the ground-truth disparity is hard to capture. In this paper, we argue that unsupervised methods can achieve comparable accuracy, but, more importantly, much higher generalization capacity and efficiency than supervised methods. Specifically, we present the Occlusion Pattern Aware Loss, named OPAL, which successfully extracts and encodes the general occlusion patterns inherent in the light field for loss calculation. OPAL enables: i) accurate and robust estimation by effectively handling occlusions without using any ground-truth information for training and ii) much efficient performance by significantly reducing the network parameters required for accurate inference. Besides, a transformer-based network and a refinement module are proposed for achieving even more accurate results. Extensive experiments demonstrate our method not only significantly improves the accuracy compared with the SOTA unsupervised methods, but also possesses strong generalization capacity, even for real-world data, compared with supervised methods. Our code will be made publicly available.

preprint2022arXiv

Packed Levitated Marker for Entity and Relation Extraction

Recent entity and relation extraction works focus on investigating how to obtain a better span representation from the pre-trained encoder. However, a major limitation of existing works is that they ignore the interrelation between spans (pairs). In this work, we propose a novel span representation approach, named Packed Levitated Markers (PL-Marker), to consider the interrelation between the spans (pairs) by strategically packing the markers in the encoder. In particular, we propose a neighborhood-oriented packing strategy, which considers the neighbor spans integrally to better model the entity boundary information. Furthermore, for those more complicated span pair classification tasks, we design a subject-oriented packing strategy, which packs each subject and all its objects to model the interrelation between the same-subject span pairs. The experimental results show that, with the enhanced marker feature, our model advances baselines on six NER benchmarks, and obtains a 4.1%-4.3% strict relation F1 improvement with higher speed over previous state-of-the-art models on ACE04 and ACE05.

preprint2022arXiv

Quantum Interference Transport in two-dimensional Semi-Dirac Semimetals

Semi-Dirac semimetals have received enthusiastic research both theoretically and experimentally in the recent years. Due to the anisotropic dispersion, its physical properties are highly direction-dependent. In this work we employ the Feynman diagrammatic perturbation theory to study the transport properties in quantum diffusive regime. The magneto-conductivity with quantum interference corrections is derived, which demonstrate the weak localization effect in the semi-Dirac semimetal. Furthermore, the origin of anomalous Hall conductivity is also clarified, where both the intrinsic and side-jump contributions vanish and only the skew-scattering gives rise to non-zero transverse conductivity. The conductance fluctuations in both mesoscopic and quantum diffusive regimes are investigated in detail. Our work provides theoretical predictions for transport experiments, which can be examined by conductivity measurements at sufficiently low temperature.

preprint2022arXiv

Rethinking the Promotion Brought by Contrastive Learning to Semi-Supervised Node Classification

Graph Contrastive Learning (GCL) has proven highly effective in promoting the performance of Semi-Supervised Node Classification (SSNC). However, existing GCL methods are generally transferred from other fields like CV or NLP, whose underlying working mechanism remains under-explored. In this work, we first deeply probe the working mechanism of GCL in SSNC, and find that the promotion brought by GCL is severely unevenly distributed: the improvement mainly comes from subgraphs with less annotated information, which is fundamentally different from contrastive learning in other fields. However, existing GCL methods generally ignore this uneven distribution of annotated information and apply GCL evenly to the whole graph. To remedy this issue and further improve GCL in SSNC, we propose the Topology InFormation gain-Aware Graph Contrastive Learning (TIFA-GCL) framework that considers the annotated information distribution across graph in GCL. Extensive experiments on six benchmark graph datasets, including the enormous OGB-Products graph, show that TIFA-GCL can bring a larger improvement than existing GCL methods in both transductive and inductive settings. Further experiments demonstrate the generalizability and interpretability of TIFA-GCL.

preprint2022arXiv

Tunable magnetically induced transparency spectra in magnon-magnon coupled Y3Fe5O12/permalloy bilayers

Hybrid magnonic systems host a variety of characteristic quantum phenomena such as the magnetically-induced transparency (MIT) and Purcell effect, which are considered useful for future coherent quantum information processing. In this work, we experimentally demonstrate a tunable MIT effect in the Y3Fe5O12(YIG)/Permalloy(Py) magnon-magnon coupled system via changing the magnetic field orientations. By probing the magneto-optic effects of Py and YIG, we identify clear features of MIT spectra induced by the mode hybridization between the uniform mode of Py and the perpendicular standing spin-wave modes of YIG. By changing the external magnetic field orientations, we observe a tunable coupling strength between the YIG's spin-wave modes and the Py's uniform mode, upon the application of an out-of-plane magnetic field. This observation is theoretically interpreted by a geometrical consideration of the Py and YIG magnetization under the oblique magnetic field even at a constant interfacial exchange coupling. Our findings show high promise for investigating tunable coherent phenomena with hybrid magnonic platforms.

preprint2021arXiv

Anisotropic Magnon Spin Transport in Ultra-thin Spinel Ferrite Thin Films -- Evidence for Anisotropy in Exchange Stiffness

We report measurements of magnon spin transport in a spinel ferrite, magnesium aluminum ferrite $\mathrm{MgAl_{0.5}Fe_{1.5}O_4}$ (MAFO), which has a substantial in-plane four-fold magnetic anisotropy. We observe spin diffusion lengths $> 0.8$ $\mathrm{μm}$ at room temperature in 6 nm films, with spin diffusion length 30% longer along the easy axes compared to the hard axes. The sign of this difference is opposite to the effects just of anisotropy in the magnetic energy for a uniform magnetic state. We suggest instead that accounting for anisotropy in exchange stiffness is necessary to explain these results.

preprint2021arXiv

Auto-FuzzyJoin: Auto-Program Fuzzy Similarity Joins Without Labeled Examples

Fuzzy similarity join is an important database operator widely used in practice. So far the research community has focused exclusively on optimizing fuzzy join \textit{scalability}. However, practitioners today also struggle to optimize fuzzy-join \textit{quality}, because they face a daunting space of parameters (e.g., distance-functions, distance-thresholds, tokenization-options, etc.), and often have to resort to a manual trial-and-error approach to program these parameters in order to optimize fuzzy-join quality. This key challenge of automatically generating high-quality fuzzy-join programs has received surprisingly little attention thus far. In this work, we study the problem of "auto-program" fuzzy-joins. Leveraging a geometric interpretation of distance-functions, we develop an unsupervised \textsc{Auto-FuzzyJoin} framework that can infer suitable fuzzy-join programs on given input tables, without requiring explicit human input such as labeled training data. Using \textsc{Auto-FuzzyJoin}, users only need to provide two input tables $L$ and $R$, and a desired precision target $τ$ (say 0.9). \textsc{Auto-FuzzyJoin} leverages the fact that one of the input is a reference table to automatically program fuzzy-joins that meet the precision target $τ$ in expectation, while maximizing fuzzy-join recall (defined as the number of correctly joined records). Experiments on both existing benchmarks and a new benchmark with 50 fuzzy-join tasks created from Wikipedia data suggest that the proposed \textsc{Auto-FuzzyJoin} significantly outperforms existing unsupervised approaches, and is surprisingly competitive even against supervised approaches (e.g., Magellan and DeepMatcher) when 50\% of ground-truth labels are used as training data.

preprint2021arXiv

Giant Topological Hall Effect in van der Waals Heterostructures of CrTe2/Bi2Te3

Discoveries of interfacial topological Hall effect (THE) provide an ideal platform for exploring physics arising from the interplay between topology and magnetism. The interfacial topological Hall effect is closely related to the Dzyaloshinskii-Moriya interaction (DMI) at interface and topological spin textures. However, it is difficult to achieve a sizable THE in heterostructures due to the stringent constraints on the constituents of THE heterostructures such as strong spin-orbit coupling (SOC). Here we report the observation of a giant THE signal of 1.39 $μΩ\cdot$cm in the van der Waals heterostructures of CrTe2/Bi2Te3 fabricated by molecular beam epitaxy, a prototype of two-dimensional (2D) ferromagnet (FM)/topological insulator (TI). This large magnitude of THE is attributed to an optimized combination of 2D ferromagnetism in CrTe2, strong SOC in Bi2Te3, and an atomically sharp interface. Our work reveals CrTe2/Bi2Te3 as a convenient platform for achieving large interfacial THE in hybrid systems, which could be utilized to develop quantum science and high-density information storage.

preprint2020arXiv

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency. Model compression has been proposed as a promising technique to improve the running efficiency via parameter and operation reduction. Whereas, this technique is mainly practiced in ANNs rather than SNNs. It is interesting to answer how much an SNN model can be compressed without compromising its functionality, where two challenges should be addressed: i) the accuracy of SNNs is usually sensitive to model compression, which requires an accurate compression methodology; ii) the computation of SNNs is event-driven rather than static, which produces an extra compression dimension on dynamic spikes. To this end, we realize a comprehensive SNN compression through three steps. First, we formulate the connection pruning and weight quantization as a constrained optimization problem. Second, we combine spatio-temporal backpropagation (STBP) and alternating direction method of multipliers (ADMM) to solve the problem with minimum accuracy loss. Third, we further propose activity regularization to reduce the spike events for fewer active operations. These methods can be applied in either a single way for moderate compression or a joint way for aggressive compression. We define several quantitative metrics to evaluation the compression performance for SNNs. Our methodology is validated in pattern recognition tasks over MNIST, N-MNIST, CIFAR10, and CIFAR100 datasets, where extensive comparisons, analyses, and insights are provided. To our best knowledge, this is the first work that studies SNN compression in a comprehensive manner by exploiting all compressible components and achieves better results.

preprint2020arXiv

Evolving the pulmonary nodules diagnosis from classical approaches to deep learning aided decision support: three decades development course and future prospect

Lung cancer is the commonest cause of cancer deaths worldwide, and its mortality can be reduced significantly by performing early diagnosis and screening. Since the 1960s, driven by the pressing needs to accurately and effectively interpret the massive volume of chest images generated daily, computer-assisted diagnosis of pulmonary nodule has opened up new opportunities to relax the limitation from physicians' subjectivity, experiences and fatigue. And the fair access to the reliable and affordable computer-assisted diagnosis will fight the inequalities in incidence and mortality between populations. It has been witnessed that significant and remarkable advances have been achieved since the 1980s, and consistent endeavors have been exerted to deal with the grand challenges on how to accurately detect the pulmonary nodules with high sensitivity at low false-positives rate as well as on how to precisely differentiate between benign and malignant nodules. There is a lack of comprehensive examination of the techniques' development which is evolving the pulmonary nodules diagnosis from classical approaches to machine learning-assisted decision support. The main goal of this investigation is to provide a comprehensive state-of-the-art review of the computer-assisted nodules detection and benign-malignant classification techniques developed over 3 decades, which have evolved from the complicated ad hoc analysis pipeline of conventional approaches to the simplified seamlessly integrated deep learning techniques. This review also identifies challenges and highlights opportunities for future work in learning models, learning algorithms and enhancement schemes for bridging current state to future prospect and satisfying future demand.

preprint2020arXiv

Experimental parameters, combined dynamics, and nonlinearity of a Magnonic-Opto-Electronic Oscillator (MOEO)

We report the construction and characterization of a comprehensive magnonic-opto-electronic oscillator (MOEO) system based on 1550-nm photonics and yttirum iron garnet (YIG) magnonics. The system exhibits a rich and synergistic parameter space because of the ability to control individual photonic, electronic, and magnonic components. Taking advantage of the spin wave dispersion of YIG, the frequency self-generation as well as the related nonlinear processes become sensitive to the external magnetic field. Besides being known as a narrowband filter and a delay element, the YIG delayline possesses spin wave modes that can be controlled to mix with the optoelectronic modes to generate higher-order harmonic beating modes. With the high sensitivity and external tunability, the MOEO system may find usefulness in sensing applications in magnetism and spintronics beyond optoelectronics and photonics.

preprint2020arXiv

FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval

In this paper, we address the text and image matching in cross-modal retrieval of the fashion industry. Different from the matching in the general domain, the fashion matching is required to pay much more attention to the fine-grained information in the fashion images and texts. Pioneer approaches detect the region of interests (i.e., RoIs) from images and use the RoI embeddings as image representations. In general, RoIs tend to represent the "object-level" information in the fashion images, while fashion texts are prone to describe more detailed information, e.g. styles, attributes. RoIs are thus not fine-grained enough for fashion text and image matching. To this end, we propose FashionBERT, which leverages patches as image features. With the pre-trained BERT model as the backbone network, FashionBERT learns high level representations of texts and images. Meanwhile, we propose an adaptive loss to trade off multitask learning in the FashionBERT modeling. Two tasks (i.e., text and image matching and cross-modal retrieval) are incorporated to evaluate FashionBERT. On the public dataset, experiments demonstrate FashionBERT achieves significant improvements in performances than the baseline and state-of-the-art approaches. In practice, FashionBERT is applied in a concrete cross-modal retrieval application. We provide the detailed matching performance and inference efficiency analysis.

preprint2020arXiv

HighwayGraph: Modelling Long-distance Node Relations for Improving General Graph Neural Network

Graph Neural Networks (GNNs) are efficient approaches to process graph-structured data. Modelling long-distance node relations is essential for GNN training and applications. However, conventional GNNs suffer from bad performance in modelling long-distance node relations due to limited-layer information propagation. Existing studies focus on building deep GNN architectures, which face the over-smoothing issue and cannot model node relations in particularly long distance. To address this issue, we propose to model long-distance node relations by simply relying on shallow GNN architectures with two solutions: (1) Implicitly modelling by learning to predict node pair relations (2) Explicitly modelling by adding edges between nodes that potentially have the same label. To combine our two solutions, we propose a model-agnostic training framework named HighwayGraph, which overcomes the challenge of insufficient labeled nodes by sampling node pairs from the training set and adopting the self-training method. Extensive experimental results show that our HighwayGraph achieves consistent and significant improvements over four representative GNNs on three benchmark datasets.

preprint2020arXiv

Hybrid vector beams with non-uniform orbital angular momentum density induced by designed azimuthal polarization gradient

Based on angular amplitude modulation of orthogonal base vectors in common-path interference method, we propose an interesting type of hybrid vector beams with unprecedented azimuthal polarization gradient and demonstrate in experiment. Distinct to previously reported types, the synthetic hybrid vector beams exhibit geometrically intriguing projection tracks of angular polarization state on Poincare sphere, more than just conventional circles. More noteworthily, the designed azimuthal polarization gradients are found to be able to induce azimuthally non-uniform orbital angular momentum density, while generally uniform for circle-track cases, immersing in homogenous intensity background whatever base states are. Moreover, via tailoring relevant parameters, more special polarization mapping tracks can be handily achieved. These peculiar features may open alternative routes for new optical effects and applications.

preprint2020arXiv

Implement Liquid Democracy on Ethereum: A Fast Algorithm for Realtime Self-tally Voting System

We study the liquid democracy problem, where each voter can either directly vote to a candidate or delegate his voting power to a proxy. We consider the implementation of liquid democracy on the blockchain through Ethereum smart contract and to be compatible with the realtime self-tallying property, where the contract itself can record ballots and update voting status upon receiving each voting massage. A challenge comes due to the gas fee limitation of Ethereum mainnet, that the number of instruction for processing a voting massage can not exceed a certain amount, which restrict the application scenario with respect to algorithms whose time complexity is linear to the number of voters. We propose a fast algorithm to overcome the challenge, such that i) shifts the on-chain initialization to off-chain and ii) the on-chain complexity for processing each voting massage is O(\log n), where n is the number of voters.

preprint2020arXiv

Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation

Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments. The segments are generated simultaneously while each segment is predicted token-by-token. By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors. Experimental results on three widely-used benchmark datasets show that our proposed model achieves more than 4$\times$ speedup while maintaining comparable performance compared with the corresponding autoregressive model.

preprint2020arXiv

Making Robots Draw A Vivid Portrait In Two Minutes

Significant progress has been made with artistic robots. However, existing robots fail to produce high-quality portraits in a short time. In this work, we present a drawing robot, which can automatically transfer a facial picture to a vivid portrait, and then draw it on paper within two minutes averagely. At the heart of our system is a novel portrait synthesis algorithm based on deep learning. Innovatively, we employ a self-consistency loss, which makes the algorithm capable of generating continuous and smooth brush-strokes. Besides, we propose a componential sparsity constraint to reduce the number of brush-strokes over insignificant areas. We also implement a local sketch synthesis algorithm, and several pre- and post-processing techniques to deal with the background and details. The portrait produced by our algorithm successfully captures individual characteristics by using a sparse set of continuous brush-strokes. Finally, the portrait is converted to a sequence of trajectories and reproduced by a 3-degree-of-freedom robotic arm. The whole portrait drawing robotic system is named AiSketcher. Extensive experiments show that AiSketcher can produce considerably high-quality sketches for a wide range of pictures, including faces in-the-wild and universal images of arbitrary content. To our best knowledge, AiSketcher is the first portrait drawing robot that uses neural style transfer techniques. AiSketcher has attended a quite number of exhibitions and shown remarkable performance under diverse circumstances.

preprint2020arXiv

Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions

Machine learning (ML) applications have been thriving recently, largely attributed to the increasing availability of data. However, inconsistency and incomplete information are ubiquitous in real-world datasets, and their impact on ML applications remains elusive. In this paper, we present a formal study of this impact by extending the notion of Certain Answers for Codd tables, which has been explored by the database research community for decades, into the field of machine learning. Specifically, we focus on classification problems and propose the notion of "Certain Predictions" (CP) -- a test data example can be certainly predicted (CP'ed) if all possible classifiers trained on top of all possible worlds induced by the incompleteness of data would yield the same prediction. We study two fundamental CP queries: (Q1) checking query that determines whether a data example can be CP'ed; and (Q2) counting query that computes the number of classifiers that support a particular prediction (i.e., label). Given that general solutions to CP queries are, not surprisingly, hard without assumption over the type of classifier, we further present a case study in the context of nearest neighbor (NN) classifiers, where efficient solutions to CP queries can be developed -- we show that it is possible to answer both queries in linear or polynomial time over exponentially many possible worlds. We demonstrate one example use case of CP in the important application of "data cleaning for machine learning (DC for ML)." We show that our proposed CPClean approach built based on CP can often significantly outperform existing techniques in terms of classification accuracy with mild manual cleaning effort.

preprint2020arXiv

Ring-Frustrated Non-Hermitian $XY$ Model

We study a non-Hermitian version of XY closed chain with odd number of lattice sites. We consider both anti-ferromagnetic coupling and also a symmetric non-collinear spin coupling. It is found that the energy spectrum is real in certain region of the parameter space. In contrast to previous non-Hermitian models, the ground state is a state with one mode occupied inside this real energy spectrum region, instead of the artificially identified vacuum state. At the same time, there appears a gapless excitation, which is made by kink like spin configurations. It is also found that this kink phase has non-trivial topological invariant.

preprint2020arXiv

SpecuSym: Speculative Symbolic Execution for Cache Timing Leak Detection

CPU cache is a limited but crucial storage component in modern processors, whereas the cache timing side-channel may inadvertently leak information through the physically measurable timing variance. Speculative execution, an essential processor optimization, and a source of such variances, can cause severe detriment on deliberate branch mispredictions. Despite static analysis could qualitatively verify the timing-leakage-free property under speculative execution, it is incapable of producing endorsements including inputs and speculated flows to diagnose leaks in depth. This work proposes a new symbolic execution based method, SpecuSym, for precisely detecting cache timing leaks introduced by speculative execution. Given a program (leakage-free in non-speculative execution), SpecuSymsystematically explores the program state space, models speculative behavior at conditional branches, and accumulates the cache side effects along with subsequent path explorations. During the dynamic execution, SpecuSymconstructs leak predicates for memory visits according to the specified cache model and conducts a constraint-solving based cache behavior analysis to inspect the new cache behaviors. We have implementedSpecuSymatop KLEE and evaluated it against 15 open-source benchmarks. Experimental results show thatSpecuSymsuccessfully detected from 2 to 61 leaks in 6 programs under 3 different cache settings and identified false positives in 2 programs reported by recent work.

preprint2019arXiv

Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators using Time Compression Supporting Multiple Spike Codes

Spiking neural networks (SNNs) are the third generation of neural networks and can explore both rate and temporal coding for energy-efficient event-driven computation. However, the decision accuracy of existing SNN designs is contingent upon processing a large number of spikes over a long period. Nevertheless, the switching power of SNN hardware accelerators is proportional to the number of spikes processed while the length of spike trains limits throughput and static power efficiency. This paper presents the first study on developing temporal compression to significantly boost throughput and reduce energy dissipation of digital hardware SNN accelerators while being applicable to multiple spike codes. The proposed compression architectures consist of low-cost input spike compression units, novel input-and-output-weighted spiking neurons, and reconfigurable time constant scaling to support large and flexible time compression ratios. Our compression architectures can be transparently applied to any given pre-designed SNNs employing either rate or temporal codes while incurring minimal modification of the neural models, learning algorithms, and hardware design. Using spiking speech and image recognition datasets, we demonstrate the feasibility of supporting large time compression ratios of up to 16x, delivering up to 15.93x, 13.88x, and 86.21x improvements in throughput, energy dissipation, the tradeoffs between hardware area, runtime, energy, and classification accuracy, respectively based on different spike codes on a Xilinx Zynq-7000 FPGA. These results are achieved while incurring little extra hardware overhead.

preprint2019arXiv

Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks

Spiking neural networks (SNNs) well support spatiotemporal learning and energy-efficient event-driven hardware neuromorphic processors. As an important class of SNNs, recurrent spiking neural networks (RSNNs) possess great computational power. However, the practical application of RSNNs is severely limited by challenges in training. Biologically-inspired unsupervised learning has limited capability in boosting the performance of RSNNs. On the other hand, existing backpropagation (BP) methods suffer from high complexity of unrolling in time, vanishing and exploding gradients, and approximate differentiation of discontinuous spiking activities when applied to RSNNs. To enable supervised training of RSNNs under a well-defined loss function, we present a novel Spike-Train level RSNNs Backpropagation (ST-RSBP) algorithm for training deep RSNNs. The proposed ST-RSBP directly computes the gradient of a rated-coded loss function defined at the output layer of the network w.r.t tunable parameters. The scalability of ST-RSBP is achieved by the proposed spike-train level computation during which temporal effects of the SNN is captured in both the forward and backward pass of BP. Our ST-RSBP algorithm can be broadly applied to RSNNs with a single recurrent layer or deep RSNNs with multiple feed-forward and recurrent layers. Based upon challenging speech and image datasets including TI46, N-TIDIGITS, Fashion-MNIST and MNIST, ST-RSBP is able to train RSNNs with an accuracy surpassing that of the current state-of-art SNN BP algorithms and conventional non-spiking deep learning models.