Source author record

Bo Yuan

Bo Yuan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

48works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TableVista: Benchmarking Multimodal Table Reasoning under Visual and Structural Complexity

We introduce TableVista, a comprehensive benchmark for evaluating foundation models in multimodal table reasoning under visual and structural complexity. TableVista consists of 3,000 high-quality table reasoning problems, where each instance is expanded into 10 distinct visual variants through our multi-style rendering and transformation pipeline. This process encompasses diverse scenario styles, robustness perturbations, and vision-only configurations, culminating in 30,000 multimodal samples for a multi-dimensional evaluation. We conduct an extensive evaluation of 29 state-of-the-art open-source and proprietary foundation models on TableVista. Through comprehensive quantitative and qualitative analysis, we find that while evaluated models remain largely stable across diverse rendering styles, they exhibit pronounced performance degradation on complex structural layouts and vision-only settings, revealing that current models struggle to maintain reasoning consistency when structural complexity combines with visually integrated presentations. These findings highlight critical gaps in current multimodal capabilities, providing insights for advancing more robust and reliable table understanding models.

preprint2023arXiv

GUAP: Graph Universal Attack Through Adversarial Patching

Graph neural networks (GNNs) are a class of effective deep learning models for node classification tasks; yet their predictive capability may be severely compromised under adversarially designed unnoticeable perturbations to the graph structure and/or node data. Most of the current work on graph adversarial attacks aims at lowering the overall prediction accuracy, but we argue that the resulting abnormal model performance may catch attention easily and invite quick counterattack. Moreover, attacks through modification of existing graph data may be hard to conduct if good security protocols are implemented. In this work, we consider an easier attack harder to be noticed, through adversarially patching the graph with new nodes and edges. The attack is universal: it targets a single node each time and flips its connection to the same set of patch nodes. The attack is unnoticeable: it does not modify the predictions of nodes other than the target. We develop an algorithm, named GUAP, that achieves high attack success rate but meanwhile preserves the prediction accuracy. GUAP is fast to train by employing a sampling strategy. We demonstrate that a 5% sampling in each epoch yields 20x speedup in training, with only a slight degradation in attack performance. Additionally, we show that the adversarial patch trained with the graph convolutional network transfers well to other GNNs, such as the graph attention network.

preprint2023arXiv

TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition

Tucker decomposition is one of the SOTA CNN model compression techniques. However, unlike the FLOPs reduction, we observe very limited inference time reduction with Tucker-compressed models using existing GPU software such as cuDNN. To this end, we propose an efficient end-to-end framework that can generate highly accurate and compact CNN models via Tucker decomposition and optimized inference code on GPUs. Specifically, we propose an ADMM-based training algorithm that can achieve highly accurate Tucker-format models. We also develop a high-performance kernel for Tucker-format convolutions and analytical performance models to guide the selection of execution parameters. We further propose a co-design framework to determine the proper Tucker ranks driven by practical inference time (rather than FLOPs). Our evaluation on five modern CNNs with A100 demonstrates that our compressed models with our optimized code achieve up to 2.21X speedup over cuDNN, 1.12X speedup over TVM, and 3.27X over the original models using cuDNN with at most 0.05% accuracy loss.

preprint2022arXiv

Birds of A Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation

Unsupervised domain adaptation (UDA) aims to enhance the generalization capability of a certain model from a source domain to a target domain. Present UDA models focus on alleviating the domain shift by minimizing the feature discrepancy between the source domain and the target domain but usually ignore the class confusion problem. In this work, we propose an Inter-class Separation and Intra-class Aggregation (ISIA) mechanism. It encourages the cross-domain representative consistency between the same categories and differentiation among diverse categories. In this way, the features belonging to the same categories are aligned together and the confusable categories are separated. By measuring the align complexity of each category, we design an Adaptive-weighted Instance Matching (AIM) strategy to further optimize the instance-level adaptation. Based on our proposed methods, we also raise a hierarchical unsupervised domain adaptation framework for cross-domain semantic segmentation task. Through performing the image-level, feature-level, category-level and instance-level alignment, our method achieves a stronger generalization performance of the model from the source domain to the target domain. In two typical cross-domain semantic segmentation tasks, i.e., GTA5 to Cityscapes and SYNTHIA to Cityscapes, our method achieves the state-of-the-art segmentation accuracy. We also build two cross-domain semantic segmentation datasets based on the publicly available data, i.e., remote sensing building segmentation and road segmentation, for domain adaptive segmentation.

preprint2022arXiv

Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation

The powerful learning ability of deep neural networks enables reinforcement learning agents to learn competent control policies directly from continuous environments. In theory, to achieve stable performance, neural networks assume i.i.d. inputs, which unfortunately does no hold in the general reinforcement learning paradigm where the training data is temporally correlated and non-stationary. This issue may lead to the phenomenon of "catastrophic interference" and the collapse in performance. In this paper, we present IQ, i.e., interference-aware deep Q-learning, to mitigate catastrophic interference in single-task deep reinforcement learning. Specifically, we resort to online clustering to achieve on-the-fly context division, together with a multi-head network and a knowledge distillation regularization term for preserving the policy of learned contexts. Built upon deep Q networks, IQ consistently boosts the stability and performance when compared to existing methods, verified with extensive experiments on classic control and Atari tasks. The code is publicly available at: https://github.com/Sweety-dm/Interference-aware-Deep-Q-learning.

preprint2022arXiv

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

Filter pruning has been widely used for neural network compression because of its enabled practical acceleration. To date, most of the existing filter pruning works explore the importance of filters via using intra-channel information. In this paper, starting from an inter-channel perspective, we propose to perform efficient filter pruning using Channel Independence, a metric that measures the correlations among different feature maps. The less independent feature map is interpreted as containing less useful information$/$knowledge, and hence its corresponding filter can be pruned without affecting model capacity. We systematically investigate the quantification metric, measuring scheme and sensitiveness$/$reliability of channel independence in the context of filter pruning. Our evaluation results for different models on various datasets show the superior performance of our approach. Notably, on CIFAR-10 dataset our solution can bring $0.90\%$ and $0.94\%$ accuracy increase over baseline ResNet-56 and ResNet-110 models, respectively, and meanwhile the model size and FLOPs are reduced by $42.8\%$ and $47.4\%$ (for ResNet-56) and $48.3\%$ and $52.1\%$ (for ResNet-110), respectively. On ImageNet dataset, our approach can achieve $40.8\%$ and $44.8\%$ storage and computation reductions, respectively, with $0.15\%$ accuracy increase over the baseline ResNet-50 model. The code is available at https://github.com/Eclipsess/CHIP_NeurIPS2021.

preprint2022arXiv

Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

One of the key challenges in visual Reinforcement Learning (RL) is to learn policies that can generalize to unseen environments. Recently, data augmentation techniques aiming at enhancing data diversity have demonstrated proven performance in improving the generalization ability of learned policies. However, due to the sensitivity of RL training, naively applying data augmentation, which transforms each pixel in a task-agnostic manner, may suffer from instability and damage the sample efficiency, thus further exacerbating the generalization performance. At the heart of this phenomenon is the diverged action distribution and high-variance value estimation in the face of augmented images. To alleviate this issue, we propose Task-aware Lipschitz Data Augmentation (TLDA) for visual RL, which explicitly identifies the task-correlated pixels with large Lipschitz constants, and only augments the task-irrelevant pixels. To verify the effectiveness of TLDA, we conduct extensive experiments on DeepMind Control suite, CARLA and DeepMind Manipulation tasks, showing that TLDA improves both sample efficiency in training time and generalization in test time. It outperforms previous state-of-the-art methods across the 3 different visual control benchmarks.

preprint2022arXiv

Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism

The dynamic job-shop scheduling problem (DJSP) is a class of scheduling tasks that specifically consider the inherent uncertainties such as changing order requirements and possible machine breakdown in realistic smart manufacturing settings. Since traditional methods cannot dynamically generate effective scheduling strategies in face of the disturbance of environments, we formulate the DJSP as a Markov decision process (MDP) to be tackled by reinforcement learning (RL). For this purpose, we propose a flexible hybrid framework that takes disjunctive graphs as states and a set of general dispatching rules as the action space with minimum prior domain knowledge. The attention mechanism is used as the graph representation learning (GRL) module for the feature extraction of states, and the double dueling deep Q-network with prioritized replay and noisy networks (D3QPN) is employed to map each state to the most appropriate dispatching rule. Furthermore, we present Gymjsp, a public benchmark based on the well-known OR-Library, to provide a standardized off-the-shelf facility for RL and DJSP research communities. Comprehensive experiments on various DJSP instances confirm that our proposed framework is superior to baseline algorithms with smaller makespan across all instances and provide empirical justification for the validity of the various components in the hybrid framework.

preprint2022arXiv

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

Safe reinforcement learning aims to learn the optimal policy while satisfying safety constraints, which is essential in real-world applications. However, current algorithms still struggle for efficient policy updates with hard constraint satisfaction. In this paper, we propose Penalized Proximal Policy Optimization (P3O), which solves the cumbersome constrained policy iteration via a single minimization of an equivalent unconstrained problem. Specifically, P3O utilizes a simple-yet-effective penalty function to eliminate cost constraints and removes the trust-region constraint by the clipped surrogate objective. We theoretically prove the exactness of the proposed method with a finite penalty factor and provide a worst-case analysis for approximate error when evaluated on sample trajectories. Moreover, we extend P3O to more challenging multi-constraint and multi-agent scenarios which are less studied in previous work. Extensive experiments show that P3O outperforms state-of-the-art algorithms with respect to both reward improvement and constraint satisfaction on a set of constrained locomotive tasks.

preprint2022arXiv

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN

Recently backdoor attack has become an emerging threat to the security of deep neural network (DNN) models. To date, most of the existing studies focus on backdoor attack against the uncompressed model; while the vulnerability of compressed DNNs, which are widely used in the practical applications, is little exploited yet. In this paper, we propose to study and develop Robust and Imperceptible Backdoor Attack against Compact DNN models (RIBAC). By performing systematic analysis and exploration on the important design knobs, we propose a framework that can learn the proper trigger patterns, model parameters and pruning masks in an efficient way. Thereby achieving high trigger stealthiness, high attack success rate and high model efficiency simultaneously. Extensive evaluations across different datasets, including the test against the state-of-the-art defense mechanisms, demonstrate the high robustness, stealthiness and model efficiency of RIBAC. Code is available at https://github.com/huyvnphan/ECCV2022-RIBAC

preprint2022arXiv

Robot Motion Planning as Video Prediction: A Spatio-Temporal Neural Network-based Motion Planner

Neural network (NN)-based methods have emerged as an attractive approach for robot motion planning due to strong learning capabilities of NN models and their inherently high parallelism. Despite the current development in this direction, the efficient capture and processing of important sequential and spatial information, in a direct and simultaneous way, is still relatively under-explored. To overcome the challenge and unlock the potentials of neural networks for motion planning tasks, in this paper, we propose STP-Net, an end-to-end learning framework that can fully extract and leverage important spatio-temporal information to form an efficient neural motion planner. By interpreting the movement of the robot as a video clip, robot motion planning is transformed to a video prediction task that can be performed by STP-Net in both spatially and temporally efficient ways. Empirical evaluations across different seen and unseen environments show that, with nearly 100% accuracy (aka, success rate), STP-Net demonstrates very promising performance with respect to both planning speed and path cost. Compared with existing NN-based motion planners, STP-Net achieves at least 5x, 2.6x and 1.8x faster speed with lower path cost on 2D Random Forest, 2D Maze and 3D Random Forest environments, respectively. Furthermore, STP-Net can quickly and simultaneously compute multiple near-optimal paths in multi-robot motion planning tasks

preprint2022arXiv

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

Safe reinforcement learning (RL) has achieved significant success on risk-sensitive tasks and shown promise in autonomous driving (AD) as well. Considering the distinctiveness of this community, efficient and reproducible baselines are still lacking for safe AD. In this paper, we release SafeRL-Kit to benchmark safe RL methods for AD-oriented tasks. Concretely, SafeRL-Kit contains several latest algorithms specific to zero-constraint-violation tasks, including Safety Layer, Recovery RL, off-policy Lagrangian method, and Feasible Actor-Critic. In addition to existing approaches, we propose a novel first-order method named Exact Penalty Optimization (EPO) and sufficiently demonstrate its capability in safe AD. All algorithms in SafeRL-Kit are implemented (i) under the off-policy setting, which improves sample efficiency and can better leverage past logs; (ii) with a unified learning framework, providing off-the-shelf interfaces for researchers to incorporate their domain-specific knowledge into fundamental safe RL methods. Conclusively, we conduct a comparative evaluation of the above algorithms in SafeRL-Kit and shed light on their efficacy for safe autonomous driving. The source code is available at \href{ https://github.com/zlr20/saferl_kit}{this https URL}.

preprint2022arXiv

The least-used key selection method for information retrieval in large-scale Cloud-based service repositories

As the number of devices connected to the Internet of Things (IoT) increases significantly, it leads to an exponential growth in the number of services that need to be processed and stored in the large-scale Cloud-based service repositories. An efficient service indexing model is critical for service retrieval and management of large-scale Cloud-based service repositories. The multilevel index model is the state-of-art service indexing model in recent years to improve service discovery and combination. This paper aims to optimize the model to consider the impact of unequal appearing probability of service retrieval request parameters and service input parameters on service retrieval and service addition operations. The least-used key selection method has been proposed to narrow the search scope of service retrieval and reduce its time. The experimental results show that the proposed least-used key selection method improves the service retrieval efficiency significantly compared with the designated key selection method in the case of the unequal appearing probability of parameters in service retrieval requests under three indexing models.

preprint2021arXiv

Clarinet: A One-step Approach Towards Budget-friendly Unsupervised Domain Adaptation

In unsupervised domain adaptation (UDA), classifiers for the target domain are trained with massive true-label data from the source domain and unlabeled data from the target domain. However, it may be difficult to collect fully-true-label data in a source domain given a limited budget. To mitigate this problem, we consider a novel problem setting where the classifier for the target domain has to be trained with complementary-label data from the source domain and unlabeled data from the target domain named budget-friendly UDA (BFUDA). The key benefit is that it is much less costly to collect complementary-label source data (required by BFUDA) than collecting the true-label source data (required by ordinary UDA). To this end, the complementary label adversarial network (CLARINET) is proposed to solve the BFUDA problem. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of the source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines.

preprint2021arXiv

Doubly Residual Neural Decoder: Towards Low-Complexity High-Performance Channel Decoding

Recently deep neural networks have been successfully applied in channel coding to improve the decoding performance. However, the state-of-the-art neural channel decoders cannot achieve high decoding performance and low complexity simultaneously. To overcome this challenge, in this paper we propose doubly residual neural (DRN) decoder. By integrating both the residual input and residual learning to the design of neural channel decoder, DRN enables significant decoding performance improvement while maintaining low complexity. Extensive experiment results show that on different types of channel codes, our DRN decoder consistently outperform the state-of-the-art decoders in terms of decoding performance, model sizes and computational cost.

preprint2021arXiv

Enabling Fast and Universal Audio Adversarial Attack Using Generative Model

Recently, the vulnerability of DNN-based audio systems to adversarial attacks has obtained the increasing attention. However, the existing audio adversarial attacks allow the adversary to possess the entire user's audio input as well as granting sufficient time budget to generate the adversarial perturbations. These idealized assumptions, however, makes the existing audio adversarial attacks mostly impossible to be launched in a timely fashion in practice (e.g., playing unnoticeable adversarial perturbations along with user's streaming input). To overcome these limitations, in this paper we propose fast audio adversarial perturbation generator (FAPG), which uses generative model to generate adversarial perturbations for the audio input in a single forward pass, thereby drastically improving the perturbation generation speed. Built on the top of FAPG, we further propose universal audio adversarial perturbation generator (UAPG), a scheme crafting universal adversarial perturbation that can be imposed on arbitrary benign audio input to cause misclassification. Extensive experiments show that our proposed FAPG can achieve up to 167X speedup over the state-of-the-art audio adversarial attack methods. Also our proposed UAPG can generate universal adversarial perturbation that achieves much better attack performance than the state-of-the-art solutions.

preprint2021arXiv

NVAE-GAN Based Approach for Unsupervised Time Series Anomaly Detection

In recent studies, Lots of work has been done to solve time series anomaly detection by applying Variational Auto-Encoders (VAEs). Time series anomaly detection is a very common but challenging task in many industries, which plays an important role in network monitoring, facility maintenance, information security, and so on. However, it is very difficult to detect anomalies in time series with high accuracy, due to noisy data collected from real world, and complicated abnormal patterns. From recent studies, we are inspired by Nouveau VAE (NVAE) and propose our anomaly detection model: Time series to Image VAE (T2IVAE), an unsupervised model based on NVAE for univariate series, transforming 1D time series to 2D image as input, and adopting the reconstruction error to detect anomalies. Besides, we also apply the Generative Adversarial Networks based techniques to T2IVAE training strategy, aiming to reduce the overfitting. We evaluate our model performance on three datasets, and compare it with other several popular models using F1 score. T2IVAE achieves 0.639 on Numenta Anomaly Benchmark, 0.651 on public dataset from NASA, and 0.504 on our dataset collected from real-world scenario, outperforms other comparison models.

preprint2020arXiv

Bridging the Theoretical Bound and Deep Algorithms for Open Set Domain Adaptation

In the unsupervised open set domain adaptation (UOSDA), the target domain contains unknown classes that are not observed in the source domain. Researchers in this area aim to train a classifier to accurately: 1) recognize unknown target data (data with unknown classes) and, 2) classify other target data. To achieve this aim, a previous study has proven an upper bound of the target-domain risk, and the open set difference, as an important term in the upper bound, is used to measure the risk on unknown target data. By minimizing the upper bound, a shallow classifier can be trained to achieve the aim. However, if the classifier is very flexible (e.g., deep neural networks (DNNs)), the open set difference will converge to a negative value when minimizing the upper bound, which causes an issue where most target data are recognized as unknown data. To address this issue, we propose a new upper bound of target-domain risk for UOSDA, which includes four terms: source-domain risk, $ε$-open set difference ($Δ_ε$), a distributional discrepancy between domains, and a constant. Compared to the open set difference, $Δ_ε$ is more robust against the issue when it is being minimized, and thus we are able to use very flexible classifiers (i.e., DNNs). Then, we propose a new principle-guided deep UOSDA method that trains DNNs via minimizing the new upper bound. Specifically, source-domain risk and $Δ_ε$ are minimized by gradient descent, and the distributional discrepancy is minimized via a novel open-set conditional adversarial training strategy. Finally, compared to existing shallow and deep UOSDA methods, our method shows the state-of-the-art performance on several benchmark datasets, including digit recognition (MNIST, SVHN, USPS), object recognition (Office-31, Office-Home), and face recognition (PIE).

preprint2020arXiv

Classical Simulation of Quantum Supremacy Circuits

It is believed that random quantum circuits are difficult to simulate classically. These have been used to demonstrate quantum supremacy: the execution of a computational task on a quantum computer that is infeasible for any classical computer. The task underlying the assertion of quantum supremacy by Arute et al. (Nature, 574, 505--510 (2019)) was initially estimated to require Summit, the world's most powerful supercomputer today, approximately 10,000 years. The same task was performed on the Sycamore quantum processor in only 200 seconds. In this work, we present a tensor network-based classical simulation algorithm. Using a Summit-comparable cluster, we estimate that our simulator can perform this task in less than 20 days. On moderately-sized instances, we reduce the runtime from years to minutes, running several times faster than Sycamore itself. These estimates are based on explicit simulations of parallel subtasks, and leave no room for hidden costs. The simulator's key ingredient is identifying and optimizing the "stem" of the computation: a sequence of pairwise tensor contractions that dominates the computational cost. This orders-of-magnitude reduction in classical simulation time, together with proposals for further significant improvements, indicates that achieving quantum supremacy may require a period of continuing quantum hardware developments without an unequivocal first demonstration.

preprint2020arXiv

Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition

Recurrent Neural Networks (RNNs) have been widely used in sequence analysis and modeling. However, when processing high-dimensional data, RNNs typically require very large model sizes, thereby bringing a series of deployment challenges. Although the state-of-the-art tensor decomposition approaches can provide good model compression performance, these existing methods are still suffering some inherent limitations, such as restricted representation capability and insufficient model complexity reduction. To overcome these limitations, in this paper we propose to develop compact RNN models using Hierarchical Tucker (HT) decomposition. HT decomposition brings strong hierarchical structure to the decomposed RNN models, which is very useful and important for enhancing the representation capability. Meanwhile, HT decomposition provides higher storage and computational cost reduction than the existing tensor decomposition approaches for RNN compression. Our experimental results show that, compared with the state-of-the-art compressed RNN models, such as TT-LSTM, TR-LSTM and BT-LSTM, our proposed HT-based LSTM (HT-LSTM), consistently achieves simultaneous and significant increases in both compression ratio and test accuracy on different datasets.

preprint2020arXiv

Embedding Compression with Isotropic Iterative Quantization

Continuous representation of words is a standard component in deep learning-based NLP models. However, representing a large vocabulary requires significant memory, which can cause problems, particularly on resource-constrained platforms. Therefore, in this paper we propose an isotropic iterative quantization (IIQ) approach for compressing embedding vectors into binary ones, leveraging the iterative quantization technique well established for image retrieval, while satisfying the desired isotropic property of PMI based models. Experiments with pre-trained embeddings (i.e., GloVe and HDC) demonstrate a more than thirty-fold compression ratio with comparable and sometimes even improved performance over the original real-valued embedding vectors.

preprint2020arXiv

How does the Combined Risk Affect the Performance of Unsupervised Domain Adaptation Approaches?

Unsupervised domain adaptation (UDA) aims to train a target classifier with labeled samples from the source domain and unlabeled samples from the target domain. Classical UDA learning bounds show that target risk is upper bounded by three terms: source risk, distribution discrepancy, and combined risk. Based on the assumption that the combined risk is a small fixed value, methods based on this bound train a target classifier by only minimizing estimators of the source risk and the distribution discrepancy. However, the combined risk may increase when minimizing both estimators, which makes the target risk uncontrollable. Hence the target classifier cannot achieve ideal performance if we fail to control the combined risk. To control the combined risk, the key challenge takes root in the unavailability of the labeled samples in the target domain. To address this key challenge, we propose a method named E-MixNet. E-MixNet employs enhanced mixup, a generic vicinal distribution, on the labeled source samples and pseudo-labeled target samples to calculate a proxy of the combined risk. Experiments show that the proxy can effectively curb the increase of the combined risk when minimizing the source risk and distribution discrepancy. Furthermore, we show that if the proxy of the combined risk is added into loss functions of four representative UDA methods, their performance is also improved.

preprint2020arXiv

Learning from a Complementary-label Source Domain: Theory and Algorithms

In unsupervised domain adaptation (UDA), a classifier for the target domain is trained with massive true-label data from the source domain and unlabeled data from the target domain. However, collecting fully-true-label data in the source domain is high-cost and sometimes impossible. Compared to the true labels, a complementary label specifies a class that a pattern does not belong to, hence collecting complementary labels would be less laborious than collecting true labels. Thus, in this paper, we propose a novel setting that the source domain is composed of complementary-label data, and a theoretical bound for it is first proved. We consider two cases of this setting, one is that the source domain only contains complementary-label data (completely complementary unsupervised domain adaptation, CC-UDA), and the other is that the source domain has plenty of complementary-label data and a small amount of true-label data (partly complementary unsupervised domain adaptation, PC-UDA). To this end, a complementary label adversarial network} (CLARINET) is proposed to solve CC-UDA and PC-UDA problems. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines on handwritten-digits-recognition and objects-recognition tasks.

preprint2020arXiv

Local Causal Structure Learning and its Discovery Between Type 2 Diabetes and Bone Mineral Density

Type 2 diabetes (T2DM), one of the most prevalent chronic diseases, affects the glucose metabolism of the human body, which decreases the quantity of life and brings a heavy burden on social medical care. Patients with T2DM are more likely to suffer bone fragility fracture as diabetes affects bone mineral density (BMD). However, the discovery of the determinant factors of BMD in a medical way is expensive and time-consuming. In this paper, we propose a novel algorithm, Prior-Knowledge-driven local Causal structure Learning (PKCL), to discover the underlying causal mechanism between BMD and its factors from the clinical data. Since there exist limited data but redundant prior knowledge for medicine, PKCL adequately utilize the prior knowledge to mine the local causal structure for the target relationship. Combining the medical prior knowledge with the discovered causal relationships, PKCL can achieve more reliable results without long-standing medical statistical experiments. Extensive experiments are conducted on a newly provided clinical data set. The experimental study of PKCL on the data is proved to highly corresponding with existing medical knowledge, which demonstrates the superiority and effectiveness of PKCL. To illustrate the importance of prior knowledge, the result of the algorithm without prior knowledge is also investigated.

preprint2020arXiv

PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices

Deep neural network (DNN) has emerged as the most important and popular artificial intelligent (AI) technique. The growth of model size poses a key energy efficiency challenge for the underlying computing platform. Thus, model compression becomes a crucial problem. However, the current approaches are limited by various drawbacks. Specifically, network sparsification approach suffers from irregularity, heuristic nature and large indexing overhead. On the other hand, the recent structured matrix-based approach (i.e., CirCNN) is limited by the relatively complex arithmetic computation (i.e., FFT), less flexible compression ratio, and its inability to fully utilize input sparsity. To address these drawbacks, this paper proposes PermDNN, a novel approach to generate and execute hardware-friendly structured sparse DNN models using permuted diagonal matrices. Compared with unstructured sparsification approach, PermDNN eliminates the drawbacks of indexing overhead, non-heuristic compression effects and time-consuming retraining. Compared with circulant structure-imposing approach, PermDNN enjoys the benefits of higher reduction in computational complexity, flexible compression ratio, simple arithmetic computation and full utilization of input sparsity. We propose PermDNN architecture, a multi-processing element (PE) fully-connected (FC) layer-targeted computing engine. The entire architecture is highly scalable and flexible, and hence it can support the needs of different applications with different model configurations. We implement a 32-PE design using CMOS 28nm technology. Compared with EIE, PermDNN achieves 3.3x~4.8x higher throughout, 5.9x~8.5x better area efficiency and 2.8x~4.0x better energy efficiency on different workloads. Compared with CirCNN, PermDNN achieves 11.51x higher throughput and 3.89x better energy efficiency.

preprint2020arXiv

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker's voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of 109 English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100X speedup over contemporary non-universal attacks.

preprint2020arXiv

Robust Long Range Magnetic Correlation across Anti-phase Domain Boundaries in Sr$_2$CrReO$_6$

Anti-site disorder is one of the most important issues that arises in synthesis of double perovskite for spintronic applications. Although it is known that anti-site disorder leads to a proliferation of structural defects, known as the anti-phase boundaries that separate ordered anti-phase domains in the sample, little is known about the magnetic correlation across these anti-phase boundaries on a microscopic level. Motivated by this, we report resonant elastic X-ray scattering study of room temperature magnetic and structural correlation in a thin-film sample of Sr$_2$CrReO$_6$, which has one of the highest $\mathrm{T_C}$ among double perovskites. Structurally, we discovered existence of anti-phase nanodomains of $\sim$15~nm in the sample. Magnetically, the ordered moments are shown to lie perpendicular to the $c$ direction. Most remarkably, we found that the magnetic correlation length far exceeds the size of individual anti-phase nanodomains. Our results therefore provide conclusive proof for existence of robust magnetic correlation across the anti-phase boundaries in Sr$_2$CrReO$_6$.

preprint2020arXiv

Towards Playing Full MOBA Games with Deep Reinforcement Learning

MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose grand challenges to AI systems such as multi-agent, enormous state-action space, complex action control, etc. Developing AI for playing MOBA games has raised much attention accordingly. However, existing work falls short in handling the raw game complexity caused by the explosion of agent combinations, i.e., lineups, when expanding the hero pool in case that OpenAI's Dota AI limits the play to a pool of only 17 heroes. As a result, full MOBA games without restrictions are far from being mastered by any existing AI system. In this paper, we propose a MOBA AI learning paradigm that methodologically enables playing full MOBA games with deep reinforcement learning. Specifically, we develop a combination of novel and existing learning techniques, including curriculum self-play learning, policy distillation, off-policy adaption, multi-head value estimation, and Monte-Carlo tree-search, in training and playing a large pool of heroes, meanwhile addressing the scalability issue skillfully. Tested on Honor of Kings, a popular MOBA game, we show how to build superhuman AI agents that can defeat top esports players. The superiority of our AI is demonstrated by the first large-scale performance test of MOBA AI agent in the literature.

preprint2019arXiv

Dirac magnons in a honeycomb lattice quantum XY magnet CoTiO3

The discovery of massless Dirac electrons in graphene and topological Dirac-Weyl materials has prompted a broad search for bosonic analogues of such Dirac particles. Recent experiments have found evidence for Dirac magnons above an Ising-like ferromagnetic ground state in a two-dimensional (2D) kagome lattice magnet and in the van der Waals layered honeycomb crystal CrI$_3$, and in a 3D Heisenberg magnet Cu$_3$TeO$_6$. Here we report on our inelastic neutron scattering investigation on large single crystals of a stacked honeycomb lattice magnet CoTiO$_3$, which is part of a broad family of ilmenite materials. The magnetically ordered ground state of CoTiO$_3$ features ferromagnetic layers of Co$^{2+}$, stacked antiferromagnetically along the $c$-axis. We discover that the magnon dispersion relation exhibits strong easy-plane exchange anisotropy and hosts a clear gapless Dirac cone along the edge of the 3D Brillouin zone. Our results establish CoTiO$_3$ as a model pseudospin-$1/2$ material to study interacting Dirac bosons in a 3D quantum XY magnet.

preprint2016arXiv

LLR-based Successive-Cancellation List Decoder for Polar Codes with Multi-bit Decision

Due to their capacity-achieving property, polar codes have become one of the most attractive channel codes. To date, the successive cancellation list (SCL) decoding algorithm is the primary approach that can guarantee outstanding error-correcting performance of polar codes. However, the hardware designs of the original SCL decoder have large silicon area and long decoding latency. Although some recent efforts can reduce either the area or latency of SCL decoders, these two metrics still cannot be optimized at the same time. This paper, for the first time, proposes a general log-likelihood-ratio (LLR)-based SCL decoding algorithm with multi-bit decision. This new algorithm, referred as LLR-2Kb-SCL, can determine 2K bits simultaneously for arbitrary K with the use of LLR messages. In addition, a reduced-data-width scheme is presented to reduce the critical path of the sorting block. Then, based on the proposed algorithm, a VLSI architecture of the new SCL decoder is developed. Synthesis results show that for an example (1024, 512) polar code with list size 4, the proposed LLR-2Kb-SCL decoders achieve significant reduction in both area and latency as compared to prior works. As a result, the hardware efficiency of the proposed designs with K=2 and 3 are 2.33 times and 3.32 times of that of the state-of-the-art works, respectively.

preprint2015arXiv

Automatic exploration of structural regularities in networks

Complex networks provide a powerful mathematical representation of complex systems in nature and society. To understand complex networks, it is crucial to explore their internal structures, also called structural regularities. The task of network structure exploration is to determine how many groups in a complex network and how to group the nodes of the network. Most existing structure exploration methods need to specify either a group number or a certain type of structure when they are applied to a network. In the real world, however, not only the group number but also the certain type of structure that a network has are usually unknown in advance. To automatically explore structural regularities in complex networks, without any prior knowledge about the group number or the certain type of structure, we extend a probabilistic mixture model that can handle networks with any type of structure but needs to specify a group number using Bayesian nonparametric theory and propose a novel Bayesian nonparametric model, called the Bayesian nonparametric mixture (BNPM) model. Experiments conducted on a large number of networks with different structures show that the BNPM model is able to automatically explore structural regularities in networks with a stable and state-of-the-art performance.

preprint2015arXiv

Successive Cancellation Decoding of Polar Codes using Stochastic Computing

Polar codes have emerged as the most favorable channel codes for their unique capacity-achieving property. To date, numerous works have been reported for efficient design of polar codes decoder. However, these prior efforts focused on design of polar decoders via deterministic computation, while the behavior of stochastic polar decoder, which can have potential advantages such as low complexity and strong error-resilience, has not been studied in existing literatures. This paper, for the first time, investigates polar decoding using stochastic logic. Specifically, the commonly-used successive cancellation (SC) algorithm is reformulated into the stochastic form. Several methods that can potentially improve decoding performance are discussed and analyzed. Simulation results show that a stochastic SC decoder can achieve similar error-correcting performance as its deterministic counterpart. This work can pave the way for future hardware design of stochastic polar codes decoders.

preprint2014arXiv

Algorithm and Architecture for Hybrid Decoding of Polar Codes

Polar codes are the first provable capacity-achieving forward error correction (FEC) codes. In general polar codes can be decoded via either successive cancellation (SC) or belief propagation (BP) decoding algorithm. However, to date practical applications of polar codes have been hindered by the long decoding latency and limited error-correcting performance problems. In this paper, based on our recent proposed early stopping criteria for the BP algorithm, we propose a hybrid BP-SC decoding scheme to improve the decoding performance of polar codes with relatively short latency. Simulation results show that, for (1024, 512) polar codes the proposed approach leads to at least 0.2dB gain over the BP algorithm with the same maximum number of iterations for the entire SNR region, and also achieves 0.2dB decoding gain over the BP algorithm with the same worst-case latency in the high SNR region. Besides, compared to the SC algorithm, the proposed scheme leads to 0.2dB gain in the medium SNR region with much less average decoding latency. In addition, we also propose the low-complexity unified hardware architecture for the hybrid decoding scheme, which is able to implement SC and BP algorithms using same hardware.

preprint2014arXiv

Low-Latency Successive-Cancellation List Decoders for Polar Codes with Multi-bit Decision

Polar codes, as the first provable capacity-achieving error-correcting codes, have received much attention in recent years. However, the decoding performance of polar codes with traditional successive-cancellation (SC) algorithm cannot match that of the low-density parity-check (LDPC) or turbo codes. Because SC list (SCL) decoding algorithm can significantly improve the error-correcting performance of polar codes, design of SCL decoders is important for polar codes to be deployed in practical applications. However, because the prior latency reduction approaches for SC decoders are not applicable for SCL decoders, these list decoders suffer from the long latency bottleneck. In this paper, we propose a multi-bit-decision approach that can significantly reduce latency of SCL decoders. First, we present a reformulated SCL algorithm that can perform intermediate decoding of 2 bits together. The proposed approach, referred as 2-bit reformulated SCL (2b-rSCL) algorithm, can reduce the latency of SCL decoder from (3n-2) to (2n-2) clock cycles without any performance loss. Then, we extend the idea of 2-bit-decision to general case, and propose a general decoding scheme that can perform intermediate decoding of any 2K bits simultaneously. This general approach, referred as 2K-bit reformulated SCL (2Kb-rSCL) algorithm, can reduce the overall decoding latency to as short as n/2K-2-2 cycles. Furthermore, based on the proposed algorithms, VLSI architectures for 2b-rSCL and 4b-rSCL decoders are synthesized. Compared with a prior SCL decoder, the proposed (1024, 512) 2b-rSCL and 4b-rSCL decoders can achieve 21% and 60% reduction in latency, 1.66 times and 2.77 times increase in coded throughput with list size 2, and 2.11 times and 3.23 times increase in coded throughput with list size 4, respectively.

preprint2014arXiv

Overlapping community detection in signed networks

Complex networks considering both positive and negative links have gained considerable attention during the past several years. Community detection is one of the main challenges for complex network analysis. Most of the existing algorithms for community detection in a signed network aim at providing a hard-partition of the network where any node should belong to a community or not. However, they cannot detect overlapping communities where a node is allowed to belong to multiple communities. The overlapping communities widely exist in many real world networks. In this paper, we propose a signed probabilistic mixture (SPM) model for overlapping community detection in signed networks. Compared with the existing models, the advantages of our methodology are (i) providing soft-partition solutions for signed networks; (ii) providing soft-memberships of nodes. Experiments on a number of signed networks show that our SPM model: (i) can identify assortative structures or disassortative structures as the same as other state-of-the-art models; (ii) can detect overlapping communities; (iii) outperform other state-of-the-art models at shedding light on the community detection in synthetic signed networks.

preprint2014arXiv

Successive Cancellation List Polar Decoder using Log-likelihood Ratios

Successive cancellation list (SCL) decoding algorithm is a powerful method that can help polar codes achieve excellent error-correcting performance. However, the current SCL algorithm and decoders are based on likelihood or log-likelihood forms, which render high hardware complexity. In this paper, we propose a log-likelihood-ratio (LLR)-based SCL (LLR-SCL) decoding algorithm, which only needs half the computation and storage complexity than the conventional one. Then, based on the proposed algorithm, we develop low-complexity VLSI architectures for LLR-SCL decoders. Analysis results show that the proposed LLR-SCL decoder achieves 50% reduction in hardware and 98% improvement in hardware efficiency.

preprint2013arXiv

Lyapunov Functions in Piecewise Linear Systems: From Fixed Point to Limit Cycle

This paper provides a first example of constructing Lyapunov functions in a class of piecewise linear systems with limit cycles. The method of construction helps analyze and control complex oscillating systems through novel geometric means. Special attention is stressed upon a problem not formerly solved: to impose consistent boundary conditions on the Lyapunov function in each linear region. By successfully solving the problem, the authors construct continuous Lyapunov functions in the whole state space. It is further demonstrated that the Lyapunov functions constructed explain for the different bifurcations leading to the emergence of limit cycle oscillation.

preprint2012arXiv

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Computational inference of causal relationships underlying complex networks, such as gene-regulatory pathways, is NP-complete due to its combinatorial nature when permuting all possible interactions. Markov chain Monte Carlo (MCMC) has been introduced to sample only part of the combinations while still guaranteeing convergence and traversability, which therefore becomes widely used. However, MCMC is not able to perform efficiently enough for networks that have more than 15~20 nodes because of the computational complexity. In this paper, we use general purpose processor (GPP) and general purpose graphics processing unit (GPGPU) to implement and accelerate a novel Bayesian network learning algorithm. With a hash-table-based memory-saving strategy and a novel task assigning strategy, we achieve a 10-fold acceleration per iteration than using a serial GPP. Specially, we use a greedy method to search for the best graph from a given order. We incorporate a prior component in the current scoring function, which further facilitates the searching. Overall, we are able to apply this system to networks with more than 60 nodes, allowing inferences and modeling of bigger and more complex networks than current methods.

preprint2012arXiv

Exact Mapping Noisy van der Pol Type Oscillator onto Quasi-symplectic Dynamics

We find exact mappings for a class of limit cycle systems with noise onto quasi-symplectic dynamics, including a van der Pol type oscillator. A dual role potential function is obtained as a component of the quasi-symplectic dynamics. Based on a stochastic interpretation different from the traditional Ito's and Stratonovich's, we show the corresponding steady state distribution is the familiar Boltzmann-Gibbs type for arbitrary noise strength. The result provides a new angle for understanding processes without detailed balance and can be verified by experiments.

preprint2012arXiv

LAGE: A Java Framework to reconstruct Gene Regulatory Networks from Large-Scale Continues Expression Data

LAGE is a systematic framework developed in Java. The motivation of LAGE is to provide a scalable and parallel solution to reconstruct Gene Regulatory Networks (GRNs) from continuous gene expression data for very large amount of genes. The basic idea of our framework is motivated by the philosophy of divideand-conquer. Specifically, LAGE recursively partitions genes into multiple overlapping communities with much smaller sizes, learns intra-community GRNs respectively before merge them altogether. Besides, the complete information of overlapping communities serves as the byproduct, which could be used to mine meaningful functional modules in biological networks.

preprint2012arXiv

LSBN: A Large-Scale Bayesian Structure Learning Framework for Model Averaging

The motivation for this paper is to apply Bayesian structure learning using Model Averaging in large-scale networks. Currently, Bayesian model averaging algorithm is applicable to networks with only tens of variables, restrained by its super-exponential complexity. We present a novel framework, called LSBN(Large-Scale Bayesian Network), making it possible to handle networks with infinite size by following the principle of divide-and-conquer. The method of LSBN comprises three steps. In general, LSBN first performs the partition by using a second-order partition strategy, which achieves more robust results. LSBN conducts sampling and structure learning within each overlapping community after the community is isolated from other variables by Markov Blanket. Finally LSBN employs an efficient algorithm, to merge structures of overlapping communities into a whole. In comparison with other four state-of-art large-scale network structure learning algorithms such as ARACNE, PC, Greedy Search and MMHC, LSBN shows comparable results in five common benchmark datasets, evaluated by precision, recall and f-score. What's more, LSBN makes it possible to learn large-scale Bayesian structure by Model Averaging which used to be intractable. In summary, LSBN provides an scalable and parallel framework for the reconstruction of network structures. Besides, the complete information of overlapping communities serves as the byproduct, which could be used to mine meaningful clusters in biological networks, such as protein-protein-interaction network or gene regulatory network, as well as in social network.

preprint2012arXiv

Non-fixation in infinite potential

Under the effects of strong genetic drift, it is highly probable to observe gene fixation or loss in a population, shown by divergent probability density functions, or infinite adaptive peaks on a landscape. It is then interesting to ask what such infinite peaks imply, with or without combining other biological factors (e.g. mutation and selection). We study the stochastic escape time from the generated infinite adaptive peaks, and show that Kramers' classical escape formula can be extended to the non-Gaussian distribution cases. The constructed landscape provides a global description for system's middle and long term behaviors, breaking the constraints in previous methods.

preprint2012arXiv

Potential Function in a Continuous Dissipative Chaotic System: Decomposition Scheme and Role of Strange Attractor

In this paper, we demonstrate, first in literature known to us, that potential functions can be constructed in continuous dissipative chaotic systems and can be used to reveal their dynamical properties. To attain this aim, a Lorenz-like system is proposed and rigorously proved chaotic for exemplified analysis. We explicitly construct a potential function monotonically decreasing along the system's dynamics, revealing the structure of the chaotic strange attractor. The potential function can have different forms of construction. We also decompose the dynamical system to explain for the different origins of chaotic attractor and strange attractor. Consequently, reasons for the existence of both chaotic nonstrange attractors and nonchaotic strange attractors are clearly discussed within current decomposition framework.

preprint2012arXiv

Relation of a New Interpretation of Stochastic Differential Equations to Ito Process

Stochastic differential equations (SDE) are widely used in modeling stochastic dynamics in literature. However, SDE alone is not enough to determine a unique process. A specified interpretation for stochastic integration is needed. Different interpretations specify different dynamics. Recently, a new interpretation of SDE is put forward by one of us. This interpretation has a built-in Boltzmann-Gibbs distribution and shows the existence of potential function for general processes, which reveals both local and global dynamics. Despite its powerful property, its relation with classical ones in arbitrary dimension remains obscure. In this paper, we will clarify such connection and derive the concise relation between the new interpretation and Ito process. We point out that the derived relation is experimentally testable.

preprint2011arXiv

Kinetics of Muller's Ratchet from Adaptive Landscape Viewpoint

Background: The accumulation of deleterious mutations of a population directly contributes to the fate as to how long the population would exist. Muller's ratchet provides a quantitative framework to study the effect of accumulation. Adaptive landscape as a powerful concept in system biology provides a handle to describe complex and rare biological events. In this article we study the evolutionary process of a population exposed to Muller's ratchet from the new viewpoint of adaptive landscape which allows us estimate the single click of the ratchet starting with an intuitive understanding. Methods: We describe how Wright-Fisher process maps to Muller's ratchet. We analytically construct adaptive landscape from general diffusion equation. It shows that the construction is dynamical and the adaptive landscape is independent of the existence and normalization of the stationary distribution. We generalize the application of diffusion model from adaptive landscape viewpoint. Results: We develop a novel method to describe the dynamical behavior of the population exposed to Muller's ratchet, and analytically derive the decaying time of the fittest class of populations as a mean first passage time. Most importantly, we describe the absorption phenomenon by adaptive landscape, where the stationary distribution is non-normalizable. These results suggest the method may be used to understand the mechanism of populations evolution and describe the biological processes quantitatively.

preprint2011arXiv

Low-Latency SC Decoder Architectures for Polar Codes

Nowadays polar codes are becoming one of the most favorable capacity achieving error correction codes for their low encoding and decoding complexity. However, due to the large code length required by practical applications, the few existing successive cancellation (SC) decoder implementations still suffer from not only the high hardware cost but also the long decoding latency. This paper presents novel several approaches to design low-latency decoders for polar codes based on look-ahead techniques. Look-ahead techniques can be employed to reschedule the decoding process of polar decoder in numerous approaches. However, among those approaches, only well-arranged ones can achieve good performance in terms of both latency and hardware complexity. By revealing the recurrence property of SC decoding chart, the authors succeed in reducing the decoding latency by 50% with look-ahead techniques. With the help of VLSI-DSP design techniques such as pipelining, folding, unfolding, and parallel processing, methodologies for four different polar decoder architectures have been proposed to meet various application demands. Sub-structure sharing scheme has been adopted to design the merged processing element (PE) for further hardware reduction. In addition, systematic methods for construction refined pipelining decoder (2nd design) and the input generating circuits (ICG) block have been given. Detailed gate-level analysis has demonstrated that the proposed designs show latency advantages over conventional ones with similar hardware cost.

preprint2011arXiv

Reduced-Latency SC Polar Decoder Architectures

Polar codes have become one of the most favorable capacity achieving error correction codes (ECC) along with their simple encoding method. However, among the very few prior successive cancellation (SC) polar decoder designs, the required long code length makes the decoding latency high. In this paper, conventional decoding algorithm is transformed with look-ahead techniques. This reduces the decoding latency by 50%. With pipelining and parallel processing schemes, a parallel SC polar decoder is proposed. Sub-structure sharing approach is employed to design the merged processing element (PE). Moreover, inspired by the real FFT architecture, this paper presents a novel input generating circuit (ICG) block that can generate additional input signals for merged PEs on-the-fly. Gate-level analysis has demonstrated that the proposed design shows advantages of 50% decoding latency and twice throughput over the conventional one with similar hardware cost.

preprint2010arXiv

Constructive Proof of Global Lyapunov Function as Potential Function

We provide a constructive proof on the equivalence of two fundamental concepts: the global Lyapunov function in engineering and the potential function in physics, establishing a bridge between these distinct fields. This result suggests new approaches on the significant unsolved problem namely to construct Lyapunov functions for general nonlinear systems through the analogy with existing methods on potential functions. In addition, we show another connection that the Lyapunov equation is a reduced form of the generalized Einstein relation for linear systems.

Bo Yuan

What is connected

Connect this record

See the researcher in context

Building this map preview

48 published item(s)

TableVista: Benchmarking Multimodal Table Reasoning under Visual and Structural Complexity

GUAP: Graph Universal Attack Through Adversarial Patching

TDC: Towards Extremely Efficient CNNs on GPUs via Hardware-Aware Tucker Decomposition

Birds of A Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation

Catastrophic Interference in Reinforcement Learning: A Solution Based on Context Division and Knowledge Distillation

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning

Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism

Penalized Proximal Policy Optimization for Safe Reinforcement Learning

RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN

Robot Motion Planning as Video Prediction: A Spatio-Temporal Neural Network-based Motion Planner

SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe Autonomous Driving

The least-used key selection method for information retrieval in large-scale Cloud-based service repositories

Clarinet: A One-step Approach Towards Budget-friendly Unsupervised Domain Adaptation

Doubly Residual Neural Decoder: Towards Low-Complexity High-Performance Channel Decoding

Enabling Fast and Universal Audio Adversarial Attack Using Generative Model

NVAE-GAN Based Approach for Unsupervised Time Series Anomaly Detection

Bridging the Theoretical Bound and Deep Algorithms for Open Set Domain Adaptation

Classical Simulation of Quantum Supremacy Circuits

Compressing Recurrent Neural Networks Using Hierarchical Tucker Tensor Decomposition

Embedding Compression with Isotropic Iterative Quantization

How does the Combined Risk Affect the Performance of Unsupervised Domain Adaptation Approaches?

Learning from a Complementary-label Source Domain: Theory and Algorithms

Local Causal Structure Learning and its Discovery Between Type 2 Diabetes and Bone Mineral Density

PERMDNN: Efficient Compressed DNN Architecture with Permuted Diagonal Matrices

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

Robust Long Range Magnetic Correlation across Anti-phase Domain Boundaries in Sr$_2$CrReO$_6$

Towards Playing Full MOBA Games with Deep Reinforcement Learning

Dirac magnons in a honeycomb lattice quantum XY magnet CoTiO3

LLR-based Successive-Cancellation List Decoder for Polar Codes with Multi-bit Decision

Automatic exploration of structural regularities in networks

Successive Cancellation Decoding of Polar Codes using Stochastic Computing

Algorithm and Architecture for Hybrid Decoding of Polar Codes

Low-Latency Successive-Cancellation List Decoders for Polar Codes with Multi-bit Decision

Overlapping community detection in signed networks

Successive Cancellation List Polar Decoder using Log-likelihood Ratios

Lyapunov Functions in Piecewise Linear Systems: From Fixed Point to Limit Cycle

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Exact Mapping Noisy van der Pol Type Oscillator onto Quasi-symplectic Dynamics

LAGE: A Java Framework to reconstruct Gene Regulatory Networks from Large-Scale Continues Expression Data

LSBN: A Large-Scale Bayesian Structure Learning Framework for Model Averaging

Non-fixation in infinite potential

Potential Function in a Continuous Dissipative Chaotic System: Decomposition Scheme and Role of Strange Attractor

Relation of a New Interpretation of Stochastic Differential Equations to Ito Process

Kinetics of Muller's Ratchet from Adaptive Landscape Viewpoint

Low-Latency SC Decoder Architectures for Polar Codes

Reduced-Latency SC Polar Decoder Architectures

Constructive Proof of Global Lyapunov Function as Potential Function