Source author record

Bin Li

Bin Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

124works

52topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Reward-Decomposed Reinforcement Learning for Immersive Video Role-Playing

Text-based role-playing models can imitate character styles, yet they often fail to reflect a scene's atmosphere and evolving tension, both essential for immersive applications such as Virtual Reality (VR) games and interactive narratives. We study video-grounded role-playing dialogue and introduce EBM-RL (Eye-Brain-Mouth Reinforcement Learning), a decoupled GRPO-based framework that explicitly separates observation ([perception]), reasoning ([think]), and utterance ([answer]). This structure promotes human-like sensory grounding by compelling the model to first attend to visual cues, then form internal interpretations, and finally generate context-appropriate dialogue. EBM-RL integrates four complementary rewards: (i) CLIP-based scene-text alignment to improve ambiance and emotion; (ii) a Perceptual-Cognitive reward that encourages [perception] and [think] processes that increase the likelihood of the reference response; (iii) answer accuracy to ensure faithfulness; and (iv) a dense format reward to enforce the desired structured output. Extensive experiments demonstrate that EBM-RL substantially outperforms text-only role-playing baselines and larger-scale vision-language models on our immersive role-playing benchmark, delivering simultaneous gains in visual-atmosphere consistency and character authenticity. Beyond the role-playing domain, EBM-RL also exhibits strong zero-shot generalization: without any additional fine-tuning, it consistently improves performance on out-of-domain VideoQA benchmarks. We additionally release an open-source dataset for video-grounded role-playing dialogue.

preprint2024arXiv

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

Visual scenes are extremely diverse, not only because there are infinite possible combinations of objects and backgrounds but also because the observations of the same scene may vary greatly with the change of viewpoints. When observing a multi-object visual scene from multiple viewpoints, humans can perceive the scene compositionally from each viewpoint while achieving the so-called ``object constancy'' across different viewpoints, even though the exact viewpoints are untold. This ability is essential for humans to identify the same object while moving and to learn from vision efficiently. It is intriguing to design models that have a similar ability. In this paper, we consider a novel problem of learning compositional scene representations from multiple unspecified (i.e., unknown and unrelated) viewpoints without using any supervision and propose a deep generative model which separates latent representations into a viewpoint-independent part and a viewpoint-dependent part to solve this problem. During the inference, latent representations are randomly initialized and iteratively updated by integrating the information in different viewpoints with neural networks. Experiments on several specifically designed synthetic datasets have shown that the proposed method can effectively learn from multiple unspecified viewpoints.

preprint2023arXiv

Towards Optimal Tradeoff Between Data Freshness and Update Cost in Information-update Systems

In this paper, we consider a discrete-time information-update system, where a service provider can proactively retrieve information from the information source to update its data and users query the data at the service provider. One example is crowdsensing-based applications. In order to keep users satisfied, the application desires to provide users with fresh data, where the freshness is measured by the Age-of-Information (AoI). However, maintaining fresh data requires the application to update its database frequently, which incurs an update cost (e.g., incentive payment). Hence, there exists a natural tradeoff between the AoI and the update cost at the service provider who needs to make update decisions. To capture this tradeoff, we formulate an optimization problem with the objective of minimizing the total cost, which is the sum of the staleness cost (which is a function of the AoI) and the update cost. Then, we provide two useful guidelines for the design of efficient update policies. Following these guidelines and assuming that the aggregated request arrival process is Bernoulli, we prove that there exists a threshold-based policy that is optimal among all online policies and thus focus on the class of threshold-based policies. Furthermore, we derive the closed-form formula for computing the long-term average cost under any threshold-based policy and obtain the optimal threshold. Finally, we perform extensive simulations using both synthetic data and real traces to verify our theoretical results and demonstrate the superior performance of the optimal threshold-based policy compared with several baseline policies.

preprint2022arXiv

3D Perception based Imitation Learning under Limited Demonstration for Laparoscope Control in Robotic Surgery

Automatic laparoscope motion control is fundamentally important for surgeons to efficiently perform operations. However, its traditional control methods based on tool tracking without considering information hidden in surgical scenes are not intelligent enough, while the latest supervised imitation learning (IL)-based methods require expensive sensor data and suffer from distribution mismatch issues caused by limited demonstrations. In this paper, we propose a novel Imitation Learning framework for Laparoscope Control (ILLC) with reinforcement learning (RL), which can efficiently learn the control policy from limited surgical video clips. Specially, we first extract surgical laparoscope trajectories from unlabeled videos as the demonstrations and reconstruct the corresponding surgical scenes. To fully learn from limited motion trajectory demonstrations, we propose Shape Preserving Trajectory Augmentation (SPTA) to augment these data, and build a simulation environment that supports parallel RGB-D rendering to reinforce the RL policy for interacting with the environment efficiently. With adversarial training for IL, we obtain the laparoscope control policy based on the generated rollouts and surgical demonstrations. Extensive experiments are conducted in unseen reconstructed surgical scenes, and our method outperforms the previous IL methods, which proves the feasibility of our unified learning-based framework for laparoscope control.

preprint2022arXiv

A Higher-Order Semantic Dependency Parser

Higher-order features bring significant accuracy gains in semantic dependency parsing. However, modeling higher-order features with exact inference is NP-hard. Graph neural networks (GNNs) have been demonstrated to be an effective tool for solving NP-hard problems with approximate inference in many graph learning tasks. Inspired by the success of GNNs, we investigate building a higher-order semantic dependency parser by applying GNNs. Instead of explicitly extracting higher-order features from intermediate parsing graphs, GNNs aggregate higher-order information concisely by stacking multiple GNN layers. Experimental results show that our model outperforms the previous state-of-the-art parser on the SemEval 2015 Task 18 English datasets.

preprint2022arXiv

A New Perspective on Stabilizing GANs training: Direct Adversarial Training

Generative Adversarial Networks (GANs) are the most popular image generation models that have achieved remarkable progress on various computer vision tasks. However, training instability is still one of the open problems for all GAN-based algorithms. Quite a number of methods have been proposed to stabilize the training of GANs, the focuses of which were respectively put on the loss functions, regularization and normalization technologies, training algorithms, and model architectures. Different from the above methods, in this paper, a new perspective on stabilizing GANs training is presented. It is found that sometimes the images produced by the generator act like adversarial examples of the discriminator during the training process, which may be part of the reason causing the unstable training of GANs. With this finding, we propose the Direct Adversarial Training (DAT) method to stabilize the training process of GANs. Furthermore, we prove that the DAT method is able to minimize the Lipschitz constant of the discriminator adaptively. The advanced performance of DAT is verified on multiple loss functions, network architectures, hyper-parameters, and datasets. Specifically, DAT achieves significant improvements of 11.5% FID on CIFAR-100 unconditional generation based on SSGAN, 10.5% FID on STL-10 unconditional generation based on SSGAN, and 13.2% FID on LSUN-Bedroom unconditional generation based on SSGAN. Code will be available at https://github.com/iceli1007/DAT-GAN

preprint2022arXiv

ADBCMM : Acronym Disambiguation by Building Counterfactuals and Multilingual Mixing

Scientific documents often contain a large number of acronyms. Disambiguation of these acronyms will help researchers better understand the meaning of vocabulary in the documents. In the past, thanks to large amounts of data from English literature, acronym task was mainly applied in English literature. However, for other low-resource languages, this task is difficult to obtain good performance and receives less attention due to the lack of large amount of annotation data. To address the above issue, this paper proposes an new method for acronym disambiguation, named as ADBCMM, which can significantly improve the performance of low-resource languages by building counterfactuals and multilingual mixing. Specifically, by balancing data bias in low-resource langauge, ADBCMM will able to improve the test performance outside the data set. In SDU@AAAI-22 - Shared Task 2: Acronym Disambiguation, the proposed method won first place in French and Spanish. You can repeat our results here https://github.com/WENGSYX/ADBCMM.

preprint2022arXiv

Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition

Achieving highly reliable networks is essential for network operators to ensure proper packet delivery in the event of software errors or hardware failures. Networks must ensure reachability and routing correctness, such as subnet isolation and waypoint traversal. Existing work in network verification relies on centralized computation at the cost of fault tolerance, while other approaches either build an over-engineered, complex control plane, or compose multiple control planes without providing any guarantee on correctness. This paper presents Carbide, a novel system to achieve high reliability in networks through distributed verification and multiple control plane composition. The core of Carbide is a simple, generic, efficient distributed verification framework that transforms a generic network verification problem to a reachability verification problem on a directed acyclic graph (DAG), and solves the latter via an efficient distributed verification protocol (DV-protocol). Equipped with verification results, Carbide allows the systematic composition of multiple control planes and realization of operator-specified consistency. Carbide is fully implemented. Extensive experiments show that (1) Carbide reduces downtime by 43% over the most reliable individual underlying control plane, while enforcing correctness requirements on all traffic; and (2) by systematically decomposing computation to devices and pruning unnecessary messaging between devices during verification, Carbide scales to a production data center network.

preprint2022arXiv

Charactering instrumental noises and stochastic gravitational wave signals from combined time-delay interferometry

LISA will detect gravitational waves (GWs) in the milli-Hz frequency band in space. Time-delay interferometry (TDI) is developed to suppress laser frequency noise beneath the acceleration noise and optical metrology noise. To identify stochastic GW signals, it would be required to characterize these noise components entangled in TDI data streams. In this work, we investigate noises characterization by combining the first-generation TDI channels from Michelson and Relay configurations. The Michelson channels are helpful to characterize acceleration noises in the lower frequency band, and the Relay configuration could effectively resolve optical path noises in the higher frequencies. Synergy could be achieved from their combination to determine these instrumental noises. Based on the characterized noises, we further reconstruct the power spectrum of noise in the selected TDI channel. Two cases are performed to characterize the spectrum shape of a stochastic GW signal. For a modeled signal, its parameter(s) could be directly estimated from the TDI data, and its spectrum could be recovered from the inferred values. And for an unexpected signal, its spectrum may be recognized and retrieved from noise-subtracted residual in which its power spectral density surpasses the noise level.

preprint2022arXiv

Combinatorial Procurement Auction in Social Networks

This paper studies one emerging procurement auction scenario where the market is constructed over the social networks. In a social network composed of many agents, smartphones or computers, one requester releases her requirement for goods or tasks to suppliers, then suppliers who have entered the market are also encouraged to invite some other suppliers to join and all the suppliers in the network could compete for the business. The key problem for this networked auction is about how to incentivize each node who have entered the sell not only to truthfully use her full ability, but also to forward the task to her neighbours. Auctions conducting over social networks have attracted considerable interests in recent years. However, most of the existing works focus on classic forward auctions. Moreover, there is no existing valid networked auction considering multiple goods/tasks. This work is the first to explore procurement auction for both homogeneous and heterogeneous goods or tasks in social networks. From both theoretical proof and experimental simulation, we proved that the proposed mechanisms are proved to be individual-rational and incentive-compatible, also both the cost of the system and the requester could get decreased.

preprint2022arXiv

Data-Efficient Backdoor Attacks

Recent studies have proven that deep neural networks are vulnerable to backdoor attacks. Specifically, by mixing a small number of poisoned samples into the training set, the behavior of the trained model can be maliciously controlled. Existing attack methods construct such adversaries by randomly selecting some clean data from the benign set and then embedding a trigger into them. However, this selection strategy ignores the fact that each poisoned sample contributes inequally to the backdoor injection, which reduces the efficiency of poisoning. In this paper, we formulate improving the poisoned data efficiency by the selection as an optimization problem and propose a Filtering-and-Updating Strategy (FUS) to solve it. The experimental results on CIFAR-10 and ImageNet-10 indicate that the proposed method is effective: the same attack success rate can be achieved with only 47% to 75% of the poisoned sample volume compared to the random selection strategy. More importantly, the adversaries selected according to one setting can generalize well to other settings, exhibiting strong transferability. The prototype code of our method is now available at https://github.com/xpf/Data-Efficient-Backdoor-Attacks.

preprint2022arXiv

Digital Twin Assisted Task Offloading for Aerial Edge Computing and Networks

Considering the user mobility and unpredictable mobile edge computing (MEC) environments, this paper studies the intelligent task offloading problem in unmanned aerial vehicle (UAV)-enabled MEC with the assistance of digital twin (DT). We aim at minimizing the energy consumption of the entire MEC system by jointly optimizing mobile terminal users (MTUs) association, UAV trajectory, transmission power distribution and computation capacity allocation while respecting the constraints of mission maximum processing delays. Specifically, double deep Q-network (DDQN) algorithm stemming from deep reinforcement learning is first proposed to effectively solve the problem of MTUs association and UAV trajectory. Then, the closed-form expression is employed to handle the problem of transmission power distribution and the computation capacity allocation problem is further addressed via an iterative algorithm. Numerical results show that our proposed scheme is able to converge and significantly reduce the total energy consumption of the MEC system compared to the benchmark schemes.

preprint2022arXiv

Dog nose print matching with dual global descriptor based on Contrastive Learning

Recent studies in biometric-based identification tasks have shown that deep learning methods can achieve better performance. These methods generally extract the global features as descriptor to represent the original image. Nonetheless, it does not perform well for biometric identification under fine-grained tasks. The main reason is that the single image descriptor contains insufficient information to represent image. In this paper, we present a dual global descriptor model, which combines multiple global descriptors to exploit multi level image features. Moreover, we utilize a contrastive loss to enlarge the distance between image representations of confusing classes. The proposed framework achieves the top2 on the CVPR2022 Biometrics Workshop Pet Biometric Challenge. The source code and trained models are publicly available at: https://github.com/flyingsheepbin/pet-biometrics

preprint2022arXiv

Enhancing Backdoor Attacks with Multi-Level MMD Regularization

While Deep Neural Networks (DNNs) excel in many tasks, the huge training resources they require become an obstacle for practitioners to develop their own models. It has become common to collect data from the Internet or hire a third party to train models. Unfortunately, recent studies have shown that these operations provide a viable pathway for maliciously injecting hidden backdoors into DNNs. Several defense methods have been developed to detect malicious samples, with the common assumption that the latent representations of benign and malicious samples extracted by the infected model exhibit different distributions. However, a comprehensive study on the distributional differences is missing. In this paper, we investigate such differences thoroughly via answering three questions: 1) What are the characteristics of the distributional differences? 2) How can they be effectively reduced? 3) What impact does this reduction have on difference-based defense methods? First, the distributional differences of multi-level representations on the regularly trained backdoored models are verified to be significant by introducing Maximum Mean Discrepancy (MMD), Energy Distance (ED), and Sliced Wasserstein Distance (SWD) as the metrics. Then, ML-MMDR, a difference reduction method that adds multi-level MMD regularization into the loss, is proposed, and its effectiveness is testified on three typical difference-based defense methods. Across all the experimental settings, the F1 scores of these methods drop from 90%-100% on the regularly trained backdoored models to 60%-70% on the models trained with ML-MMDR. These results indicate that the proposed MMD regularization can enhance the stealthiness of existing backdoor attack methods. The prototype code of our method is now available at https://github.com/xpf/Multi-Level-MMD-Regularization.

preprint2022arXiv

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Follow-the-Regularized-Lead (FTRL) and Online Mirror Descent (OMD) are regret minimization algorithms for Online Convex Optimization (OCO), they are mathematically elegant but less practical in solving Extensive-Form Games (EFGs). Counterfactual Regret Minimization (CFR) is a technique for approximating Nash equilibria in EFGs. CFR and its variants have a fast convergence rate in practice, but their theoretical results are not satisfactory. In recent years, researchers have been trying to link CFRs with OCO algorithms, which may provide new theoretical results and inspire new algorithms. However, existing analysis is restricted to local decision points. In this paper, we show that CFRs with Regret Matching and Regret Matching+ are equivalent to special cases of FTRL and OMD, respectively. According to these equivalences, a new FTRL and a new OMD algorithm, which can be considered as extensions of vanilla CFR and CFR+, are derived. The experimental results show that the two variants converge faster than conventional FTRL and OMD, even faster than vanilla CFR and CFR+ in some EFGs.

preprint2022arXiv

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

Data-Efficient GANs (DE-GANs), which aim to learn generative models with a limited amount of training data, encounter several challenges for generating high-quality samples. Since data augmentation strategies have largely alleviated the training instability, how to further improve the generative performance of DE-GANs becomes a hotspot. Recently, contrastive learning has shown the great potential of increasing the synthesis quality of DE-GANs, yet related principles are not well explored. In this paper, we revisit and compare different contrastive learning strategies in DE-GANs, and identify (i) the current bottleneck of generative performance is the discontinuity of latent space; (ii) compared to other contrastive learning strategies, Instance-perturbation works towards latent space continuity, which brings the major improvement to DE-GANs. Based on these observations, we propose FakeCLR, which only applies contrastive learning on perturbed fake samples, and devises three related training techniques: Noise-related Latent Augmentation, Diversity-aware Queue, and Forgetting Factor of Queue. Our experimental results manifest the new state of the arts on both few-shot generation and limited-data generation. On multiple datasets, FakeCLR acquires more than 15% FID improvement compared to existing DE-GANs. Code is available at https://github.com/iceli1007/FakeCLR.

preprint2022arXiv

Graph Layer Security: Encrypting Information via Common Networked Physics

The proliferation of low-cost Internet of Things (IoT) devices has led to a race between wireless security and channel attacks. Traditional cryptography requires high-computational power and is not suitable for low-power IoT scenarios. Whist, recently developed physical layer security (PLS) can exploit common wireless channel state information (CSI), its sensitivity to channel estimation makes them vulnerable from attacks. In this work, we exploit an alternative common physics shared between IoT transceivers: the monitored channel-irrelevant physical networked dynamics (e.g., water/oil/gas/electrical signal-flows). Leveraging this, we propose for the first time, graph layer security (GLS), by exploiting the dependency in physical dynamics among network nodes for information encryption and decryption. A graph Fourier transform (GFT) operator is used to characterize such dependency into a graph-bandlimted subspace, which allows the generations of channel-irrelevant cipher keys by maximizing the secrecy rate. We evaluate our GLS against designed active and passive attackers, using IEEE 39-Bus system. Results demonstrate that, GLS is not reliant on wireless CSI, and can combat attackers that have partial networked dynamic knowledge (realistic access to full dynamic and critical nodes remains challenging). We believe this novel GLS has widespread applicability in secure health monitoring and for Digital Twins in adversarial radio environments.

preprint2022arXiv

Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression

For neural video codec, it is critical, yet challenging, to design an efficient entropy model which can accurately predict the probability distribution of the quantized latent representation. However, most existing video codecs directly use the ready-made entropy model from image codec to encode the residual or motion, and do not fully leverage the spatial-temporal characteristics in video. To this end, this paper proposes a powerful entropy model which efficiently captures both spatial and temporal dependencies. In particular, we introduce the latent prior which exploits the correlation among the latent representation to squeeze the temporal redundancy. Meanwhile, the dual spatial prior is proposed to reduce the spatial redundancy in a parallel-friendly manner. In addition, our entropy model is also versatile. Besides estimating the probability distribution, our entropy model also generates the quantization step at spatial-channel-wise. This content-adaptive quantization mechanism not only helps our codec achieve the smooth rate adjustment in single model but also improves the final rate-distortion performance by dynamic bit allocation. Experimental results show that, powered by the proposed entropy model, our neural codec can achieve 18.2% bitrate saving on UVG dataset when compared with H.266 (VTM) using the highest compression ratio configuration. It makes a new milestone in the development of neural video codec. The codes are at https://github.com/microsoft/DCVC.

preprint2022arXiv

Improve Radar Sensing Performance of Multiple Roadside Units Cooperation via Space Registration

Roadside units (RSUs) can help vehicles sense the traffic environment, so as to improve traffic safety. Since the sensing capability of single RSU is limited, we propose a multiple RSUs cooperative radar sensing network (RSU-CRSN) with signal-level fusion technique. Spatial registration is an essential prerequisite and foundation for RSU-CRSN with signal-level fusion. In this paper, we present an adjustable beam enabled spatial registration algorithm (AB-SRA) that makes the sensing area of each RSU coincide by adjusting the sensing beam width of RSU. To adjust the width of sensing beam flexibly, a beamwidth adjustable beamforming algorithm (BABA) is proposed in this paper. Simulation results show that the performance of AB-SRA is close to perfect spatial registration.

preprint2022arXiv

LAMOST MRS-N Observations of the W80 Region

The spectral observations and analysis for the W80 Region are presented by using the data of Medium-Resolution Spectroscopic Survey of Nebulae (MRS-N) with the Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST). A total of 2982 high-quality nebular spectra have been obtained in the 20 square degree field of view (FoV) which covers the W80 complex, and the largest sample of spectral data have been established for the first time. The relative intensities, radial velocities (RVs), and Full Widths at Half Maximum (FWHMs) are measured with the high spectral resolution of LAMOST MRS, for H$α$ $λ$ 6563 Å, [\ion{N}{ii}] $λ$$λ$ 6548 Å, 6584 Å\ , and [\ion{S}{ii}] $λ$$λ$ 6716 Å, 6731 Å\ emission lines. In the field of view of whole W80 Region, the strongest line emissions are found to be consistent with the bright nebulae, NGC 7000, IC 5070, and LBN 391, and weak line emissions also truly exist in the Middle Region, where no bright nebulae are detected by the wide-band optical observations. The large-scale spectral observations to the W80 Region reveal the systematic spatial variations of RVs and FWHMs, and several unique structural features. A 'curved feature' to the east of the NGC 7000, and a 'jet feature' to the west of the LBN 391 are detected to be showing with larger radial velocities. A 'wider FWHM region' is identified in the eastern part of the NGC 7000. The variations of [\ion{S}{ii}] / H$α$ ratios display a gradient from southwest to northeast in the NGC 7000 region, and manifest a ring shape around the 'W80 bubble' ionized by an O-type star in the L935. Further spectral and multi-band observations are guaranteed to investigate in detail the structural features.

preprint2022arXiv

LDoS attack detection method based on traffic time-frequency characteristics

For the traditional denial-of-service attack detection methods have complex algorithms and high computational overhead, which are difficult to meet the demand of online detection; and the experimental environment is mostly a simulation platform, which is difficult to deploy in real network environment, we propose a real network environment-oriented LDoS attack detection method based on the time-frequency characteristics of traffic data. All the traffic data flowing through the Web server is obtained through the acquisition storage system, and the detection data set is constructed using pre-processing; the simple features of the flow fragments are used as input, and the deep neural network is used to learn the time-frequency domain features of normal traffic features and generate reconstructed sequences, and the LDoS attack is discriminated based on the differences between the reconstructed sequences and the input data in the time-frequency domain. The experimental results show that the proposed method can accurately detect the attack features in the flow fragments in a very short time and achieve high detection accuracy for complex and diverse LDoS attacks; since only the statistical features of the packets are used, there is no need to parse the packet data, which can be adapted to different network environments.

preprint2022arXiv

Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions

Generalization across different environments with the same tasks is critical for successful applications of visual reinforcement learning (RL) in real scenarios. However, visual distractions -- which are common in real scenes -- from high-dimensional observations can be hurtful to the learned representations in visual RL, thus degrading the performance of generalization. To tackle this problem, we propose a novel approach, namely Characteristic Reward Sequence Prediction (CRESP), to extract the task-relevant information by learning reward sequence distributions (RSDs), as the reward signals are task-relevant in RL and invariant to visual distractions. Specifically, to effectively capture the task-relevant information via RSDs, CRESP introduces an auxiliary task -- that is, predicting the characteristic functions of RSDs -- to learn task-relevant representations, because we can well approximate the high-dimensional distributions by leveraging the corresponding characteristic functions. Experiments demonstrate that CRESP significantly improves the performance of generalization on unseen environments, outperforming several state-of-the-arts on DeepMind Control tasks with different visual distractions.

preprint2022arXiv

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Counterfactual Regret Minimization (CFR) has achieved many fascinating results in solving large-scale Imperfect Information Games (IIGs). Neural network approximation CFR (neural CFR) is one of the promising techniques that can reduce computation and memory consumption by generalizing decision information between similar states. Current neural CFR algorithms have to approximate cumulative regrets. However, efficient and accurate approximation in a large-scale IIG is still a tough challenge. In this paper, a new CFR variant, Recursive CFR (ReCFR), is proposed. In ReCFR, Recursive Substitute Values (RSVs) are learned and used to replace cumulative regrets. It is proven that ReCFR can converge to a Nash equilibrium at a rate of $O({1}/{\sqrt{T}})$. Based on ReCFR, a new model-free neural CFR with bootstrap learning, Neural ReCFR-B, is proposed. Due to the recursive and non-cumulative nature of RSVs, Neural ReCFR-B has lower-variance training targets than other neural CFRs. Experimental results show that Neural ReCFR-B is competitive with the state-of-the-art neural CFR algorithms at a much lower training cost.

preprint2022arXiv

Multi-Unit Diffusion Auctions with Intermediaries

This paper studies multi-unit auctions powered by intermediaries, where each intermediary owns a private set of unit-demand buyers and all intermediaries are networked with each other. Our goal is to incentivize the intermediaries to diffuse the auction information to individuals they can reach, including their private buyers and neighboring intermediaries, so that more potential buyers are able to participate in the auction. To this end, we build a diffusion-based auction framework which incorporates the strategic interaction of intermediaries. It is showed that the classic Vickrey-Clarke-Groves (VCG) mechanism within the framework can achieve the maximum social welfare, but it may decrease the seller's revenue or even lead to a deficit. To overcome the revenue issue, we propose a novel auction, called critical neighborhood auction, which not only maximizes the social welfare, but also improves the seller's revenue comparing to the VCG mechanism with/without intermediaries.

preprint2022arXiv

MVD: Memory-Related Vulnerability Detection Based on Flow-Sensitive Graph Neural Networks

Memory-related vulnerabilities constitute severe threats to the security of modern software. Despite the success of deep learning-based approaches to generic vulnerability detection, they are still limited by the underutilization of flow information when applied for detecting memory-related vulnerabilities, leading to high false positives. In this paper,we propose MVD, a statement-level Memory-related Vulnerability Detection approach based on flow-sensitive graph neural networks (FS-GNN). FS-GNN is employed to jointly embed both unstructured information (i.e., source code) and structured information (i.e., control- and data-flow) to capture implicit memory-related vulnerability patterns. We evaluate MVD on the dataset which contains 4,353 real-world memory-related vulnerabilities, and compare our approach with three state-of-the-art deep learning-based approaches as well as five popular static analysisbased memory detectors. The experiment results show that MVD achieves better detection accuracy, outperforming both state-of-theart DL-based and static analysis-based approaches. Furthermore, MVD makes a great trade-off between accuracy and efficiency.

preprint2022arXiv

Neural Compression-Based Feature Learning for Video Restoration

How to efficiently utilize the temporal features is crucial, yet challenging, for video restoration. The temporal features usually contain various noisy and uncorrelated information, and they may interfere with the restoration of the current frame. This paper proposes learning noise-robust feature representations to help video restoration. We are inspired by that the neural codec is a natural denoiser. In neural codec, the noisy and uncorrelated contents which are hard to predict but cost lots of bits are more inclined to be discarded for bitrate saving. Therefore, we design a neural compression module to filter the noise and keep the most useful information in features for video restoration. To achieve robustness to noise, our compression module adopts a spatial channel-wise quantization mechanism to adaptively determine the quantization step size for each position in the latent. Experiments show that our method can significantly boost the performance on video denoising, where we obtain 0.13 dB improvement over BasicVSR++ with only 0.23x FLOPs. Meanwhile, our method also obtains SOTA results on video deraining and dehazing.

preprint2022arXiv

New Massive Contact Twin Binary in a Radio-quiet HII Region Associated with the M17 Complex

Early-B stars may create an HII region that appears as radio-quiet. We report the identification of new early-B stars associated with the radio-quiet HII region G014.645--00.606 in the M17 complex. The ratio-quiet HII region G014.645--00.606 is adjacent to three radio-quiet WISE HII region candidates. The ionizing sources of the radio-quiet HII regions are expected to later than B1V, given the sensitivity about 1-2 mJy of the MAGPIS 20 cm survey. The stars were first selected if their parallaxes of GAIA EDR3 match that of the 22 GHz H$_2$O maser source within the same region. We used the color-magnitude diagram made from the ZTF photometric catalog to select the candidates for massive stars because the intrinsic $g-r$ colors of massive stars change little from B-type to O-type stars. Five stars lie in the areas of the color-magnitude diagram where either reddened massive stars or evolved post-main sequence stars of lower masses are commonly found. Three of the five stars, sources 1, 2, and 3, are located at the cavities of the three IR bubbles, and extended H$α$ emission is detected around the three IR bubbles. We suggest that sources 1, 2, and 3 are candidates for early-B stars associated with the radio-quiet region G014.645--00.606. Particularly, source 1 is an EW type eclipsing binary with a short period of 0.825 day, while source 2 is an EA type eclipsing binary with a short period of 0.919 day. The physical parameters of the two binary systems have been derived through the PHOEBE model. Source 1 is a twin binary of two stars with T~23,500 K, and source 2 contains a hotter component (T~20,100 K) and a cooler one (T~15,500 K). The $O-C$ values of source 1 show a trend of decline, implying that the period of the source is deceasing. Source 1 is likely a contacting early-B twin binary, for which mass transfer might cause its orbit to shrink.

preprint2022arXiv

Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction

Previous works on human motion prediction follow the pattern of building a mapping relation between the sequence observed and the one to be predicted. However, due to the inherent complexity of multivariate time series data, it still remains a challenge to find the extrapolation relation between motion sequences. In this paper, we present a new prediction pattern, which introduces previously overlooked human poses, to implement the prediction task from the view of interpolation. These poses exist after the predicted sequence, and form the privileged sequence. To be specific, we first propose an InTerPolation learning Network (ITP-Network) that encodes both the observed sequence and the privileged sequence to interpolate the in-between predicted sequence, wherein the embedded Privileged-sequence-Encoder (Priv-Encoder) learns the privileged knowledge (PK) simultaneously. Then, we propose a Final Prediction Network (FP-Network) for which the privileged sequence is not observable, but is equipped with a novel PK-Simulator that distills PK learned from the previous network. This simulator takes as input the observed sequence, but approximates the behavior of Priv-Encoder, enabling FP-Network to imitate the interpolation process. Extensive experimental results demonstrate that our prediction pattern achieves state-of-the-art performance on benchmarked H3.6M, CMU-Mocap and 3DPW datasets in both short-term and long-term predictions.

preprint2022arXiv

Phase Transitions and Superconductivity in Ternary Hydride Li$_2$SiH$_6$ at High Pressures

We predicted a new ternary hydride Li$_2$SiH$_6$ at high pressures. A systematic structure search in Li$_2$SiH$_6$ compound reveals novel stable phases with intriguing electronic and phonon properties. It is found that Li$_2$SiH$_6$ is dynamically stable from ambient pressure up to 400 GPa with three novel phases: P312, P$\bar{3}$, and P$\bar{6}$2m. The calculation of electron-phonon coupling combined with Bardeen-Cooper-Schrieffer's argument indicates that this compound may be a candidate for high $T_c$ superconductors under high pressures. In particular, the maximum $T_c$ of $P\bar{6}2m$-Li$_2$SiH$_6$ at 400 GPa reaches 56 K. These findings may pave the way for obtaining room temperature superconductors in dense hydrogen-rich compounds.

preprint2022arXiv

Prompt-based System for Personality and Interpersonal Reactivity Prediction

This paper describes our proposed method for the Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA) 2022 shared task on Personality Prediction (PER) and Reactivity Index Prediction (IRI). In this paper, we adopt the prompt-based learning method with the pre-trained language model to accomplish these tasks. Specifically, the prompt is designed to provide knowledge of the extra personalized information for enhancing the pre-trained model. Data augmentation and model ensemble are adopted for obtaining better results. Moreover, we also provided the online software demonstration and the codes of the software for further research.

preprint2022arXiv

Remote blood pressure measurement via spatiotemporal mapping of a short-time facial video

Blood pressure (BP) monitoring is vital in daily healthcare, especially for cardiovascular diseases. However, BP values are mainly acquired through the contact sensing method, which is inconvenient and unfriendly to continuous BP measurement. Hence, we propose an efficient end-to-end network to estimate the BP values from a facial video to achieve remote BP measurement in daily life. In this study, we first derived a Spatial-temporal map of a short-time (~15s) facial video. According to the Spatial-temporal map, we then regressed the BP ranges by a designed blood pressure classifier and simultaneously calculated the specific value by a blood pressure calculator in each BP range. In addition, we also developed an innovative oversampling training strategy to handle the unbalanced data distribution problem. Finally, we trained the proposed network on a private dataset ASPD and tested it on the popular dataset MMSE-HR. As a result, the proposed network achieved a state-of-the-art MAE of 12.35 mmHg and 9.5 mmHg on systolic and diastolic BP measurements, which is better than the recent works. It concludes that the proposed method has excellent potential for camera-based BP monitoring in real-world scenarios.

preprint2022arXiv

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

This paper introduces the schemes of Team LingJing's experiments in NLPCC-2022-Shared-Task-4 Multi-modal Dialogue Understanding and Generation (MDUG). The MDUG task can be divided into two phases: multi-modal context understanding and response generation. To fully leverage the visual information for both scene understanding and dialogue generation, we propose the scene-aware prompt for the MDUG task. Specifically, we utilize the multi-tasking strategy for jointly modelling the scene- and session- multi-modal understanding. The visual captions are adopted to aware the scene information, while the fixed-type templated prompt based on the scene- and session-aware labels are used to further improve the dialogue generation performance. Extensive experimental results show that the proposed method has achieved state-of-the-art (SOTA) performance compared with other competitive methods, where we rank the 1-st in all three subtasks in this MDUG competition.

preprint2022arXiv

Secure UAV-to-Ground MIMO Communications: Joint Transceiver and Location Optimization

Unmanned aerial vehicles (UAVs) are foreseen to constitute promising airborne communication devices as a benefit of their superior channel quality. But UAV-to-ground (U2G) communications are vulnerable to eavesdropping. Hence, we conceive a sophisticated physical layer security solution for improving the secrecy rate of multi-antenna aided U2G systems. Explicitly, the secrecy rate of the U2G MIMO wiretap channels is derived by using random matrix theory. The resultant explicit expression is then applied in the joint optimization of the MIMO transceiver and the UAV location relying on an alternating optimization technique. Our numerical results show that the joint transceiver and location optimization conceived facilitates secure communications even in the challenging scenario, where the legitimate channel of confidential information is inferior to the eavesdropping channel.

preprint2022arXiv

Self-Adversarial Training incorporating Forgery Attention for Image Forgery Localization

Image editing techniques enable people to modify the content of an image without leaving visual traces and thus may cause serious security risks. Hence the detection and localization of these forgeries become quite necessary and challenging. Furthermore, unlike other tasks with extensive data, there is usually a lack of annotated forged images for training due to annotation difficulties. In this paper, we propose a self-adversarial training strategy and a reliable coarse-to-fine network that utilizes a self-attention mechanism to localize forged regions in forgery images. The self-attention module is based on a Channel-Wise High Pass Filter block (CW-HPF). CW-HPF leverages inter-channel relationships of features and extracts noise features by high pass filters. Based on the CW-HPF, a self-attention mechanism, called forgery attention, is proposed to capture rich contextual dependencies of intrinsic inconsistency extracted from tampered regions. Specifically, we append two types of attention modules on top of CW-HPF respectively to model internal interdependencies in spatial dimension and external dependencies among channels. We exploit a coarse-to-fine network to enhance the noise inconsistency between original and tampered regions. More importantly, to address the issue of insufficient training data, we design a self-adversarial training strategy that expands training data dynamically to achieve more robust performance. Specifically, in each training iteration, we perform adversarial attacks against our network to generate adversarial examples and train our model on them. Extensive experimental results demonstrate that our proposed algorithm steadily outperforms state-of-the-art methods by a clear margin in different benchmark datasets.

preprint2022arXiv

Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis

Universal style transfer (UST) infuses styles from arbitrary reference images into content images. Existing methods, while enjoying many practical successes, are unable of explaining experimental observations, including different performances of UST algorithms in preserving the spatial structure of content images. In addition, methods are limited to cumbersome global controls on stylization, so that they require additional spatial masks for desired stylization. In this work, we provide a systematic Fourier analysis on a general framework for UST. We present an equivalent form of the framework in the frequency domain. The form implies that existing algorithms treat all frequency components and pixels of feature maps equally, except for the zero-frequency component. We connect Fourier amplitude and phase with Gram matrices and a content reconstruction loss in style transfer, respectively. Based on such equivalence and connections, we can thus interpret different structure preservation behaviors between algorithms with Fourier phase. Given the interpretations we have, we propose two manipulations in practice for structure preservation and desired stylization. Both qualitative and quantitative experiments demonstrate the competitive performance of our method against the state-of-the-art methods. We also conduct experiments to demonstrate (1) the abovementioned equivalence, (2) the interpretability based on Fourier amplitude and phase and (3) the controllability associated with frequency components.

preprint2022arXiv

The Data Processing of the LAMOST Medium-Resolution Spectral Survey of Galactic Nebulae (LAMOST MRS-N Pipeline)

The Large sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) medium-resolution spectral survey of Galactic Nebulae (MRS-N) has conducted for three years since Sep. 2018 and observed more than 190 thousands nebular spectra and 20 thousands stellar spectra. However, there is not yet a data processing pipeline for nebular data. To significantly improve the accuracy of nebulae classification and their physical parameters, we developed the MRS-N Pipeline. This article presented in detail each data processing step of the MRS-N Pipeline, such as removing cosmic rays, merging single exposure, fitting sky light emission lines, subtracting skylight, wavelength recalibration, measuring nebular parameters, creating catalogs and packing spectra. Finally, a description of the data products, including nebular spectra files and parameter catalogs, is provided.

preprint2022arXiv

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

Despite achieving great success, Deep Neural Networks (DNNs) are vulnerable to adversarial examples. How to accurately evaluate the adversarial robustness of DNNs is critical for their deployment in real-world applications. An ideal indicator of robustness is adversarial risk. Unfortunately, since it involves maximizing the 0-1 loss, calculating the true risk is technically intractable. The most common solution for this is to compute an approximate risk by replacing the 0-1 loss with a surrogate one. Some functions have been used, such as Cross-Entropy (CE) loss and Difference of Logits Ratio (DLR) loss. However, these functions are all manually designed and may not be well suited for adversarial robustness evaluation. In this paper, we leverage AutoML to tighten the error (gap) between the true and approximate risks. Our main contributions are as follows. First, AutoLoss-AR, the first method to search for surrogate losses for adversarial risk, with an elaborate search space, is proposed. The experimental results on 10 adversarially trained models demonstrate the effectiveness of the proposed method: the risks evaluated using the best-discovered losses are 0.2% to 1.6% better than those evaluated using the handcrafted baselines. Second, 5 surrogate losses with clean and readable formulas are distilled out and tested on 7 unseen adversarially trained models. These losses outperform the baselines by 0.8% to 2.4%, indicating that they can be used individually as some kind of new knowledge. Besides, the possible reasons for the better performance of these losses are explored.

preprint2022arXiv

Universal Polar Coding for Parallel Gaussian Channels with Non-Binary Inputs and Its Applications to HARQ and MIMO

In this paper, we first propose an universal polar coding scheme for parallel Gaussian channels with non-binary inputs. It is assumed that the encoder knows only the sum capacity of M parallel channels instead of the capacity of any single channel. By decomposing each parallel channel into T = [log2r] sub channels, we therefore obtain MT binary sub-channels. A super polar coding scheme that across all sub-channels is then proposed. This scheme can achieve the sum capacity when the block length is sufficiently large. We have also discussed the applications of parallel polar coding design for both the HARQ and MIMO systems. It is shown that a capacity-achieving HARQ scheme can be obtained for block fading channel and a capacity-achieving MIMO design that requires only the feedback of the sum rate of all MIMO layers can also be attained.

preprint2022arXiv

Waiting but not Aging: Optimizing Information Freshness Under the Pull Model

The Age-of-Information is an important metric for investigating the timeliness performance in information-update systems. In this paper, we study the AoI minimization problem under a new Pull model with replication schemes, where a user proactively sends a replicated request to multiple servers to "pull" the information of interest. Interestingly, we find that under this new Pull model, replication schemes capture a novel tradeoff between different values of the AoI across the servers (due to the random updating processes) and different response times across the servers, which can be exploited to minimize the expected AoI at the user's side. Specifically, assuming Poisson updating process for the servers and exponentially distributed response time, we derive a closed-form formula for computing the expected AoI and obtain the optimal number of responses to wait for to minimize the expected AoI. Then, we extend our analysis to the setting where the user aims to maximize the AoI-based utility, which represents the user's satisfaction level with respect to the freshness of the received information. Furthermore, we consider a more realistic scenario where the user has no prior knowledge of the system. In this case, we reformulate the utility maximization problem as a stochastic Multi-Armed Bandit problem with side observations and leverage a special linear structure of side observations to design learning algorithms with improved performance guarantees. Finally, we conduct extensive simulations to elucidate our theoretical results and compare the performance of different algorithms. Our findings reveal that under the Pull model, waiting does not necessarily lead to aging; waiting for more than one response can often significantly reduce the AoI and improve the AoI-based utility in most scenarios.

preprint2021arXiv

A Generic Object Re-identification System for Short Videos

Short video applications like TikTok and Kwai have been a great hit recently. In order to meet the increasing demands and take full advantage of visual information in short videos, objects in each short video need to be located and analyzed as an upstream task. A question is thus raised -- how to improve the accuracy and robustness of object detection, tracking, and re-identification across tons of short videos with hundreds of categories and complicated visual effects (VFX). To this end, a system composed of a detection module, a tracking module and a generic object re-identification module, is proposed in this paper, which captures features of major objects from short videos. In particular, towards the high efficiency demands in practical short video application, a Temporal Information Fusion Network (TIFN) is proposed in the object detection module, which shows comparable accuracy and improved time efficiency to the state-of-the-art video object detector. Furthermore, in order to mitigate the fragmented issue of tracklets in short videos, a Cross-Layer Pointwise Siamese Network (CPSN) is proposed in the tracking module to enhance the robustness of the appearance model. Moreover, in order to evaluate the proposed system, two challenge datasets containing real-world short videos are built for video object trajectory extraction and generic object re-identification respectively. Overall, extensive experiments for each module and the whole system demonstrate the effectiveness and efficiency of our system.

preprint2021arXiv

Bayesian Nonparametric Space Partitions: A Survey

Bayesian nonparametric space partition (BNSP) models provide a variety of strategies for partitioning a $D$-dimensional space into a set of blocks. In this way, the data points lie in the same block would share certain kinds of homogeneity. BNSP models can be applied to various areas, such as regression/classification trees, random feature construction, relational modeling, etc. In this survey, we investigate the current progress of BNSP research through the following three perspectives: models, which review various strategies for generating the partitions in the space and discuss their theoretical foundation `self-consistency'; applications, which cover the current mainstream usages of BNSP models and their potential future practises; and challenges, which identify the current unsolved problems and valuable future research topics. As there are no comprehensive reviews of BNSP literature before, we hope that this survey can induce further exploration and exploitation on this topic.

preprint2021arXiv

Efficient Learning-based Scheduling for Information Freshness in Wireless Networks

Motivated by the recent trend of integrating artificial intelligence into the Internet-of-Things (IoT), we consider the problem of scheduling packets from multiple sensing sources to a central controller over a wireless network. Here, packets from different sensing sources have different values or degrees of importance to the central controller for intelligent decision making. In such a setup, it is critical to provide timely and valuable information for the central controller. In this paper, we develop a parameterized maximum-weight type scheduling policy that combines both the AoI metrics and Upper Confidence Bound (UCB) estimates in its weight measure with parameter $η$. Here, UCB estimates balance the tradeoff between exploration and exploitation in learning and are critical for yielding a small cumulative regret. We show that our proposed algorithm yields the running average total age at most by $O(N^2η)$. We also prove that our proposed algorithm achieves the cumulative regret over time horizon $T$ at most by $O(NT/η+\sqrt{NT\log T})$. This reveals a tradeoff between the cumulative regret and the running average total age: when increasing $η$, the cumulative regret becomes smaller, but is at the cost of increasing running average total age. Simulation results are provided to evaluate the efficiency of our proposed algorithm.

preprint2021arXiv

Image Steganography based on Iteratively Adversarial Samples of A Synchronized-directions Sub-image

Nowadays a steganography has to face challenges of both feature based staganalysis and convolutional neural network (CNN) based steganalysis. In this paper, we present a novel steganography scheme denoted as ITE-SYN (based on ITEratively adversarial perturbations onto a SYNchronized-directions sub-image), by which security data is embedded with synchronizing modification directions to enhance security and then iteratively increased perturbations are added onto a sub-image to reduce loss with cover class label of the target CNN classifier. Firstly an exist steganographic function is employed to compute initial costs. Then the cover image is decomposed into some non-overlapped sub-images. After each sub-image is embedded, costs will be adjusted following clustering modification directions profile. And then the next sub-image will be embedded with adjusted costs until all secret data has been embedded. If the target CNN classifier does not discriminate the stego image as a cover image, based on adjusted costs, we change costs with adversarial manners according to signs of gradients back-propagated from the CNN classifier. And then a sub-image is chosen to be re-embedded with changed costs. Adversarial intensity will be iteratively increased until the adversarial stego image can fool the target CNN classifier. Experiments demonstrate that the proposed method effectively enhances security to counter both conventional feature-based classifiers and CNN classifiers, even other non-target CNN classifiers.

preprint2021arXiv

Infant Cry Classification with Graph Convolutional Networks

We propose an approach of graph convolutional networks for robust infant cry classification. We construct non-fully connected graphs based on the similarities among the relevant nodes in both supervised and semi-supervised node classification with convolutional neural networks to consider the short-term and long-term effects of infant cry signals related to inner-class and inter-class messages. The approach captures the diversity of variations within infant cries, especially for limited training samples. The effectiveness of this approach is evaluated on Baby Chillanto Database and Baby2020 database. With as limited as 20% of labeled training data, our model outperforms that of CNN model with 80% labeled training data and the accuracy stably improves as the number of labeled training samples increases. The best results give significant improvements of 7.36% and 3.59% compared with the results of the CNN models on Baby Chillanto database and Baby2020 database respectively.

preprint2021arXiv

More but Correct: Generating Diversified and Entity-revised Medical Response

Medical Dialogue Generation (MDG) is intended to build a medical dialogue system for intelligent consultation, which can communicate with patients in real-time, thereby improving the efficiency of clinical diagnosis with broad application prospects. This paper presents our proposed framework for the Chinese MDG organized by the 2021 China conference on knowledge graph and semantic computing (CCKS) competition, which requires generating context-consistent and medically meaningful responses conditioned on the dialogue history. In our framework, we propose a pipeline system composed of entity prediction and entity-aware dialogue generation, by adding predicted entities to the dialogue model with a fusion mechanism, thereby utilizing information from different sources. At the decoding stage, we propose a new decoding mechanism named Entity-revised Diverse Beam Search (EDBS) to improve entity correctness and promote the length and quality of the final response. The proposed method wins both the CCKS and the International Conference on Learning Representations (ICLR) 2021 Workshop Machine Learning for Preventing and Combating Pandemics (MLPCP) Track 1 Entity-aware MED competitions, which demonstrate the practicality and effectiveness of our method.

preprint2021arXiv

Protonation-induced discrete superconducting phases in bulk FeSe single crystals

The superconducting transition temperature, $T_{\rm{c}}$, of FeSe can be significantly enhanced several-fold by applying pressure, electron doping, intercalating spacing layer, and reducing dimensionality. Various ordered electronic phases, such as nematicity and spin density waves, have also been observed accompanying high-$T_{\rm{c}}$ superconductivity. Investigation on the evolution of the electronic structure with $T_{\rm{c}}$ is essential to understanding electronic behavior and high-$T_{\rm{c}}$ superconductivity in FeSe and its derived superconductors. In this report, we have found a series of discrete superconducting phases, with a maximum $T_{\rm{c}}$ up to 44 K, in H$^+$-intercalated FeSe single crystals using an ionic liquid gating method. Accompanied with the increase of $T_{\rm{c}}$, suppression of the nematic phase and evolution from non-Fermi-liquid to Fermi-liquid behavior was observed. An abrupt change in the Fermi surface topology was proposed to explain the discrete superconducting phases. A band structure that favors the high-$T_{\rm{c}}$ superconducting phase was also revealed.

preprint2021arXiv

Quantum versus Classical Regime in Circuit Quantum Acoustodynamics

We experimentally study a circuit quantum acoustodynamics system, which consists of a superconducting artificial atom, coupled to both a two-dimensional surface acoustic wave resonator and a one-dimensional microwave transmission line. The strong coupling between the artificial atom and the acoustic wave resonator is confirmed by the observation of the vacuum Rabi splitting at the base temperature of dilution refrigerator. We show that the propagation of microwave photons in the microwave transmission line can be controlled by a few phonons in the acoustic wave resonator. Furthermore, we demonstrate the temperature effect on the measurements of the Rabi splitting and temperature induced transitions from high excited dressed states. We find that the spectrum structure of two-peak for the Rabi splitting becomes into those of several peaks, and gradually disappears with the increase of the environmental temperature $T$. The quantum-to-classical transition is observed around the crossover temperature $T_{c}$, which is determined via the thermal fluctuation energy $k_{B}T$ and the characteristic energy level spacing of the coupled system. Experimental results agree well with the theoretical simulations via the master equation of the coupled system at different effective temperatures.

preprint2021arXiv

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Most learning-based methods estimate ego-motion by utilizing visual sensors, which suffer from dramatic lighting variations and textureless scenarios. In this paper, we incorporate sparse but accurate depth measurements obtained from lidars to overcome the limitation of visual methods. To this end, we design a self-supervised visual-lidar odometry (Self-VLO) framework. It takes both monocular images and sparse depth maps projected from 3D lidar points as input, and produces pose and depth estimations in an end-to-end learning manner, without using any ground truth labels. To effectively fuse two modalities, we design a two-pathway encoder to extract features from visual and depth images and fuse the encoded features with those in decoders at multiple scales by our fusion module. We also adopt a siamese architecture and design an adaptively weighted flip consistency loss to facilitate the self-supervised learning of our VLO. Experiments on the KITTI odometry benchmark show that the proposed approach outperforms all self-supervised visual or lidar odometries. It also performs better than fully supervised VOs, demonstrating the power of fusion.

preprint2021arXiv

Serial-parallel Multi-Scale Feature Fusion for Anatomy-Oriented Hand Joint Detection

Accurate hand joints detection from images is a fundamental topic which is essential for many applications in computer vision and human computer interaction. This paper presents a two stage network for hand joints detection from single unmarked image by using serial-parallel multi-scale feature fusion. In stage I, the hand regions are located by a pre-trained network, and the features of each detected hand region are extracted by a shallow spatial hand features representation module. The extracted hand features are then fed into stage II, which consists of serially connected feature extraction modules with similar structures, called "multi-scale feature fusion" (MSFF). A MSFF contains parallel multi-scale feature extraction branches, which generate initial hand joint heatmaps. The initial heatmaps are then mutually reinforced by the anatomic relationship between hand joints. The experimental results on five hand joints datasets show that the proposed network overperforms the state-of-the-art methods.

preprint2021arXiv

The intrinsic structure of Sagittarius A* at 1.3 cm and 7 mm

Sagittarius A* (Sgr A*), the Galactic Center supermassive black hole (SMBH), is one of the best targets to resolve the innermost region of SMBH with very long baseline interferometry (VLBI). In this study, we have carried out observations toward Sgr A* at 1.349 cm (22.223 GHz) and 6.950 mm (43.135 GHz) with the East Asian VLBI Network, as a part of the multi-wavelength campaign of the Event Horizon Telescope (EHT) in 2017 April. To mitigate scattering effects, the physically motivated scattering kernel model from Psaltis et al. (2018) and the scattering parameters from Johnson et al. (2018) have been applied. As a result, a single, symmetric Gaussian model well describes the intrinsic structure of Sgr A* at both wavelengths. From closure amplitudes, the major-axis sizes are ~704$\pm$102 $μ$as (axial ratio $\sim$1.19$^{+0.24}_{-0.19}$) and $\sim$300$\pm$25 $μ$as (axial ratio $\sim$1.28$\pm$0.2) at 1.349 cm and 6.95 mm respectively. Together with a quasi-simultaneous observation at 3.5 mm (86 GHz) by Issaoun et al. (2019), we show that the intrinsic size scales with observing wavelength as a power-law, with an index $\sim$1.2$\pm$0.2. Our results also provide estimates of the size and compact flux density at 1.3 mm, which can be incorporated into the analysis of the EHT observations. In terms of the origin of radio emission, we have compared the intrinsic structures with the accretion flow scenario, especially the radiatively inefficient accretion flow based on the Keplerian shell model. With this, we show that a nonthermal electron population is necessary to reproduce the source sizes.

preprint2021arXiv

Understanding the Error in Evaluating Adversarial Robustness

Deep neural networks are easily misled by adversarial examples. Although lots of defense methods are proposed, many of them are demonstrated to lose effectiveness when against properly performed adaptive attacks. How to evaluate the adversarial robustness effectively is important for the realistic deployment of deep models, but yet still unclear. To provide a reasonable solution, one of the primary things is to understand the error (or gap) between the true adversarial robustness and the evaluated one, what is it and why it exists. Several works are done in this paper to make it clear. Firstly, we introduce an interesting phenomenon named gradient traps, which lead to incompetent adversaries and are demonstrated to be a manifestation of evaluation error. Then, we analyze the error and identify that there are three components. Each of them is caused by a specific compromise. Moreover, based on the above analysis, we present our evaluation suggestions. Experiments on adversarial training and its variations indicate that: (1) the error does exist empirically, and (2) these defenses are still vulnerable. We hope these analyses and results will help the community to develop more powerful defenses.

preprint2020arXiv

A Fast Recursive Algorithm for G-STBC

This paper proposes a fast recursive algorithm for Group-wise Space-Time Block Code (G-STBC), which takes full advantage of the Alamouti structure in the equivalent channel matrix to reduce the computational complexity. With respect to the existing efficient algorithms for G-STBC, the proposed algorithm achieves better performance and usually requires less computational complexity.

preprint2020arXiv

An Improved Square-root Algorithm for V-BLAST Based on Efficient Inverse Cholesky Factorization

A fast algorithm for inverse Cholesky factorization is proposed, to compute a triangular square-root of the estimation error covariance matrix for Vertical Bell Laboratories Layered Space-Time architecture (V-BLAST). It is then applied to propose an improved square-root algorithm for V-BLAST, which speedups several steps in the previous one, and can offer further computational savings in MIMO Orthogonal Frequency Division Multiplexing (OFDM) systems. Compared to the conventional inverse Cholesky factorization, the proposed one avoids the back substitution (of the Cholesky factor), and then requires only half divisions. The proposed V-BLAST algorithm is faster than the existing efficient V-BLAST algorithms. The expected speedups of the proposed square-root V-BLAST algorithm over the previous one and the fastest known recursive V-BLAST algorithm are 3.9~5.2 and 1.05~1.4, respectively.

preprint2020arXiv

Bulk Superconductivity in the Dirac Semimetal TlSb

A feasible strategy to realize the Majorana fermions is searching for a simple compound with both bulk superconductivity and Dirac surface states. In this paper, we performed calculations of electronic band structure, the Fermi surface and surface states, as well as measured the resistivity, magnetization, specific heat for TlSb compound with a CsCl-type structure. The band structure calculations show that TlSb is a Dirac semimetal when spin-orbit coupling is taken into account. Meanwhile, we first found that TlSb is a type-II superconductor with $T_c$ = 4.38 K, $H_{c1}$(0) = 148 Oe, $H_{c2}$(0) = 1.12 T and $κ_{GL}$ = 10.6, and confirmed it to be a moderately coupled s-wave superconductor. Although we can not determine which bands near the Fermi level $E_F$ to be responsible for superconductivity, its coexistence with the topological surface states implies that TlSb compound may be a simple material platform to realize the fault-tolerant quantum computations.

preprint2020arXiv

CALPA-NET: Channel-pruning-assisted Deep Residual Network for Steganalysis of Digital Images

Over the past few years, detection performance improvements of deep-learning based steganalyzers have been usually achieved through structure expansion. However, excessive expanded structure results in huge computational cost, storage overheads, and consequently difficulty in training and deployment. In this paper we propose CALPA-NET, a ChAnneL-Pruning-Assisted deep residual network architecture search approach to shrink the network structure of existing vast, over-parameterized deep-learning based steganalyzers. We observe that the broad inverted-pyramid structure of existing deep-learning based steganalyzers might contradict the well-established model diversity oriented philosophy, and therefore is not suitable for steganalysis. Then a hybrid criterion combined with two network pruning schemes is introduced to adaptively shrink every involved convolutional layer in a data-driven manner. The resulting network architecture presents a slender bottleneck-like structure. We have conducted extensive experiments on BOSSBase+BOWS2 dataset, more diverse ALASKA dataset and even a large-scale subset extracted from ImageNet CLS-LOC dataset. The experimental results show that the model structure generated by our proposed CALPA-NET can achieve comparative performance with less than two percent of parameters and about one third FLOPs compared to the original steganalytic model. The new model possesses even better adaptivity, transferability, and scalability.

preprint2020arXiv

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

Portfolio Selection is an important real-world financial task and has attracted extensive attention in artificial intelligence communities. This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs. Most existing methods adopt handcraft features and/or consider no constraints for the costs, which may make them perform unsatisfactorily and fail to control both costs in practice. In this paper, we propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations, while a new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning. We theoretically analyze the near-optimality of the proposed reward, which shows that the growth rate of the policy regarding this reward function can approach the theoretical optimum. We also empirically evaluate the proposed method on real-world datasets. Promising results demonstrate the effectiveness and superiority of the proposed method in terms of profitability, cost-sensitivity and representation abilities.

preprint2020arXiv

Crop Water Status Monitoring by Terahertz Imaging

We demonstrate the reliability and applicability of THz imaging for destructive and non-destructive water status monitoring of winter wheat leaves. Based on the measured THz transmission amplitude, we find that the water loss in the distal region is less than that in the basal region during the nature dehydration process. A high correlation is shown between the transmitted THz signal and water content level measured by gravimetric weighing method during dehydration and after rehydration. The obtained results show that the water content in winter wheat leaves can be measured destructively and non-destructively with a high accuracy, using terahertz waves in a transmission geometry.

preprint2020arXiv

DR 21 South Filament: a Parsec-sized Dense Gas Accretion Flow onto the DR 21 Massive Young Cluster

DR21 south filament (DR21SF) is a unique component of the giant network of filamentary molecular clouds in the north region of Cygnus X complex. Unlike the highly fragmented and star-forming active environment it resides, DR21SF exhibits a coherent profile in the column density map with very few star formation signposts, even though the previously reported linear density of the filament is an order of magnitude higher than the thermal stable threshold. We derive the size (3.6~pc by 0.13~pc), temperature (10 to 15~K), and mass (1048~\textit{M$_\odot$}) of DR21SF from Shanghai 65 m TianMa Radio Telescope (TMRT) observations of NH$_3$ (1, 1) and (2, 2) inversion lines in conjunction with the column density map from our previous work. Star-forming sites are identified along the filament where gas temperature excesses. We find clear gradients in radial velocity and intrinsic line-width along the spine of the filament. The gradients can be well interpreted with a scenario of an accretion flow feeding DR 21 at a mass transfer rate of $1.1 \times 10^{-3}$~\textit{M$_\odot$} yr$^{-1}$. Based on the analysis of its kinematic temperature, intrinsic line-width and mass distribution, we conclude that DR21SF is in an overall trans-critical status, which indicates an early evolutionary stage.

preprint2020arXiv

Dual-stream Maximum Self-attention Multi-instance Learning

Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available. Training classifiers to accurately determine the bag label and instance labels is a challenging but critical task in many practical scenarios, such as computational histopathology. Recently, MIL models fully parameterized by neural networks have become popular due to the high flexibility and superior performance. Most of these models rely on attention mechanisms that assign attention scores across the instance embeddings in a bag and produce the bag embedding using an aggregation operator. In this paper, we proposed a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks. The first stream deploys a simple MIL max-pooling while the top-activated instance embedding is determined and used to obtain self-attention scores across instance embeddings in the second stream. Different from most of the previous methods, the proposed model jointly learns an instance classifier and a bag classifier based on the same instance embeddings. The experiments results show that our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.

preprint2020arXiv

Estimation of Regional Economic Development Indicator from Transportation Network Analytics

With the booming economy in China, many researches have pointed out that the improvement of regional transportation infrastructure among other factors had an important effect on economic growth. Utilizing a large-scale dataset which includes 3.5 billion entry and exit records of vehicles along highways generated from toll collection systems, we attempt to establish the relevance of mid-distance land transport patterns to regional economic status through transportation network analyses. We apply standard measurements of complex networks to analyze the highway transportation networks. A set of traffic flow features are computed and correlated to the regional economic development indicator. The multi-linear regression models explain about 89% to 96% of the variation of cities' GDP across three provinces in China. We then fit gravity models using annual traffic volumes of cars, buses, and freight trucks between pairs of cities for each province separately as well as for the whole dataset. We find the temporal changes of distance-decay effects on spatial interactions between cities in transportation networks, which link to the economic development patterns of each province. We conclude that transportation big data reveal the status of regional economic development and contain valuable information of human mobility, production linkages, and logistics for regional management and planning. Our research offers insights into the investigation of regional economic development status using highway transportation big data.

preprint2020arXiv

Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor and Event-Stream Dataset

Robotic grasping plays an important role in the field of robotics. The current state-of-the-art robotic grasping detection systems are usually built on the conventional vision, such as RGB-D camera. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research. Currently, there are limited event-based datasets due to the troublesome annotation of the asynchronous event stream. Annotating large scale vision dataset often takes lots of computation resources, especially the troublesome data for video-level annotation. In this work, we consider the problem of detecting robotic grasps in a moving camera view of a scene containing objects. To obtain more agile robotic perception, a neuromorphic vision sensor (DAVIS) attaching to the robot gripper is introduced to explore the potential usage in grasping detection. We construct a robotic grasping dataset named Event-Stream Dataset with 91 objects. A spatio-temporal mixed particle filter (SMP Filter) is proposed to track the led-based grasp rectangles which enables video-level annotation of a single grasp rectangle per object. As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz. Based on the Event-Stream dataset, we develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression. The method performs high detection accuracy on our Event-Stream dataset with 93% precision at object-wise level. This work provides a large-scale and well-annotated dataset, and promotes the neuromorphic vision applications in agile robot.

preprint2020arXiv

How Molecular Chiralities of Bis(mandelato)borate Anions affect Their Binding Structures with Alkali Metal Ions and Microstructural Properties in Tetraalkylphosphonium Ionic Liquids

Spiroborate anions based inorganic electrolytes and ionic liquids (ILs) have fascinating electrochemical and tribological properties, and have received widespread attention in industrial applications. Molecular chiralities of spiroborate anions have a significant effect on microstructures and macroscopic functionalities of these ionic materials in applications, and thus deserve a fundamental understanding. In current work, we performed quantum chemistry calculations to address binding strength and coordination structures of chiral bis(mandelato)borate ([BMB]) anions with representative alkali metal ions, as well as electronic properties of alkali metal ion-[BMB] ion pair complexes. The optimized [BMB] conformers are categorized into V-shaped, bent, and twisted structures with varied electrostatic potential contours, conformational energies, and distinct alkali metal ion-[BMB] binding structures. Alkali metal ions have additional associations with phenyl groups in V-shaped [BMB] conformers owing to preferential cation-$π$ interactions. Furthermore, effects of molecular chiralities of [BMB] anions on thermodynamics and microstructural properties of tetraalkylphosphonium [BMB] ILs were studied by performing extensive atomistic interactions. Oxygen atoms in [BMB] anions have competitive hydrogen bonding interactions with hydrogen atoms in cations depending on molecular chiralities and steric hindrance effects of [BMB] anions. However, molecular chiralities of [BMB] anions have negligible effect on liquid densities of tetraalkylphosphonium [BMB] ILs and spatial distributions of boron atoms in anions around phosphorous atoms in cations. Enlarging tetraalkylphosphonium cation sizes leads to enhanced cation-anion hydrogen bonding and Coulombic interactions due to enhanced segregation of polar groups in apolar networks in heterogeneous IL matrices.

preprint2020arXiv

Identification of Deep Network Generated Images Using Disparities in Color Components

With the powerful deep network architectures, such as generative adversarial networks, one can easily generate photorealistic images. Although the generated images are not dedicated for fooling human or deceiving biometric authentication systems, research communities and public media have shown great concerns on the security issues caused by these images. This paper addresses the problem of identifying deep network generated (DNG) images. Taking the differences between camera imaging and DNG image generation into considerations, we analyze the disparities between DNG images and real images in different color components. We observe that the DNG images are more distinguishable from real ones in the chrominance components, especially in the residual domain. Based on these observations, we propose a feature set to capture color image statistics for identifying DNG images. Additionally, we evaluate several detection situations, including the training-testing data are matched or mismatched in image sources or generative models and detection with only real images. Extensive experimental results show that the proposed method can accurately identify DNG images and outperforms existing methods when the training and testing data are mismatched. Moreover, when the GAN model is unknown, our methods also achieves good performance with one-class classification by using only real images for training.

preprint2020arXiv

Incentive-Compatible Diffusion Auctions

Diffusion auction is a new model in auction design. It can incentivize the buyers who have already joined in the auction to further diffuse the sale information to others via social relations, whereby both the seller's revenue and the social welfare can be improved. Diffusion auctions are essentially non-typical multidimensional mechanism design problems and agents' social relations are complicatedly involved with their bids. In such auctions, incentive-compatibility (IC) means it is best for every agent to honestly report her valuation and fully diffuse the sale information to all her neighbors. Existing work identified some specific mechanisms for diffusion auctions, while a general theory characterizing all incentive-compatible diffusion auctions is still missing. In this work, we identify a sufficient and necessary condition for all dominant-strategy incentive-compatible (DSIC) diffusion auctions. We formulate the monotonic allocation policies in such multidimensional problems and show that any monotonic allocation policy can be implemented in a DSIC diffusion auction mechanism. Moreover, given any monotonic allocation policy, we obtain the optimal payment policy to maximize the seller's revenue.

preprint2020arXiv

Interpreting the Latent Space of GANs via Correlation Analysis for Controllable Concept Manipulation

Generative adversarial nets (GANs) have been successfully applied in many fields like image generation, inpainting, super-resolution and drug discovery, etc., by now, the inner process of GANs is far from been understood. To get deeper insight of the intrinsic mechanism of GANs, in this paper, a method for interpreting the latent space of GANs by analyzing the correlation between latent variables and the corresponding semantic contents in generated images is proposed. Unlike previous methods that focus on dissecting models via feature visualization, the emphasis of this work is put on the variables in latent space, i.e. how the latent variables affect the quantitative analysis of generated results. Given a pretrained GAN model with weights fixed, the latent variables are intervened to analyze their effect on the semantic content in generated images. A set of controlling latent variables can be derived for specific content generation, and the controllable semantic content manipulation be achieved. The proposed method is testified on the datasets Fashion-MNIST and UT Zappos50K, experiment results show its effectiveness.

preprint2020arXiv

Multitask Non-Autoregressive Model for Human Motion Prediction

Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem. Therefore, extensive efforts have been continued on exploring different RNN-based encoder-decoder architectures. However, by generating target poses conditioned on the previously generated ones, these models are prone to bringing issues such as error accumulation problem. In this paper, we argue that such issue is mainly caused by adopting autoregressive manner. Hence, a novel Non-auToregressive Model (NAT) is proposed with a complete non-autoregressive decoding scheme, as well as a context encoder and a positional encoding module. More specifically, the context encoder embeds the given poses from temporal and spatial perspectives. The frame decoder is responsible for predicting each future pose independently. The positional encoding module injects positional signal into the model to indicate temporal order. Moreover, a multitask training paradigm is presented for both low-level human skeleton prediction and high-level human action recognition, resulting in the convincing improvement for the prediction task. Our approach is evaluated on Human3.6M and CMU-Mocap benchmarks and outperforms state-of-the-art autoregressive methods.

preprint2020arXiv

Online Binary Space Partitioning Forests

The Binary Space Partitioning-Tree~(BSP-Tree) process was recently proposed as an efficient strategy for space partitioning tasks. Because it uses more than one dimension to partition the space, the BSP-Tree Process is more efficient and flexible than conventional axis-aligned cutting strategies. However, due to its batch learning setting, it is not well suited to large-scale classification and regression problems. In this paper, we develop an online BSP-Forest framework to address this limitation. With the arrival of new data, the resulting online algorithm can simultaneously expand the space coverage and refine the partition structure, with guaranteed universal consistency for both classification and regression problems. The effectiveness and competitive performance of the online BSP-Forest is verified via simulations on real-world datasets.

preprint2020arXiv

Outlier Detection Ensemble with Embedded Feature Selection

Feature selection places an important role in improving the performance of outlier detection, especially for noisy data. Existing methods usually perform feature selection and outlier scoring separately, which would select feature subsets that may not optimally serve for outlier detection, leading to unsatisfying performance. In this paper, we propose an outlier detection ensemble framework with embedded feature selection (ODEFS), to address this issue. Specifically, for each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation to learn feature subsets that are tailored for the outlier detection method. Moreover, we adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection, which is helpful to improve the reliability of the training set. After that, we design an alternate algorithm with proved convergence to solve the resultant optimization problem. In addition, we analyze the generalization error bound of the proposed framework, which provides theoretical guarantee on the method and insightful practical guidance. Comprehensive experimental results on 12 real-world datasets from diverse domains validate the superiority of the proposed ODEFS.

preprint2020arXiv

Pressure Engineering of the Dirac Fermions in Quasi-One-Dimensional Tl$_2$Mo$_6$Se$_6$

Topological band dispersions other than the standard Dirac or Weyl fermions have garnered the increasing interest in materials science. Among them, the cubic Dirac fermions were recently proposed in the family of quasi-one-dimensional conductors A$_2$Mo$_6$X$_6$ (A= Na, K, In, Tl; X= S, Se, Te), where the band crossing is characterized by a linear dispersion in one $k$-space direction but the cubic dispersion in the plane perpendicular to it. It is not yet clear, however, how the external perturbations can alter these nontrivial carriers and ultimately induce a new distinct quantum phase. Here we study the evolution of Dirac fermions, in particular the cubic Dirac crossing, under external pressure in the representative quasi-one-dimensional Tl$_2$Mo$_6$Se$_6$ via the first-principles calculations. Specifically, it is found that the topological properties, including the bulk Dirac crossings and the topological surface states, change progressively under pressure up to 50 GPa where it undergoes a structural transition from the hexagonal phase to body-centered tetragonal phase. Above 50 GPa, the system is more likely to be topologically trivial. Further, we also investigate its phonon spectra, which reveals a gradual depletion of the negative phonon modes with pressure, consistent with the more three-dimensional Fermi surface in the high-pressure phase. Our work may provide a useful guideline for further experimental search and the band engineering of the topologically nontrivial fermions in this intriguing state of matter.

preprint2020arXiv

Question Guided Modular Routing Networks for Visual Question Answering

This paper studies the task of Visual Question Answering (VQA), which is topical in Multimedia community recently. Particularly, we explore two critical research problems existed in VQA: (1) efficiently fusing the visual and textual modalities; (2) enabling the visual reasoning ability of VQA models in answering complex questions. To address these challenging problems, a novel Question Guided Modular Routing Networks (QGMRN) has been proposed in this paper. Particularly, The QGMRN is composed of visual, textual and routing network. The visual and textual network serve as the backbones for the generic feature extractors of visual and textual modalities. QGMRN can fuse the visual and textual modalities at multiple semantic levels. Typically, the visual reasoning is facilitated by the routing network in a discrete and stochastic way by using Gumbel-Softmax trick for module selection. When the input reaches a certain modular layer, routing network newly proposed in this paper, dynamically selects a portion of modules from that layer to process the input depending on the question features generated by the textual network. It can also learn to reason by routing between the generic modules without additional supervision information or expert knowledge. Benefiting from the dynamic routing mechanism, QGMRN can outperform the previous classical VQA methods by a large margin and achieve the competitive results against the state-of-the-art methods. Furthermore, attention mechanism is integrated into our QGMRN model and thus can further boost the model performance. Empirically, extensive experiments on the CLEVR and CLEVR-Humans datasets validate the effectiveness of our proposed model, and the state-of-the-art performance has been achieved.

preprint2020arXiv

Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data Modelling

The Dirichlet Belief Network~(DirBN) has been recently proposed as a promising approach in learning interpretable deep latent representations for objects. In this work, we leverage its interpretable modelling architecture and propose a deep dynamic probabilistic framework -- the Recurrent Dirichlet Belief Network~(Recurrent-DBN) -- to study interpretable hidden structures from dynamic relational data. The proposed Recurrent-DBN has the following merits: (1) it infers interpretable and organised hierarchical latent structures for objects within and across time steps; (2) it enables recurrent long-term temporal dependence modelling, which outperforms the one-order Markov descriptions in most of the dynamic probabilistic frameworks. In addition, we develop a new inference strategy, which first upward-and-backward propagates latent counts and then downward-and-forward samples variables, to enable efficient Gibbs sampling for the Recurrent-DBN. We apply the Recurrent-DBN to dynamic relational data problems. The extensive experiment results on real-world data validate the advantages of the Recurrent-DBN over the state-of-the-art models in interpretable latent structure discovery and improved link prediction performance.

preprint2020arXiv

Smoothing Graphons for Modelling Exchangeable Relational Data

Modelling exchangeable relational data can be described by \textit{graphon theory}. Most Bayesian methods for modelling exchangeable relational data can be attributed to this framework by exploiting different forms of graphons. However, the graphons adopted by existing Bayesian methods are either piecewise-constant functions, which are insufficiently flexible for accurate modelling of the relational data, or are complicated continuous functions, which incur heavy computational costs for inference. In this work, we introduce a smoothing procedure to piecewise-constant graphons to form {\em smoothing graphons}, which permit continuous intensity values for describing relations, but without impractically increasing computational costs. In particular, we focus on the Bayesian Stochastic Block Model (SBM) and demonstrate how to adapt the piecewise-constant SBM graphon to the smoothed version. We initially propose the Integrated Smoothing Graphon (ISG) which introduces one smoothing parameter to the SBM graphon to generate continuous relational intensity values. We then develop the Latent Feature Smoothing Graphon (LFSG), which improves on the ISG by introducing auxiliary hidden labels to decompose the calculation of the ISG intensity and enable efficient inference. Experimental results on real-world data sets validate the advantages of applying smoothing strategies to the Stochastic Block Model, demonstrating that smoothing graphons can greatly improve AUC and precision for link prediction without increasing computational complexity.

preprint2020arXiv

The ENUF Method -- Ewald Summation based on Non-Uniform Fast Fourier Transform: Implementation, Parallelization, and Application

Computer simulations of model systems are widely used to explore striking phenomena in promising applications spanning from physics, chemistry, biology, to materials science and engineering. The long range electrostatic interactions between charged particles constitute a prominent factor in determining structures and states of model systems. How to efficiently calculate electrostatic interactions in model systems subjected to partial or full periodic boundary conditions has been a grand challenging task. In the past decades, a large variety of computational schemes have been proposed, among which the Ewald summation method is the most reliable route to accurately deal with electrostatic interactions in model systems. In addition, extensive effort has been done to improve computational efficiency of the Ewald summation based methods. Representative examples are approaches based on cutoffs, reaction fields, multi-poles, multi-grids, and particle-mesh schemes. We sketched an ENUF method, an abbreviation for the Ewald summation method based on Non-Uniform fast Fourier transform technique, and have implemented this method in particle-based simulation packages to calculate electrostatic energies and forces at micro- and mesoscopic levels. Extensive computational studies of conformational properties of polyelectrolytes, dendrimer-membrane complexes, and ionic fluids demonstrated that the ENUF method and its derivatives conserve both energy and momentum to floating point accuracy, and exhibit a computational complexity of $\mathcal{O}(N\log N)$ with optimal physical parameters. These ENUF based methods are attractive alternatives in molecular simulations where high accuracy and efficiency of simulation methods are needed to accelerate calculations of electrostatic interactions at extended spatiotemporal scales.

preprint2020arXiv

Topological Dirac states in a layered telluride TaPdTe$_5$ with quasi-one-dimensional PdTe$_2$ chains

We report the synthesis and systematic studies of a new layered ternary telluride TaPdTe5 with quasi-one-dimensional PdTe2 chains. This compound crystalizes in a layered orthorhombic structure with space group Cmcm. Analysis of its curved field-dependent Hall resistivity, using the two-band model, indicates the hole-dominated transport with a high mobility $μ_h$ = 2.38 $\times$ 10$^3$ cm$^2$ V$^{-1}$ s$^{-1}$ at low temperatures. The in-plane magnetoresistance (MR) displays significant anisotropy with field applied along the crystallographic $b$ axis. The MR with the current applied along the $c$-axis is also measured in high magnetic fields up to 51.7 T. Remarkably, it follows a power-law dependence and reaches (9.5 $\times$ 10$^3$)% at 2.1 K without any signature of saturation. The De Haas-van Alphen oscillations show a small Fermi-surface pocket with a nontrivial Berry phase. The Shubnikov-de Haas (SdH) oscillations are detected at low temperatures and under magnetic fields above 28.5 T. Two effective masses $m^*$ (0.26$m_e$ and 0.41$m_e$) are extracted from the oscillatory SdH data. Our first-principles calculations unveil a topological Dirac cone in its surface states, and, in particular, the topological index indicates that TaPdTe$_5$ is a topologically nontrivial material.

preprint2020arXiv

VL-BERT: Pre-training of Generic Visual-Linguistic Representations

We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short). VL-BERT adopts the simple yet powerful Transformer model as the backbone, and extends it to take both visual and linguistic embedded features as input. In it, each element of the input is either of a word from the input sentence, or a region-of-interest (RoI) from the input image. It is designed to fit for most of the visual-linguistic downstream tasks. To better exploit the generic representation, we pre-train VL-BERT on the massive-scale Conceptual Captions dataset, together with text-only corpus. Extensive empirical analysis demonstrates that the pre-training procedure can better align the visual-linguistic clues and benefit the downstream tasks, such as visual commonsense reasoning, visual question answering and referring expression comprehension. It is worth noting that VL-BERT achieved the first place of single model on the leaderboard of the VCR benchmark. Code is released at \url{https://github.com/jackroos/VL-BERT}.

preprint2020arXiv

W-net: Simultaneous segmentation of multi-anatomical retinal structures using a multi-task deep neural network

Segmentation of multiple anatomical structures is of great importance in medical image analysis. In this study, we proposed a $\mathcal{W}$-net to simultaneously segment both the optic disc (OD) and the exudates in retinal images based on the multi-task learning (MTL) scheme. We introduced a class-balanced loss and a multi-task weighted loss to alleviate the imbalanced problem and to improve the robustness and generalization property of the $\mathcal{W}$-net. We demonstrated the effectiveness of our approach by applying five-fold cross-validation experiments on two public datasets e\_ophtha\_EX and DiaRetDb1. We achieved F1-score of 94.76\% and 95.73\% for OD segmentation, and 92.80\% and 94.14\% for exudates segmentation. To further prove the generalization property of the proposed method, we applied the trained model on the DRIONS-DB dataset for OD segmentation and on the MESSIDOR dataset for exudate segmentation. Our results demonstrated that by choosing the optimal weights of each task, the MTL based $\mathcal{W}$-net outperformed separate models trained individually on each task. Code and pre-trained models will be available at: \url{https://github.com/FundusResearch/MTL_for_OD_and_exudates.git}.

preprint2019arXiv

Bulk Fermi surface of the layered superconductor TaSe3 with three-dimensional strong topological insulator state

High magnetic field transport measurements and ab initio calculations on the layered superconductor TaSe3 have provided compelling evidences for the existence of a three-dimensional strong topological insulator state. Longitudinal magnetotransport measurements up to ~ 33 T unveiled striking Shubnikov-de Hass oscillations with two fundamental frequencies at 100 T and 175 T corresponding to a nontrivial electron Fermi pocket at the B point and a nontrivial hole Fermi pocket at the Γ point respectively in the Brillouin zone. However, calculations revealed one more electron pocket at the B point, which was not detected by the magnetotransport measurements, presumably due to the limited carrier momentum relaxation time. Angle dependent quantum oscillations by rotating the sample with respect to the magnetic field revealed clear changes in the two fundamental frequencies, indicating anisotropic electronic Fermi pockets. The ab initio calculations gave the topological Z2 invariants of (1; 100) and revealed a single Dirac cone on the (1 0 -1) surface at the X point with helical spin texture at a constant-energy contour, suggesting a strong topological insulator state. The results demonstrate TaSe3 an excellent platform to study the interplay between topological phase and superconductivity and a promising system for the exploration of topological superconductivity.

preprint2019arXiv

High Dynamic Range Externally Time-gated Photon Counting Optical Time-domain Reflectometry

Single photon detector (SPD) has a maximum count rate due to its dead time, which results in that the dynamic range of photon counting optical time-domain reflectometry (PC-OTDR) de-creases with the length of monitored fiber. To further improve the dynamic range of PC-OTDR, we propose and demonstrate an externally time-gated scheme. The externally time-gated scheme is realized by using a high-speed optical switch, i.e. a Mach-Zehnder interferometer, to modulate the back-propagation optical signal, and to allow that only a certain segment of the fiber is monitored by the SPD. The feasibility of proposed scheme is first examined with theoretical analysis and simulation; then we experimentally demonstrate it with our experimental PC-OTDR testbed operating at 800 nm wavelength band. In our studies, a dynamic range of 30.0 dB is achieved in a 70 meters long PC-OTDR system with 50 ns external gates, corresponding to an improvement of 11.0 dB in dynamic range comparing with no gating operation. Furthermore, with the improved dynamic range, a successful identification of a 0.37 dB loss event is detected with 30-seconds accumulation, which could not be identified without gating operation. Our scheme paves an avenue for developing PC-OTDR systems with high dynamic range.

preprint2018arXiv

Critical behavior of order parameter at the nonequilibrium phase transition of the Ising model

After a quench of transverse field, the asymptotic long-time state of Ising model displays a transition from a ferromagnetic phase to a paramagnetic phase as the post-quench field strength increases, which is revealed by the vanishing of the order parameter defined as the averaged magnetization over time. We estimate the critical behavior of the magnetization at this nonequilibrium phase transition by using mean-field approximation. In the vicinity of the critical field, the magnetization vanishes as the inverse of a logarithmic function, which is significantly distinguished from the critical behavior of order parameter at the corresponding equilibrium phase transition, i.e. a power-law function.

preprint2018arXiv

Predicting Lung Nodule Malignancies by Combining Deep Convolutional Neural Network and Handcrafted Features

To predict lung nodule malignancy with a high sensitivity and specificity, we propose a fusion algorithm that combines handcrafted features (HF) into the features learned at the output layer of a 3D deep convolutional neural network (CNN). First, we extracted twenty-nine handcrafted features, including nine intensity features, eight geometric features, and twelve texture features based on grey-level co-occurrence matrix (GLCM) averaged from thirteen directions. We then trained 3D CNNs modified from three state-of-the-art 2D CNN architectures (AlexNet, VGG-16 Net and Multi-crop Net) to extract the CNN features learned at the output layer. For each 3D CNN, the CNN features combined with the 29 handcrafted features were used as the input for the support vector machine (SVM) coupled with the sequential forward feature selection (SFS) method to select the optimal feature subset and construct the classifiers. The fusion algorithm takes full advantage of the handcrafted features and the highest level CNN features learned at the output layer. It can overcome the disadvantage of the handcrafted features that may not fully reflect the unique characteristics of a particular lesion by combining the intrinsic CNN features. Meanwhile, it also alleviates the requirement of a large scale annotated dataset for the CNNs based on the complementary of handcrafted features. The patient cohort includes 431 malignant nodules and 795 benign nodules extracted from the LIDC/IDRI database. For each investigated CNN architecture, the proposed fusion algorithm achieved the highest AUC, accuracy, sensitivity, and specificity scores among all competitive classification models.

preprint2016arXiv

Chromatic Effect for THz Generation in a Novel Wave-front Tilt Scheme

Deriving single or few cycle terahertz pulse (THz) by intense femtosecond laser through cascaded optical rectification in electro-optic crystals is a crucial technique in cutting-edge time-resolved spectroscopy to characterize micro-scale structures and ultrafast dynamics. In the past decade, lithium niobate (LN) crystal implementation of wave-front tilt scheme has been prevalently used, while painstaking efforts have been invested in order to achieve higher THz conversion efficiency. In this research we developed a brand new type of LN crystal possessing dual-face-cut and Brewster coupling, and conducted experimental and simulative investigation systematically to optimize the multi-dimensionally entangled parameters in THz generation, predicting the extreme conversion efficiency of 10% is potentially promising at the THz absorption coefficient of 0.5cm-1. More remarkably, we first discovered that the chirp of the driving laser pulse plays a decisive role in the wave-front tilt scheme, and the THz generation efficiency could be enhanced tremendously by applying an appropriate chirp.

preprint2016arXiv

Efficient Multiple Line-Based Intra Prediction for HEVC

Traditional intra prediction usually utilizes the nearest reference line to generate the predicted block when considering strong spatial correlation. However, this kind of single line-based method does not always work well due to at least two issues. One is the incoherence caused by the signal noise or the texture of other object, where this texture deviates from the inherent texture of the current block. The other reason is that the nearest reference line usually has worse reconstruction quality in block-based video coding. Due to these two issues, this paper proposes an efficient multiple line-based intra prediction scheme to improve coding efficiency. Besides the nearest reference line, further reference lines are also utilized. The further reference lines with relatively higher quality can provide potential better prediction. At the same time, the residue compensation is introduced to calibrate the prediction of boundary regions in a block when we utilize further reference lines. To speed up the encoding process, this paper designs several fast algorithms. Experimental results show that, compared with HM-16.9, the proposed fast search method achieves 2.0% bit saving on average and up to 3.7%, with increasing the encoding time by 112%.

preprint2016arXiv

Hierarchy, dimension, attractor and self-organization -- dynamics of mode-locked fiber lasers

Mode-locked fiber lasers are one of the most important sources of ultra-short pulses. However, A unified description for the rich variety of states and the driving forces behind the complex and diverse nonlinear behavior of mode-locked fiber lasers have yet to be developed. Here we present a comprehensive theoretical framework based upon complexity science, thereby offering a fundamentally new way of thinking about the behavior of mode-locked fiber lasers. This hierarchically structured frame work provide a model with and changeable variable dimensionality resulting in a simple and elegant view, with which numerous complex states can be described systematically. The existence of a set of new mode-locked fiber laser states is proposed for the first time. Moreover, research into the attractors' basins reveals the origin of stochasticity, hysteresis and multistability in these systems. These findings pave the way for dynamics analysis and new system designs of mode-locked fiber lasers. The paradigm will have a wide range of potential applications in diverse research fields.

preprint2016arXiv

Induced robust topological order on an ordinary insulator hetero-structured with a strong topological insulator

Topological states of matter originate from distinct topological electronic structures of materials. As for strong topological insulators (STIs), the topological surface (interface) is a direct consequence of electronic structure transition between materials categorized to different topological genus. Therefore, it is fundamentally interesting if such topological character can be manipulated. Besides tuning the crystal field and the strength of spin-orbital coupling (e.g., by external strain, or chemical doping), there is currently rare report on topological state induced in ordinary insulators (OIs) by the heterostructure of OI/STI. Here we report the observation of a Dirac cone topological surface state (TSS) induced on the Sb2Se3 layer up to 15 nm thick in the OI/STI heterostructure, in sharp contrast with the OI/OI heterostructure where no sign of TSS can be observed. This is evident for an induced topological state in an OI by heterostructure.

preprint2016arXiv

Insulator-metal transition in deep Sr-vacant spin-orbit Mott insulator Sr2IrO4

Sr2IrO4 exhibits a novel insulating state assisted by spin-orbit interactions. A series of polycrystalline samples of Sr2-xIrO4 have been synthesized. It is found that deep Sr-vacancies of Sr2-xIrO4 greatly reduce the rotation of IrO6 octahedral, and more importantly, a significant structural change occurs around x = 0.48 in both the lattice constants and the Ir-O2 bond length. An insulator-metal transition (IMT) appears and a non-Fermi-liquid metallic electronic state has been proved at x>0.48 in Sr2-xIrO4. Furthermore, a sudden drop emerges of the localization temperature T0 and the antiferromagnetic (AFM) transition temperature TN in Sr1.5IrO4, together with the Curie-Weiss temperature reversing its sign. These abrupt changes are closely related with the reduction of the rotation crystal structure.

preprint2016arXiv

Low-Delay Distributed Source Coding for Time-Varying Sources with Unknown Statistics

We consider a system in which two nodes take correlated measurements of a random source with time-varying and unknown statistics. The observations of the source at the first node are to be losslessly replicated with a given probability of outage at the second node, which receives data from the first node over a constant-rate errorless channel. We develop a system and associated strategies for joint distributed source coding (encoding and decoding) and transmission control in order to achieve low end-to-end delay. Slepian-Wolf coding in its traditional form cannot be applied in our scenario, since the encoder requires the joint statistics of the observations and the associated decoding delay is very high. We analytically evaluate the performance of our strategies and show that the delay achieved by them are order optimal, as the conditional entropy of the source approaches to the channel rate. We also evaluate the performance of our algorithms based on real-world experiments using two cameras recording videos of a scene at different angles. Having realized our schemes, we demonstrated that, even with a very low-complexity quantizer, a compression ratio of approximately 50% is achievable for lossless replication at the decoder, at an average delay of a few seconds.

preprint2016arXiv

On magnitude, asymptotics and duration of drawdowns for Lévy models

This paper considers magnitude, asymptotics and duration of drawdowns for some Lévy processes. First, we revisit some existing results on the magnitude of drawdowns for spectrally negative Lévy processes using an approximation approach. For any spectrally negative Lévy process whose scale functions are well-behaved at $0+$, we then study the asymptotics of drawdown quantities when the threshold of drawdown magnitude approaches zero. We also show that such asymptotics is robust to perturbations of additional positive compound Poisson jumps. Finally, thanks to the asymptotic results and some recent works on the running maximum of Lévy processes, we derive the law of duration of drawdowns for a large class of Lévy processes (with a general spectrally negative part plus a positive compound Poisson structure). The duration of drawdowns is also known as the "Time to Recover" (TTR) the historical maximum, which is a widely used performance measure in the fund management industry. We find that the law of duration of drawdowns qualitatively depends on the path type of the spectrally negative component of the underlying Lévy process.

preprint2016arXiv

Origin of the superconductivity of WTe2 under pressure

Tungsten ditelluride (WTe2) has attracted significant attention due to its interesting electronic properties, such as the unsaturated magnetoresistance and superconductivity. Recently, it has been proposed to be a new type of Weyl semimetal, which is distinguished from other transition metal dichalcogenides (TMDs) from a topological prospective. Here, we study the structure of WTe2 under pressure with a crystal structure prediction and ab initio calculations combined with high pressure synchrotron X-ray diffraction and Raman spectroscopy measurements. We find that the ambient orthorhombic structure (Td) transforms into a monoclinic structure (1T') at around 4-5 GPa. As the transition pressure is very close to the critical point in recent high-pressure electrical transport measurements, the emergence of superconductivity in WTe2 under pressure is attributed to the Td-1T' structure phase transition, which associates with a sliding mechanism of the TMD layers and results in a shorter Te-Te interlayer distance compared to the intralayer ones. These results highlight the critical role of the interlayer stacking and chalcogen interactions on the electronic and superconducting properties of multilayered TMDs under hydrostatic strain environments.

preprint2016arXiv

TMRT Observations of Carbon-chain molecules in Serpens South 1A

We report Shanghai Tian Ma Radio Telescope detections of several long carbon-chain molecules at C and Ku band, including HC3N, HC5N, HC7N, HC9N, C3S, C6H and C8H toward the starless cloud Serpens South 1a. We detected some transitions (HC9N J=13-12 F=12-11 and F=14-13, H13CCCN J=2-1 F=1-0 and F=1-1, HC13CCN J=2-1 F=2-2, F=1-0 and F=1-1, HCC13CN J=2-1 F=1-0 and F=1-1) and resolved some hyperfine components (HC5N J=6-5 F=5-4, H13CCCN J=2-1 F=2-1) for the first time in the interstellar medium. The column densities of these carbon-chain molecules in a range of 10^{12}-10^{13} cm^{-2} are comparable to two carbon-chain molecule rich sources, TMC-1 and Lupus-1A. The abundance ratios are 1.00:(1.11\pm0.15):(1.47\pm0.18) for [H13CCCN]:[HC13CCN]:[HCC13CN]. This result implies that the 13C isotope is also concentrated in the carbon atom adjacent to the nitrogen atom in HC3N in Serpens south 1a, which is similar to TMC-1. The [HC3N]/[H13CCCN] ratio of 78\pm9, the [HC3N]/[HC13CCN] ratio of 70\pm8, and the [HC3N]/[HCC13CN] ratio of 53\pm4 are also comparable to those in TMC-1. In any case, Serpens South 1a proves a testing ground for understanding carbon-chain chemistry.

preprint2015arXiv

A Decision-Aided Parallel SC-List Decoder for Polar Codes

In this paper, we propose a decision-aided scheme for parallel SC-List decoding of polar codes. At the parallel SC-List decoder, each survival path is extended based on multiple information bits, therefore the number of split paths becomes very large and the sorting to find the top L paths becomes very complex. We propose a decision-aided scheme to reduce the number of split paths and thus reduce the sorting complexity.

preprint2015arXiv

Capacity-Achieving Rateless Polar Codes

A rateless coding scheme transmits incrementally more and more coded bits over an unknown channel until all the information bits are decoded reliably by the receiver. We propose a new rateless coding scheme based on polar codes, and we show that this scheme is capacity-achieving, i.e. its information rate is as good as the best code specifically designed for the unknown channel. Previous rateless coding schemes are designed for specific classes of channels such as AWGN channels, binary erasure channels, etc. but the proposed rateless coding scheme is capacity-achieving for broad classes of channels as long as they are ordered via degradation. Moreover, it inherits the conceptual and computational simplicity of polar codes.

preprint2015arXiv

Low-complexity Non-coherent Signal Detection for Nano-Scale Molecular Communications

Nano-scale molecular communication is a viable way of exchanging information between nano-machines. In this letter, a low-complexity and non-coherent signal detection technique is proposed to mitigate the inter-symbol-interference (ISI) and additive noise. In contrast to existing coherent detection methods of high complexity, the proposed non-coherent signal detector is more practical when the channel conditions are hard to acquire accurately or hidden from the receiver. The proposed scheme employs the concentration difference to detect the ISI corrupted signals and we demonstrate that it can suppress the ISI effectively. The concentration difference is a stable characteristic, irrespective of the diffusion channel conditions. In terms of complexity, by excluding matrix operations or likelihood calculations, the new detection scheme is particularly suitable for nano-scale molecular communication systems with a small energy budget or limited computation resource.

preprint2015arXiv

Low-latency List Decoding Of Polar Codes With Double Thresholding

For polar codes with short-to-medium code length, list successive cancellation decoding is used to achieve a good error-correcting performance. However, list pruning in the current list decoding is based on the sorting strategy and its timing complexity is high. This results in a long decoding latency for large list size. In this work, aiming at a low-latency list decoding implementation, a double thresholding algorithm is proposed for a fast list pruning. As a result, with a negligible performance degradation, the list pruning delay is greatly reduced. Based on the double thresholding, a low-latency list decoding architecture is proposed and implemented using a UMC 90nm CMOS technology. Synthesis results show that, even for a large list size of 16, the proposed low-latency architecture achieves a decoding throughput of 220 Mbps at a frequency of 641 MHz.

preprint2015arXiv

Molecular Communications with Longitudinal Carrier Waves: Baseband to Passband Modulation

Traditional molecular communications via diffusion (MCvD) systems have used baseband modulation techniques by varying properties of molecular pulses such as the amplitude, the frequency of the transversal wave of the pulse, and the time delay between subsequent pulses. In this letter, we propose and implement passband modulation with molecules that exhibit longitudinal carrier wave properties. This is achieved through the oscillation of the transmitter. Frequency division multiplexing is employed to allow different molecular information streams to co-exist in the same space and time channel, creating an effective bandwidth for MCvD.

preprint2015arXiv

Molecular Communications: Channel Model and Physical Layer Techniques

This article examines recent research in molecular communications from a telecommunications system design perspective. In particular, it focuses on channel models and state-of-the-art physical layer techniques. The goal is to provide a foundation for higher layer research and motivation for research and development of functional prototypes. In the first part of the article, we focus on the channel and noise model, comparing molecular and radio-wave pathloss formulae. In the second part, the article examines, equipped with the appropriate channel knowledge, the design of appropriate modulation and error correction coding schemes. The third reviews transmitter and receiver side signal processing methods that suppress inter-symbol-interference. Taken together, the three parts present a series of physical layer techniques that are necessary to producing reliable and practical molecular communications.

preprint2015arXiv

Multilayer C2N: Effect of Stacking Order and Number of Layers on Bandgap and Its Controlled Electronic Properties by External Electric Field

Successful synthesis of the nitrogenated holey two-dimensional structures C2N (Nat. Commun. 2015, 6, 1-7) using simply wet-chemical reaction offer a cost-effective way to generate other 2D materials with novel optical and electronic properties. Using the few-layer C2N as models, we have performed an ab initio study of electronic properties of layered C2N. Band gaps of this system exhibit monotone decreasing as the number of layers increase. And a direct-gap to indirect-gap transition at the bulk C2N. Besides, when we apply an out-of-plane electric field on few-layer C2N, the band gap of multilayer C2N will be decreased as the electric field increased and a semiconductor-semimetal transition will happen for five-layer C2N under an appropriate electric field, whereas the band gap of monolayer C2N is unchanged under electric field. Owing to their tunable bandgaps in a wide range, layers C2N will have tremendous opportunities to be applied in nanoscale electronic and optoelectronic devices.

preprint2015arXiv

Reduce the Complexity of List Decoding of Polar Codes by Tree-Pruning

Polar codes under cyclic redundancy check aided successive cancellation list (CA-SCL) decoding can outperform the turbo codes and the LDPC codes when code lengths are configured to be several kilobits. In order to reduce the decoding complexity, a novel tree-pruning scheme for the \mbox{SCL/CA-SCL} decoding algorithms is proposed in this paper. In each step of the decoding procedure, the candidate paths with metrics less than a threshold are dropped directly to avoid the unnecessary computations for the path searching on the descendant branches of them. Given a candidate path, an upper bound of the path metric of its descendants is proposed to determined whether the pruning of this candidate path would affect frame error rate (FER) performance. By utilizing this upper bounding technique and introducing a dynamic threshold, the proposed scheme deletes the redundant candidate paths as many as possible while keeping the performance deterioration in a tolerant region, thus it is much more efficient than the existing pruning scheme. With only a negligible loss of FER performance, the computational complexity of the proposed pruned decoding scheme is only about $40\%$ of the standard algorithm in the low signal-to-noise ratio (SNR) region (where the FER under CA-SCL decoding is about $0.1 \sim 0.001$), and it can be very close to that of the successive cancellation (SC) decoder in the moderate and high SNR regions.

preprint2015arXiv

Single-pulse radio observations of the Galactic Center magnetar PSR J1745-2900

In this paper, we report radio observations of the Galactic Center magnetar PSR J1745-2900 at six epochs between June and October, 2014. These observations were carried out using the new Shanghai Tian Ma Radio Telescope at a frequency of 8.6 GHz. Both the flux density and integrated profile of PSR J1745-2900 show dramatic changes from epoch to epoch showing that the pulsar was in its "erratic" phase. On MJD 56836, the flux density of this magnetar was about 8.7 mJy, which was ten times large than that reported at the time of discovery, enabling a single-pulse analysis. The emission is dominated by narrow "spiky" pulses which follow a log-normal distribution in peak flux density. From 1913 pulses, we detected 53 pulses whose peak flux density is ten times greater than that of the integrated profile. They are concentrated in pulse phase at the peaks of the integrated profile. The pulse widths at the 50% level of these bright pulses was between 0.2 to 0.9 deg, much narrower than that of integrated profile (~12 deg). The observed pulse widths may be limited by interstellar scattering. No clear correlation was found between the widths and peak flux density of these pulses and no evidence was found for subpulse drifting. Relatively strong spiky pulses are also detected in the other five epochs of observation, showing the same properties as that detected in MJD 56836. These strong spiky pulses cannot be classified as "giant" pulses but are more closely related to normal pulse emission.

preprint2015arXiv

Towards Data-Driven Hierarchical Surgical Skill Analysis

This paper evaluates methods of hierarchical skill analysis developed in aerospace to the problem of surgical skill assessment and modeling. The analysis employs tool motion data of Fundamental of Laparoscopic Skills (FLS) tasks collected from clinicians of various skill levels at three different clinical teaching hospitals in the United States. Outcomes are evaluated based on their ability to provide relevant information about the underlying processes across the entire system hierarchy including control, guidance and planning.

preprint2014arXiv

A Quasi-Classical Mapping Approach to Vibrationally Coupled Electron Transport in Molecular Junctions

We develop a classical mapping approach suitable to describe vibrationally coupled charge transport in molecular junctions based on the Cartesian mapping for many-electron systems [J. Chem. Phys. 137, 154107 (2012)]. To properly describe vibrational quantum effects in the transport characteristics, we introduce a simple transformation rewriting the Hamiltonian in terms of occupation numbers and use a binning function to facilitate quantization. The approach provides accurate results for the nonequilibrium Holstein model for a range of bias voltages, vibrational frequencies and temperatures. It also captures the hallmarks of vibrational quantum effects apparent in step-like structure in the current-voltage characteristics at low temperatures as well as the phenomenon of Franck-Condon blockade.

preprint2014arXiv

A RM-Polar Codes

In this letter we propose a new hybrid code called "RM-Polar" codes. This new codes are constructed by combining the construction of Reed-Muller (RM) code and Polar code. It has much larger minimum Hamming distance than Polar codes, therefore it has much better error performance than Polar codes.

preprint2014arXiv

Experimental demonstration of longitudinal beam phase space linearizer in a free-electron laser facility by corrugated structures

Removal of residual linear energy chirp and intrinsic nonlinear energy curvature in the relativistic electron beam from radiofrequency linear accelerator is of paramount importance for efficient lasing of a high-gain free-electron laser. Recently, it was theoretically and experimentally demonstrated that the longitudinal wakefield excited by the electrons itself in the corrugated structure allows for precise control of the electron beam phase space. In this Letter, we report the first utilization of a corrugated structure as beam linearizer in the operation of a seeded free-electron laser driven by a 140 MeV linear accelerator, where a gain of ~10,000 over spontaneous emission was achieved at the second harmonic of the 1047 nm seed laser, and a free-electron laser bandwidth narrowing by about 50% was observed, in good agreement with the theoretical expectations.

preprint2014arXiv

First-principles Study of the Interactions of Electron Donor and Acceptor Molecules with Phosphorene

Density functional theory calculations have been carried out to investigate single-layer phosphorene functionalized with two kinds of organic molecules, i.e. an electrophilic molecule tetracyano-p-quinodimethane (TCNQ) as electron acceptor and a nucleophilic molecule tetrathia-fulvalene (TTF) as electron donor. The TCNQ molecule introduces shallow acceptor states in the gap of phosphorene close to the valence band edge (VBE), which makes the doped system a p-type semiconductor. However, when the TTF molecule is adsorbed on the phosphorene, the occupied molecular states introduced into the gap are of deep donor states so that effective n-doping for transport cannot be realized. This disadvantageous situation can be amended by applying an external electric field perpendicular to the phosphorene surface with direction from the phosphorene to the TTF molecule, under which the TTF-introduced donor states move closer to conduction band edge (CBE) of the phosphorene and then the TTF-doped phosphorene system becomes an n-type semiconductor. The effective bipolar doping of single-layer phosphorene via molecular adsorption predicted above, especially n-doping against its native p-doping propensity, would broaden the way to the application of this new type of two-dimensional material in nanoelectronic and optoelectronic devices.

preprint2014arXiv

JPEG Noises beyond the First Compression Cycle

This paper focuses on the JPEG noises, which include the quantization noise and the rounding noise, during a JPEG compression cycle. The JPEG noises in the first compression cycle have been well studied; however, so far less attention has been paid on the JPEG noises in higher compression cycles. In this work, we present a statistical analysis on JPEG noises beyond the first compression cycle. To our knowledge, this is the first work on this topic. We find that the noise distributions in higher compression cycles are different from those in the first compression cycle, and they are dependent on the quantization parameters used between two successive cycles. To demonstrate the benefits from the statistical analysis, we provide two applications that can employ the derived noise distributions to uncover JPEG compression history with state-of-the-art performance.

preprint2014arXiv

On the Frequency of Drawdowns for Brownian Motion Processes

Drawdowns measuring the decline in value from the historical running maxima over a given period of time, are considered as extremal events from the standpoint of risk management. To date, research on the topic has mainly focus on the side of severity by studying the first drawdown over certain pre-specified size. In this paper, we extend the discussion by investigating the frequency of drawdowns, and some of their inherent characteristics. We consider two types of drawdown time sequences depending on whether a historical running maximum {is reset or not}. For each type, we study the frequency rate of drawdowns, the Laplace transform of the $n$-th drawdown time, the distribution of the running maximum and the value process at the $n$-th drawdown time, as well as some other quantities of interest. Interesting relationships between these two drawdown time sequences are also established. Finally, insurance policies protecting against the risk of frequent drawdowns are also proposed and priced.

preprint2014arXiv

Throughput-Optimal Scheduling Design with Regular Service Guarantees in Wireless Networks

Motivated by the regular service requirements of video applications for improving Quality-of-Experience (QoE) of users, we consider the design of scheduling strategies in multi-hop wireless networks that not only maximize system throughput but also provide regular inter-service times for all links. Since the service regularity of links is related to the higher-order statistics of the arrival process and the policy operation, it is highly challenging to characterize and analyze directly. We overcome this obstacle by introducing a new quantity, namely the time-since-last-service (TSLS), which tracks the time since the last service. By combining it with the queue-length in the weight, we propose a novel maximum-weight type scheduling policy, called Regular Service Guarantee (RSG) Algorithm. The unique evolution of the TSLS counter poses significant challenges for the analysis of the RSG Algorithm. To tackle these challenges, we first propose a novel Lyapunov function to show the throughput optimality of the RSG Algorithm. Then, we prove that the RSG Algorithm can provide service regularity guarantees by using the Lyapunov-drift based analysis of the steady-state behavior of the stochastic processes. In particular, our algorithm can achieve a degree of service regularity within a factor of a fundamental lower bound we derive. This factor is a function of the system statistics and design parameters and can be as low as two in some special networks. Our results, both analytical and numerical, exhibit significant service regularity improvements over the traditional throughput-optimal policies, which reveals the importance of incorporating the metric of time-since-last-service into the scheduling policy for providing regulated service.

preprint2013arXiv

CORN: Correlation-Driven Nonparametric Learning Approach for Portfolio Selection -- an Online Appendix

This appendix proves CORN's universal consistency. One of Bin's PhD thesis examiner (Special thanks to Vladimir Vovk from Royal Holloway, University of London) suggested that CORN is universal and provided sketch proof of Lemma 1.6, which is the key of this proof. Based on the proof in Gyprfi et al. [2006], we thus prove CORN's universal consistency. Note that the notations in this appendix follows Györfi et al. [2006].

preprint2013arXiv

Crystal Structure on the Category of Modules over Colored Planar Rook Algebra

Colored planar rook algebra is a semigroup algebra in which the basis element has a diagrammatic description. The category of finite dimensional modules over this algebra is completely reducible and suitable functors are defined on this category so that it admits a crystal structure in the sense of Kashiwara. We show that the category and functors categorify the crystal bases for the polynomial representations of quantized enveloping algebra $U_q(gl_{n+1})$.

preprint2013arXiv

Geometry of Quantum Computation with Qutrits

Determining the quantum circuit complexity of a unitary operation is an important problem in quantum computation. By using the mathematical techniques of Riemannian geometry, we investigate the efficient quantum circuits in quantum computation with $n$ qutrits. We show that the optimal quantum circuits are essentially equivalent to the shortest path between two points in a certain curved geometry of $SU(3^n)$. As an example, three-qutrit systems are investigated in detail.

preprint2013arXiv

Observations of 6.7 GHz Methanol Masers with EAVN I: VLBI Images of the first Epoch of Observations

Very long baseline interferometry (VLBI) monitoring of the 6.7 GHz methanol maser allows us to measure the internal proper motions of the maser spots and therefore study the gas motion around high-mass young stellar objects. To this end, we have begun monitoring observations with the East-Asian VLBI Network. In this paper we present the results of the first epoch observation for 36 sources, including 35 VLBI images of the methanol maser. Since two independent sources were found in three images, respectively, images of 38 sources were obtained. In 34 sources, more than or equal to 10 spots were detected. The observed spatial scale of the maser distribution was from 9 to 4900 astronomical units, and the following morphological categories were observed: elliptical, arched, linear, paired, and complex. The position of the maser spot was determined to an accuracy of approximately 0.1 mas, sufficiently high to measure the internal proper motion from two years of monitoring observations. The VLBI observation, however, detected only approximately 20% of all maser emission, suggesting that the remaining 80% of the total flux was spread into an undetectable extended distribution. Therefore, in addition to high-resolution observations, it is important to observe the whole structure of the maser emission including extended low-brightness structures, to reveal the associated site of the maser and gas motion.

preprint2013arXiv

Online Portfolio Selection: A Survey

Online portfolio selection is a fundamental problem in computational finance, which has been extensively studied across several research communities, including finance, statistics, artificial intelligence, machine learning, and data mining, etc. This article aims to provide a comprehensive survey and a structural understanding of published online portfolio selection techniques. From an online machine learning perspective, we first formulate online portfolio selection as a sequential decision problem, and then survey a variety of state-of-the-art approaches, which are grouped into several major categories, including benchmarks, "Follow-the-Winner" approaches, "Follow-the-Loser" approaches, "Pattern-Matching" based approaches, and "Meta-Learning Algorithms". In addition to the problem formulation and related algorithms, we also discuss the relationship of these algorithms with the Capital Growth theory in order to better understand the similarities and differences of their underlying trading ideas. This article aims to provide a timely and comprehensive survey for both machine learning and data mining researchers in academia and quantitative portfolio managers in the financial industry to help them understand the state-of-the-art and facilitate their research and practical applications. We also discuss some open issues and evaluate some emerging new trends for future research directions.

preprint2013arXiv

Parallel Decoders of Polar Codes

In this letter, we propose parallel SC (Successive Cancellation) decoder and parallel SC-List decoder for polar codes. The parallel decoder is composed of M=2^m(m>=1) component decoders working in parallel and each component decoder decodes a Polar code of a block size of 1/M of the original Polar code. Therefore the parallel decoder has M times faster decoding speed. Our simulation results show that the parallel decoder has almost the same error-rate performance as the conventional non-parallel decoder.

preprint2013arXiv

Time optimal quantum control of two-qubit systems

We study the optimal quantum control of heteronuclear two-qubit systems described by a Hamiltonian containing both nonlocal internal drift and local control terms. We derive an explicit formula to compute the minimum time required to steer the system from an initial state to a specified final state. As applications the minimal time to implement Controlled-NOT gate, SWAP gate and Controlled-U gate is calculated in detail. The experimental realizations of these quantum gates are explicitly presented.

preprint2012arXiv

A Fast-CSMA Algorithm for Deadline-Constrained Scheduling over Wireless Fading Channels

Recently, low-complexity and distributed Carrier Sense Multiple Access (CSMA)-based scheduling algorithms have attracted extensive interest due to their throughput-optimal characteristics in general network topologies. However, these algorithms are not well-suited for serving real-time traffic under time-varying channel conditions for two reasons: (1) the mixing time of the underlying CSMA Markov Chain grows with the size of the network, which, for large networks, generates unacceptable delay for deadline-constrained traffic; (2) since the dynamic CSMA parameters are influenced by the arrival and channel state processes, the underlying CSMA Markov Chain may not converge to a steady-state under strict deadline constraints and fading channel conditions. In this paper, we attack the problem of distributed scheduling for serving real-time traffic over time-varying channels. Specifically, we consider fully-connected topologies with independently fading channels (which can model cellular networks) in which flows with short-term deadline constraints and long-term drop rate requirements are served. To that end, we first characterize the maximal set of satisfiable arrival processes for this system and, then, propose a Fast-CSMA (FCSMA) policy that is shown to be optimal in supporting any real-time traffic that is within the maximal satisfiable set. These theoretical results are further validated through simulations to demonstrate the relative efficiency of the FCSMA policy compared to some of the existing CSMA-based algorithms.

preprint2012arXiv

An Adaptive Successive Cancellation List Decoder for Polar Codes with Cyclic Redundancy Check

In this letter, we propose an adaptive SC (Successive Cancellation)-List decoder for polar codes with CRC. This adaptive SC-List decoder iteratively increases the list size until the decoder outputs contain at least one survival path which can pass CRC. Simulation shows that the adaptive SC-List decoder provides significant complexity reduction. We also demonstrate that polar code (2048, 1024) with 24-bit CRC decoded by our proposed adaptive SC-List decoder with very large list size can achieve a frame error rate FER=0.001 at Eb/No=1.1dB, which is about 0.2dB from the information theoretic limit at this block length.

preprint2012arXiv

Carrier dependent ferromagnetism in chromium doped topological insulator $Cr_{0.2}Bi_xSb_{1.8-x}Te_3$

Carrier-independent ferromagnetism of chromium doped topological insulator $Bi_xSb_{2-x}Te_3$ thin films,which cannot be explained by current theory of dilute magnetic semiconductor, has been reported recently. To study if it is related to the distinctive surface state of topological insulator, we studied the structural, magnetic and transport characters of $Cr_{0.2}Bi_xSb_{1.8-x}Te_3$ single crystals. The Curie temperature $T_c$, which is determined from magnetization and anomalous Hall effect measurements by Arrott plots, is found to be proportional to $p^{1/3}$, where p is the hole density. This fact supports a scenario of RKKY interaction with mean-field approximation. This carrier density dependent nature enables tuning and controlling of the magnetic properties by applying a gate voltage in the future science researches and spintronics applications.

preprint2012arXiv

Controllable spin singlet - spin triplet transition in three concentric quantum rings through magnetic field and confinement potential

We present a theoretical study of the spectrum of electrons confined in triple concentric rings. An unusual ordering and rich variety of angular momentum transitions are found that depend on the coupling between the rings and the confinement potential of the rings. Using the Configuration Interaction (CI) method, we calculated the two electron energy spectrum. Spin singlet to spin triplet transitions of the electron ground state are predicted and a fractional Aharonov-Bohm effect is found. We show that both the period and amplitude of the spin singlet - triplet energy gap depend strongly on the confinement potential and the external magnetic field. The spin singlet - triplet transition is found to depend on the spin Zeeman energy, especially for rings with weak confinement and in the presence of large magnetic field. The amplitude of the spin singlet - triplet energy gap depends on the Landé $g$-factor but the period of the transitions is independent of $g$.

preprint2012arXiv

Decomposition of the Symmetric Powers

A decomposition of any symmetric power of $\Bbb C^2\otimes\Bbb C^2\otimes\Bbb C^2$ into irreducible $sl_2(\Bbb C)\oplus sl_2(\Bbb C)\oplus sl_2(\Bbb C)$-submodules are presented. Namely, the multiplicities of irreducible summands in the symmetric power are determined.

preprint2012arXiv

Groupwise Constrained Reconstruction for Subspace Clustering

Reconstruction based subspace clustering methods compute a self reconstruction matrix over the samples and use it for spectral clustering to obtain the final clustering result. Their success largely relies on the assumption that the underlying subspaces are independent, which, however, does not always hold in the applications with increasing number of subspaces. In this paper, we propose a novel reconstruction based subspace clustering model without making the subspace independence assumption. In our model, certain properties of the reconstruction matrix are explicitly characterized using the latent cluster indicators, and the affinity matrix used for spectral clustering can be directly built from the posterior of the latent cluster indicators instead of the reconstruction matrix. Experimental results on both synthetic and real-world datasets show that the proposed model can outperform the state-of-the-art methods.

preprint2012arXiv

On-Line Portfolio Selection with Moving Average Reversion

On-line portfolio selection has attracted increasing interests in machine learning and AI communities recently. Empirical evidences show that stock's high and low prices are temporary and stock price relatives are likely to follow the mean reversion phenomenon. While the existing mean reversion strategies are shown to achieve good empirical performance on many real datasets, they often make the single-period mean reversion assumption, which is not always satisfied in some real datasets, leading to poor performance when the assumption does not hold. To overcome the limitation, this article proposes a multiple-period mean reversion, or so-called Moving Average Reversion (MAR), and a new on-line portfolio selection strategy named "On-Line Moving Average Reversion" (OLMAR), which exploits MAR by applying powerful online learning techniques. From our empirical results, we found that OLMAR can overcome the drawback of existing mean reversion algorithms and achieve significantly better results, especially on the datasets where the existing mean reversion algorithms failed. In addition to superior trading performance, OLMAR also runs extremely fast, further supporting its practical applicability to a wide range of applications.

preprint2012arXiv

Tunable optical Aharonov-Bohm effect in a semiconductor quantum ring

By applying an electric field perpendicular to a semiconductor quantum ring we show that it is possible to modify the single particle wave function between quantum dot (QD)-like to ring-like. The constraints on the geometrical parameters of the quantum ring to realize such a transition are derived. With such a perpendicular electric field we are able to tune the Aharanov-Bohm (AB) effect for both single particles and for excitons. The tunability is in both the strength of the AB-effect as well as in its periodicity. We also investigate the strain induce potential inside the self assembled quantum ring and the effect of the strain on the AB effect.

preprint2010arXiv

Canonical bases and quantum coordinate ring

Some filtrations of the tensor product of a highest weight module and a lowest weight module over quantum group $U_q(\mathfrak g)$ are constructed in \cite{LZ:2009} and one can use them to define some ideals of the modified quantized enveloping algebra. It is shown that the quotient algebras inherit canonical bases from the modified quantized enveloping algebra and are dual to the quantum coordinate ring defined by Kashiwara for symmetrizable Kac-Moody algebra $\mathfrak g$.

preprint2010arXiv

Composition Series of Tensor Product

Given a quantized enveloping algebra $U_q(\mathfrak g)$ and a pair of dominant weights ($λ$, $μ$), we extend a conjecture raised by Lusztig in \cite{Lusztig:1992}to a more general form and then prove this extended Lusztig's conjecture. Namely we prove that for any symmetrizable Kac-Moody algebra $\mathfrak g$, there is a composition series of the $U_q(\mathfrak g)$-module $V(λ)\otimes V(μ)$ compatible with the canonical basis. As a byproduct, the celebrated Littlewood-Richardson rule is derived and we also construct, in the same manner, a composition series of $V(λ)\otimes V(-μ)$ compatible with the canonical basis when $\mathfrak g$ is of affine type and the level of $λ-μ$ is nonzero.

preprint2010arXiv

Optimization Framework and Graph-Based Approach for Relay-Assisted Bidirectional OFDMA Cellular Networks

This paper considers a relay-assisted bidirectional cellular network where the base station (BS) communicates with each mobile station (MS) using OFDMA for both uplink and downlink. The goal is to improve the overall system performance by exploring the full potential of the network in various dimensions including user, subcarrier, relay, and bidirectional traffic. In this work, we first introduce a novel three-time-slot time-division duplexing (TDD) transmission protocol. This protocol unifies direct transmission, one-way relaying and network-coded two-way relaying between the BS and each MS. Using the proposed three-time-slot TDD protocol, we then propose an optimization framework for resource allocation to achieve the following gains: cooperative diversity (via relay selection), network coding gain (via bidirectional transmission mode selection), and multiuser diversity (via subcarrier assignment). We formulate the problem as a combinatorial optimization problem, which is NP-complete. To make it more tractable, we adopt a graph-based approach. We first establish the equivalence between the original problem and a maximum weighted clique problem in graph theory. A metaheuristic algorithm based on any colony optimization (ACO) is then employed to find the solution in polynomial time. Simulation results demonstrate that the proposed protocol together with the ACO algorithm significantly enhances the system total throughput.

Bin Li

What is connected

Connect this record

See the researcher in context

Building this map preview

124 published item(s)

Reward-Decomposed Reinforcement Learning for Immersive Video Role-Playing

Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints

Towards Optimal Tradeoff Between Data Freshness and Update Cost in Information-update Systems

3D Perception based Imitation Learning under Limited Demonstration for Laparoscope Control in Robotic Surgery

A Higher-Order Semantic Dependency Parser

A New Perspective on Stabilizing GANs training: Direct Adversarial Training

ADBCMM : Acronym Disambiguation by Building Counterfactuals and Multilingual Mixing

Carbide: Highly Reliable Networks Through Real-Time Multiple Control Plane Composition

Charactering instrumental noises and stochastic gravitational wave signals from combined time-delay interferometry

Combinatorial Procurement Auction in Social Networks

Data-Efficient Backdoor Attacks

Digital Twin Assisted Task Offloading for Aerial Edge Computing and Networks

Dog nose print matching with dual global descriptor based on Contrastive Learning

Enhancing Backdoor Attacks with Multi-Level MMD Regularization

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs

Graph Layer Security: Encrypting Information via Common Networked Physics

Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression

Improve Radar Sensing Performance of Multiple Roadside Units Cooperation via Space Registration

LAMOST MRS-N Observations of the W80 Region

LDoS attack detection method based on traffic time-frequency characteristics

Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Multi-Unit Diffusion Auctions with Intermediaries

MVD: Memory-Related Vulnerability Detection Based on Flow-Sensitive Graph Neural Networks

Neural Compression-Based Feature Learning for Video Restoration

New Massive Contact Twin Binary in a Radio-quiet HII Region Associated with the M17 Complex

Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction

Phase Transitions and Superconductivity in Ternary Hydride Li$_2$SiH$_6$ at High Pressures

Prompt-based System for Personality and Interpersonal Reactivity Prediction

Remote blood pressure measurement via spatiotemporal mapping of a short-time facial video

Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation

Secure UAV-to-Ground MIMO Communications: Joint Transceiver and Location Optimization

Self-Adversarial Training incorporating Forgery Attention for Image Forgery Localization

Style Spectroscope: Improve Interpretability and Controllability through Fourier Analysis

The Data Processing of the LAMOST Medium-Resolution Spectral Survey of Galactic Nebulae (LAMOST MRS-N Pipeline)

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

Universal Polar Coding for Parallel Gaussian Channels with Non-Binary Inputs and Its Applications to HARQ and MIMO

Waiting but not Aging: Optimizing Information Freshness Under the Pull Model

A Generic Object Re-identification System for Short Videos

Bayesian Nonparametric Space Partitions: A Survey

Efficient Learning-based Scheduling for Information Freshness in Wireless Networks

Image Steganography based on Iteratively Adversarial Samples of A Synchronized-directions Sub-image

Infant Cry Classification with Graph Convolutional Networks

More but Correct: Generating Diversified and Entity-revised Medical Response

Protonation-induced discrete superconducting phases in bulk FeSe single crystals

Quantum versus Classical Regime in Circuit Quantum Acoustodynamics

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Serial-parallel Multi-Scale Feature Fusion for Anatomy-Oriented Hand Joint Detection

The intrinsic structure of Sagittarius A* at 1.3 cm and 7 mm

Understanding the Error in Evaluating Adversarial Robustness

A Fast Recursive Algorithm for G-STBC

An Improved Square-root Algorithm for V-BLAST Based on Efficient Inverse Cholesky Factorization

Bulk Superconductivity in the Dirac Semimetal TlSb

CALPA-NET: Channel-pruning-assisted Deep Residual Network for Steganalysis of Digital Images

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning

Crop Water Status Monitoring by Terahertz Imaging

DR 21 South Filament: a Parsec-sized Dense Gas Accretion Flow onto the DR 21 Massive Young Cluster

Dual-stream Maximum Self-attention Multi-instance Learning

Estimation of Regional Economic Development Indicator from Transportation Network Analytics

Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor and Event-Stream Dataset

How Molecular Chiralities of Bis(mandelato)borate Anions affect Their Binding Structures with Alkali Metal Ions and Microstructural Properties in Tetraalkylphosphonium Ionic Liquids

Identification of Deep Network Generated Images Using Disparities in Color Components

Incentive-Compatible Diffusion Auctions

Interpreting the Latent Space of GANs via Correlation Analysis for Controllable Concept Manipulation

Multitask Non-Autoregressive Model for Human Motion Prediction

Online Binary Space Partitioning Forests

Outlier Detection Ensemble with Embedded Feature Selection

Pressure Engineering of the Dirac Fermions in Quasi-One-Dimensional Tl$_2$Mo$_6$Se$_6$

Question Guided Modular Routing Networks for Visual Question Answering

Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data Modelling

Smoothing Graphons for Modelling Exchangeable Relational Data

The ENUF Method -- Ewald Summation based on Non-Uniform Fast Fourier Transform: Implementation, Parallelization, and Application

Topological Dirac states in a layered telluride TaPdTe$_5$ with quasi-one-dimensional PdTe$_2$ chains