Source author record

Wei Yang

Wei Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

94works

36topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

Recent advances in self-evolution video understanding frameworks have demonstrated the potential of autonomous learning without human annotations. However, existing methods often suffer from weakly controlled optimization and uncontrolled difficulty progression, as they lack structured guidance throughout the iterative learning process. To address these limitations, we propose CurEvo, a curriculum-guided self-evolution framework that introduces curriculum learning into self-evolution to achieve more structured and progressive model improvement. CurEvo dynamically regulates task difficulty, refines evaluation criteria, and balances data diversity according to model competence, forming a curriculum-guided feedback loop that aligns learning complexity with model capability. Built upon this principle, we develop a multi-dimensional adaptive QA framework that jointly evolves question generation and answer evaluation across perception, recognition, and understanding dimensions, ensuring coherent and measurable curriculum progression. Through this integration, CurEvo transforms weakly controlled self-evolution into a more structured learning process for autonomous video understanding. Across seven backbones, CurEvo consistently improves both benchmark accuracy and evaluator-based semantic score on four VideoQA benchmarks, validating the effectiveness of curriculum-guided self-evolution for video understanding.

preprint2023arXiv

Artificial intelligence for diagnosing and predicting survival of patients with renal cell carcinoma: Retrospective multi-center study

Background: Clear cell renal cell carcinoma (ccRCC) is the most common renal-related tumor with high heterogeneity. There is still an urgent need for novel diagnostic and prognostic biomarkers for ccRCC. Methods: We proposed a weakly-supervised deep learning strategy using conventional histology of 1752 whole slide images from multiple centers. Our study was demonstrated through internal cross-validation and external validations for the deep learning-based models. Results: Automatic diagnosis for ccRCC through intelligent subtyping of renal cell carcinoma was proved in this study. Our graderisk achieved aera the curve (AUC) of 0.840 (95% confidence interval: 0.805-0.871) in the TCGA cohort, 0.840 (0.805-0.871) in the General cohort, and 0.840 (0.805-0.871) in the CPTAC cohort for the recognition of high-grade tumor. The OSrisk for the prediction of 5-year survival status achieved AUC of 0.784 (0.746-0.819) in the TCGA cohort, which was further verified in the independent General cohort and the CPTAC cohort, with AUC of 0.774 (0.723-0.820) and 0.702 (0.632-0.765), respectively. Cox regression analysis indicated that graderisk, OSrisk, tumor grade, and tumor stage were found to be independent prognostic factors, which were further incorporated into the competing-risk nomogram (CRN). Kaplan-Meier survival analyses further illustrated that our CRN could significantly distinguish patients with high survival risk, with hazard ratio of 5.664 (3.893-8.239, p < 0.0001) in the TCGA cohort, 35.740 (5.889-216.900, p < 0.0001) in the General cohort and 6.107 (1.815 to 20.540, p < 0.0001) in the CPTAC cohort. Comparison analyses conformed that our CRN outperformed current prognosis indicators in the prediction of survival status, with higher concordance index for clinical prognosis.

preprint2023arXiv

MLMSA: Multi-Label Multi-Side-Channel-Information enabled Deep Learning Attacks on APUF Variants

To improve the modeling resilience of silicon strong physical unclonable functions (PUFs), in particular, the APUFs, that yield a very large number of challenge response pairs (CRPs), a number of composited APUF variants such as XOR-APUF, interpose-PUF (iPUF), feed-forward APUF (FF-APUF),and OAX-APUF have been devised. When examining their security in terms of modeling resilience, utilizing multiple information sources such as power side channel information (SCI) or/and reliability SCI given a challenge is under-explored, which poses a challenge to their supposed modeling resilience in practice. Building upon multi-label/head deep learning model architecture,this work proposes Multi-Label Multi-Side-channel-information enabled deep learning Attacks (MLMSA) to thoroughly evaluate the modeling resilience of aforementioned APUF variants. Despite its simplicity, MLMSA can successfully break large-scaled APUF variants, which has not previously been achieved. More precisely, the MLMSA breaks 128-stage 30-XOR-APUF, (9, 9)- and (2, 18)-iPUFs, and (2, 2, 30)-OAX-APUF when CRPs, power SCI and reliability SCI are concurrently used. It breaks 128-stage 12-XOR-APUF and (2, 2, 9)-OAX-APUF even when only the easy-to-obtain reliability SCI and CRPs are exploited. The 128-stage six-loop FF-APUF and one-loop 20-XOR-FF-APUF can be broken by simultaneously using reliability SCI and CRPs. All these attacks are normally completed within an hour with a standard personalcomputer. Therefore, MLMSA is a useful technique for evaluating other existing or any emerging strong PUF designs.

preprint2022arXiv

Analysis Facilities for HL-LHC

The HL-LHC presents significant challenges for the HEP analysis community. The number of events in each analysis is expected to increase by an order of magnitude and new techniques are expected to be required; both challenges necessitate new services and approaches for analysis facilities. These services are expected to provide new capabilities, a larger scale, and different access modalities (complementing -- but distinct from -- traditional batch-oriented approaches). To facilitate this transition, the US-LHC community is actively investing in analysis facilities to provide a testbed for those developing new analysis systems and to demonstrate new techniques for service delivery. This whitepaper outlines the existing activities within the US LHC community in this R&D area, the short- to medium-term goals, and the outline of common goals and milestones.

preprint2022arXiv

CorrGAN: Input Transformation Technique Against Natural Corruptions

Because of the increasing accuracy of Deep Neural Networks (DNNs) on different tasks, a lot of real times systems are utilizing DNNs. These DNNs are vulnerable to adversarial perturbations and corruptions. Specifically, natural corruptions like fog, blur, contrast etc can affect the prediction of DNN in an autonomous vehicle. In real time, these corruptions are needed to be detected and also the corrupted inputs are needed to be de-noised to be predicted correctly. In this work, we propose CorrGAN approach, which can generate benign input when a corrupted input is provided. In this framework, we train Generative Adversarial Network (GAN) with novel intermediate output-based loss function. The GAN can denoise the corrupted input and generate benign input. Through experimentation, we show that up to 75.2% of the corrupted misclassified inputs can be classified correctly by DNN using CorrGAN.

preprint2022arXiv

Design and Evaluate Recomposited OR-AND-XOR-PUF

Physical Unclonable Function (PUF) is a hardware security primitive with a desirable feature of low-cost. Based on the space of challenge-response pairs (CRPs), it has two categories:weak PUF and strong PUF. Though designing a reliable and secure lightweight strong PUF is challenging, there is continuing efforts to fulfill this gap due to wide range of applications enabled by strong PUF. It was prospected that the combination of MAX and MIN bit-wise operation is promising for improving the modeling resilience when MAX and MIN are employed in the PUF recomposition. The main rationale lies on the fact that each bit-wise might be mainly vulnerable to one specific type of modeling attack, combining them can have an improved holistic resilience. This work is to first evaluate the main PUF performance, in particular,uniformity and reliability of the OR-AND-XOR-PUF(OAX-PUF)-(x, y, z)-OAX-PUF. Compared with the most used l-XOR-PUF, the (x, y, z)-OAX-PUF eventually exhibits better reliability given l=x+y+z without degrading the uniformity retaining to be 50%. We further examine the modeling resilience of the (x, y, z)-OAX-PUF with four powerful attacking strategies to date, which are Logistic Regression (LR) attack, reliability assisted CMA-ES attack, multilayer perceptron (MLP) attack, and the most recent hybrid LR-reliability attack. In comparison with the XOR-APUF, the OAX-APUF successfully defeats the CAM-ES attack. However, it shows no notable modeling accuracy drop against other three attacks, though the attacking times have been greatly prolonged to LR and hybrid LR-reliability attacks. Overall, the OAX recomposition could be an alternative lightweight recomposition method compared to XOR towards constructing strong PUFs if the underlying PUF, e.g., FF-APUF, has exhibited improved resilience to modeling attack, because the OAX incurs smaller reliability degradation compared to XOR.

preprint2022arXiv

Detecting Topology Attacks against Graph Neural Networks

Graph neural networks (GNNs) have been widely used in many real applications, and recent studies have revealed their vulnerabilities against topology attacks. To address this issue, existing efforts have mainly been dedicated to improving the robustness of GNNs, while little attention has been paid to the detection of such attacks. In this work, we study the victim node detection problem under topology attacks against GNNs. Our approach is built upon the key observation rooted in the intrinsic message passing nature of GNNs. That is, the neighborhood of a victim node tends to have two competing group forces, pushing the node classification results towards the original label and the targeted label, respectively. Based on this observation, we propose to detect victim nodes by deliberately designing an effective measurement of the neighborhood variance for each node. Extensive experimental results on four real-world datasets and five existing topology attacks show the effectiveness and efficiency of the proposed detection approach.

preprint2022arXiv

EREBA: Black-box Energy Testing of Adaptive Neural Networks

Recently, various Deep Neural Network (DNN) models have been proposed for environments like embedded systems with stringent energy constraints. The fundamental problem of determining the robustness of a DNN with respect to its energy consumption (energy robustness) is relatively unexplored compared to accuracy-based robustness. This work investigates the energy robustness of Adaptive Neural Networks (AdNNs), a type of energy-saving DNNs proposed for many energy-sensitive domains and have recently gained traction. We propose EREBA, the first black-box testing method for determining the energy robustness of an AdNN. EREBA explores and infers the relationship between inputs and the energy consumption of AdNNs to generate energy surging samples. Extensive implementation and evaluation using three state-of-the-art AdNNs demonstrate that test inputs generated by EREBA could degrade the performance of the system substantially. The test inputs generated by EREBA can increase the energy consumption of AdNNs by 2,000% compared to the original inputs. Our results also show that test inputs generated via EREBA are valuable in detecting energy surging inputs.

preprint2022arXiv

Fast localization and single-pixel imaging of the moving object using time-division multiplexing

When imaging moving objects, single-pixel imaging produces motion blur. This paper proposes a new single-pixel imaging method, which can achieve anti-motion blur imaging of a fast-moving object. The geometric moment patterns and Hadamard patterns are used to alternately encode the position information and the image information of the object with time-division multiplexing. In the reconstruction process, the object position information is extracted independently and combining motion-compensation reconstruction algorithm to decouple the object motion from image information. As a result, the anti-motion blur image and the high frame rate object positions are obtained. Experimental results show that for a moving object with an angular velocity of up to 0.5rad/s relative to the imaging system, the proposed method achieves a localization frequency of 5.55kHz, and gradually reconstructs a clear image of the fast-moving object with a pseudo resolution of 512x512. The method has application prospects in single-pixel imaging of the fast-moving object.

preprint2022arXiv

HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

We introduce a new simulation benchmark "HandoverSim" for human-to-robot object handovers. To simulate the giver's motion, we leverage a recent motion capture dataset of hand grasping of objects. We create training and evaluation environments for the receiver with standardized protocols and metrics. We analyze the performance of a set of baselines and show a correlation with a real-world evaluation. Code is open sourced at https://handover-sim.github.io.

preprint2022arXiv

HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs

Recent neural human representations can produce high-quality multi-view rendering but require using dense multi-view inputs and costly training. They are hence largely limited to static models as training each frame is infeasible. We present HumanNeRF - a generalizable neural representation - for high-fidelity free-view synthesis of dynamic humans. Analogous to how IBRNet assists NeRF by avoiding per-scene training, HumanNeRF employs an aggregated pixel-alignment feature across multi-view inputs along with a pose embedded non-rigid deformation field for tackling dynamic motions. The raw HumanNeRF can already produce reasonable rendering on sparse video inputs of unseen subjects and camera settings. To further improve the rendering quality, we augment our solution with an appearance blending module for combining the benefits of both neural volumetric rendering and neural texture blending. Extensive experiments on various multi-view dynamic human datasets demonstrate the generalizability and effectiveness of our approach in synthesizing photo-realistic free-view humans under challenging motions and with very sparse camera view inputs.

preprint2022arXiv

Learning Free Gait Transition for Quadruped Robots via Phase-Guided Controller

Gaits and transitions are key components in legged locomotion. For legged robots, describing and reproducing gaits as well as transitions remain longstanding challenges. Reinforcement learning has become a powerful tool to formulate controllers for legged robots. Learning multiple gaits and transitions, nevertheless, is related to the multi-task learning problems. In this work, we present a novel framework for training a simple control policy for a quadruped robot to locomote in various gaits. Four independent phases are used as the interface between the gait generator and the control policy, which characterizes the movement of four feet. Guided by the phases, the quadruped robot is able to locomote according to the generated gaits, such as walk, trot, pacing and bounding, and to make transitions among those gaits. More general phases can be used to generate complex gaits, such as mixed rhythmic dancing. With the control policy, the Black Panther robot, a medium-dog-sized quadruped robot, can perform all learned motor skills while following the velocity commands smoothly and robustly in natural environment.

preprint2022arXiv

Learning Perceptual Concepts by Bootstrapping from Human Queries

When robots operate in human environments, it's critical that humans can quickly teach them new concepts: object-centric properties of the environment that they care about (e.g. objects near, upright, etc). However, teaching a new perceptual concept from high-dimensional robot sensor data (e.g. point clouds) is demanding, requiring an unrealistic amount of human labels. To address this, we propose a framework called Perceptual Concept Bootstrapping (PCB). First, we leverage the inherently lower-dimensional privileged information, e.g., object poses and bounding boxes, available from a simulator only at training time to rapidly learn a low-dimensional, geometric concept from minimal human input. Second, we treat this low-dimensional concept as an automatic labeler to synthesize a large-scale high-dimensional data set with the simulator. With these two key ideas, PCB alleviates human label burden while still learning perceptual concepts that work with real sensor input where no privileged information is available. We evaluate PCB for learning spatial concepts that describe object state or multi-object relationships, and show it achieves superior performance compared to baseline methods. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka Panda robot.

preprint2022arXiv

Model Predictive Control for Fluid Human-to-Robot Handovers

Human-robot handover is a fundamental yet challenging task in human-robot interaction and collaboration. Recently, remarkable progressions have been made in human-to-robot handovers of unknown objects by using learning-based grasp generators. However, how to responsively generate smooth motions to take an object from a human is still an open question. Specifically, planning motions that take human comfort into account is not a part of the human-robot handover process in most prior works. In this paper, we propose to generate smooth motions via an efficient model-predictive control (MPC) framework that integrates perception and complex domain-specific constraints into the optimization problem. We introduce a learning-based grasp reachability model to select candidate grasps which maximize the robot's manipulability, giving it more freedom to satisfy these constraints. Finally, we integrate a neural net force/torque classifier that detects contact events from noisy data. We conducted human-to-robot handover experiments on a diverse set of objects with several users (N=4) and performed a systematic evaluation of each module. The study shows that the users preferred our MPC approach over the baseline system by a large margin. More results and videos are available at https://sites.google.com/nvidia.com/mpc-for-handover.

preprint2022arXiv

Multi-Modal Fusion in Contact-Rich Precise Tasks via Hierarchical Policy Learning

Combined visual and force feedback play an essential role in contact-rich robotic manipulation tasks. Current methods focus on developing the feedback control around a single modality while underrating the synergy of the sensors. Fusing different sensor modalities is necessary but remains challenging. A key challenge is to achieve an effective multi-modal and generalized control scheme to novel objects with precision. This paper proposes a practical multi-modal sensor fusion mechanism using hierarchical policy learning. To begin with, we use a self-supervised encoder that extracts multi-view visual features and a hybrid motion/force controller that regulates force behaviors. Next, the multi-modality fusion is simplified by hierarchical integration of the vision, force, and proprioceptive data in the reinforcement learning (RL) algorithm. Moreover, with hierarchical policy learning, the control scheme can exploit the visual feedback limits and explore the contribution of individual modality in precise tasks. Experiments indicate that robots with the control scheme could assemble objects with 0.25mm clearance in simulation. The system could be generalized to widely varied initial configurations and new shapes. Experiments validate that the simulated system can be robustly transferred to reality without fine-tuning.

preprint2022arXiv

NeReF: Neural Refractive Field for Fluid Surface Reconstruction and Implicit Representation

Existing neural reconstruction schemes such as Neural Radiance Field (NeRF) are largely focused on modeling opaque objects. We present a novel neural refractive field(NeReF) to recover wavefront of transparent fluids by simultaneously estimating the surface position and normal of the fluid front. Unlike prior arts that treat the reconstruction target as a single layer of the surface, NeReF is specifically formulated to recover a volumetric normal field with its corresponding density field. A query ray will be refracted by NeReF according to its accumulated refractive point and normal, and we employ the correspondences and uniqueness of refracted ray for NeReF optimization. We show NeReF, as a global optimization scheme, can more robustly tackle refraction distortions detrimental to traditional methods for correspondence matching. Furthermore, the continuous NeReF representation of wavefront enables view synthesis as well as normal integration. We validate our approach on both synthetic and real data and show it is particularly suitable for sparse multi-view acquisition. We hence build a small light field array and experiment on various surface shapes to demonstrate high fidelity NeReF reconstruction.

preprint2022arXiv

Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

The pipeline of current robotic pick-and-place methods typically consists of several stages: grasp pose detection, finding inverse kinematic solutions for the detected poses, planning a collision-free trajectory, and then executing the open-loop trajectory to the grasp pose with a low-level tracking controller. While these grasping methods have shown good performance on grasping static objects on a table-top, the problem of grasping dynamic objects in constrained environments remains an open problem. We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network. This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.

preprint2022arXiv

NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing

Some of the most exciting experiences that Metaverse promises to offer, for instance, live interactions with virtual characters in virtual environments, require real-time photo-realistic rendering. 3D reconstruction approaches to rendering, active or passive, still require extensive cleanup work to fix the meshes or point clouds. In this paper, we present a neural volumography technique called neural volumetric video or NeuVV to support immersive, interactive, and spatial-temporal rendering of volumetric video contents with photo-realism and in real-time. The core of NeuVV is to efficiently encode a dynamic neural radiance field (NeRF) into renderable and editable primitives. We introduce two types of factorization schemes: a hyper-spherical harmonics (HH) decomposition for modeling smooth color variations over space and time and a learnable basis representation for modeling abrupt density and color changes caused by motion. NeuVV factorization can be integrated into a Video Octree (VOctree) analogous to PlenOctree to significantly accelerate training while reducing memory overhead. Real-time NeuVV rendering further enables a class of immersive content editing tools. Specifically, NeuVV treats each VOctree as a primitive and implements volume-based depth ordering and alpha blending to realize spatial-temporal compositions for content re-purposing. For example, we demonstrate positioning varied manifestations of the same performance at different 3D locations with different timing, adjusting color/texture of the performer's clothing, casting spotlight shadows and synthesizing distance falloff lighting, etc, all at an interactive speed. We further develop a hybrid neural-rasterization rendering framework to support consumer-level VR headsets so that the aforementioned volumetric video viewing and editing, for the first time, can be conducted immersively in virtual 3D space.

preprint2022arXiv

NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models

Neural image caption generation (NICG) models have received massive attention from the research community due to their excellent performance in visual understanding. Existing work focuses on improving NICG model accuracy while efficiency is less explored. However, many real-world applications require real-time feedback, which highly relies on the efficiency of NICG models. Recent research observed that the efficiency of NICG models could vary for different inputs. This observation brings in a new attack surface of NICG models, i.e., An adversary might be able to slightly change inputs to cause the NICG models to consume more computational resources. To further understand such efficiency-oriented threats, we propose a new attack approach, NICGSlowDown, to evaluate the efficiency robustness of NICG models. Our experimental results show that NICGSlowDown can generate images with human-unnoticeable perturbations that will increase the NICG model latency up to 483.86%. We hope this research could raise the community's concern about the efficiency robustness of NICG models.

preprint2022arXiv

Spin Manipulation by Giant Valley-Zeeman Spin-Orbit Field in Atom-Thick WSe2

The phenomenon originating from spin-orbit coupling (SOC) provides energy-efficient strategies for spin manipulation and device applications. The broken inversion symmetry interface and resulting electric field induce a Rashba-type spin-orbit field (SOF), which has been demonstrated to generate spin-orbit torque for data storage applications. In this study, we found that spin flipping can be achieved by the valley-Zeeman SOF in monolayer WSe2 at room temperature, which manifests as a negative magnetoresistance in the vertical spin valve. Quantum transmission calculations based on an effective model near the K valley of WSe2 confirm the precessional spin transport of carriers under the giant SOF, which is estimated to be 650 T. In particular, the valley-Zeeman SOF-induced spin dynamics was demonstrated to be tunable with the layer number and stacking phase of WSe2 as well as the gate voltage, which provides a novel strategy for spin manipulation and can benefit the development of ultralow-power spintronic devices.

preprint2022arXiv

Temperature-linear Resistivity in Twisted Double Bilayer Graphene

We report an experimental study of carrier density (n), displacement field (D) and twist angle (θ) dependence of temperature (T)-linear resistivity in twisted double bilayer graphene (TDBG). For a large twist angle (θ>1.5°) where correlated insulating states are absent, we observe a T-linear resistivity (with the slope of the order ~10Ω/K) over a wide range of carrier density and its slope decreases with increasing of n, in agreement with acoustic phonon scattering model semi-quantitatively. The slope of T-linear resistivity is non-monotonically dependent on the displacement field with a single peak structure. For device with θ~1.23° at which correlated states emerge, the slope of T-linear resistivity is found maximum (~100Ω/K) at the boundary of the halo structure where phase transition occurs, with signatures of continuous phase transition, Planckian dissipation, and the diverging effective mass; these observations are in line with quantum critical behaviors, which might be due to the symmetry-breaking instability at the critical points. Our results shed new light on correlated physics in TDBG and other twisted moiré systems.

preprint2022arXiv

Transformer Tracking with Cyclic Shifting Window Attention

Transformer architecture has been showing its great strength in visual object tracking, for its effective attention mechanism. Existing transformer-based approaches adopt the pixel-to-pixel attention strategy on flattened image features and unavoidably ignore the integrity of objects. In this paper, we propose a new transformer architecture with multi-scale cyclic shifting window attention for visual object tracking, elevating the attention from pixel to window level. The cross-window multi-scale attention has the advantage of aggregating attention at different scales and generates the best fine-scale match for the target object. Furthermore, the cyclic shifting strategy brings greater accuracy by expanding the window samples with positional information, and at the same time saves huge amounts of computational power by removing redundant calculations. Extensive experiments demonstrate the superior performance of our method, which also sets the new state-of-the-art records on five challenging datasets, along with the VOT2020, UAV123, LaSOT, TrackingNet, and GOT-10k benchmarks.

preprint2021arXiv

An Empirical Analysis of UI-based Flaky Tests

Flaky tests have gained attention from the research community in recent years and with good reason. These tests lead to wasted time and resources, and they reduce the reliability of the test suites and build systems they affect. However, most of the existing work on flaky tests focus exclusively on traditional unit tests. This work ignores UI tests that have larger input spaces and more diverse running conditions than traditional unit tests. In addition, UI tests tend to be more complex and resource-heavy, making them unsuited for detection techniques involving rerunning test suites multiple times. In this paper, we perform a study on flaky UI tests. We analyze 235 flaky UI test samples found in 62 projects from both web and Android environments. We identify the common underlying root causes of flakiness in the UI tests, the strategies used to manifest the flaky behavior, and the fixing strategies used to remedy flaky UI tests. The findings made in this work can provide a foundation for the development of detection and prevention techniques for flakiness arising in UI tests.

preprint2021arXiv

F3SNet: A Four-Step Strategy for QIM Steganalysis of Compressed Speech Based on Hierarchical Attention Network

Traditional machine learning-based steganalysis methods on compressed speech have achieved great success in the field of communication security. However, previous studies lacked mathematical description and modeling of the correlation between codewords, and there is still room for improvement in steganalysis for small-sized and low embedding rates sample. To deal with the challenge, We use Bayesian networks to measure different types of correlations between codewords in linear prediction code and present F3SNet -- a four-step strategy: Embedding, Encoding, Attention and Classification for quantizaition index modulation steganalysis of compressed speech based on Hierarchical Attention Network. Among them, Embedding converts codewords into high-density numerical vectors, Encoding uses the memory characteristics of LSTM to retain more information by distributing it among all its vectors and Attention further determines which vectors have a greater impact on the final classification result. To evaluate the performance of F3SNet, we make comprehensive comparison of F3SNet with existing steganography methods. Experimental results show that F3SNet surpasses the state-of-the-art methods, particularly for small-sized and low embedding rate samples.

preprint2021arXiv

Isospin competitions and valley polarized correlated insulators in twisted double bilayer graphene

New phase of matter usually emerges when a given symmetry breaks spontaneously, which can involve charge, spin, and valley degree of freedoms. Here, we report an observation of new correlated insulators evolved from spin polarized states to valley polarized states in AB-BA stacked twisted double bilayer graphene (TDBG). The transition of the isospin polarization is a result of the competition between spin and valley, driven by the displacement field (D). At a high field |D| > 0.7 V/nm, we observe valley polarized correlated insulators with a big Zeeman g factor of ~10, both at v = 2 in the moiré conduction band and more surprisingly at v = -2 in the moiré valence band. At a medium field |D| < 0.6 V/nm, by contrast, it is a conventional spin polarized correlated insulator at v = 2 and a featureless metal at v = -2. Moreover, we observe a valley polarized Chern insulator with C = 2 emanating at v = 2 in the electron side and a valley polarized Fermi surface around v = -2 in the hole side. The valley Chern insulator with C = 2 is evident from a well quantized Hall conductance plateau at 2e^2/h and correspondingly a vanishing longitudinal component. The valley polarized Fermi surface is topologically trivial with C = 0, and it shows a series of quantized Landau levels with v_LL = 0, 1, 2, 3, 4 and others. These observations are in good agreements with our band and topology calculations. Our results demonstrate a feasible way to realize isospin control and to obtain new phases of matter in TDBG by the displacement field, and might benefit other twisted or non-twisted multilayer systems.

preprint2021arXiv

Spatially indirect intervalley excitons in bilayer WSe2

Spatially indirect excitons with displaced wavefunctions of electrons and holes play a pivotal role in a large portfolio of fascinating physical phenomena and emerging optoelectronic applications, such as valleytronics, exciton spin Hall effect, excitonic integrated circuit and high-temperature superfluidity. Here, we uncover three types of spatially indirect excitons (including their phonon replicas) and their quantum-confined Stark effects in hexagonal boron nitride encapsulated bilayer WSe2, by performing electric field-tunable photoluminescence measurements. Because of different out-of-plane electric dipole moments, the energy order between the three types of spatially indirect excitons can be switched by a vertical electric field. Remarkably, we demonstrate, assisted by first-principles calculations, that the observed spatially indirect excitons in bilayer WSe2 are also momentum-indirect, involving electrons and holes from Q and K/Γ valleys in the Brillouin zone, respectively. This is in contrast to the previously reported spatially indirect excitons with electrons and holes localized in the same valley. Furthermore, we find that the spatially indirect intervalley excitons in bilayer WSe2 can exhibit considerable, doping-sensitive circular polarization. The spatially indirect excitons with momentum-dark nature and highly tunable circular polarization open new avenues for exotic valley physics and technological innovations in photonics and optoelectronics.

preprint2020arXiv

Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM

Service robots should be able to operate autonomously in dynamic and daily changing environments over an extended period of time. While Simultaneous Localization And Mapping (SLAM) is one of the most fundamental problems for robotic autonomy, most existing SLAM works are evaluated with data sequences that are recorded in a short period of time. In real-world deployment, there can be out-of-sight scene changes caused by both natural factors and human activities. For example, in home scenarios, most objects may be movable, replaceable or deformable, and the visual features of the same place may be significantly different in some successive days. Such out-of-sight dynamics pose great challenges to the robustness of pose estimation, and hence a robot's long-term deployment and operation. To differentiate the forementioned problem from the conventional works which are usually evaluated in a static setting in a single run, the term \textit{lifelong SLAM} is used here to address SLAM problems in an ever-changing environment over a long period of time. To accelerate lifelong SLAM research, we release the OpenLORIS-Scene datasets. The data are collected in real-world indoor scenes, for multiple times in each place to include scene changes in real life. We also design benchmarking metrics for lifelong SLAM, with which the robustness and accuracy of pose estimation are evaluated separately. The datasets and benchmark are available online at https://lifelong-robotic-vision.github.io/dataset/scene.

preprint2020arXiv

Collaborative Behavior Models for Optimized Human-Robot Teamwork

Effective human-robot collaboration requires informed anticipation. The robot must anticipate the human's actions, but also react quickly and intuitively when its predictions are wrong. The robot must plan its actions to account for the human's own plan, with the knowledge that the human's behavior will change based on what the robot actually does. This cyclical game of predicting a human's future actions and generating a corresponding motion plan is extremely difficult to model using standard techniques. In this work, we describe a novel Model Predictive Control (MPC)-based framework for finding optimal trajectories in a collaborative, multi-agent setting, in which we simultaneously plan for the robot while predicting the actions of its external collaborators. We use human-robot handovers to demonstrate that with a strong model of the collaborator, our framework produces fluid, reactive human-robot interactions in novel, cluttered environments. Our method efficiently generates coordinated trajectories, and achieves a high success rate in handover, even in the presence of significant sensor noise.

preprint2020arXiv

Direct Observation of Room-Temperature Dislocation Plasticity in Diamond

It is well known that diamond does not deform plastically at room temperature and usually fails in catastrophic brittle fracture. Here we demonstrate room-temperature dislocation plasticity in sub-micrometer sized diamond pillars by in-situ mechanical testing in the transmission electron microscope. We document in unprecedented details of spatio-temporal features of the dislocations introduced by the confinement-free compression, including dislocation generation and propagation. Atom-resolved observations with tomographic reconstructions show unequivocally that mixed-type dislocations with Burgers vectors of 1/2<110> are activated in the non-close-packed {001} planes of diamond under uniaxial compression of <111> and <110> directions, respectively, while being activated in the {111} planes under the <100> directional loading, indicating orientation-dependent dislocation plasticity. These results provide new insights into the mechanical behavior of diamond and stimulate reconsideration of the basic deformation mechanism in diamond as well as in other brittle covalent crystals at low temperatures.

preprint2020arXiv

DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features

A robust and efficient Simultaneous Localization and Mapping (SLAM) system is essential for robot autonomy. For visual SLAM algorithms, though the theoretical framework has been well established for most aspects, feature extraction and association is still empirically designed in most cases, and can be vulnerable in complex environments. This paper shows that feature extraction with deep convolutional neural networks (CNNs) can be seamlessly incorporated into a modern SLAM framework. The proposed SLAM system utilizes a state-of-the-art CNN to detect keypoints in each image frame, and to give not only keypoint descriptors, but also a global descriptor of the whole image. These local and global features are then used by different SLAM modules, resulting in much more robustness against environmental changes and viewpoint changes compared with using hand-crafted features. We also train a visual vocabulary of local features with a Bag of Words (BoW) method. Based on the local features, global features, and the vocabulary, a highly reliable loop closure detection method is built. Experimental results show that all the proposed modules significantly outperforms the baseline, and the full system achieves much lower trajectory errors and much higher correct rates on all evaluated data. Furthermore, by optimizing the CNN with Intel OpenVINO toolkit and utilizing the Fast BoW library, the system benefits greatly from the SIMD (single-instruction-multiple-data) techniques in modern CPUs. The full system can run in real-time without any GPU or other accelerators. The code is public at https://github.com/ivipsourcecode/dxslam.

preprint2020arXiv

High-order minibands and interband Landau level reconstruction in graphene moire superlattice

The propagation of Dirac fermions in graphene through a long-period periodic potential would result in a band folding together with the emergence of a series of cloned Dirac points (DPs). In highly aligned graphene/hexagonal boron nitride (G/hBN) heterostructures, the lattice mismatch between the two atomic crystals generates a unique kind of periodic structure known as a moiré superlattice. Of particular interests is the emergent phenomena related to the reconstructed band-structure of graphene, such as the Hofstadter butterfly, topological currents, gate dependent pseudospin mixing, and ballistic miniband conduction. However, most studies so far have been limited to the lower-order minibands, e.g. the 1st and 2nd minibands counted from charge neutrality, and consequently the fundamental nature of the reconstructed higher-order miniband spectra still remains largely unknown. Here we report on probing the higher-order minibands of precisely aligned graphene moiré superlattices by transport spectroscopy. Using dual electrostatic gating, the edges of these high-order minibands, i.e. the 3rd and 4th minibands, can be reached. Interestingly, we have observed interband Landau level (LL) crossinginducing gap closures in a multiband magneto-transport regime, which originates from band overlap between the 2nd and 3rd minibands. As observed high-order minibands and LL reconstruction qualitatively match our simulated results. Our findings highlight the synergistic effect of minibands in transport, thus presenting a new opportunity for graphene electronic devices.

preprint2020arXiv

Human Grasp Classification for Reactive Human-to-Robot Handovers

Transfer of objects between humans and robots is a critical capability for collaborative robots. Although there has been a recent surge of interest in human-robot handovers, most prior research focus on robot-to-human handovers. Further, work on the equally critical human-to-robot handovers often assumes humans can place the object in the robot's gripper. In this paper, we propose an approach for human-to-robot handovers in which the robot meets the human halfway, by classifying the human's grasp of the object and quickly planning a trajectory accordingly to take the object from the human's hand according to their intent. To do this, we collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses, and learn a deep model on this dataset to classify the hand grasps into one of these categories. We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position, and replans as necessary when the handover is interrupted. Through a systematic evaluation, we demonstrate that our system results in more fluent handovers versus two baselines. We also present findings from a user study (N = 9) demonstrating the effectiveness and usability of our approach with naive users in different scenarios. More results and videos can be found at http://wyang.me/handovers.

preprint2020arXiv

Integrating Discrete and Neural Features via Mixed-feature Trans-dimensional Random Field Language Models

There has been a long recognition that discrete features (n-gram features) and neural network based features have complementary strengths for language models (LMs). Improved performance can be obtained by model interpolation, which is, however, a suboptimal two-step integration of discrete and neural features. The trans-dimensional random field (TRF) framework has the potential advantage of being able to flexibly integrate a richer set of features. However, either discrete or neural features are used alone in previous TRF LMs. This paper develops a mixed-feature TRF LM and demonstrates its advantage in integrating discrete and neural features. Various LMs are trained over PTB and Google one-billion-word datasets, and evaluated in N-best list rescoring experiments for speech recognition. Among all single LMs (i.e. without model interpolation), the mixed-feature TRF LMs perform the best, improving over both discrete TRF LMs and neural TRF LMs alone, and also being significantly better than LSTM LMs. Compared to interpolating two separately trained models with discrete and neural features respectively, the performance of mixed-feature TRF LMs matches the best interpolated model, and with simplified one-step training process and reduced training time.

preprint2020arXiv

Massive Access for Future Wireless Communication Systems

Multiple access technology played an important role in wireless communication in the last decades: it increases the capacity of the channel and allows different users to access the system simultaneously. However, the conventional multiple access technology, as originally designed for current human-centric wireless networks, is not scalable for future machine-centric wireless networks. Massive access (studied in the literature under such names as massive-device multiple access, unsourced massive random access, massive connectivity, massive machine-type communication, and many-access channels) exhibits a clean break with current networks by potentially supporting millions of devices in each cellular network. The tremendous growth in the number of connected devices requires a fundamental rethinking of the conventional multiple access technologies in favor of new schemes suited for massive random access. Among the many new challenges arising in this setting, the most relevant are: the fundamental limits of communication from a massive number of bursty devices transmitting simultaneously with short packets, the design of low complexity and energy-efficient massive access coding and communication schemes, efficient methods for the detection of a relatively small number of active users among a large number of potential user devices with sporadic transmission pattern, and the integration of massive access with massive MIMO and other important wireless communication technologies. This paper presents an overview of the concept of massive access wireless communication and of the contemporary research on this important topic.

preprint2020arXiv

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

Multiple-object tracking and segmentation (MOTS) is a novel computer vision task that aims to jointly perform multiple object tracking (MOT) and instance segmentation. In this work, we present PointTrack++, an effective on-line framework for MOTS, which remarkably extends our recently proposed PointTrack framework. To begin with, PointTrack adopts an efficient one-stage framework for instance segmentation, and learns instance embeddings by converting compact image representations to un-ordered 2D point cloud. Compared with PointTrack, our proposed PointTrack++ offers three major improvements. Firstly, in the instance segmentation stage, we adopt a semantic segmentation decoder trained with focal loss to improve the instance selection quality. Secondly, to further boost the segmentation performance, we propose a data augmentation strategy by copy-and-paste instances into training images. Finally, we introduce a better training strategy in the instance association stage to improve the distinguishability of learned instance embeddings. The resulting framework achieves the state-of-the-art performance on the 5th BMTT MOTChallenge.

preprint2020arXiv

Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents

Techniques for automatically extracting important content elements from business documents such as contracts, statements, and filings have the potential to make business operations more efficient. This problem can be formulated as a sequence labeling task, and we demonstrate the adaption of BERT to two types of business documents: regulatory filings and property lease agreements. There are aspects of this problem that make it easier than "standard" information extraction tasks and other aspects that make it more difficult, but on balance we find that modest amounts of annotated data (less than 100 documents) are sufficient to achieve reasonable accuracy. We integrate our models into an end-to-end cloud platform that provides both an easy-to-use annotation interface as well as an inference interface that allows users to upload documents and inspect model outputs.

preprint2020arXiv

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

Current multi-object tracking and segmentation (MOTS) methods follow the tracking-by-detection paradigm and adopt convolutions for feature extraction. However, as affected by the inherent receptive field, convolution based feature extraction inevitably mixes up the foreground features and the background features, resulting in ambiguities in the subsequent instance association. In this paper, we propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation. Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images. Furthermore, multiple informative data modalities are converted into point-wise representations to enrich point-wise features. The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods including 3D tracking methods by large margins (5.4% higher MOTSA and 18 times faster over MOTSFusion) with the near real-time speed (22 FPS). Evaluations across three datasets demonstrate both the effectiveness and efficiency of our method. Moreover, based on the observation that current MOTS datasets lack crowded scenes, we build a more challenging MOTS dataset named APOLLO MOTS with higher instance density. Both APOLLO MOTS and our codes are publicly available at https://github.com/detectRecog/PointTrack.

preprint2020arXiv

Towards Playing Full MOBA Games with Deep Reinforcement Learning

MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose grand challenges to AI systems such as multi-agent, enormous state-action space, complex action control, etc. Developing AI for playing MOBA games has raised much attention accordingly. However, existing work falls short in handling the raw game complexity caused by the explosion of agent combinations, i.e., lineups, when expanding the hero pool in case that OpenAI's Dota AI limits the play to a pool of only 17 heroes. As a result, full MOBA games without restrictions are far from being mastered by any existing AI system. In this paper, we propose a MOBA AI learning paradigm that methodologically enables playing full MOBA games with deep reinforcement learning. Specifically, we develop a combination of novel and existing learning techniques, including curriculum self-play learning, policy distillation, off-policy adaption, multi-head value estimation, and Monte-Carlo tree-search, in training and playing a large pool of heroes, meanwhile addressing the scalability issue skillfully. Tested on Honor of Kings, a popular MOBA game, we show how to build superhuman AI agents that can defeat top esports players. The superiority of our AI is demonstrated by the first large-scale performance test of MOBA AI agent in the literature.

preprint2020arXiv

TREVERSE: Trial-and-Error Lightweight Secure Reverse Authentication with Simulatable PUFs

A physical unclonable function (PUF) generates hardware intrinsic volatile secrets by exploiting uncontrollable manufacturing randomness. Although PUFs provide the potential for lightweight and secure authentication for increasing numbers of low-end Internet of Things devices, practical and secure mechanisms remain elusive. We aim to explore simulatable PUFs (SimPUFs) that are physically unclonable but efficiently modeled mathematically through privileged one-time PUF access to address the above problem. Given a challenge, a securely stored SimPUF in possession of a trusted server computes the corresponding response and its bit-specific reliability. Consequently, naturally noisy PUF responses generated by a resource limited prover can be immediately processed by a one-way function (OWF) and transmitted to the server, because the resourceful server can exploit the SimPUF to perform a trial-and-error search over likely error patterns to recover the noisy response to authenticate the prover. Security of trial-and-error reverse (TREVERSE) authentication under the random oracle model is guaranteed by the hardness of inverting the OWF. We formally evaluate the TREVERSE authentication capability with two SimPUFs experimentally derived from popular silicon PUFs.

preprint2020arXiv

Unsupervised Deformable Medical Image Registration via Pyramidal Residual Deformation Fields Estimation

Deformation field estimation is an important and challenging issue in many medical image registration applications. In recent years, deep learning technique has become a promising approach for simplifying registration problems, and has been gradually applied to medical image registration. However, most existing deep learning registrations do not consider the problem that when the receptive field cannot cover the corresponding features in the moving image and the fixed image, it cannot output accurate displacement values. In fact, due to the limitation of the receptive field, the 3 x 3 kernel has difficulty in covering the corresponding features at high/original resolution. Multi-resolution and multi-convolution techniques can improve but fail to avoid this problem. In this study, we constructed pyramidal feature sets on moving and fixed images and used the warped moving and fixed features to estimate their "residual" deformation field at each scale, called the Pyramidal Residual Deformation Field Estimation module (PRDFE-Module). The "total" deformation field at each scale was computed by upsampling and weighted summing all the "residual" deformation fields at all its previous scales, which can effectively and accurately transfer the deformation fields from low resolution to high resolution and is used for warping the moving features at each scale. Simulation and real brain data results show that our method improves the accuracy of the registration and the rationality of the deformation field.

preprint2020arXiv

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

3D object detection is an essential task in autonomous driving and robotics. Though great progress has been made, challenges remain in estimating 3D pose for distant and occluded objects. In this paper, we present a novel framework named ZoomNet for stereo imagery-based 3D detection. The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes. To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming, which simultaneously resizes 2D instance bounding boxes to a unified resolution and adjusts the camera intrinsic parameters accordingly. In this way, we are able to estimate higher-quality disparity maps from the resized box images then construct dense point clouds for both nearby and distant objects. Moreover, we introduce to learn part locations as complementary features to improve the resistance against occlusion and put forward the 3D fitting score to better estimate the 3D detection quality. Extensive experiments on the popular KITTI 3D detection dataset indicate ZoomNet surpasses all previous state-of-the-art methods by large margins (improved by 9.4% on APbv (IoU=0.7) over pseudo-LiDAR). Ablation study also demonstrates that our adaptive zooming strategy brings an improvement of over 10% on AP3d (IoU=0.7). In addition, since the official KITTI benchmark lacks fine-grained annotations like pixel-wise part locations, we also present our KFG dataset by augmenting KITTI with detailed instance-wise annotations including pixel-wise part location, pixel-wise disparity, etc.. Both the KFG dataset and our codes will be publicly available at https://github.com/detectRecog/ZoomNet.

preprint2019arXiv

Universal transfer and stacking technique of van der Waals heterostructures for spintronics

The key to achieving high-quality van der Waals heterostructure devices made from various two-dimensional (2D) materials lies in the control over clean and flexible interfaces. However, existing transfer methods based on different mediators possess insufficiencies including the presence of residues, the unavailability of flexible interface engineering, and the selectivity towards materials and substrates since their adhesions differ considerably with the various preparation conditions, from chemical vapor deposition (CVD) growth to mechanical exfoliation. In this paper, we introduce a more universal method using a prefabricated polyvinyl alcohol (PVA) film to transfer and stack 2D materials, whether they are prepared by CVD or exfoliation. This peel-off and drop-off technique promises an ideal interface of the materials without introducing contamination. In addition, the method exhibits a micron-scale spatial transfer accuracy and meets special experimental conditions such as the preparation of twisted graphene and the 2D/metal heterostructure construction. We illustrate the superiority of this method with a WSe2 vertical spin valve device, whose performance verifies the applicability and advantages of such a method for spintronics. Our PVA-assisted transfer process will promote the development of high-performance 2D-material-based devices.

preprint2018arXiv

A Roadmap for HEP Software and Computing R&D for the 2020s

Particle physics has an ambitious and broad experimental programme for the coming decades. This programme requires large investments in detector hardware, either to build new facilities and experiments, or to upgrade existing ones. Similarly, it requires commensurate investment in the R&D of software to acquire, manage, process, and analyse the shear amounts of data to be recorded. In planning for the HL-LHC in particular, it is critical that all of the collaborating stakeholders agree on the software goals and priorities, and that the efforts complement each other. In this spirit, this white paper describes the R&D activities required to prepare for this software upgrade.

preprint2016arXiv

$D \rightarrow a_1, f_1$ transition form factors and semileptonic decays via 3-point QCD sum rules

By using the 3-point QCD sum rules, we calculate the transition form factors of $D$ decays into the spin triplet axial vector mesons $a_1(1260)$, $f_1(1285) $, $f_1(1420)$. In the calculations, we consider the quark contents of each meson in detail. In view of the fact that the isospin of $a_1(1260)$ is one, we calculate the $D^+ \rightarrow a_1^0 (1260)$ and $D^0 \rightarrow a_1^- (1260)$ transition form factors separately. In the case of $ f_1(1285), f_1(1420)$, the mixing between light flavor $SU(3)$ singlet and octet is taken into account. Based on the form factors obtained here, we give predictions for the branching ratios of relevant semileptonic decays, which can be tested in the future experiments.

preprint2016arXiv

A Beta-Beta Achievability Bound with Applications

A channel coding achievability bound expressed in terms of the ratio between two Neyman-Pearson $β$ functions is proposed. This bound is the dual of a converse bound established earlier by Polyanskiy and Verdú (2014). The new bound turns out to simplify considerably the analysis in situations where the channel output distribution is not a product distribution, for example due to a cost constraint or a structural constraint (such as orthogonality or constant composition) on the channel inputs. Connections to existing bounds in the literature are discussed. The bound is then used to derive 1) an achievability bound on the channel dispersion of additive non-Gaussian noise channels with random Gaussian codebooks, 2) the channel dispersion of the exponential-noise channel, 3) a second-order expansion for the minimum energy per bit of an AWGN channel, and 4) a lower bound on the maximum coding rate of a multiple-input multiple-output Rayleigh-fading channel with perfect channel state information at the receiver, which is the tightest known achievability result.

preprint2016arXiv

Dislocation Activities at the Martensite Phase Transformation Interface in Metastable Austenitic Stainless Steel: An In-situ TEM Study

Understanding the mechanism of martensitic transformation is of great importance in developing advanced high strength steels, especially TRansformation-Induced Plasticity (TRIP) steels. The TRIP effect leads to enhanced work-hardening rate, postponed onset of necking and excellent formability. In-situ transmission electron microscopy has been performed to systematically investigate the dynamic interactions between dislocations and alpha martensite at microscale. Local stress concentrations, e.g. from notches or dislocation pile-ups, render free edges and grain boundaries favorable nucleation sites for alpha martensite. Its growth leads to partial dislocation emission on two independent slip planes from the hetero-interface when the austenite matrix is initially free of dislocations. The kinematic analysis reveals that activating slip systems on two independent {111} planes of austenite are necessary in accommodating the interfacial mismatch strain. Full dislocation emission is generally observed inside of austenite regions that contain high density of dislocations. In both situations, phase boundary propagation generates large amounts of dislocations entering into the matrix, which renders the total deformation compatible and provide substantial strain hardening of the host phase. These moving dislocation sources enable plastic relaxation and prevent local damage accumulation by intense slipping on the softer side of the interfacial region. Thus, finely dispersed martensite distribution renders plastic deformation more uniform throughout the austenitic matrix, which explains the exceptional combination of strength and ductility of TRIP steels.

preprint2016arXiv

Finite-Blocklength Bounds for Wiretap Channels

This paper investigates the maximal secrecy rate over a wiretap channel subject to reliability and secrecy constraints at a given blocklength. New achievability and converse bounds are derived, which are shown to be tighter than existing bounds. The bounds also lead to the tightest second-order coding rate for discrete memoryless and Gaussian wiretap channels.

preprint2016arXiv

Minimum Energy to Send $k$ Bits Over Multiple-Antenna Fading Channels

This paper investigates the minimum energy required to transmit $k$ information bits with a given reliability over a multiple-antenna Rayleigh block-fading channel, with and without channel state information (CSI) at the receiver. No feedback is assumed. It is well known that the ratio between the minimum energy per bit and the noise level converges to $-1.59$ dB as $k$ goes to infinity, regardless of whether CSI is available at the receiver or not. This paper shows that lack of CSI at the receiver causes a slowdown in the speed of convergence to $-1.59$ dB as $k\to\infty$ compared to the case of perfect receiver CSI. Specifically, we show that, in the no-CSI case, the gap to $-1.59$ dB is proportional to $((\log k) /k)^{1/3}$, whereas when perfect CSI is available at the receiver, this gap is proportional to $1/\sqrt{k}$. In both cases, the gap to $-1.59$ dB is independent of the number of transmit antennas and of the channel's coherence time. Numerically, we observe that, when the receiver is equipped with a single antenna, to achieve an energy per bit of $ - 1.5$ dB in the no-CSI case, one needs to transmit at least $7\times 10^7$ information bits, whereas $6\times 10^4$ bits suffice for the case of perfect CSI at the receiver.

preprint2016arXiv

Mutual Information Optimally Local Private Discrete Distribution Estimation

Consider statistical learning (e.g. discrete distribution estimation) with local $ε$-differential privacy, which preserves each data provider's privacy locally, we aim to optimize statistical data utility under the privacy constraints. Specifically, we study maximizing mutual information between a provider's data and its private view, and give the exact mutual information bound along with an attainable mechanism: $k$-subset mechanism as results. The mutual information optimal mechanism randomly outputs a size $k$ subset of the original data domain with delicate probability assignment, where $k$ varies with the privacy level $ε$ and the data domain size $d$. After analysing the limitations of existing local private mechanisms from mutual information perspective, we propose an efficient implementation of the $k$-subset mechanism for discrete distribution estimation, and show its optimality guarantees over existing approaches.

preprint2016arXiv

New Insights on Stacking Fault Behavior in Twin Induced Plasticity from Meta-Atom Molecular Dynamics Simulations

There is growing interest in promoting deformation twinning for plasticity in advanced materials, as highly organized twin boundaries are beneficial to better strength-ductility combination in contrast to disordered grain boundaries. Twinning deformation typically involves the kinetics of stacking faults, its interaction with dislocations, and dislocation - twin boundary interactions. While the latter has been intensively investigated, the dynamics of stacking faults has been less known. In this work, we report several new insights on the stacking fault behavior in twin induced plasticity from our meta-atom molecular dynamics simulation: The stacking fault interactions are dominated by dislocation reactions taking place spontaneously, different from the proposed mechanism in literatures; The competition among generating a single stacking fault, a twinning partial and a trailing partial dislocation is dependent on a unique parameter, i.e. stacking fault energy, which in turn determines deformation twinning behaviors. The complex twin-slip and twin-dislocation interactions demonstrate the dual role of deformation twins as both dislocation barrier and storage, potentially contributing to the high strength and ductility of advanced materials like TWIP steels where deformation twinning dominated plasticity accounts for the superb strength-ductility combination.

preprint2016arXiv

Nonasymptotic coding-rate bounds for binary erasure channels with feedback

We present nonasymptotic achievability and converse bounds on the maximum coding rate (for a fixed average error probability and a fixed average blocklength) of variable-length full-feedback (VLF) and variable-length stop-feedback (VLSF) codes operating over a binary erasure channel (BEC). For the VLF setup, the achievability bound relies on a scheme that maps each message onto a variable-length Huffman codeword and then repeats each bit of the codeword until it is received correctly. The converse bound is inspired by the meta-converse framework by Polyanskiy, Poor, and Verdú (2010) and relies on binary sequential hypothesis testing. For the case of zero error probability, our achievability and converse bounds match. For the VLSF case, we provide achievability bounds that exploit the following feature of BEC: the decoder can assess the correctness of its estimate by verifying whether the chosen codeword is the only one that is compatible with the erasure pattern. One of these bounds is obtained by analyzing the performance of a variable-length extension of random linear fountain codes. The gap between the VLSF achievability and the VLF converse bound, when number of messages is small, is significant: $23\%$ for 8 messages on a BEC with erasure probability $0.5.$ The absence of a tight VLSF converse bound does not allow us to assess whether this gap is fundamental.

preprint2015arXiv

Broadcasting a Common Message with Variable-Length Stop-Feedback Codes

We investigate the maximum coding rate achievable over a two-user broadcast channel for the scenario where a common message is transmitted using variable-length stop-feedback codes. Specifically, upon decoding the common message, each decoder sends a stop signal to the encoder, which transmits continuously until it receives both stop signals. For the point-to-point case, Polyanskiy, Poor, and Verdú (2011) recently demonstrated that variable-length coding combined with stop feedback significantly increases the speed at which the maximum coding rate converges to capacity. This speed-up manifests itself in the absence of a square-root penalty in the asymptotic expansion of the maximum coding rate for large blocklengths, a result a.k.a. zero dispersion. In this paper, we show that this speed-up does not necessarily occur for the broadcast channel with common message. Specifically, there exist scenarios for which variable-length stop-feedback codes yield a positive dispersion.

preprint2015arXiv

Clothing Co-Parsing by Joint Image Segmentation and Labeling

This paper aims at developing an integrated system of clothing co-parsing, in order to jointly parse a set of clothing images (unsegmented but annotated with tags) into semantic configurations. We propose a data-driven framework consisting of two phases of inference. The first phase, referred as "image co-segmentation", iterates to extract consistent regions on images and jointly refines the regions over all images by employing the exemplar-SVM (E-SVM) technique [23]. In the second phase (i.e. "region co-labeling"), we construct a multi-image graphical model by taking the segmented regions as vertices, and incorporate several contexts of clothing configuration (e.g., item location and mutual interactions). The joint label assignment can be solved using the efficient Graph Cuts algorithm. In addition to evaluate our framework on the Fashionista dataset [30], we construct a dataset called CCP consisting of 2098 high-resolution street fashion photos to demonstrate the performance of our system. We achieve 90.29% / 88.23% segmentation accuracy and 65.52% / 63.89% recognition rate on the Fashionista and the CCP datasets, respectively, which are superior compared with state-of-the-art methods.

preprint2015arXiv

Data-Driven Scene Understanding with Adaptively Retrieved Exemplars

This article investigates a data-driven approach for semantically scene understanding, without pixelwise annotation and classifier training. Our framework parses a target image with two steps: (i) retrieving its exemplars (i.e. references) from an image database, where all images are unsegmented but annotated with tags; (ii) recovering its pixel labels by propagating semantics from the references. We present a novel framework making the two steps mutually conditional and bootstrapped under the probabilistic Expectation-Maximization (EM) formulation. In the first step, the references are selected by jointly matching their appearances with the target as well as the semantics (i.e. the assigned labels of the target and the references). We process the second step via a combinatorial graphical representation, in which the vertices are superpixels extracted from the target and its selected references. Then we derive the potentials of assigning labels to one vertex of the target, which depend upon the graph edges that connect the vertex to its spatial neighbors of the target and to its similar vertices of the references. Besides, the proposed framework can be naturally applied to perform image annotation on new test images. In the experiments, we validate our approach on two public databases, and demonstrate superior performances over the state-of-the-art methods in both semantic segmentation and image annotation tasks.

preprint2015arXiv

Discriminatively Trained And-Or Graph Models for Object Shape Detection

In this paper, we investigate a novel reconfigurable part-based model, namely And-Or graph model, to recognize object shapes in images. Our proposed model consists of four layers: leaf-nodes at the bottom are local classifiers for detecting contour fragments; or-nodes above the leaf-nodes function as the switches to activate their child leaf-nodes, making the model reconfigurable during inference; and-nodes in a higher layer capture holistic shape deformations; one root-node on the top, which is also an or-node, activates one of its child and-nodes to deal with large global variations (e.g. different poses and views). We propose a novel structural optimization algorithm to discriminatively train the And-Or model from weakly annotated data. This algorithm iteratively determines the model structures (e.g. the nodes and their layouts) along with the parameter learning. On several challenging datasets, our model demonstrates the effectiveness to perform robust shape-based object detection against background clutter and outperforms the other state-of-the-art approaches. We also release a new shape database with annotations, which includes more than 1500 challenging shape instances, for recognition and detection.

preprint2015arXiv

Fair Packet Scheduling in Network on Chip

Interconnection networks of parallel systems are used for servicing traf- fic generated by different applications, often belonging to different users. When multiple traffic flows contend for channel bandwidth, the scheduling algorithm regulating the access to that channel plays a key role in ensur- ing that each flow obtains the required quality of service. Fairness is a highly desirable property for a scheduling algorithm. We show that using the Relative Fairness Bound as a fairness measure may lead to decrease in throughput and increase in latency. We propose an alternative metric to evaluate the fairness and avoid the drawback of Relative Fairness Bound.

preprint2015arXiv

Finite-SNR Bounds on the Sum-Rate Capacity of Rayleigh Block-Fading Multiple-Access Channels with no a Priori CSI

We provide nonasymptotic upper and lower bounds on the sum-rate capacity of Rayleigh block-fading multiple-access channels for the setup where a priori channel state information is not available. The upper bound relies on a dual formula for channel capacity and on the assumption that the users can cooperate perfectly. The lower bound is derived assuming a noncooperative scenario, where each user employs unitary space-time modulation (independently from the other users). Numerical results show that the gap between the upper and the lower bound is small already at moderate SNR values. This suggests that the sum-rate capacity gains obtainable through user cooperation are minimal.

preprint2015arXiv

Inspection games in a mean field setting

In this paper, we present a new development of inspection games in a mean field setting. In our dynamic version of an inspection game, there is one inspector and a large number N interacting inspectees with a finite state space. By applying the mean field game methodology, we present a solution as an epsilon-equilibrium to this type of inspection games, where epsilon goes to 0 as N tends to infinity. In order to facilitate numerical analysis of this new type inspection game, we conduct an approximation analysis, that is we approximate the optimal Lipschitz continuous switching strategies by smooth switching strategies. We show that any approximating smooth switching strategy is also an epsilon-equilibrium solution to the inspection game with a large and finite number N of inspectees with epsilon being of order 1/N.

preprint2015arXiv

Learning Contour-Fragment-based Shape Model with And-Or Tree Representation

This paper proposes a simple yet effective method to learn the hierarchical object shape model consisting of local contour fragments, which represents a category of shapes in the form of an And-Or tree. This model extends the traditional hierarchical tree structures by introducing the "switch" variables (i.e. the or-nodes) that explicitly specify production rules to capture shape variations. We thus define the model with three layers: the leaf-nodes for detecting local contour fragments, the or-nodes specifying selection of leaf-nodes, and the root-node encoding the holistic distortion. In the training stage, for optimization of the And-Or tree learning, we extend the concave-convex procedure (CCCP) by embedding the structural clustering during the iterative learning steps. The inference of shape detection is consistent with the model optimization, which integrates the local testings via the leaf-nodes and or-nodes with the global verification via the root-node. The advantages of our approach are validated on the challenging shape databases (i.e., ETHZ and INRIA Horse) and summarized as follows. (1) The proposed method is able to accurately localize shape contours against unreliable edge detection and edge tracing. (2) The And-Or tree model enables us to well capture the intraclass variance.

preprint2015arXiv

Optimum Power Control at Finite Blocklength

This paper investigates the maximal channel coding rate achievable at a given blocklength $n$ and error probability $ε$, when the codewords are subject to a long-term (i.e., averaged-over-all-codeword) power constraint. The second-order term in the large-$n$ expansion of the maximal channel coding rate is characterized both for additive white Gaussian noise (AWGN) channels and for quasi-static fading channels with perfect channel state information available at both the transmitter and the receiver. It is shown that in both cases the second-order term is proportional to $\sqrt{n^{-1}\ln n}$. For the quasi-static fading case, this second-order term is achieved by truncated channel inversion, namely, by concatenating a dispersion-optimal code for an AWGN channel subject to a short-term power constraint, with a power controller that inverts the channel whenever the fading gain is above a certain threshold. Easy-to-evaluate approximations of the maximal channel coding rate are developed for both the AWGN and the quasi-static fading case.

preprint2015arXiv

Resolving Scale Ambiguity Via XSlit Aspect Ratio Analysis

In perspective cameras, images of a frontal-parallel 3D object preserve its aspect ratio invariant to its depth. Such an invariance is useful in photography but is unique to perspective projection. In this paper, we show that alternative non-perspective cameras such as the crossed-slit or XSlit cameras exhibit a different depth-dependent aspect ratio (DDAR) property that can be used to 3D recovery. We first conduct a comprehensive analysis to characterize DDAR, infer object depth from its AR, and model recoverable depth range, sensitivity, and error. We show that repeated shape patterns in real Manhattan World scenes can be used for 3D reconstruction using a single XSlit image. We also extend our analysis to model slopes of lines. Specifically, parallel 3D lines exhibit depth-dependent slopes (DDS) on their images which can also be used to infer their depths. We validate our analyses using real XSlit cameras, XSlit panoramas, and catadioptric mirrors. Experiments show that DDAR and DDS provide important depth cues and enable effective single-image scene reconstruction.

preprint2015arXiv

Sensitivity analysis for HJB equations with an application to coupled backward-forward systems

In this paper, we analyse Lipschitz continuous dependence of the solution to Hamilton-Jacobi-Bellman equations on a functional parameter. This sensitivity analysis not only has the interest on its own, but also is important for the mean field games methodology, namely for solving a coupled system of backward-forward equations. We show that the unique solution to a Hamilton-Jacobi-Bellman equation and its spacial gradient are Lipschitz continuous uniformly with respect to the functional parameter. In particular, we provide verifiable criteria for the so-called feedback regularity condition. Finally as an application, we show how the sensitive results are used to solved the coupled system of backward-forward equations.

preprint2015arXiv

Short-Packet Communications over Multiple-Antenna Rayleigh-Fading Channels

Motivated by the current interest in ultra-reliable, low-latency, machine-type communication systems, we investigate the tradeoff between reliability, throughput, and latency in the transmission of information over multiple-antenna Rayleigh block-fading channels. Specifically, we obtain finite-blocklength, finite-SNR upper and lower bounds on the maximum coding rate achievable over such channels for a given constraint on the packet error probability. Numerical evidence suggests that our bounds delimit tightly the maximum coding rate already for short blocklengths (packets of about 100 symbols). Furthermore, our bounds reveal the existence of a tradeoff between the rate gain obtainable by spreading each codeword over all available time-frequency-spatial degrees of freedom, and the rate loss caused by the need of estimating the fading coefficients over these degrees of freedom. In particular, our bounds allow us to determine the optimal number of transmit antennas and the optimal number of time-frequency diversity branches that maximize the rate. Finally, we show that infinite-blocklength performance metrics such as the ergodic capacity and the outage capacity yield inaccurate throughput estimates.

preprint2014arXiv

Diversity versus Multiplexing at Finite Blocklength

A finite blocklenth analysis of the diversity-multiplexing tradeoff is presented, based on nonasymptotic bounds on the maximum channel coding rate of multiple-antenna block-memoryless Rayleigh-fading channels.The bounds in this paper allow one to numerically assess for which packet size, number of antennas, and degree of channel selectivity, diversity-exploiting schemes are close to optimal, and when instead the available spatial degrees of freedom should be used to provide spatial multiplexing. This finite blocklength view on the diversity-multiplexing tradeoff provides insights on the design of delay-sensitive ultra-reliable communication links.

preprint2014arXiv

Gate-dependent Pseudospin Mixing in Graphene/Boron Nitride Moire Superlattices

Electrons in graphene are described by relativistic Dirac-Weyl spinors with a two-component pseudospin1-12. The unique pseudospin structure of Dirac electrons leads to emerging phenomena such as the massless Dirac cone2, anomalous quantum Hall effect2, 3, and Klein tunneling4, 5 in graphene. The capability to manipulate electron pseudospin is highly desirable for novel graphene electronics, and it requires precise control to differentiate the two graphene sub-lattices at atomic level. Graphene/boron nitride (graphene/BN) Moire superlattice, where a fast sub-lattice oscillation due to B-N atoms is superimposed on the slow Moire period, provides an attractive approach to engineer the electron pseudospin in graphene13-18. This unusual Moire superlattice leads to a spinor potential with unusual hybridization of electron pseudospins, which can be probed directly through infrared spectroscopy because optical transitions are very sensitive to excited state wavefunctions. Here, we perform micro-infrared spectroscopy on graphene/BN heterostructure and demonstrate that the Moire superlattice potential is dominated by a pseudospin-mixing component analogous to a spatially varying pseudomagnetic field. In addition, we show that the spinor potential depends sensitively on the gate-induced carrier concentration in graphene, indicating a strong renormalization of the spinor potential from electron-electron interactions. Our study offers deeper understanding of graphene pseudospin structure under spinor Moire potential, as well as exciting opportunities to control pseudospin in two-dimensional heterostructures for novel electronic and photonic nanodevices.

preprint2014arXiv

Observation of an intrinsic bandgap and Landau level renormalization in graphene/boron-nitride heterostructures

Van der Waals heterostructures formed by assembling different two-dimensional atomic crystals into stacks can lead to many new phenomena and device functionalities. In particular, graphene/boron-nitride heterostructures have emerged as a very promising system for band engineering of graphene. However, the intrinsic value and origin of the bandgap in such heterostructures remain unresolved. Here we report the observation of an intrinsic bandgap in epitaxial graphene/boron-nitride heterostructures with zero crystallographic alignment angle. Magneto-optical spectroscopy provides a direct probe of the Landau level transitions in this system and reveals a bandgap of ~ 38 meV (440 K). Moreover, the Landau level transitions are characterized by effective Fermi velocities with a critical dependence on specific transitions and magnetic field. These findings highlight the important role of many body interactions in determining the fundamental properties of graphene heterostructures.

preprint2014arXiv

Quasi-Static Multiple-Antenna Fading Channels at Finite Blocklength

This paper investigates the maximal achievable rate for a given blocklength and error probability over quasi-static multiple-input multiple-output (MIMO) fading channels, with and without channel state information (CSI) at the transmitter and/or the receiver. The principal finding is that outage capacity, despite being an asymptotic quantity, is a sharp proxy for the finite-blocklength fundamental limits of slow-fading channels. Specifically, the channel dispersion is shown to be zero regardless of whether the fading realizations are available at both transmitter and receiver, at only one of them, or at neither of them. These results follow from analytically tractable converse and achievability bounds. Numerical evaluation of these bounds verifies that zero dispersion may indeed imply fast convergence to the outage capacity as the blocklength increases. In the example of a particular $1 \times 2$ single-input multiple-output (SIMO) Rician fading channel, the blocklength required to achieve $90\%$ of capacity is about an order of magnitude smaller compared to the blocklength required for an AWGN channel with the same capacity. For this specific scenario, the coding/decoding schemes adopted in the LTE-Advanced standard are benchmarked against the finite-blocklength achievability and converse bounds.

preprint2013arXiv

A Galerkin approximation scheme for the mean correction in a mean-reversion stochastic differential equation

This paper is concerned with the following Markovian stochastic differential equation of mean-reversion type \[ dR_t= (θ+σα(R_t, t))R_t dt +σR_t dB_t \] with an initial value $R_0=r_0\in\mathbb{R}$, where $θ\in\mathbb{R}$ and $σ>0$ are constants, and the mean correction function $α:\mathbb{R}\times[0,\infty)\to α(x,t)\in\mathbb{R}$ is twice continuously differentiable in $x$ and continuously differentiable in $t$. We first derive that under the assumption of path independence of the density process of Girsanov transformation for the above stochastic differential equation, the mean correction function $α$ satisfies a non-linear partial differential equation which is known as the viscous Burgers equation. We then develop a Galerkin type approximation scheme for the function $α$ by utilizing truncation of discretised Fourier transformation to the viscous Burgers equation.

preprint2013arXiv

Capacity Pre-Log of Noncoherent SIMO Channels via Hironaka's Theorem

We find the capacity pre-log of a temporally correlated Rayleigh block-fading SIMO channel in the noncoherent setting. It is well known that for block-length L and rank of the channel covariance matrix equal to Q, the capacity pre-log in the SISO case is given by 1-Q/L. Here, Q/L can be interpreted as the pre-log penalty incurred by channel uncertainty. Our main result reveals that, by adding only one receive antenna, this penalty can be reduced to 1/L and can, hence, be made to vanish in the large-L limit, even if Q/L remains constant as L goes to infinity. Intuitively, even though the SISO channels between the transmit antenna and the two receive antennas are statistically independent, the transmit signal induces enough statistical dependence between the corresponding receive signals for the second receive antenna to be able to resolve the uncertainty associated with the first receive antenna's channel and thereby make the overall system appear coherent. The proof of our main theorem is based on a deep result from algebraic geometry known as Hironaka's Theorem on the Resolution of Singularities.

preprint2013arXiv

Existence of solutions to path-dependent kinetic equations and related forward - backward systems

This paper is devoted to path-dependent kinetics equations arising, in particular, from the analysis of the coupled backward - forward systems of equations of mean field games. We present local well-posedness, global existence and some regularity results for these equations.

preprint2013arXiv

Inspection and crime prevention: an evolutionary perspective

In this paper, we analyse inspection games with an evolutionary perspective. In our evolutionary inspection game with a large population, each individual is not a rational payoff maximiser, but periodically updates his strategy if he perceives that other individuals' strategies are more successful than his own, namely strategies are subject to the evolutionary pressure. We develop this game into a few directions. Firstly, social norms are incorporated into the game and we analyse how social norms may influence individuals' propensity to engage in criminal behaviour. Secondly, a forward-looking inspector is considered, namely, the inspector chooses the level of law enforcement whilst taking into account the effect that this choice will have on future crime rates. Finally, the game is extended to the one with continuous strategy spaces.

preprint2013arXiv

Near-Optimal Truthful Auction Mechanisms in Secondary Spectrum Markets

In this work, we study spectrum auction problem where each request from secondary users has spatial, temporal, and spectral features. With the requests of secondary users and the reserve price of the primary user, our goal is to design truthful mechanisms that will either maximize the social efficiency or maximize the revenue of the primary user. As the optimal conflict-free spectrum allocation problem is NP-hard, in this work, we design near optimal spectrum allocation mechanisms separately based on the following techniques: derandomized allocation from integer programming formulation, its linear programming (LP) relaxation, and the dual of the LP. We theoretically prove that 1) our near optimal allocation methods are bid monotone, which implys truthful auction mechanisms; and 2) our near optimal allocation methods can achieve a social efficiency or a revenue that is at least $1-\frac{1}{e}$ times of the optimal respectively. At last, we conduct extensive simulations to study the performances (social efficiency, revenue) of the proposed methods, and the simulation results corroborate our theoretical analysis.

preprint2013arXiv

PS-TRUST: Provably Secure Solution for Truthful Double Spectrum Auctions

Truthful spectrum auctions have been extensively studied in recent years. Truthfulness makes bidders bid their true valuations, simplifying greatly the analysis of auctions. However, revealing one's true valuation causes severe privacy disclosure to the auctioneer and other bidders. To make things worse, previous work on secure spectrum auctions does not provide adequate security. In this paper, based on TRUST, we propose PS-TRUST, a provably secure solution for truthful double spectrum auctions. Besides maintaining the properties of truthfulness and special spectrum reuse of TRUST, PS-TRUST achieves provable security against semi-honest adversaries in the sense of cryptography. Specifically, PS-TRUST reveals nothing about the bids to anyone in the auction, except the auction result. To the best of our knowledge, PS-TRUST is the first provably secure solution for spectrum auctions. Furthermore, experimental results show that the computation and communication overhead of PS-TRUST is modest, and its practical applications are feasible.

preprint2013arXiv

Quasi-Static SIMO Fading Channels at Finite Blocklength

We investigate the maximal achievable rate for a given blocklength and error probability over quasi-static single-input multiple-output (SIMO) fading channels. Under mild conditions on the channel gains, it is shown that the channel dispersion is zero regardless of whether the fading realizations are available at the transmitter and/or the receiver. The result follows from computationally and analytically tractable converse and achievability bounds. Through numerical evaluation, we verify that, in some scenarios, zero dispersion indeed entails fast convergence to outage capacity as the blocklength increases. In the example of a particular 1*2 SIMO Rician channel, the blocklength required to achieve 90% of capacity is about an order of magnitude smaller compared to the blocklength required for an AWGN channel with the same capacity.

preprint2013arXiv

SL(n,R)-Toda Black Holes

We consider D-dimensional Einstein gravity coupled to (n-1) U(1) vector fields and (n-2) dilatonic scalars. We find that for some appropriate exponential dilaton couplings of the field strengths, the equations of motion for the static charged ansatz can be reduced to a set of one-dimensional SL(n,R) Toda equations. This allows us to obtain a general class of explicit black holes with mass and (n-1) independent charges. The near-horizon geometry in the extremal limit is AdS_2 x S^{D-2}. The n=2 case gives the Reissner-Nordstrom solution, and the n=3 example includes the Kaluza-Klein dyon. We study the global structure and the black hole thermodynamics and obtain the universal entropy product formula. We also discuss the characteristics of extremal multi-charge black holes that have positive, zero or negative binding energies.

preprint2013arXiv

The Implications from Benchmarking Three Big Data Systems

Along with today's data explosion and application diversification, a variety of hardware platforms for big data are emerging, attracting interests from both industry and academia. The existing hardware platforms represent a wide range of implementation approaches, and different hardware platforms have different strengths. In this paper, we conduct comprehensive evaluations on three representative big data systems: Intel Xeon, Atom (low power processors), and many-core Tilera using BigDataBench - a big data benchmark suite. Then we explore the relative performance of the three implementation approaches by running BigDataBench, and provide strong guidance for the big data systems construction. Through our experiments, we have inferred that a big data system based on specific hardware has different performance in the context of different applications and data volumes. When we construct a system, we should take into account not only the performance or energy consumption of the pure hardware, but also the characteristics of applications running on them. Data scale, application type and complexity should be considered comprehensively when researchers or architects plan to choose fundamental components for their big data systems.

preprint2012arXiv

Diversity versus Channel Knowledge at Finite Block-Length

We study the maximal achievable rate R*(n, ε) for a given block-length n and block error probability εover Rayleigh block-fading channels in the noncoherent setting and in the finite block-length regime. Our results show that for a given block-length and error probability, R*(n, ε) is not monotonic in the channel's coherence time, but there exists a rate maximizing coherence time that optimally trades between diversity and cost of estimating the channel.

preprint2012arXiv

Electron correlation and spin-orbit coupling effects in US3 and USe3

A systematic density functional theory (DFT)+U study is conducted to investigate the electron correlation and spin-orbit coupling (SOC) effects in US3 and USe3. Our calculations reveal that inclusion of the U term is essential to get energy band gaps for them, indicating the strong correlation effects for uranium 5f electrons. Taking consideration of the SOC effect results in small reduction on the electronic band gaps of US3 and USe3, but largely changes the energy band shapes around the Fermi energy. As a result, US3 has a direct band gap while USe3 has an indirect one. Our calculations predict that both US3 and USe3 are antiferromagnetic insulators, in agreement with corresponding experimental results. Based on our DFT+U calculations, we systematically present the ground-state electronic, mechanical, and Raman properties for US3 and USe3.

preprint2012arXiv

Energy-Efficient Transmission Schemes in Cooperative Cellular Systems

Energy-efficient communication is an important requirement for mobile devices, as the battery technology has not kept up with the growing requirements stemming from ubiquitous multimedia applications. This paper considers energy-efficient transmission schemes in cooperative cellular systems with unbalanced traffic between uplink and downlink. Theoretically, we derive the optimal transmission data rate, which minimizes the total energy consumption of battery-powered terminals per information bit. The energy-efficient cooperation regions are then investigated to illustrate the effects of relay locations on the energy-efficiency of the systems, and the optimal relay location is found for maximum energy-efficiency. Finally, numerical results are provided to demonstrate the tradeoff between energy-efficiency and spectral efficiency.

preprint2012arXiv

Mean Field Games and Nonlinear Markov Processes

In this paper, we investigate the mean field games with $K$ classes of agents who are weakly coupled via the empirical measure. The underlying dynamics of the representative agents is assumed to be a controlled nonlinear Markov process associated with rather general integro-differential generators of Lévy-Khintchine type (with variable coefficients). We show that nonlinear measure-valued kinetic equations describing the dynamic law of large numbers limit for system with large number N of agents are solvable and that their solutions represent 1/N-Nash equilibria for approximating systems of N agents.

preprint2012arXiv

More than Word Frequencies: Authorship Attribution via Natural Frequency Zoned Word Distribution Analysis

With such increasing popularity and availability of digital text data, authorships of digital texts can not be taken for granted due to the ease of copying and parsing. This paper presents a new text style analysis called natural frequency zoned word distribution analysis (NFZ-WDA), and then a basic authorship attribution scheme and an open authorship attribution scheme for digital texts based on the analysis. NFZ-WDA is based on the observation that all authors leave distinct intrinsic word usage traces on texts written by them and these intrinsic styles can be identified and employed to analyze the authorship. The intrinsic word usage styles can be estimated through the analysis of word distribution within a text, which is more than normal word frequency analysis and can be expressed as: which groups of words are used in the text; how frequently does each group of words occur; how are the occurrences of each group of words distributed in the text. Next, the basic authorship attribution scheme and the open authorship attribution scheme provide solutions for both closed and open authorship attribution problems. Through analysis and extensive experimental studies, this paper demonstrates the efficiency of the proposed method for authorship attribution.

preprint2012arXiv

On the Capacity of Large-MIMO Block-Fading Channels

We characterize the capacity of Rayleigh block-fading multiple-input multiple-output (MIMO) channels in the noncoherent setting where transmitter and receiver have no a priori knowledge of the realizations of the fading channel. We prove that unitary space-time modulation (USTM) is not capacity-achieving in the high signal-to-noise ratio (SNR) regime when the total number of antennas exceeds the coherence time of the fading channel (expressed in multiples of the symbol duration), a situation that is relevant for MIMO systems with large antenna arrays (large-MIMO systems). This result settles a conjecture by Zheng & Tse (2002) in the affirmative. The capacity-achieving input signal, which we refer to as Beta-variate space-time modulation (BSTM), turns out to be the product of a unitary isotropically distributed random matrix, and a diagonal matrix whose nonzero entries are distributed as the square-root of the eigenvalues of a Beta-distributed random matrix of appropriate size. Numerical results illustrate that using BSTM instead of USTM in large-MIMO systems yields a rate gain as large as 13% for SNR values of practical interest.

preprint2012arXiv

On the security of an enhanced short signature scheme

Currently, short signature is receiving significant attention since it is particularly useful in low-bandwidth communication environments. However, most of the short signature schemes are only based on one intractable assumption. Recently, Su presented an identity-based short signature scheme based on knapsack and bilinear pairing. He claimed that the signature scheme is secure in the random oracle model. Unfortunately, in this paper, we show that his scheme is insecure. Concretely, an adversary can forge a valid signature on any message with respect to any identity in Su's scheme.

preprint2012arXiv

Reliable Multicasting for Device-to-Device Radio Underlaying Cellular Networks

This paper proposes Leader in Charge (LiC), a reliable multicast architecture for device-to-device (D2D) radio underlaying cellular networks. The multicast-requesting user equipments (UEs) in close proximity form a D2D cluster to receive the multicast packets through cooperation. In addition to receiving the multicast packets from the eNB, UEs share what they received from the multicast on short-range links among UEs, namely the D2D links, to exploit the wireless resources a more efficient way. Consequently, we show that utilizing the D2D links in cellular networks increases the throughput of a multicast session by means of simulation. We also discuss some practical issues facing the integration of LiC into the current cellular networks. In particular, we propose efficient delay control mechanism to reduce the average and maximum delay experienced by LiC users, which is further confirmed by the simulation results.

preprint2012arXiv

Security of a biometric identity-based encryption scheme

Biometric identity-based encryption (Bio-IBE) is a kind of fuzzy identity-based encryption (fuzzy IBE) where a ciphertext encrypted under an identity w' can be decrypted using a secret key corresponding to the identity w which is close to w' as measured by some metric. Recently, Yang et al. proposed a constant-size Bio-IBE scheme and proved that it is secure against adaptive chosen-ciphertext attack (CCA2) in the random oracle model. Unfortunately, in this paper, we will show that their Bio-IBE scheme is even not chosen-plaintext secure. Specifically, user w using his secret key is able to decrypt any ciphertext encrypted under an identity w' even though w is not close to w'.

preprint2012arXiv

The turnpike theorems for Markov games

This paper has a two-folded purpose. First, we attempt to outline the development of the turnpike theorems in the the last several decades. Second, we study turnpike theorems in finite-horizon two-person zero-sum Markov games on a general Borel state space. Utilising the Bellman (or Shapley) operator defined for this game, we prove the stochastic versions of the early turnpike theorem on the set of optimal strategies and the middle turnpike theorem on the distribution of the state space.

preprint2011arXiv

Capacity Pre-Log of SIMO Correlated Block-Fading Channels

We establish an upper bound on the noncoherent capacity pre-log of temporally correlated block-fading single-input multiple-output (SIMO) channels. The upper bound matches the lower bound recently reported in Riegler et al. (2011), and, hence, yields a complete characterization of the SIMO noncoherent capacity pre-log, provided that the channel covariance matrix satisfies a mild technical condition. This result allows one to determine the optimal number of receive antennas to be used to maximize the capacity pre-log for a given block-length and a given rank of the channel covariance matrix.

preprint2011arXiv

Quantum discord of two-qubit rank-two states

Among various definitions of quantum correlations, quantum discord has attracted considerable attention. To find analytical expression of quantum discord is an intractable task. In this paper, we discuss thoroughly the case of two-qubit rank-two states. An analytical expression for the quantum discord is obtained by means of Koashi-Winter relation. A geometric picture is demonstrated by means of quantum steering ellipsoid. We point out that in this case the optimal measurement is indeed the von Neumann measurement, which is usually used in the study of quantum discord. However, for some two-qubit states with the rank larger than two, we find that three-element POVM measurement is more optimal. It means that more careful attention should be paid in the discussion of quantum discord.

preprint2010arXiv

Energy-Efficient Relay Selection and Optimal Relay Location in Cooperative Cellular Networks with Asymmetric Traffic

Energy-efficient communication is an important requirement for mobile relay networks due to the limited battery power of user terminals. This paper considers energy-efficient relaying schemes through selection of mobile relays in cooperative cellular systems with asymmetric traffic. The total energy consumption per information bit of the battery-powered terminals, i.e., the mobile station (MS) and the relay, is derived in theory. In the Joint Uplink and Downlink Relay Selection (JUDRS) scheme we proposed, the relay which minimizes the total energy consumption is selected. Additionally, the energy-efficient cooperation regions are investigated, and the optimal relay location is found for cooperative cellular systems with asymmetric traffic. The results reveal that the MS-relay and the relay-base station (BS) channels have different influence over relay selection decisions for optimal energy-efficiency. Information theoretic analysis of the diversity-multiplexing tradeoff (DMT) demonstrates that the proposed scheme achieves full spatial diversity in the quantity of cooperating terminals in this network. Finally, numerical results further confirm a significant energy efficiency gain of the proposed algorithm comparing to the previous best worse channel selection and best harmonic mean selection algorithms.

preprint2010arXiv

Joint Relay Selection and Link Adaptation for Distributed Beamforming in Regenerative Cooperative Networks

Relay selection enhances the performance of the cooperative networks by selecting the links with higher capacity. Meanwhile link adaptation improves the spectral efficiency of wireless data-centric networks through adapting the modulation and coding schemes (MCS) to the current link condition. In this paper, relay selection is combined with link adaptation for distributed beamforming in a two-hop regenerative cooperative system. A novel signaling mechanism and related optimal algorithms are proposed for joint relay selection and link adaptation. In the proposed scheme, there is no need to feedback the relay selection results to each relay. Instead, by broadcasting the link adaptation results from the destination, each relay will automatically understand whether it is selected or not. The lower and upper bounds of the throughput of the proposed scheme are derived. The analysis and simulation results indicate that the proposed scheme provides synergistic gains compared to the pure relay selection and link adaptation schemes.

preprint2010arXiv

Joint Uplink and Downlink Relay Selection in Cooperative Cellular Networks

We consider relay selection technique in a cooperative cellular network where user terminals act as mobile relays to help the communications between base station (BS) and mobile station (MS). A novel relay selection scheme, called Joint Uplink and Downlink Relay Selection (JUDRS), is proposed in this paper. Specifically, we generalize JUDRS in two key aspects: (i) relay is selected jointly for uplink and downlink, so that the relay selection overhead can be reduced, and (ii) we consider to minimize the weighted total energy consumption of MS, relay and BS by taking into account channel quality and traffic load condition of uplink and downlink. Information theoretic analysis of the diversity-multiplexing tradeoff demonstrates that the proposed scheme achieves full spatial diversity in the quantity of cooperating terminals in this network. And numerical results are provided to further confirm a significant energy efficiency gain of the proposed algorithm comparing to the previous best worse channel selection and best harmonic mean selection algorithms.

preprint2009arXiv

Magnetism in Cr-doped ZnS: Density-functional theory studies

We investigated the magnetism and aggregation trends in cubic Zn1-xCrxS using the density-functional theory calculations.We demonstrate that all studied configurations show ground state half-metallic ferromagnetism (HMF); and Cr impurities are energetically favorable to planar cluster into delta-doping structures. The single-layer delta-doping structures of Zn0.75Cr0.25S and Zn0.875Cr0.125S show ferromagnetic stabilization energies (ΔE_AF) of 0.551 and 0.561 eV/Cr-Cr pair, respectively. The half-layer delta-doping structure of Zn0.875Cr0.125S and double-layer delta-doping structure of Zn0.75Cr0.25S show ΔE_AF of 0.394 and 0.166 eV/Cr-Cr pair, respectively. Furthermore, our studies indicate that the cubic ZnS/CrS heterostructure, one extreme situation of the delta-doping structure, also shows ground state HMF. The origin of HMF is discussed using a simple crystal field model. Finally, we anticipate the potential spintronics application of Zn1-xCrxS.

preprint2009arXiv

Structure and Magnetism in Mn Doped Zirconia: Density-functional Theory Studies

Using the first-principles density-functional theory plan-wave pseudopotential method, we investigate the structure and magnetism in 25% Mn substitutive and interstitial doped monoclinic, tetragonal and cubic ZrO2 systematically. Our studies show that the introduction of Mn impurities into ZrO2 not only stabilizes the high temperature phase, but also endows ZrO2 with magnetism. Based on the simple crystal field theory (CFT), we discuss the origination of magnetism in Mn doped ZrO2. Moreover, we discuss the effect of electron donor on magnetic semiconductors, and the possibility as electronic structure modulator.

preprint2006arXiv

Dual-Topology Hamiltonian-Replica-Exchange Overlap Histogramming Method to Calculate Relative Free Energy Difference in Rough Energy Landscape

A novel overlap histogramming method based on Dual-Topology Hamiltonian-Replica-Exchange simulation technique is presented to efficiently calculate relative free energy difference in rough energy landscape, in which multiple conformers coexist and are separated by large energy barriers. The proposed method is based on the realization that both DT-HERM exchange efficiency and confidence of free energy determination in overlap histogramming method depend on the same criteria: neighboring states' energy derivative distribution overlap. In this paper, we demonstrate this new methodology by calculating free energy difference between amino acids: Leucine and Asparagine, which is an identified chanllenging system for free energy simulations.

Wei Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

94 published item(s)

CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

Artificial intelligence for diagnosing and predicting survival of patients with renal cell carcinoma: Retrospective multi-center study

MLMSA: Multi-Label Multi-Side-Channel-Information enabled Deep Learning Attacks on APUF Variants

Analysis Facilities for HL-LHC

CorrGAN: Input Transformation Technique Against Natural Corruptions

Design and Evaluate Recomposited OR-AND-XOR-PUF

Detecting Topology Attacks against Graph Neural Networks

EREBA: Black-box Energy Testing of Adaptive Neural Networks

Fast localization and single-pixel imaging of the moving object using time-division multiplexing

HandoverSim: A Simulation Framework and Benchmark for Human-to-Robot Object Handovers

HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs

Learning Free Gait Transition for Quadruped Robots via Phase-Guided Controller

Learning Perceptual Concepts by Bootstrapping from Human Queries

Model Predictive Control for Fluid Human-to-Robot Handovers

Multi-Modal Fusion in Contact-Rich Precise Tasks via Hierarchical Policy Learning

NeReF: Neural Refractive Field for Fluid Surface Reconstruction and Implicit Representation

Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value Functions

NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing

NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models

Spin Manipulation by Giant Valley-Zeeman Spin-Orbit Field in Atom-Thick WSe2

Temperature-linear Resistivity in Twisted Double Bilayer Graphene

Transformer Tracking with Cyclic Shifting Window Attention

An Empirical Analysis of UI-based Flaky Tests

F3SNet: A Four-Step Strategy for QIM Steganalysis of Compressed Speech Based on Hierarchical Attention Network

Isospin competitions and valley polarized correlated insulators in twisted double bilayer graphene

Spatially indirect intervalley excitons in bilayer WSe2

Are We Ready for Service Robots? The OpenLORIS-Scene Datasets for Lifelong SLAM

Collaborative Behavior Models for Optimized Human-Robot Teamwork

Direct Observation of Room-Temperature Dislocation Plasticity in Diamond

DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features

High-order minibands and interband Landau level reconstruction in graphene moire superlattice

Human Grasp Classification for Reactive Human-to-Robot Handovers

Integrating Discrete and Neural Features via Mixed-feature Trans-dimensional Random Field Language Models

Massive Access for Future Wireless Communication Systems

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation

Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation

Towards Playing Full MOBA Games with Deep Reinforcement Learning

TREVERSE: Trial-and-Error Lightweight Secure Reverse Authentication with Simulatable PUFs

Unsupervised Deformable Medical Image Registration via Pyramidal Residual Deformation Fields Estimation

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

Universal transfer and stacking technique of van der Waals heterostructures for spintronics

A Roadmap for HEP Software and Computing R&D for the 2020s

$D \rightarrow a_1, f_1$ transition form factors and semileptonic decays via 3-point QCD sum rules

A Beta-Beta Achievability Bound with Applications

Dislocation Activities at the Martensite Phase Transformation Interface in Metastable Austenitic Stainless Steel: An In-situ TEM Study

Finite-Blocklength Bounds for Wiretap Channels

Minimum Energy to Send $k$ Bits Over Multiple-Antenna Fading Channels

Mutual Information Optimally Local Private Discrete Distribution Estimation

New Insights on Stacking Fault Behavior in Twin Induced Plasticity from Meta-Atom Molecular Dynamics Simulations

Nonasymptotic coding-rate bounds for binary erasure channels with feedback

Broadcasting a Common Message with Variable-Length Stop-Feedback Codes

Clothing Co-Parsing by Joint Image Segmentation and Labeling

Data-Driven Scene Understanding with Adaptively Retrieved Exemplars

Discriminatively Trained And-Or Graph Models for Object Shape Detection

Fair Packet Scheduling in Network on Chip

Finite-SNR Bounds on the Sum-Rate Capacity of Rayleigh Block-Fading Multiple-Access Channels with no a Priori CSI

Inspection games in a mean field setting

Learning Contour-Fragment-based Shape Model with And-Or Tree Representation

Optimum Power Control at Finite Blocklength

Resolving Scale Ambiguity Via XSlit Aspect Ratio Analysis

Sensitivity analysis for HJB equations with an application to coupled backward-forward systems

Short-Packet Communications over Multiple-Antenna Rayleigh-Fading Channels

Diversity versus Multiplexing at Finite Blocklength

Gate-dependent Pseudospin Mixing in Graphene/Boron Nitride Moire Superlattices

Observation of an intrinsic bandgap and Landau level renormalization in graphene/boron-nitride heterostructures

Quasi-Static Multiple-Antenna Fading Channels at Finite Blocklength

A Galerkin approximation scheme for the mean correction in a mean-reversion stochastic differential equation

Capacity Pre-Log of Noncoherent SIMO Channels via Hironaka's Theorem

Existence of solutions to path-dependent kinetic equations and related forward - backward systems

Inspection and crime prevention: an evolutionary perspective

Near-Optimal Truthful Auction Mechanisms in Secondary Spectrum Markets

PS-TRUST: Provably Secure Solution for Truthful Double Spectrum Auctions

Quasi-Static SIMO Fading Channels at Finite Blocklength