Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
54works
0followers
28topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

54 published item(s)

preprint2026arXiv

High-Fidelity Mobile Avatars with Pruned Local Blendshapes

We propose a method to reconstruct high-fidelity human avatars from multi-view video that can run on mobile devices. Many works can model high-quality Gaussian-based full-body avatars from multi-view video. However, these methods require heavy computation to obtain pose-dependent appearance, making deployment on mobile devices very difficult. Recent methods distill from pretrained models and model pose-dependent nonlinear Gaussian attributes by linearly combining global pose features with blendshapes. Although they can run on mobile devices, they suffer some loss of detail. We observe that nearby Gaussians are often highly correlated within a local region of the body, and can be linearly modeled with less error. Therefore, we use local linear blendshapes in small body parts to capture global nonlinear changes of Gaussian attributes. To further reduce computation and model size, we propose to remove blendshapes for Gaussians whose attributes change little, yielding a minimal blendshape representation. Our method is an end-to-end training method without a pretrained model. To make it run on multiple devices, we implement our method using WebGPU. Experiments show that our method can render high-quality human avatars with better details, and can reach 120 FPS at 2K resolution on mobile devices.

preprint2024arXiv

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction

We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction. HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms. Frame-wise annotations for panoptic segmentation, motion segmentation, 3D hand pose, category-level object pose and hand action have also been provided, together with reconstructed object meshes and scene point clouds. With HOI4D, we establish three benchmarking tasks to promote category-level HOI from 4D visual signals including semantic segmentation of 4D dynamic point cloud sequences, category-level object pose tracking, and egocentric action segmentation with diverse interaction targets. In-depth analysis shows HOI4D poses great challenges to existing methods and produces great research opportunities.

preprint2023arXiv

Entanglement and work statistics in the driven open system

We study the entanglement and work statistics in a driven two-qubit system. The regulation of periodic driving has much more versatility and universality in contrast to reservoir engineering in static systems. We found the quasi-steady state entanglement can be amplified effectively by the external drive in certain parameter regimes. The drive extends the range of temperatures or temperature differences at which entanglement can emerge. From the view of the effective Hamiltonian, the addition of the driving alters the inter-qubit coupling and system-bath coupling, which are crucial in determining the quasi-steady state. The work statistics are also investigated. The driven system, as a continuous quantum thermal machine, output work continuously and steadily at the quasi-steady state. There is a distinct operation of modes and corresponding performance by changing driving. It can also be understood that the drive changes the effective Hamiltonian, and further the modes of energy exchanges between the system and the baths as well as the work reservoir.

preprint2022arXiv

8D Parameters Estimation for Bistatic EMVS-MIMO Radar via the nested PARAFAC

In this letter, a novel nested PARAFAC algorithm was proposed to improve the 8D parameters estimation performance for the bistatic EMVS-MIMO radar. Firstly, the outer part PARAFAC algorithm was carried out to estimate the receive spatial response matrix and its first way factor matrix. For the estimated first way factor matrix, a theory is given to rearrange its data into an new matrix, which is the mode-1 unfolding matrix of a three-way tensor. Then, the inner part PARAFAC algorithm was used to estimate the transmit steering vector matrix, the transmit spatial response matrix and the receive steering vector matrix. Thus, the transmit 4D parameters and receive 4D parameters can be accurately located via the abovementioned process. Compared with the original PARAFAC algorithm, the proposed nested PARAFAC algorithm can avoid additional reconstruction process when estimating the transmit/receive spatial response matrix. Moreover, the proposed algorithm can offer a highly-accurate 8D parameters estimaiton than that of the original PARAFAC algorithm. Simulated results verify the effectiveness of the proposed algorithm.

preprint2022arXiv

A Practical Two-stage Ranking Framework for Cross-market Recommendation

Cross-market recommendation aims to recommend products to users in a resource-scarce target market by leveraging user behaviors from similar rich-resource markets, which is crucial for E-commerce companies but receives less research attention. In this paper, we present our detailed solution adopted in the cross-market recommendation contest, i.e., WSDM CUP 2022. To better utilize collaborative signals and similarities between target and source markets, we carefully consider multiple features as well as stacking learning models consisting of deep graph recommendation models (Graph Neural Network, DeepWalk, etc.) and traditional recommendation models (ItemCF, UserCF, Swing, etc.). Furthermore, We adopt tree-based ensembling methods, e.g., LightGBM, which show superior performance in prediction task to generate final results. We conduct comprehensive experiments on the XMRec dataset, verifying the effectiveness of our model. The proposed solution of our team WSDM_Coggle_ is selected as the second place submission.

preprint2022arXiv

Attention U-Net as a surrogate model for groundwater prediction

Numerical simulations of groundwater flow are used to analyze and predict the response of an aquifer system to its change in state by approximating the solution of the fundamental groundwater physical equations. The most used and classical methodologies, such as Finite Difference (FD) and Finite Element (FE) Methods, use iterative solvers which are associated with high computational cost. This study proposes a physics-based convolutional encoder-decoder neural network as a surrogate model to quickly calculate the response of the groundwater system. Holding strong promise in cross-domain mappings, encoder-decoder networks are applicable for learning complex input-output mappings of physical systems. This manuscript presents an Attention U-Net model that attempts to capture the fundamental input-output relations of the groundwater system and generates solutions of hydraulic head in the whole domain given a set of physical parameters and boundary conditions. The model accurately predicts the steady state response of a highly heterogeneous groundwater system given the locations and piezometric head of up to 3 wells as input. The network learns to pay attention only in the relevant parts of the domain and the generated hydraulic head field corresponds to the target samples in great detail. Even relative to coarse finite difference approximations the proposed model is shown to be significantly faster than a comparative state-of-the-art numerical solver, thus providing a base for further development of the presented networks as surrogate models for groundwater prediction.

preprint2022arXiv

CodedVTR: Codebook-based Sparse Voxel Transformer with Geometric Guidance

Transformers have gained much attention by outperforming convolutional neural networks in many 2D vision tasks. However, they are known to have generalization problems and rely on massive-scale pre-training and sophisticated training techniques. When applying to 3D tasks, the irregular data structure and limited data scale add to the difficulty of transformer's application. We propose CodedVTR (Codebook-based Voxel TRansformer), which improves data efficiency and generalization ability for 3D sparse voxel transformers. On the one hand, we propose the codebook-based attention that projects an attention space into its subspace represented by the combination of "prototypes" in a learnable codebook. It regularizes attention learning and improves generalization. On the other hand, we propose geometry-aware self-attention that utilizes geometric information (geometric pattern, density) to guide attention learning. CodedVTR could be embedded into existing sparse convolution-based methods, and bring consistent performance improvements for indoor and outdoor 3D semantic segmentation tasks

preprint2022arXiv

Deep Reinforcement Learning Coordinated Receiver Beamforming for Millimeter-Wave Train-ground Communications

As more and more people choose high-speed rail (HSR) as a means of transportation for short trips, there is ever growing demand of high quality of multimedia services. With its rich spectrum resources, millimeter wave (mm-wave) communications can satisfy the high network capacity requirements for HSR. Also, it is possible for receivers (RXs) to be equipped with antenna arrays in mm-wave communication systems due to its short wavelength. However, as HSRs run with high speed, the received signal power (RSP) varies rapidly over a cell and it is the lowest at the edge of the cell compared to other locations. Consequently, it is necessary to conduct research on RX beamforming for HSR in mm-wave band to improve the quality of the received signal. In this paper, we focus on RX beamforming for a mm-wave train-ground communication system. To improve the RSP, we propose an effective RX beamforming scheme based on deep reinforcement learning (DRL), and develop a deep Q-network (DQN) algorithm to train and determine the optimal RX beam direction with the purpose of maximizing average RSP. Through extensive simulations, we demonstrate that the proposed scheme has better performance than the four baseline schemes in terms of average RSP at most positions on the railway.

preprint2022arXiv

Design of Switchable Frequency-Selective Rasorber with A-R-A-T or A-T-A-R Operating Modes

This work presents a switchable frequency-selective rasorber (SFSR) with two operating modes. Switching from transmission to reflection can be achieved by appropriately adding feeders and PIN diodes based on cascaded two-dimensional lossy and lossless frequency-selective surface (FSS) screens. The proposed SFSR can realize out-of-band absorption. Analysis of the equivalent circuit model (ECM) can be useful for achieving a switchable rasorber. As the state of the PIN diodes changes, the working state of the SFSR can be switched from low-frequency reflection and high-frequency transmission to low-frequency transmission and high-frequency reflection, respectively. In the final-revision simulation of the working band of the SFSR, in the ON state, reflection and transmission peaks of -0.55 dB at 4.06 GHz and -0.52 dB at 5.97 GHz, respectively, are achieved; in the OFF state, the transmission and reflection peaks of -0.31 dB at 4.04GHz and -0.26 dB at 6.01 GHz, respectively, are obtained. A prototype sample is developed and validated. The results are in good agreement with those of the full-wave simulation. The proposed design can be used in intelligent anti-jamming communication.

preprint2022arXiv

Efficient characterization of quantum nondemolition qubit readout

We study the quantitative characterization of the performance of qubit measurements in this paper. In particular, the back-action evading nature of quantum nondemolition (QND) readout of qubits is fully quantified by quantum trace distance. Only computational basis states are necessary to be taken into consideration. Most importantly, we propose an experimentally efficient method to evaluate the QND fidelity based on the classical trace distance, which uses an experimental scheme with two consecutive measurements. The three key quantifiers of a measurement, i.e., QND fidelity, readout fidelity, and projectivity, can be derived directly from the same experimental scheme. Besides, we present the relationships among these three factors. Theoretical simulation results for the dispersive readout of superconducting qubits show the validity of the proposed QND fidelity. Efficient quantification of measurement performance is of practical significance for the diagnosis, performance improvement, and design of measurement apparatuses.

preprint2022arXiv

Ensemble of Deep Convolutional Neural Networks for real-time gravitational wave signal recognition

With the rapid development of deep learning technology, more and more researchers apply it to gravitational wave (GW) data analysis. Previous studies focused on a single deep learning model. In this paper we design an ensemble algorithm combining a set of convolutional neural networks (CNN) for GW signal recognition. The whole ensemble model consists of two sub-ensemble models. Each sub-ensemble model is also an ensemble model of deep learning. The two sub-ensemble models treat data of Hanford and Livinston detectors respectively. Proper voting scheme is adopted to combine the two sub-ensemble models to form the whole ensemble model. We apply this ensemble model to all reported GW events in the first observation and second observation runs (O1/O2) by LIGO-VIRGO Scientific Collaboration. We find that the ensemble algorithm can clearly identify all binary black hole merger events except GW170818. We also apply the ensemble model to one month (August 2017) data of O2. There is no false trigger happens although only O1 data are used for training. Our test results indicate that the ensemble learning algorithms can be used in real-time GW data analysis.

preprint2022arXiv

Fine-grained differentiable physics: a yarn-level model for fabrics

Differentiable physics modeling combines physics models with gradient-based learning to provide model explicability and data efficiency. It has been used to learn dynamics, solve inverse problems and facilitate design, and is at its inception of impact. Current successes have concentrated on general physics models such as rigid bodies, deformable sheets, etc., assuming relatively simple structures and forces. Their granularity is intrinsically coarse and therefore incapable of modelling complex physical phenomena. Fine-grained models are still to be developed to incorporate sophisticated material structures and force interactions with gradient-based learning. Following this motivation, we propose a new differentiable fabrics model for composite materials such as cloths, where we dive into the granularity of yarns and model individual yarn physics and yarn-to-yarn interactions. To this end, we propose several differentiable forces, whose counterparts in empirical physics are indifferentiable, to facilitate gradient-based learning. These forces, albeit applied to cloths, are ubiquitous in various physical systems. Through comprehensive evaluation and comparison, we demonstrate our model's explicability in learning meaningful physical parameters, versatility in incorporating complex physical structures and heterogeneous materials, data-efficiency in learning, and high-fidelity in capturing subtle dynamics.

preprint2022arXiv

FisherMatch: Semi-Supervised Rotation Regression via Entropy-based Filtering

Estimating the 3DoF rotation from a single RGB image is an important yet challenging problem. Recent works achieve good performance relying on a large amount of expensive-to-obtain labeled data. To reduce the amount of supervision, we for the first time propose a general framework, FisherMatch, for semi-supervised rotation regression, without assuming any domain-specific knowledge or paired data. Inspired by the popular semi-supervised approach, FixMatch, we propose to leverage pseudo label filtering to facilitate the information flow from labeled data to unlabeled data in a teacher-student mutual learning framework. However, incorporating the pseudo label filtering mechanism into semi-supervised rotation regression is highly non-trivial, mainly due to the lack of a reliable confidence measure for rotation prediction. In this work, we propose to leverage matrix Fisher distribution to build a probabilistic model of rotation and devise a matrix Fisher-based regressor for jointly predicting rotation along with its prediction uncertainty. We then propose to use the entropy of the predicted distribution as a confidence measure, which enables us to perform pseudo label filtering for rotation regression. For supervising such distribution-like pseudo labels, we further investigate the problem of how to enforce loss between two matrix Fisher distributions. Our extensive experiments show that our method can work well even under very low labeled data ratios on different benchmarks, achieving significant and consistent performance improvement over supervised learning and other semi-supervised learning baselines. Our project page is at https://yd-yin.github.io/FisherMatch.

preprint2022arXiv

FishFuzz: Throwing Larger Nets to Catch Deeper Bugs

Greybox fuzzing is the de-facto standard to discover bugs during development. Fuzzers execute many inputs to maximize the amount of reached code. Recently, Directed Greybox Fuzzers (DGFs) propose an alternative strategy that goes beyond "just" coverage: driving testing toward specific code targets by selecting "closer" seeds. DGFs go through different phases: exploration (i.e., reaching interesting locations) and exploitation (i.e., triggering bugs). In practice, DGFs leverage coverage to directly measure exploration, while exploitation is, at best, measured indirectly by alternating between different targets. Specifically, we observe two limitations in existing DGFs: (i) they lack precision in their distance metric, i.e., averaging multiple paths and targets into a single score (to decide which seeds to prioritize), and (ii) they assign energy to seeds in a round-robin fashion without adjusting the priority of the targets (exhaustively explored targets should be dropped). We propose FishFuzz, which draws inspiration from trawl fishing: first casting a wide net, scraping for high coverage, then slowly pulling it in to maximize the harvest. The core of our fuzzer is a novel seed selection strategy that builds on two concepts: (i) a novel multi-distance metric whose precision is independent of the number of targets, and (ii) a dynamic target ranking to automatically discard exhausted targets. This strategy allows FishFuzz to seamlessly scale to tens of thousands of targets and dynamically alternate between exploration and exploitation phases. We evaluate FishFuzz by leveraging all sanitizer labels as targets. Extensively comparing FishFuzz against modern DGFs and coverage-guided fuzzers shows that FishFuzz reached higher coverage compared to the direct competitors, reproduces existing bugs (70.2% faster), and finally discovers 25 new bugs (18 CVEs) in 44 programs.

preprint2022arXiv

Formation Control for UAVs Using a Flux Guided Approach

Existing studies on formation control for unmanned aerial vehicles (UAV) have not considered encircling targets where an optimum coverage of the target is required at all times. Such coverage plays a critical role in many real-world applications such as tracking hostile UAVs. This paper proposes a new path planning approach called the Flux Guided (FG) method, which generates collision-free trajectories for multiple UAVs while maximising the coverage of target(s). Our method enables UAVs to track directly toward a target whilst maintaining maximum coverage. Furthermore, multiple scattered targets can be tracked by scaling the formation during flight. FG is highly scalable since it only requires communication between sub-set of UAVs on the open boundary of the formation's surface. Experimental results further validate that FG generates UAV trajectories $1.5 \times$ shorter than previous work and that trajectory planning for 9 leader/follower UAVs to surround a target in two different scenarios only requires 0.52 seconds and 0.88 seconds, respectively. The resulting trajectories are suitable for robotic controls after time-optimal parameterisation; we demonstrate this using a 3d dynamic particle system that tracks the desired trajectories using a PID controller.

preprint2022arXiv

Functionally Reconfigurable Integrated Structure of Fabry-Perot Antenna and Wideband Liquid Absorber

This paper proposes a functionally reconfigurable integrated structure of a Fabry-Perot (FP) antenna and wideband liquid absorber. The partial reflecting surface (PRS) is elaborately tailored to serve as both a component of the FP antenna and the metal ground of the broadband liquid absorber. Then the integrated structure is realized by combining the FP antenna with the liquid absorber. The PRS is composed of patches on the top layer of the substrate and the square loop on the bottom and the microstrip patch antenna is used as a source. The liquid absorber is composed of a 3-D printed container, 45% ethanol layer, and a metal ground. By this design, the structure can serve as the FP antenna when the ethanol is extracted or the liquid absorber when the ethanol is injected. This can be used to switch between detection and stealth mode which can better adapt to the complex electromagnetic environment. The simulation results show that the gain of the antenna reaches to 19.7 dBi and the operating band is 100 MHz when the ethanol is extracted. When the ethanol is injected, the absorption band of the absorber ranges from 4 GHz to 20 GHz and the absorption bandwidth is over 133%. The monostatic RCS reduction bands of the structure with ethanol range from 4 GHz to 20GHz and the average RCS reduction is 28.4 dB. The measured and simulated results agree well.

preprint2022arXiv

iPLAN: Interactive and Procedural Layout Planning

Layout design is ubiquitous in many applications, e.g. architecture/urban planning, etc, which involves a lengthy iterative design process. Recently, deep learning has been leveraged to automatically generate layouts via image generation, showing a huge potential to free designers from laborious routines. While automatic generation can greatly boost productivity, designer input is undoubtedly crucial. An ideal AI-aided design tool should automate repetitive routines, and meanwhile accept human guidance and provide smart/proactive suggestions. However, the capability of involving humans into the loop has been largely ignored in existing methods which are mostly end-to-end approaches. To this end, we propose a new human-in-the-loop generative model, iPLAN, which is capable of automatically generating layouts, but also interacting with designers throughout the whole procedure, enabling humans and AI to co-evolve a sketchy idea gradually into the final design. iPLAN is evaluated on diverse datasets and compared with existing methods. The results show that iPLAN has high fidelity in producing similar layouts to those from human designers, great flexibility in accepting designer inputs and providing design suggestions accordingly, and strong generalizability when facing unseen design tasks and limited training data.

preprint2022arXiv

Learning Category-Level Generalizable Object Manipulation Policy via Generative Adversarial Self-Imitation Learning from Demonstrations

Generalizable object manipulation skills are critical for intelligent and multi-functional robots to work in real-world complex scenes. Despite the recent progress in reinforcement learning, it is still very challenging to learn a generalizable manipulation policy that can handle a category of geometrically diverse articulated objects. In this work, we tackle this category-level object manipulation policy learning problem via imitation learning in a task-agnostic manner, where we assume no handcrafted dense rewards but only a terminal reward. Given this novel and challenging generalizable policy learning problem, we identify several key issues that can fail the previous imitation learning algorithms and hinder the generalization to unseen instances. We then propose several general but critical techniques, including generative adversarial self-imitation learning from demonstrations, progressive growing of discriminator, and instance-balancing for expert buffer, that accurately pinpoints and tackles these issues and can benefit category-level manipulation policy learning regardless of the tasks. Our experiments on ManiSkill benchmarks demonstrate a remarkable improvement on all tasks and our ablation studies further validate the contribution of each proposed technique.

preprint2022arXiv

Multi-Robot Active Mapping via Neural Bipartite Graph Matching

We study the problem of multi-robot active mapping, which aims for complete scene map construction in minimum time steps. The key to this problem lies in the goal position estimation to enable more efficient robot movements. Previous approaches either choose the frontier as the goal position via a myopic solution that hinders the time efficiency, or maximize the long-term value via reinforcement learning to directly regress the goal position, but does not guarantee the complete map construction. In this paper, we propose a novel algorithm, namely NeuralCoMapping, which takes advantage of both approaches. We reduce the problem to bipartite graph matching, which establishes the node correspondences between two graphs, denoting robots and frontiers. We introduce a multiplex graph neural network (mGNN) that learns the neural distance to fill the affinity matrix for more effective graph matching. We optimize the mGNN with a differentiable linear assignment layer by maximizing the long-term values that favor time efficiency and map completeness via reinforcement learning. We compare our algorithm with several state-of-the-art multi-robot active mapping approaches and adapted reinforcement-learning baselines. Experimental results demonstrate the superior performance and exceptional generalization ability of our algorithm on various indoor scenes and unseen number of robots, when only trained with 9 indoor scenes.

preprint2022arXiv

PartAfford: Part-level Affordance Discovery from 3D Objects

Understanding what objects could furnish for humans-namely, learning object affordance-is the crux to bridge perception and action. In the vision community, prior work primarily focuses on learning object affordance with dense (e.g., at a per-pixel level) supervision. In stark contrast, we humans learn the object affordance without dense labels. As such, the fundamental question to devise a computational model is: What is the natural way to learn the object affordance from visual appearance and geometry with humanlike sparse supervision? In this work, we present a new task of part-level affordance discovery (PartAfford): Given only the affordance labels per object, the machine is tasked to (i) decompose 3D shapes into parts and (ii) discover how each part of the object corresponds to a certain affordance category. We propose a novel learning framework for PartAfford, which discovers part-level representations by leveraging only the affordance set supervision and geometric primitive regularization, without dense supervision. The proposed approach consists of two main components: (i) an abstraction encoder with slot attention for unsupervised clustering and abstraction, and (ii) an affordance decoder with branches for part reconstruction, affordance prediction, and cuboidal primitive regularization. To learn and evaluate PartAfford, we construct a part-level, cross-category 3D object affordance dataset, annotated with 24 affordance categories shared among >25, 000 objects. We demonstrate that our method enables both the abstraction of 3D objects and part-level affordance discovery, with generalizability to difficult and cross-category examples. Further ablations reveal the contribution of each component.

preprint2022arXiv

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Generating new images with desired properties (e.g. new view/poses) from source images has been enthusiastically pursued recently, due to its wide range of potential applications. One way to ensure high-quality generation is to use multiple sources with complementary information such as different views of the same object. However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques. Therefore, we propose a new general approach which models multiple types of variations among sources, such as view angles, poses, facial expressions, in a unified framework, so that it can be employed on datasets of vastly different nature. We verify our approach on a variety of data including humans bodies, faces, city scenes and 3D objects. Both the qualitative and quantitative results demonstrate the better performance of our method than the state of the art.

preprint2022arXiv

Projective Manifold Gradient Layer for Deep Rotation Regression

Regressing rotations on SO(3) manifold using deep neural networks is an important yet unsolved problem. The gap between the Euclidean network output space and the non-Euclidean SO(3) manifold imposes a severe challenge for neural network learning in both forward and backward passes. While several works have proposed different regression-friendly rotation representations, very few works have been devoted to improving the gradient backpropagating in the backward pass. In this paper, we propose a manifold-aware gradient that directly backpropagates into deep network weights. Leveraging Riemannian optimization to construct a novel projective gradient, our proposed regularized projective manifold gradient (RPMG) method helps networks achieve new state-of-the-art performance in a variety of rotation estimation tasks. Our proposed gradient layer can also be applied to other smooth manifolds such as the unit sphere. Our project page is at https://jychen18.github.io/RPMG.

preprint2022arXiv

Real-time Controllable Motion Transition for Characters

Real-time in-between motion generation is universally required in games and highly desirable in existing animation pipelines. Its core challenge lies in the need to satisfy three critical conditions simultaneously: quality, controllability and speed, which renders any methods that need offline computation (or post-processing) or cannot incorporate (often unpredictable) user control undesirable. To this end, we propose a new real-time transition method to address the aforementioned challenges. Our approach consists of two key components: motion manifold and conditional transitioning. The former learns the important low-level motion features and their dynamics; while the latter synthesizes transitions conditioned on a target frame and the desired transition duration. We first learn a motion manifold that explicitly models the intrinsic transition stochasticity in human motions via a multi-modal mapping mechanism. Then, during generation, we design a transition model which is essentially a sampling strategy to sample from the learned manifold, based on the target frame and the aimed transition duration. We validate our method on different datasets in tasks where no post-processing or offline computation is allowed. Through exhaustive evaluation and comparison, we show that our method is able to generate high-quality motions measured under multiple metrics. Our method is also robust under various target frames (with extreme cases).

preprint2022arXiv

Stable and Efficient Shapley Value-Based Reward Reallocation for Multi-Agent Reinforcement Learning of Autonomous Vehicles

With the development of sensing and communication technologies in networked cyber-physical systems (CPSs), multi-agent reinforcement learning (MARL)-based methodologies are integrated into the control process of physical systems and demonstrate prominent performance in a wide array of CPS domains, such as connected autonomous vehicles (CAVs). However, it remains challenging to mathematically characterize the improvement of the performance of CAVs with communication and cooperation capability. When each individual autonomous vehicle is originally self-interest, we can not assume that all agents would cooperate naturally during the training process. In this work, we propose to reallocate the system's total reward efficiently to motivate stable cooperation among autonomous vehicles. We formally define and quantify how to reallocate the system's total reward to each agent under the proposed transferable utility game, such that communication-based cooperation among multi-agents increases the system's total reward. We prove that Shapley value-based reward reallocation of MARL locates in the core if the transferable utility game is a convex game. Hence, the cooperation is stable and efficient and the agents should stay in the coalition or the cooperating group. We then propose a cooperative policy learning algorithm with Shapley value reward reallocation. In experiments, compared with several literature algorithms, we show the improvement of the mean episode system reward of CAV systems using our proposed algorithm.

preprint2022arXiv

Talking Head from Speech Audio using a Pre-trained Image Generator

We propose a novel method for generating high-resolution videos of talking-heads from speech audio and a single 'identity' image. Our method is based on a convolutional neural network model that incorporates a pre-trained StyleGAN generator. We model each frame as a point in the latent space of StyleGAN so that a video corresponds to a trajectory through the latent space. Training the network is in two stages. The first stage is to model trajectories in the latent space conditioned on speech utterances. To do this, we use an existing encoder to invert the generator, mapping from each video frame into the latent space. We train a recurrent neural network to map from speech utterances to displacements in the latent space of the image generator. These displacements are relative to the back-projection into the latent space of an identity image chosen from the individuals depicted in the training dataset. In the second stage, we improve the visual quality of the generated videos by tuning the image generator on a single image or a short video of any chosen identity. We evaluate our model on standard measures (PSNR, SSIM, FID and LMD) and show that it significantly outperforms recent state-of-the-art methods on one of two commonly used datasets and gives comparable performance on the other. Finally, we report on ablation experiments that validate the components of the model. The code and videos from experiments can be found at https://mohammedalghamdi.github.io/talking-heads-acm-mm

preprint2022arXiv

Underdetermined 2D-DOD and 2D-DOA Estimation for Bistatic Coprime EMVS-MIMO Radar: From the Difference Coarray Perspective

In this paper, the underdetermined 2D-DOD and 2D-DOA estimation for bistatic coprime EMVS-MIMO radar is considered. Firstly, a 5-D tensor model was constructed by using the multi-dimensional space-time characteristics of the received data. Then, an 8-D tensor has been obtained by using the auto-correlation calculation. To obtain the difference coarrays of transmit and receive EMVS, the de-coupling process between the spatial response of EMVS and the steering vector is inevitable. Thus, a new 6-D tensor can be constructed via the tensor permutation and the generalized tensorization of the canonical polyadic decomposition. {According} to the theory of the Tensor-Matrix Product operation, the duplicated elements in the difference coarrays can be removed by the utilization of two designed selection matrices. Due to the centrosymmetric geometry of the difference coarrays, two DFT beamspace matrices were subsequently designed to convert the complex steering matrices into the real-valued ones, whose advantage is to improve the estimation accuracy of the 2D-DODs and 2D-DOAs. Afterwards, a third-order tensor with the third-way fixed at 36 was constructed and the Parallel Factor algorithm was deployed, which can yield the closed-form automatically paired 2D-DOD and 2D-DOA estimation. The simulation results show that the proposed algorithm can exhibit superior estimation performance for the underdetermined 2D-DOD and 2D-DOA estimation.

preprint2021arXiv

A Slot Antenna Array with Reconfigurable RCS Using Liquid Absorber

This paper presents a slot antenna array with a reconfigurable radar cross section (RCS). The antenna system is formed by combining a liquid absorber with a 2*2 slot antenna array. The liquid absorber consists of a polymethyl methacrylate (PMMA) container, a 45% ethanol layer, and a metal ground,which is attached to the surface of the slot antenna array. The incident wave can be absorbed by the absorber rather than reflected in other directions when the PMMA container is filled with ethanol, which reduces the monostatic and bistatic RCS. Thus the RCS of the antenna can be changed by injecting and extracting ethanol while the antenna's radiation performance in terms of bandwidth, radiation patterns and gain is well sustained. In a complex communication system, this can be used to switch between detection and stealth mode. The mechanism of the absorber is investigated. The simulated results show that the antenna with this absorber has monostatic and bistatic RCS reduction bands from 2.0 GHz to 18.0 GHz, a maximum RCS reduction of 35 dB with an average RCS reduction of 13.28 dB. The antenna's operating band is 100 MHz. Without ethanol, the antenna has a realized gain of 12.1 dBi, and it drops by 2 dB when the lossy ethanol is injected. The measured results agree well with the simulated ones.

preprint2021arXiv

Band-Notched Frequency-Selective Absorber with Polarization Rotation Function

This paper presents the theoretical analysis, design, simulations, and experimental verification of a novel band-notched frequency-selective absorber (BFSA) with polarization rotation function and absorption out-of-band. The BFSA consists of multiple layers including one lossy layer, one polarization-rotary layer and an air layer. The lossy layer is loaded with lumped resistances for obtaining a wide absorption band. A new four-port equivalent circuit model (ECM) with lossy layer is developed for providing theoretical analysis. The BFSA realizes -0.41dB cross-polarization reflection at 4.5GHz. In the upper and lower bands, the BFSA realizes a broad absorption function from 2.2GHz to 4.1GHz and 5.03GHz to 6.5GHz. The fractional bandwidth of co-polarization reflection is 98.8% with 10dB reflection reduction. The full-wave simulation, ECM, and experimental measurements are conducted to validate the polarization conversion of the band-notched absorbers, and good agreement between theory and measurement results is observed.

preprint2021arXiv

Distributed Optimization with Coupling Constraints

In this paper, we develop a novel distributed algorithm for addressing convex optimization with both nonlinear inequality and linear equality constraints, where the objective function can be a general nonsmooth convex function and all the constraints can be fully coupled. Specifically, we first separate the constraints into three groups, and design two primal-dual methods and utilize a virtual-queue-based method to handle each group of the constraints independently. Then, we integrate these three methods in a strategic way, leading to an integrated primal-dual proximal (IPLUX) algorithm, and enable the distributed implementation of IPLUX. We show that IPLUX achieves an $O(1/k)$ rate of convergence in terms of optimality and feasibility, which is stronger than the convergence results of the state-of-the-art distributed algorithms for convex optimization with coupling nonlinear constraints. Finally, IPLUX exhibits competitive practical performance in the simulations.

preprint2021arXiv

Enhanced 3D Human Pose Estimation from Videos by using Attention-Based Neural Network with Dilated Convolutions

The attention mechanism provides a sequential prediction framework for learning spatial models with enhanced implicit temporal consistency. In this work, we show a systematic design (from 2D to 3D) for how conventional networks and other forms of constraints can be incorporated into the attention framework for learning long-range dependencies for the task of pose estimation. The contribution of this paper is to provide a systematic approach for designing and training of attention-based models for the end-to-end pose estimation, with the flexibility and scalability of arbitrary video sequences as input. We achieve this by adapting temporal receptive field via a multi-scale structure of dilated convolutions. Besides, the proposed architecture can be easily adapted to a causal model enabling real-time performance. Any off-the-shelf 2D pose estimation systems, e.g. Mocap libraries, can be easily integrated in an ad-hoc fashion. Our method achieves the state-of-the-art performance and outperforms existing methods by reducing the mean per joint position error to 33.4 mm on Human3.6M dataset.

preprint2021arXiv

Equilibrium and nonequilibrium quantum correlations between two detectors in curved space time

We investigate the equilibrium and nonequilibrium quantum information correlations encoded in two-qubit system (near the horizon of a Kerr black hole). We study the impact of mass and the angular momentum, and further the local curvature or accelerations on the behaviors of the quantum correlations between two qubits. We show the quantum information of two qubits is encoded in the space time structure. In nonequilibrium case, the nonequilibrium can also contribute to the correlations.

preprint2021arXiv

High-order Differentiable Autoencoder for Nonlinear Model Reduction

This paper provides a new avenue for exploiting deep neural networks to improve physics-based simulation. Specifically, we integrate the classic Lagrangian mechanics with a deep autoencoder to accelerate elastic simulation of deformable solids. Due to the inertia effect, the dynamic equilibrium cannot be established without evaluating the second-order derivatives of the deep autoencoder network. This is beyond the capability of off-the-shelf automatic differentiation packages and algorithms, which mainly focus on the gradient evaluation. Solving the nonlinear force equilibrium is even more challenging if the standard Newton's method is to be used. This is because we need to compute a third-order derivative of the network to obtain the variational Hessian. We attack those difficulties by exploiting complex-step finite difference, coupled with reverse automatic differentiation. This strategy allows us to enjoy the convenience and accuracy of complex-step finite difference and in the meantime, to deploy complex-value perturbations as collectively as possible to save excessive network passes. With a GPU-based implementation, we are able to wield deep autoencoders (e.g., $10+$ layers) with a relatively high-dimension latent space in real-time. Along this pipeline, we also design a sampling network and a weighting network to enable \emph{weight-varying} Cubature integration in order to incorporate nonlinearity in the model reduction. We believe this work will inspire and benefit future research efforts in nonlinearly reduced physical simulation problems.

preprint2021arXiv

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex. Given a construction layout, we propose a visual context-aware graph generation network that learns the implicit global relations among the scene components and infers the location of a new building unit. The proposed network takes as input the scene graph and the corresponding top-view depth image. It provides the location recommendations for a newly-added building units by learning an auto-regressive edge distribution conditioned on existing scenes. We also introduce a global graph-image matching loss to enhance the awareness of essential geometry semantics of the site. Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.

preprint2021arXiv

Systematic electrochemical etching of various metal tips for tunneling spectroscopy and scanning probe microscopy

Hard point-contact spectroscopy and scanning probe microscopy/spectroscopy are powerful techniques for investigating materials with strong expandability. To support these studies, tips with various physical and chemical properties are required. To ensure the reproducibility of experimental results, the fabrication of tips should be standardized, and a controllable and convenient system should be set up. Here a systematic methodology to fabricate various tips is proposed, involving electrochemical etching reactions. The reaction parameters fall into four categories: solution, power supply, immersion depth, and interruption. An etching system was designed and built so that these parameters could be accurately controlled. With this system, etching parameters for copper, silver, gold, platinum/iridium alloy, tungsten, lead, niobium, iron, nickel, cobalt, and permalloy were explored and standardized. Among these tips, silver and niobium's new recipes were explored and standardized. Optical and scanning electron microscopies were performed to characterize the sharp needles. Relevant point-contact experiments were carried out with an etched silver tip to confirm the suitability of the fabricated tips.

preprint2020arXiv

Category-Level Articulated Object Pose Estimation

This project addresses the task of category-level pose estimation for articulated objects from a single depth image. We present a novel category-level approach that correctly accommodates object instances previously unseen during training. We introduce Articulation-aware Normalized Coordinate Space Hierarchy (ANCSH) - a canonical representation for different articulated objects in a given category. As the key to achieve intra-category generalization, the representation constructs a canonical object space as well as a set of canonical part spaces. The canonical object space normalizes the object orientation,scales and articulations (e.g. joint parameters and states) while each canonical part space further normalizes its part pose and scale. We develop a deep network based on PointNet++ that predicts ANCSH from a single depth point cloud, including part segmentation, normalized coordinates, and joint parameters in the canonical object space. By leveraging the canonicalized joints, we demonstrate: 1) improved performance in part pose and scale estimations using the induced kinematic constraints from joints; 2) high accuracy for joint parameter estimation in camera space.

preprint2020arXiv

Constant Regret Re-solving Heuristics for Price-based Revenue Management

Price-based revenue management is an important problem in operations management with many practical applications. The problem considers a retailer who sells a product (or multiple products) over $T$ consecutive time periods and is subject to constraints on the initial inventory levels. While the optimal pricing policy could be obtained via dynamic programming, such an approach is sometimes undesirable because of high computational costs. Approximate policies, such as the re-solving heuristics, are often applied as computationally tractable alternatives. In this paper, we show the following two results. First, we prove that a natural re-solving heuristic attains $O(1)$ regret compared to the value of the optimal policy. This improves the $O(\ln T)$ regret upper bound established in the prior work of \cite{jasin2014reoptimization}. Second, we prove that there is an $Ω(\ln T)$ gap between the value of the optimal policy and that of the fluid model. This complements our upper bound result by showing that the fluid is not an adequate information-relaxed benchmark when analyzing price-based revenue management algorithms.

preprint2020arXiv

Curriculum DeepSDF

When learning to sketch, beginners start with simple and flexible shapes, and then gradually strive for more complex and accurate ones in the subsequent training sessions. In this paper, we design a "shape curriculum" for learning continuous Signed Distance Function (SDF) on shapes, namely Curriculum DeepSDF. Inspired by how humans learn, Curriculum DeepSDF organizes the learning task in ascending order of difficulty according to the following two criteria: surface accuracy and sample difficulty. The former considers stringency in supervising with ground truth, while the latter regards the weights of hard training samples near complex geometry and fine structure. More specifically, Curriculum DeepSDF learns to reconstruct coarse shapes at first, and then gradually increases the accuracy and focuses more on complex local details. Experimental results show that a carefully-designed curriculum leads to significantly better shape reconstructions with the same training data, training epochs and network architecture as DeepSDF. We believe that the application of shape curricula can benefit the training process of a wide variety of 3D shape representation learning methods.

preprint2020arXiv

Dynamic Future Net: Diversified Human Motion Generation

Human motion modelling is crucial in many areas such as computer graphics, vision and virtual reality. Acquiring high-quality skeletal motions is difficult due to the need for specialized equipment and laborious manual post-posting, which necessitates maximizing the use of existing data to synthesize new data. However, it is a challenge due to the intrinsic motion stochasticity of human motion dynamics, manifested in the short and long terms. In the short term, there is strong randomness within a couple frames, e.g. one frame followed by multiple possible frames leading to different motion styles; while in the long term, there are non-deterministic action transitions. In this paper, we present Dynamic Future Net, a new deep learning model where we explicitly focuses on the aforementioned motion stochasticity by constructing a generative model with non-trivial modelling capacity in temporal stochasticity. Given limited amounts of data, our model can generate a large number of high-quality motions with arbitrary duration, and visually-convincing variations in both space and time. We evaluate our model on a wide range of motions and compare it with the state-of-the-art methods. Both qualitative and quantitative results show the superiority of our method, for its robustness, versatility and high-quality.

preprint2020arXiv

FASTSWARM: A Data-driven FrAmework for Real-time Flying InSecT SWARM Simulation

Insect swarms are common phenomena in nature and therefore have been actively pursued in computer animation. Realistic insect swarm simulation is difficult due to two challenges: high-fidelity behaviors and large scales, which make the simulation practice subject to laborious manual work and excessive trial-and-error processes. To address both challenges, we present a novel data-driven framework, FASTSWARM, to model complex behaviors of flying insects based on real-world data and simulate plausible animations of flying insect swarms. FASTSWARM has a linear time complexity and achieves real-time performance for large swarms. The high-fidelity behavior model of FASTSWARM explicitly takes into consideration the most common behaviors of flying insects, including the interactions among insects such as repulsion and attraction, the self-propelled behaviors such as target following and obstacle avoidance, and other characteristics such as the random movements. To achieve scalability, an energy minimization problem is formed with different behaviors modelled as energy terms, where the minimizer is the desired behavior. The minimizer is computed from the real-world data, which ensures the plausibility of the simulation results. Extensive simulation results and evaluations show that FASTSWARM is versatile in simulating various swarm behaviors, high fidelity measured by various metrics, easily controllable in inducing user controls and highly scalable.

preprint2020arXiv

Human-like Planning for Reaching in Cluttered Environments

Humans, in comparison to robots, are remarkably adept at reaching for objects in cluttered environments. The best existing robot planners are based on random sampling of configuration space -- which becomes excessively high-dimensional with large number of objects. Consequently, most planners often fail to efficiently find object manipulation plans in such environments. We addressed this problem by identifying high-level manipulation plans in humans, and transferring these skills to robot planners. We used virtual reality to capture human participants reaching for a target object on a tabletop cluttered with obstacles. From this, we devised a qualitative representation of the task space to abstract the decision making, irrespective of the number of obstacles. Based on this representation, human demonstrations were segmented and used to train decision classifiers. Using these classifiers, our planner produced a list of waypoints in task space. These waypoints provided a high-level plan, which could be transferred to an arbitrary robot model and used to initialise a local trajectory optimiser. We evaluated this approach through testing on unseen human VR data, a physics-based robot simulation, and a real robot (dataset and code are publicly available). We found that the human-like planner outperformed a state-of-the-art standard trajectory optimisation algorithm, and was able to generate effective strategies for rapid planning -- irrespective of the number of obstacles in the environment.

preprint2020arXiv

Informative Scene Decomposition for Crowd Analysis, Comparison and Simulation Guidance

Crowd simulation is a central topic in several fields including graphics. To achieve high-fidelity simulations, data has been increasingly relied upon for analysis and simulation guidance. However, the information in real-world data is often noisy, mixed and unstructured, making it difficult for effective analysis, therefore has not been fully utilized. With the fast-growing volume of crowd data, such a bottleneck needs to be addressed. In this paper, we propose a new framework which comprehensively tackles this problem. It centers at an unsupervised method for analysis. The method takes as input raw and noisy data with highly mixed multi-dimensional (space, time and dynamics) information, and automatically structure it by learning the correlations among these dimensions. The dimensions together with their correlations fully describe the scene semantics which consists of recurring activity patterns in a scene, manifested as space flows with temporal and dynamics profiles. The effectiveness and robustness of the analysis have been tested on datasets with great variations in volume, duration, environment and crowd dynamics. Based on the analysis, new methods for data visualization, simulation evaluation and simulation guidance are also proposed. Together, our framework establishes a highly automated pipeline from raw data to crowd analysis, comparison and simulation guidance. Extensive experiments and evaluations have been conducted to show the flexibility, versatility and intuitiveness of our framework.

preprint2020arXiv

MeshingNet: A New Mesh Generation Method based on Deep Learning

We introduce a novel approach to automatic unstructured mesh generation using machine learning to predict an optimal finite element mesh for a previously unseen problem. The framework that we have developed is based around training an artificial neural network (ANN) to guide standard mesh generation software, based upon a prediction of the required local mesh density throughout the domain. We describe the training regime that is proposed, based upon the use of \emph{a posteriori} error estimation, and discuss the topologies of the ANNs that we have considered. We then illustrate performance using two standard test problems, a single elliptic partial differential equation (PDE) and a system of PDEs associated with linear elasticity. We demonstrate the effective generation of high quality meshes for arbitrary polygonal geometries and a range of material parameters, using a variety of user-selected error norms.

preprint2020arXiv

Object-Centric Multi-View Aggregation

We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid. Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation, and then combined -- in a manner that can accommodate a variable number of views and is view order independent. We show that computing a symmetry-aware mapping from pixels to the canonical coordinate system allows us to better propagate information to unseen regions, as well as to robustly overcome pose ambiguities during inference. Our aggregate representation enables us to perform 3D inference tasks like volumetric reconstruction and novel view synthesis, and we use these tasks to demonstrate the benefits of our aggregation approach as compared to implicit or camera-centric alternatives.

preprint2020arXiv

Predicting the Physical Dynamics of Unseen 3D Objects

Machines that can predict the effect of physical interactions on the dynamics of previously unseen object instances are important for creating better robots and interactive virtual worlds. In this work, we focus on predicting the dynamics of 3D objects on a plane that have just been subjected to an impulsive force. In particular, we predict the changes in state - 3D position, rotation, velocities, and stability. Different from previous work, our approach can generalize dynamics predictions to object shapes and initial conditions that were unseen during training. Our method takes the 3D object's shape as a point cloud and its initial linear and angular velocities as input. We extract shape features and use a recurrent neural network to predict the full change in state at each time step. Our model can support training with data from both a physics engine or the real world. Experiments show that we can accurately predict the changes in state for unseen object geometries and initial conditions.

preprint2020arXiv

PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions

3D generative shape modeling is a fundamental research area in computer vision and interactive computer graphics, with many real-world applications. This paper investigates the novel problem of generating 3D shape point cloud geometry from a symbolic part tree representation. In order to learn such a conditional shape generation procedure in an end-to-end fashion, we propose a conditional GAN "part tree"-to-"point cloud" model (PT2PC) that disentangles the structural and geometric factors. The proposed model incorporates the part tree condition into the architecture design by passing messages top-down and bottom-up along the part tree hierarchy. Experimental results and user study demonstrate the strengths of our method in generating perceptually plausible and diverse 3D point clouds, given the part tree condition. We also propose a novel structural measure for evaluating if the generated shape point clouds satisfy the part tree conditions.

preprint2020arXiv

Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks

In this paper, we examine the long-neglected yet important effects of point sampling patterns in point cloud GANs. Through extensive experiments, we show that sampling-insensitive discriminators (e.g.PointNet-Max) produce shape point clouds with point clustering artifacts while sampling-oversensitive discriminators (e.g.PointNet++, DGCNN) fail to guide valid shape generation. We propose the concept of sampling spectrum to depict the different sampling sensitivities of discriminators. We further study how different evaluation metrics weigh the sampling pattern against the geometry and propose several perceptual metrics forming a sampling spectrum of metrics. Guided by the proposed sampling spectrum, we discover a middle-point sampling-aware baseline discriminator, PointNet-Mix, which improves all existing point cloud generators by a large margin on sampling-related metrics. We point out that, though recent research has been focused on the generator design, the main bottleneck of point cloud GAN actually lies in the discriminator design. Our work provides both suggestions and tools for building future discriminators. We will release the code to facilitate future research.

preprint2020arXiv

SAPIEN: A SimulAted Part-based Interactive ENvironment

Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable. Existing environments achieve these requirements for robotics simulation with different levels of simplification and focus. We take one step further in constructing an environment that supports household tasks for training robot learning algorithm. Our work, SAPIEN, is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects. Our SAPIEN enables various robotic vision and interaction tasks that require detailed part-level understanding.We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks using heuristic approaches and reinforcement learning algorithms. We hope that our SAPIEN can open a lot of research directions yet to be explored, including learning cognition through interaction, part motion discovery, and construction of robotics-ready simulated game environment.

preprint2020arXiv

SMART: Skeletal Motion Action Recognition aTtack

Adversarial attack has inspired great interest in computer vision, by showing that classification-based solutions are prone to imperceptible attack in many tasks. In this paper, we propose a method, SMART, to attack action recognizers which rely on 3D skeletal motions. Our method involves an innovative perceptual loss which ensures the imperceptibility of the attack. Empirical studies demonstrate that SMART is effective in both white-box and black-box scenarios. Its generalizability is evidenced on a variety of action recognizers and datasets. Its versatility is shown in different attacking strategies. Its deceitfulness is proven in extensive perceptual studies. Finally, SMART shows that adversarial attack on 3D skeletal motion, one type of time-series data, is significantly different from traditional adversarial attack problems.

preprint2020arXiv

The nonequilibrium back-reaction of Hawking radiation to a Schwarzschild black hole

We investigate the nonequilibrium back reaction on the Schwarzschild black hole from the radiation field. The back reactions are characterized by the membrane close to the black hole. When the membrane is thin, we found that larger temperature difference can lead to more significant negative surface tension, larger thermodynamic dissipation cost and back reaction in energy and entropy as well as larger black hole area. This may be relevant to the primordial black holes in early universe. Moreover, our nonequilibrium model can resolve the inconsistency issue of the black hole back reaction under zero mass limit in the equilibrium case. In the thick membrane case, the nonequilibrium back reaction is found to be more significant than that in the thin membrane case. The nonequilibrium temperature difference can increase the energy and entropy loss as well as the thermodynamic dissipation of the black hole and the membrane back reactions. The nonequilibrium dissipation cost characterized by the entropy production rate appears to be significant compared to the entropy rate radiated by the black hole under finite temperature difference. This may shed light on the black hole information paradox due to the information loss from the entropy production rate in the nonequilibirum cases. The nonequilibrium thermodynamic fluctuations can also reflect the effects of the back-reactions of the Hawking radiation on the evolution of a black hole.

preprint2020arXiv

The Online Saddle Point Problem and Online Convex Optimization with Knapsacks

We study the online saddle point problem, an online learning problem where at each iteration a pair of actions need to be chosen without knowledge of the current and future (convex-concave) payoff functions. The objective is to minimize the gap between the cumulative payoffs and the saddle point value of the aggregate payoff function, which we measure using a metric called "SP-Regret". The problem generalizes the online convex optimization framework but here we must ensure both players incur cumulative payoffs close to that of the Nash equilibrium of the sum of the games. We propose an algorithm that achieves SP-Regret proportional to $\sqrt{\ln(T)T}$ in the general case, and $\log(T)$ SP-Regret for the strongly convex-concave case. We also consider the special case where the payoff functions are bilinear and the decision sets are the probability simplex. In this setting we are able to design algorithms that reduce the bounds on SP-Regret from a linear dependence in the dimension of the problem to a \textit{logarithmic} one. We also study the problem under bandit feedback and provide an algorithm that achieves sublinear SP-Regret. We then consider an online convex optimization with knapsacks problem motivated by a wide variety of applications such as: dynamic pricing, auctions, and crowdsourcing. We relate this problem to the online saddle point problem and establish $O(\sqrt{T})$ regret using a primal-dual algorithm.

preprint2019arXiv

Competing Against Equilibria in Zero-Sum Games with Evolving Payoffs

We study the problem of repeated play in a zero-sum game in which the payoff matrix may change, in a possibly adversarial fashion, on each round; we call these Online Matrix Games. Finding the Nash Equilibrium (NE) of a two player zero-sum game is core to many problems in statistics, optimization, and economics, and for a fixed game matrix this can be easily reduced to solving a linear program. But when the payoff matrix evolves over time our goal is to find a sequential algorithm that can compete with, in a certain sense, the NE of the long-term-averaged payoff matrix. We design an algorithm with small NE regret--that is, we ensure that the long-term payoff of both players is close to minimax optimum in hindsight. Our algorithm achieves near-optimal dependence with respect to the number of rounds and depends poly-logarithmically on the number of available actions of the players. Additionally, we show that the naive reduction, where each player simply minimizes its own regret, fails to achieve the stated objective regardless of which algorithm is used. We also consider the so-called bandit setting, where the feedback is significantly limited, and we provide an algorithm with small NE regret using one-point estimates of each payoff matrix.

preprint2019arXiv

Gravitational wave signal recognition of O1 data by deep learning

Deep learning method develops very fast as a tool for data analysis these years. Such a technique is quite promising to treat gravitational wave detection data. There are many works already in the literature which used deep learning technique to process simulated gravitational wave data. In this paper we apply deep learning to LIGO O1 data. In order to improve the weak signal recognition we adjust the convolutional neural network (CNN) a little bit. Our adjusted convolutional neural network admits comparable accuracy and efficiency of signal recognition as other deep learning works published in the literature. Based on our adjusted CNN, we can clearly recognize the eleven confirmed gravitational wave events included in O1 and O2. And more we find about 2000 gravitational wave triggers in O1 data.

preprint2018arXiv

Novelty Detection Meets Collider Physics

Novelty detection is the machine learning task to recognize data, which belong to an unknown pattern. Complementary to supervised learning, it allows to analyze data model-independently. We demonstrate the potential role of novelty detection in collider physics, using autoencoder-based deep neural network. Explicitly, we develop a set of density-based novelty evaluators, which are sensitive to the clustering of unknown-pattern testing data or new-physics signal events, for the design of detection algorithms. We also explore the influence of the known-pattern data fluctuations, arising from non-signal regions, on detection sensitivity. Strategies to address it are proposed. The algorithms are applied to detecting fermionic di-top partner and resonant di-top productions at LHC, and exotic Higgs decays of two specific modes at a $e^+e^-$ future collider. With parton-level analysis, we conclude that potentially the new-physics benchmarks can be recognized with high efficiency.