Source author record

Peng Wang

Peng Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

250works

54topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Ability Transfer and Recovery via Modularized Parameters Localization

Large language models can be continually pre-trained or fine-tuned to improve performance in specific domains, languages, or skills, but this specialization often degrades other capabilities and may cause catastrophic forgetting. We investigate how abilities are distributed within LLM parameters by analyzing module activations under domain- and language-specific inputs for closely related models. Across layers and modules, we find that ability-related activations are highly concentrated in a small set of channels (typically <5\%), and these channels are largely disentangled with good sufficiency and stability. Building on these observations, we propose ACT (Activation-Guided Channel-wise Ability Transfer), which localizes ability-relevant channels via activation differences and selectively transfers only the corresponding parameters, followed by lightweight fine-tuning for compatibility. Experiments on multilingual mathematical and scientific reasoning show that ACT can recover forgotten abilities while preserving retained skills. It can also merge multiple specialized models to integrate several abilities into a single model with minimal interference. Our code and data will be publicly released.

preprint2026arXiv

BOLT: Online Lightweight Adaptation for Preparation-Free Heterogeneous Cooperative Perception

Most existing heterogeneous cooperative perception methods depend on prior preparation like offline joint training or tailored collaborator-model adaptation. Such preprocessing is, however, generally impractical in real scenarios, as agents are usually independently trained by different developers and meet occasionally online. This work investigates \emph{preparation-free heterogeneous cooperative perception}, where agents use independently trained single-agent detectors without any pre-deployment coordination. We find direct cross-agent fusion under this setting greatly underperforms ego-only perception. We present BOLT, a lightweight plug-and-play module that adapts neighboring features online via ego-as-teacher distillation, requiring only ego predictions without ground-truth labels. BOLT leverages high-confidence ego perception features to guide cross-agent feature-domain alignment, while enabling neighbors to contribute features in the ego's low-confidence regions. With only 0.9M trainable parameters, BOLT improves AP@50 by up to 32.3 points over vanilla unadapted fusion in the preparation-free setting. It consistently outperforms ego-only results on DAIR-V2X and OPV2V, across different encoder pairs and fusion strategies. Code: https://github.com/sidiangongyuan/BOLT.

preprint2026arXiv

CA-GCL: Cross-Anatomy Global-Local Contrastive Learning for Robust 3D Medical Image Understanding

Fine-grained Vision-Language Pre-training (FVLP) demonstrates significant potential in 3D medical image understanding by aligning anatomy-level visual representations with corresponding textual descriptions. However, existing FVLP paradigms often suffer from severe representation collapse in the textual embedding space, where text embeddings of distinct anatomical structures become highly clustered and indistinguishable. This distributional degeneracy renders the model hypersensitive to prompt variations, hindering reliable clinical deployment. To address these challenges, we propose a novel Cross-Anatomy Global-Local Contrastive Learning framework (CA-GCL). CA-GCL introduces a global contrastive objective that enforces separation between anatomical categories in the latent space, effectively counteracting the aggregation tendency induced by local alignment. Furthermore, we incorporate a clinical-aware text augmentation strategy based on permutation invariance and partial completeness to enhance robustness against descriptive incompleteness. Extensive evaluations on the CT-RATE and Rad-ChestCT datasets demonstrate that CA-GCL consistently outperforms existing VLP paradigms in zero-shot abnormality detection, achieving superior performance while exhibiting strong cross-dataset generalization. Crucially, CA-GCL reduces performance variance across diverse prompt templates, transforming the collapsed textual similarity distribution into a bell-shaped distribution. These results validate CA-GCL as an effective framework for robust 3D medical image understanding.

preprint2026arXiv

Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei

Self-destructive behaviors are linked to complex psychological states and can be challenging to diagnose. These behaviors may be even harder to identify within subcultural groups due to their unique expressions. As large language models (LLMs) are applied across various fields, some researchers have begun exploring their application for detecting self-destructive behaviors. Motivated by this, we investigate self-destructive behavior detection within subcultures using current LLM-based methods. However, these methods have two main challenges: (1) Knowledge Lag: Subcultural slang evolves rapidly, faster than LLMs' training cycles; and (2) Semantic Misalignment: it is challenging to grasp the specific and nuanced expressions unique to subcultures. To address these issues, we proposed Subcultural Alignment Solver (SAS), a multi-agent framework that incorporates automatic retrieval and subculture alignment, significantly enhancing the performance of LLMs in detecting self-destructive behavior. Our experimental results show that SAS outperforms the current advanced multi-agent framework OWL. Notably, it competes well with fine-tuned LLMs. We hope that SAS will advance the field of self-destructive behavior detection in subcultural contexts and serve as a valuable resource for future researchers.

preprint2026arXiv

DecoRec: Decomposed 3D Scene Reconstruction from Single-View Images via Object-Level Diffusion

In this paper, we introduce \textit{DecoRec}, a novel system designed to elevate single-view 2D images to a decomposed 3D scene mesh. Current methods for single-view scene reconstruction typically rely on object retrieval or the regression of coarse 3D voxels or surfaces, leading to inaccuracies in capturing the appearance and geometry of the input image. The lack of high-quality large-scale scene-level datasets further complicates direct 3D scene generation from single-view images. To achieve high-quality 3D scene generation from a single-view image, DecoRec takes advantage of recent diffusion-based single-view object reconstruction methods to reconstruct individual objects separately. Subsequently, a refinement pipeline is proposed to effectively merge these reconstructed objects, enhancing appearance and geometry through a differentiable rendering technique and diffusion-guided refinement. Our results demonstrate that DecoRec facilitates high-quality single-view scene reconstruction in both geometry and novel synthesis, offering significant benefits for downstream applications like room interior design.

preprint2026arXiv

Generative 3D Gaussians with Learned Density Control

We present Density-Sampled Gaussians (DeG), a novel 3D representation designed to bridge the gap between adaptive rendering primitives and scalable generative modeling. Unlike existing approaches that constrain 3D Gaussians to fixed voxel grids or arrays, DeG models Gaussian centers as samples from a learnable probability density function defined over an octree. This formulation provides a rigorous mathematical framework for adaptive density control: by jointly optimizing the spatial density and Gaussian attributes under rendering supervision, our model naturally concentrates primitives in regions of high geometric complexity. We achieve this via a new render loss contribution gradient that serves as a fully differentiable analogue to the discrete densification and pruning heuristics used in standard Gaussian Splatting. The resulting representation is highly flexible, supporting variable-resolution decoding from a single latent code by simply adjusting the sampling budget. To enable generative synthesis, we train a latent diffusion model on DeG. We identify a critical challenge in applying diffusion to unordered set-structured latents, which can significantly slow convergence, and propose VecSeq, a canonical re-indexing mechanism that anchors latent tokens to a deterministic 3D Sobol sequence. This transforms the ambiguous set-generation problem into a robust sequence modeling task. Extensive experiments demonstrate that our pipeline achieves state-of-the-art quality in single-image-to-3D generation, combining the structural adaptivity of unstructured primitives with the training stability of grid-based methods.

preprint2026arXiv

GeoTopoDiff: Learning Geometry--Topology Graph Priors through Boundary-Constrained Mixed Diffusion for Sparse-Slice 3D Porous Reconstruction

Diffusion-based voxel prior modelling is challenging for the reconstruction of large-scale 3D porous microstructures. Due to the demanding requirements for simultaneously modelling both the continuous pore morphology and the discrete pore-throat topology, the diffusion models require fully observed CT scans to provide topology-faithful priors, which results in an inherent trade-off among throughput, topological fidelity, and field of view in practical industrial applications. We propose GeoTopoDiff, a graph diffusion-based framework for reconstructing 3D porous microstructures from sparse CT slices. GeoTopoDiff transfers the learning of diffusion priors from a voxel-based space to a mixed graph state space, which simultaneously encompasses continuous pore geometry and discrete pore-throat topology. A topology-aware partial graph prior from sparsely observed CT slices is introduced to constrain the reverse denoising process. Experiments on anisotropic PTFE and Fontainebleau sandstone show that GeoTopoDiff reduces morphology-related errors by 19.8% and topology-sensitive transport errors by 36.5% on average. Our findings suggest that the mixed graph state space promotes the diffusion denoising process to reduce posterior uncertainty under a sparse observations. All models and code have been made publicly available to facilitate the exploration of diffusion models in the field of 3D porous microstructures simulation.

preprint2026arXiv

GP-GS: Gaussian Processes Densification for 3D Gaussian Splatting

3D Gaussian Splatting (3DGS) enables photorealistic rendering but suffers from artefacts due to sparse Structure-from-Motion (SfM) initialisation. To address this limitation, we propose GP-GS, a Gaussian Process (GP) based densification framework for 3DGS optimisation. GP-GS formulates point cloud densification as a continuous regression problem, where a GP learns a local mapping from 2D pixel coordinates to 3D position and colour attributes. An adaptive neighbourhood-based sampling strategy generates candidate pixels for inference, while GP-predicted uncertainty is used to filter unreliable predictions, reducing noise and preserving geometric structure. Extensive experiments on synthetic and real-world benchmarks demonstrate that GP-GS consistently improves reconstruction quality and rendering fidelity, achieving up to 1.12 dB PSNR improvement over strong baselines.

preprint2026arXiv

GSMap: 2D Gaussians for Online HD Mapping

Accurate High-Definition (HD) map construction is critical for autonomous driving, yet existing methods face a fundamental trade-off: vectorization-based approaches preserve topology but struggle with geometric fidelity, while rasterization-based approaches enable precise geometric supervision but produce unstructured outputs. To bridge this gap, we propose GSMap, a novel framework that unifies both paradigms via a learnable 2D Gaussian representation. Each map element is modeled as an ordered sequence of 2D Gaussians, whose centers correspond to the vertices of the vectorized polyline/polygon. This formulation enables simultaneous optimization through: (1) Differentiable rasterization that enforces pixel-level geometric constraints, and (2) Topology-aware vectorization that maintains structural regularity. Experiments on both nuScenes and Argoverse2 demonstrate that our Gaussian-based representation effectively unifies geometric and topological learning, achieving significant performance improvements and demonstrating strong compatibility with existing HD mapping architectures. Code will be available at https://github.com/peakpang/GSMap

preprint2026arXiv

HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection

The rapid evolution of generative models has enabled the creation of highly realistic and diverse synthetic images, posing significant challenges to reliable and generalizable Synthetic Image Detection (SID). However, existing detectors are typically trained on limited and biased datasets, resulting in poor generalization to unseen generators. To address this issue, we propose HiMix, a unified framework that enhances generalization by expanding the training distribution and promoting artifact-aware representations. Specifically, the Mixup-driven Distributional Augmentation (MDA) module constructs continuous transitional samples between real and fake images, improving coverage of low-confidence regions and exposing the model to more challenging samples, while the pixel-wise mixup operation smoothly perturbs semantics to enhance sensitivity to low-level artifacts. Moreover, the Hierarchical Artifact-aware Representation (HAR) module aggregates artifact information from both global and local levels through cross-layer integration and coarse-to-fine feature fusion, enabling the extraction of discriminative forgery representations under diverse distributions. Extensive experiments across multiple benchmarks demonstrate that HiMix achieves state-of-the-art performance, establishing well-separated logits for improved generalization to unseen forgeries.

preprint2026arXiv

Is One Score Enough? Rethinking the Evaluation of Sequentially Evolving LLM Memory

Memory plays a central role in enabling large language models (LLMs) to operate over sequential tasks by accumulating and reusing experience over time. However, existing evaluations of LLM memory mostly rely on aggregate metrics such as final hold-out accuracy or cumulative online performance, which can obscure critical failure modes such as forgetting and negative transfer. In this paper, we introduce SeqMem-Eval, a diagnostic evaluation framework for sequentially evolving LLM memory. Drawing inspiration from continual learning, it targets a test-time setting in which memory is external, prompt-mediated, and updated without modifying model parameters. Rather than focusing only on final performance, SeqMem-Eval evaluates how memory states evolve, generalize, consolidate experience, and retain useful information during sequential inference. Specifically, it measures online utility, hold-out generalization, backward transfer, and forgetting, providing a finer-grained view of memory quality. Through extensive experiments across diverse tasks and memory methods, we show that higher final or cumulative accuracy does not necessarily imply better memory quality: many methods exhibit strong performance gains while suffering from substantial forgetting or negative transfer. Moreover, different memory designs exhibit distinct trade-offs between adaptability and stability that remain invisible under standard evaluation metrics.

preprint2026arXiv

LLMs Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Previous research has shown that LLMs finetuned on malicious or incorrect completions within narrow domains (e.g., insecure code or incorrect medical advice) can become broadly misaligned to exhibit harmful behaviors, which is called emergent misalignment. In this work, we investigate whether this phenomenon can extend beyond safety behaviors to a broader spectrum of dishonesty and deception under high-stakes scenarios (e.g., lying under pressure and deceptive behavior). To explore this, we finetune open-sourced LLMs on misaligned completions across diverse domains. Experimental results demonstrate that LLMs show broadly misaligned behavior in dishonesty. Additionally, we further explore this phenomenon in a downstream combined finetuning setting, and find that introducing as little as 1% of misalignment data into a standard downstream task is sufficient to decrease honest behavior over 20%. Furthermore, we consider a more practical human-AI interaction environment where we simulate both benign and biased users to interact with the assistant LLM. Notably, we find that the assistant can be misaligned unintentionally to exacerbate its dishonesty with only 10% biased user population. In summary, we extend the study of emergent misalignment to the domain of dishonesty and deception under high-stakes scenarios, and demonstrate that this risk arises not only through direct finetuning, but also in downstream mixture tasks and practical human-AI interactions. Refer to https://github.com/hxhcreate/LLM_Deceive_Unintentionally for experimental resources.

preprint2026arXiv

PDR: A Plug-and-Play Positional Decay Framework for LLM Pre-training Data Detection

Detecting pre-training data in Large Language Models (LLMs) is crucial for auditing data privacy and copyright compliance, yet it remains challenging in black-box, zero-shot settings where computational resources and training data are scarce. While existing likelihood-based methods have shown promise, they typically aggregate token-level scores using uniform weights, thereby neglecting the inherent information-theoretic dynamics of autoregressive generation. In this paper, we hypothesize and empirically validate that memorization signals are heavily skewed towards the high-entropy initial tokens, where model uncertainty is highest, and decay as context accumulates. To leverage this linguistic property, we introduce Positional Decay Reweighting (PDR), a training-free and plug-and-play framework. PDR explicitly reweights token-level scores to amplify distinct signals from early positions while suppressing noise from later ones. Extensive experiments show that PDR acts as a robust prior and can usually enhance a wide range of advanced methods across multiple benchmarks.

preprint2026arXiv

Softmax-GS: Generalized Gaussians Learning When to Blend or Bound

3D Gaussian Splatting (3D GS) is widely adopted for novel view synthesis due to its high training and rendering efficiency. However, its efficiency relies on the key assumption that Gaussians do not overlap in the 3D space, which leads to noticeable artifacts and view inconsistencies. In addition, the inherently diffuse boundaries of Gaussians hinder accurate reconstruction of sharp object edges. We propose Softmax-GS, a unified solution that addresses both the view-inconsistency and the diffuse-boundary problem by enforcing a softmax-based competition in overlapping regions between two Gaussians. With learnable parameters controlling the strength of the competition, it enables a continuous spectrum from smooth color blending to crisp, well-defined boundaries. Our formulation explicitly preserves order invariance for any two overlapping Gaussians and ensures that the output transmittance remains unchanged irrespective of the extent of overlapping, preventing undesirable discontinuities in the rendered output. Ablation experiments on simple geometries demonstrate the effectiveness of each component of Softmax-GS, and evaluations on real-world benchmarks show that it achieves state-of-the-art performance, improving both reconstruction quality and parameter efficiency.

preprint2026arXiv

Towards Robust Sequential Decomposition for Complex Image Editing

Recent advances in visual generative models have enabled high-fidelity image editing guided by human instructions. However, these models often struggle with complex instructions involving combinatorial editing operations or inter-step dependencies. This difficulty stems from the limitations of two canonical paradigms: (1) single-turn editing, which attempts to apply all instructed edits in one pass, often fails to parse the complex instruction accurately and causes undesired edits; and (2) sequential editing can decompose the task into simpler steps but suffers from compounding errors introduced by the sequential execution, leading to low-fidelity results. To derive a robust solution for complex image editing, we examine editing behaviors of different paradigms under a unified in-context editing framework, and study how the benefits of sequential decomposition can be balanced against its error-accumulation drawbacks. We further develop a synthetic data pipeline that constructs editing tasks of varying instruction complexity, allowing us to curate a large-scale editing dataset with high-quality decomposed sequences. By finetuning on synthetic data, we discovered that with properly designed editing paradigms, sequential decomposition yields robust improvements even as task complexity increases. Furthermore, the decomposition skills learned from synthetic tasks can transfer to real images by co-training with real-world editing data, demonstrating the promise of sim-to-real generalization for tackling complex image editing across broader domains.

preprint2025arXiv

Quaternion Approximation Networks for Enhanced Image Classification and Oriented Object Detection

This paper introduces Quaternion Approximate Networks (QUAN), a novel deep learning framework that leverages quaternion algebra for rotation equivariant image classification and object detection. Unlike conventional quaternion neural networks attempting to operate entirely in the quaternion domain, QUAN approximates quaternion convolution through Hamilton product decomposition using real-valued operations. This approach preserves geometric properties while enabling efficient implementation with custom CUDA kernels. We introduce Independent Quaternion Batch Normalization (IQBN) for training stability and extend quaternion operations to spatial attention mechanisms. QUAN is evaluated on image classification (CIFAR-10/100, ImageNet), object detection (COCO, DOTA), and robotic perception tasks. In classification tasks, QUAN achieves higher accuracy with fewer parameters and faster convergence compared to existing convolution and quaternion-based models. For objection detection, QUAN demonstrates improved parameter efficiency and rotation handling over standard Convolutional Neural Networks (CNNs) while establishing the SOTA for quaternion CNNs in this downstream task. These results highlight its potential for deployment in resource-constrained robotic systems requiring rotation-aware perception and application in other domains.

preprint2024arXiv

Interferometric Signatures of Black Holes with Multiple Photon Spheres

It has been reported that the photon ring structure in black hole images produces strong and universal interferometric signatures on long interferometric baselines, holding promise for measuring black hole parameters and testing general relativity. This paper investigates the interferometric signatures of black holes with one or two photon spheres, specifically within the framework of Einstein-Maxwell-Scalar models. Notably, for black holes possessing two photon spheres, interference between light rays orbiting the inner and outer photon spheres manifests as beat signals in the visibility amplitude, deviating from the universal signatures observed in the single-photon sphere case.

preprint2024arXiv

Timelike entanglement entropy and $T\bar{T}$ deformation

In a previous work arXiv:1811.07758 about the $T\bar{T}$ deformed CFT$_2$, from the consistency requirement of the entanglement entropy theory, we found that in addition to the usual spacelike entanglement entropy, a timelike entanglement entropy must be introduced and treated equally. Inspired by the recent explicit constructions of the timelike entanglement entropy and its bulk dual, we provide a comprehensive analysis of the timelike and spacelike entanglement entropies in the $T\bar{T}$ deformed finite size system and finite temperature system. The results confirm our prediction that in the finite size system only the timelike entanglement entropy receives a correction, while in the finite temperature system only the usual spacelike entanglement entropy gets a correction. These findings affirm the necessity of a complete measure including both spacelike and timelike entanglement entropies.

preprint2024arXiv

Timelike entanglement entropy in dS$_3$/CFT$_2$

In the context of dS$_3$/CFT$_2$, we propose a timelike entanglement entropy defined by the renormalization group flow. This timelike entanglement entropy is calculated in CFT by using the Callan-Symanzik equation. We find an exact match between this entanglement entropy and the length of a timelike geodesic connecting two different spacelike surfaces in dS$_3$.The counterpart of this entanglement entropy in AdS$_3$ is a spacelike one, also induced by RG flow and extends all the way into the bulk of AdS$_3$. As a result, in both AdS$_3$/CFT$_2$ and dS$_3$/CFT$_2$, there exist exactly three entanglement entropies, providing precisely sufficient information to reconstruct the three-dimensional bulk geometry.

preprint2023arXiv

Classification of Minimal Immersions of Conformally Flat $3$-Tori and $4$-Tori in Spheres by The First Eigenfunctions

This paper is devoted to the study of minimal immersions of flat $n$-tori into spheres, especially those immersed by the first eigenfunctions (such immersion is called $λ_1$-minimal immersion), which also play important roles in spectral geometry. It is known that there are only two non-congruent $λ_1$-minimal $2$-tori in spheres, which are both flat. For higher dimensional case, the Clifford $n$-torus in $\mathbb{S}^{2n-1}$ might be the only known example in the literature. In this paper, by discussing the general construction of homogeneous minimal flat $n$-tori in spheres, we construct many new examples of $λ_1$-minimal flat $3$-tori and $4$-tori. In contrast to the rigidity in the case of $2$-tori, we show that there exists a $2$-parameter family of non-congruent $λ_1$-minimal flat $4$-tori. It turns out that the examples we constructed exhaust all $λ_1$-minimal immersions of conformally flat $3$-tori and $4$-tori in spheres. The classification involves some detailed investigations of shortest vectors in lattices, which can also be used to solve the Berger's problem on flat $3$-tori and $4$-tori. The dilation-invariant functional $λ_1(g)V(g)^{\frac{2}{n}}$ about the first eignvalue is proved to have maximal value among all flat $3$-tori and $4$-tori.

preprint2022arXiv

A Wearable ECG Monitor for Deep Learning Based Real-Time Cardiovascular Disease Detection

Cardiovascular disease has become one of the most significant threats endangering human life and health. Recently, Electrocardiogram (ECG) monitoring has been transformed into remote cardiac monitoring by Holter surveillance. However, the widely used Holter can bring a great deal of discomfort and inconvenience to the individuals who carry them. We developed a new wireless ECG patch in this work and applied a deep learning framework based on the Convolutional Neural Network (CNN) and Long Short-term Memory (LSTM) models. However, we find that the models using the existing techniques are not able to differentiate two main heartbeat types (Supraventricular premature beat and Atrial fibrillation) in our newly obtained dataset, resulting in low accuracy of 58.0 %. We proposed a semi-supervised method to process the badly labelled data samples with using the confidence-level-based training. The experiment results conclude that the proposed method can approach an average accuracy of 90.2 %, i.e., 5.4 % higher than the accuracy of conventional ECG classification methods.

preprint2022arXiv

Anisotropic satellite accretion onto the Local Group with HESTIA

How the cosmic web feeds halos, and fuels galaxy formation is an open question with wide implications. This study explores the mass assembly in the Local Group within the context of the local cosmography by employing simulations whose initial conditions have been constrained to reproduce the local environment. The goal of this study is to inspect whether the direction of accretion of satellites on to the Milky Way and Andromeda galaxies, is related to the cosmic web. The analysis considers the three high-resolution simulations available in the HESTIA simulation suite, as well as the derived velocity shear and tidal tensors. We notice two eras in the Local Group accretion history, delimited by an epoch around $z \approx 0.7$. We also find that satellites can travel up to $\sim 4$ Mpc, relative to their parent halo before crossing its viral radius $R_{200}$. Finally, we observe a strong alignment of the infall direction with the axis of slowest collapse $\vec{e_3}$ of both tidal and shear tensors, implying satellites of the Local Group originated from one particular region of the cosmic web and were channeled towards us via the process of accretion.This alignment is dominated by the satellites that enter during the early infall era, i.e $z>0.7$.

preprint2022arXiv

Appearance of an Infalling Star in Black Holes with Multiple Photon Spheres

Photon spheres play a pivotal role in the imaging of luminous objects near black holes. In this paper, we examine observational appearances of a star freely falling in hairy black holes, which can possess one or two photon spheres outside the event horizon. When there exists a single photon sphere, the total luminosity measured by distant observers decreases exponentially with time at late times. Due to successive arrivals of photons orbiting around the photon sphere different times, a specific observer would see a series of light flashes with decreasing intensity, which share a similar frequency content. Whereas in the case with two photon spheres, photons temporarily trapped between the photon spheres can cause a peak of the total luminosity, which is followed by a slow exponential decay, at late times. In addition, these photons lead to one more series of light flashes seen by the specific observer.

preprint2022arXiv

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Recently, online shopping has gradually become a common way of shopping for people all over the world. Wonderful merchandise advertisements often attract more people to buy. These advertisements properly integrate multimodal multi-structured information of commodities, such as visual spatial information and fine-grained structure information. However, traditional multimodal text generation focuses on the conventional description of what existed and happened, which does not match the requirement of advertisement copywriting in the real world. Because advertisement copywriting has a vivid language style and higher requirements of faithfulness. Unfortunately, there is a lack of reusable evaluation frameworks and a scarcity of datasets. Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation. Noticeably, it is one of the largest video captioning datasets in this field. Accordingly, we propose a baseline method and faithfulness evaluation metric on the strength of structured information reasoning to solve the demand in reality on this dataset. It surpasses the previous methods by a large margin on all metrics. The dataset and method are coming soon on \url{https://e-mmad.github.io/e-mmad.net/index.html}.

preprint2022arXiv

Balanced control between performance and saturation for constrained nonlinear systems

This paper addresses the balanced control between performance and saturation for a class of constrained nonlinear systems, including the branches: balanced command filtered backstepping (BCFB) and balanced performance control (BPC). To balance the interconnection and conflict between performance and saturation constraints, define a performance safety evaluation (PSE) function, which evaluates the system safety under the destabilizing effect variables (DEVs) like saturation quantity and filter errors, then the cumulative effects of DEVs are fully utilized and compensated for the performance recovery. Specifically, there exists some degree of tolerance for the DEVs in the safety region, and the compensation operation works when the evaluation of the system goes dangerous. The advantages of the proposed methodology are illustrated in the numerical simulation.

preprint2022arXiv

CapOnImage: Context-driven Dense-Captioning on Image

Existing image captioning systems are dedicated to generating narrative captions for images, which are spatially detached from the image in presentation. However, texts can also be used as decorations on the image to highlight the key points and increase the attractiveness of images. In this work, we introduce a new task called captioning on image (CapOnImage), which aims to generate dense captions at different locations of the image based on contextual information. To fully exploit the surrounding visual context to generate the most suitable caption for each location, we propose a multi-modal pre-training model with multi-level pre-training tasks that progressively learn the correspondence between texts and image locations from easy to difficult. Since the model may generate redundant captions for nearby locations, we further enhance the location embedding with neighbor locations as context. For this new task, we also introduce a large-scale benchmark called CapOnImage2M, which contains 2.1 million product images, each with an average of 4.8 spatially localized captions. Compared with other image captioning model variants, our model achieves the best results in both captioning accuracy and diversity aspects. We will make code and datasets public to facilitate future research.

preprint2022arXiv

Chaos bound and its violation in charged Kiselev black hole

The chaos bound in the near-horizon regions has been studied through the expansions of the metric functions on the horizon. In this paper, we investigate the chaos bound in the the near-horizon region and at a certain distance from the horizon of a charged Kiselev black hole. The value of the Lyapunov exponent is accurately calculated by a Jacobian matrix. The angular momentum of a charged particle around the black hole affects not only the exponent, but also the position of the equilibrium orbit. This position gradually moves away from the horizon with the increase of the angular momentum. We find that the the bound is violated at a certain distance from the horizon and there is no violation in the near-horizon region when the charge mass ratio of the particle is fixed. The small value of the normalization factor is more likely to cause the violation.

preprint2022arXiv

Cold and Hot gas distribution around the Milky-Way-M31 system in the HESTIA simulations

Recent observations have revealed remarkable insights into the gas reservoir in the circumgalactic medium (CGM) of galaxy haloes. In this paper, we characterise the gas in the vicinity of Milky Way and Andromeda analogues in the HESTIA (High resolution Environmental Simulations of The Immediate Area) suite of constrained Local Group (LG) simulations. The HESTIA suite comprise of a set of three high-resolution {\sc arepo}-based simulations of the LG, run using the Auriga galaxy formation model. For this paper, we focus only on the $z = 0$ simulation datasets and generate mock skymaps along with a power spectrum analysis to show that the distributions of ions tracing low-temperature gas (HI and SiIII) are more clumpy in comparison to warmer gas tracers (OVI, OVII and OVIII). We compare to the spectroscopic CGM observations of M31 and low-redshift galaxies. HESTIA under-produces the column densities of the M31 observations, but the simulations are consistent with the observations of low-redshift galaxies. A possible explanation for these findings is that the spectroscopic observations of M31 are contaminated by gas residing in the CGM of the Milky Way.

preprint2022arXiv

Connections between reflected entropies and hyperbolic string vertices

In this paper, we establish connections between reflected entropies of multipartite mixed states in CFT$_{2}$ and hyperbolic string vertices of closed string field theory (CSFT). We show that the reflected surfaces, which are bulk duals of the reflected entropies, share the same Riemann surfaces with the hyperbolic string vertices. This observation enables us to build quantitative relations between the reflected entropies and hyperbolic string vertices. We illustrate the connections with several examples. Consequently, we propose that spacetime structure could be directly generated from the hyperbolic string vertices. The advantage of the hyperbolic string vertices approach is that we have a dynamical equation, the Batalin-Vilkoviski master equation, to control the generating process.

preprint2022arXiv

Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

The K-subspaces (KSS) method is a generalization of the K-means method for subspace clustering. In this work, we present local convergence analysis and a recovery guarantee for KSS, assuming data are generated by the semi-random union of subspaces model, where $N$ points are randomly sampled from $K \ge 2$ overlapping subspaces. We show that if the initial assignment of the KSS method lies within a neighborhood of a true clustering, it converges at a superlinear rate and finds the correct clustering within $Θ(\log\log N)$ iterations with high probability. Moreover, we propose a thresholding inner-product based spectral method for initialization and prove that it produces a point in this neighborhood. We also present numerical results of the studied method to support our theoretical developments.

preprint2022arXiv

Diffraction properties of lights with transverse orbital angular momentum

Spatiotemporal optical vortex (STOV) is a unique optical vortex with phase singularity in the space-time domain and the photons in a STOV can carry transverse orbital angular momentum (OAM). The STOV shows many fantastic properties which are worth exploring. Here, we theoretically and experimentally study the diffraction property of STOV, which is a fundamental wave phenomenon. The diffraction behaviors of STOVs are obviously affected by the transverse OAM. The diffraction patterns of STOV pulses diffracted by a grating show multi-lobe structure with each gap corresponding to 1 topological charge. The diffraction properties of lights with transverse OAM are demonstrated clearly and help us understanding the physical properties of STOV, which will be of special applications, such as the realization of fast detection of STOVs with different topological charges, which may pay the way for STOV based optical communication.

preprint2022arXiv

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

Recent Knowledge distillation (KD) studies show that different manually designed schemes impact the learned results significantly. Yet, in KD, automatically searching an optimal distillation scheme has not yet been well explored. In this paper, we propose DistPro, a novel framework which searches for an optimal KD process via differentiable meta-learning. Specifically, given a pair of student and teacher networks, DistPro first sets up a rich set of KD connection from the transmitting layers of the teacher to the receiving layers of the student, and in the meanwhile, various transforms are also proposed for comparing feature maps along its pathway for the distillation. Then, each combination of a connection and a transform choice (pathway) is associated with a stochastic weighting process which indicates its importance at every step during the distillation. In the searching stage, the process can be effectively learned through our proposed bi-level meta-optimization strategy. In the distillation stage, DistPro adopts the learned processes for knowledge distillation, which significantly improves the student accuracy especially when faster training is required. Lastly, we find the learned processes can be generalized between similar tasks and networks. In our experiments, DistPro produces state-of-the-art (SoTA) accuracy under varying number of learning epochs on popular datasets, i.e. CIFAR100 and ImageNet, which demonstrate the effectiveness of our framework.

preprint2022arXiv

Dual-Level Decoupled Transformer for Video Captioning

Video captioning aims to understand the spatio-temporal semantic concept of the video and generate descriptive sentences. The de-facto approach to this task dictates a text generator to learn from \textit{offline-extracted} motion or appearance features from \textit{pre-trained} vision models. However, these methods may suffer from the so-called \textbf{\textit{"couple"}} drawbacks on both \textit{video spatio-temporal representation} and \textit{sentence generation}. For the former, \textbf{\textit{"couple"}} means learning spatio-temporal representation in a single model(3DCNN), resulting the problems named \emph{disconnection in task/pre-train domain} and \emph{hard for end-to-end training}. As for the latter, \textbf{\textit{"couple"}} means treating the generation of visual semantic and syntax-related words equally. To this end, we present $\mathcal{D}^{2}$ - a dual-level decoupled transformer pipeline to solve the above drawbacks: \emph{(i)} for video spatio-temporal representation, we decouple the process of it into "first-spatial-then-temporal" paradigm, releasing the potential of using dedicated model(\textit{e.g.} image-text pre-training) to connect the pre-training and downstream tasks, and makes the entire model end-to-end trainable. \emph{(ii)} for sentence generation, we propose \emph{Syntax-Aware Decoder} to dynamically measure the contribution of visual semantic and syntax-related words. Extensive experiments on three widely-used benchmarks (MSVD, MSR-VTT and VATEX) have shown great potential of the proposed $\mathcal{D}^{2}$ and surpassed the previous methods by a large margin in the task of video captioning.

preprint2022arXiv

Echoes from Hairy Black Holes

We study the waveforms of time signals produced by scalar perturbations in static hairy black holes, in which the perturbations can be governed by a double-peak effective potential. The inner potential peak would give rise to echoes, which provide a powerful tool to test the Kerr hypothesis. The waveforms are constructed in the time and frequency domains, and we find that the late-time waveforms are determined by the long-lived and sub-long-lived quasinormal modes, which are trapped in the potential valley and near the smaller peak, respectively. When the distance between the peaks is significantly larger than the width of the peaks, a train of decaying echo pulses is produced by the superposition of the long-lived and sub-long-lived modes. In certain cases, the echoes can vanish and then reappear. When the peaks are close enough, one detects far fewer echo signals and a following sinusoid tail, which is controlled by the long-lived or sub-long-lived mode and hence decays very slowly.

preprint2022arXiv

Effects of Born-Infeld electrodynamics on black hole shadows

In this work, we study the shadow of Born-Infeld (BI) black holes with magnetic monopoles and Schwarzschild black holes immersed in the BI uniform magnetic field. Illuminated by a celestial sphere, black hole images are obtained by using the backward ray-tracing method. For magnetically charged BI black holes, we find that the shadow radius increases with the increase of nonlinear electromagnetics effects. For Schwarzschild black holes immersed in the BI uniform magnetic field, photons tend to move towards the axis of symmetric, resulting in stretched shadows along the equatorial plane.

preprint2022arXiv

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

Natural language spatial video grounding aims to detect the relevant objects in video frames with descriptive sentences as the query. In spite of the great advances, most existing methods rely on dense video frame annotations, which require a tremendous amount of human effort. To achieve effective grounding under a limited annotation budget, we investigate one-shot video grounding, and learn to ground natural language in all video frames with solely one frame labeled, in an end-to-end manner. One major challenge of end-to-end one-shot video grounding is the existence of videos frames that are either irrelevant to the language query or the labeled frames. Another challenge relates to the limited supervision, which might result in ineffective representation learning. To address these challenges, we designed an end-to-end model via Information Tree for One-Shot video grounding (IT-OS). Its key module, the information tree, can eliminate the interference of irrelevant frames based on branch search and branch cropping techniques. In addition, several self-supervised tasks are proposed based on the information tree to improve the representation learning under insufficient labeling. Experiments on the benchmark dataset demonstrate the effectiveness of our model.

preprint2022arXiv

Exact Community Recovery over Signed Graphs

Signed graphs encode similarity and dissimilarity relationships among different entities with positive and negative edges. In this paper, we study the problem of community recovery over signed graphs generated by the signed stochastic block model (SSBM) with two equal-sized communities. Our approach is based on the maximum likelihood estimation (MLE) of the SSBM. Unlike many existing approaches, our formulation reveals that the positive and negative edges of a signed graph should be treated unequally. We then propose a simple two-stage iterative algorithm for solving the regularized MLE. It is shown that in the logarithmic degree regime, the proposed algorithm can exactly recover the underlying communities in nearly-linear time at the information-theoretic limit. Numerical results on both synthetic and real data are reported to validate and complement our theoretical developments and demonstrate the efficacy of the proposed method.

preprint2022arXiv

Fast-Spanning Ant Colony Optimisation (FaSACO) for Mobile Robot Coverage Path Planning

Coverage Path Planning (CPP) aims at finding an optimal path that covers the whole given space. Due to the NP-hard nature, CPP remains a challenging problem. Bio-inspired algorithms such as Ant Colony Optimisation (ACO) have been exploited to solve the problem because they can utilise heuristic information to mitigate the path planning complexity. This paper proposes the Fast-Spanning Ant Colony Optimisation (FaSACO), where ants can explore the environment with various velocities. By doing so, ants with higher velocities can find destinations or obstacles faster and keep lower velocity ants informed by communicating such information via pheromone trails on the path. This mechanism ensures that the (sub-)~optimal path is found while reducing the overall path planning time. Experimental results show that FaSACO is $19.3-32.3\%$ more efficient than ACO in terms of CPU time, and re-covers $6.9-12.5\%$ less cells than ACO. This makes FaSACO appealing in real-time and energy-limited applications.

preprint2022arXiv

FastRE: Towards Fast Relation Extraction with Convolutional Encoder and Improved Cascade Binary Tagging Framework

Recent work for extracting relations from texts has achieved excellent performance. However, most existing methods pay less attention to the efficiency, making it still challenging to quickly extract relations from massive or streaming text data in realistic scenarios. The main efficiency bottleneck is that these methods use a Transformer-based pre-trained language model for encoding, which heavily affects the training speed and inference speed. To address this issue, we propose a fast relation extraction model (FastRE) based on convolutional encoder and improved cascade binary tagging framework. Compared to previous work, FastRE employs several innovations to improve efficiency while also keeping promising performance. Concretely, FastRE adopts a novel convolutional encoder architecture combined with dilated convolution, gated unit and residual connection, which significantly reduces the computation cost of training and inference, while maintaining the satisfactory performance. Moreover, to improve the cascade binary tagging framework, FastRE first introduces a type-relation mapping mechanism to accelerate tagging efficiency and alleviate relation redundancy, and then utilizes a position-dependent adaptive thresholding strategy to obtain higher tagging accuracy and better model generalization. Experimental results demonstrate that FastRE is well balanced between efficiency and performance, and achieves 3-10x training speed, 7-15x inference speed faster, and 1/100 parameters compared to the state-of-the-art models, while the performance is still competitive.

preprint2022arXiv

Gravitational Lensing by Black Holes with Multiple Photon Spheres

We study gravitational lensing of light by hairy black holes, which, in a certain parameter regime, can possess two photon spheres of different size outside the event horizon. In particular, we focus on higher-order images of a point-like light source and a luminous celestial sphere produced by strong gravitational lensing near photon spheres. Two photon spheres usually triple the number of high-order images of a point-like light source. When a hairy black hole is illuminated by a celestial sphere, two photon spheres would give rise to two critical curves in the black hole image, and the smaller critical curve coincides with the shadow edge. In addition to a set of higher-order images of the celestial sphere outside the shadow edge, two more sets of higher-order images are observed inside and outside the larger critical curve, respectively.

preprint2022arXiv

Identical Photons from Multiple Tin-Vacancy Centers in Diamond

We report the narrow inhomogeneous distribution of the zero-phonon line from tin-vacancy (SnV) centers in diamond and the overlap of spectra from multiple separated SnV centers. Photoluminescence excitation spectroscopy measurements at a cryogenic temperature showed that SnV centers exhibit stable fluorescence and linewidths close to the Fourier transform-limited linewidth. The inhomogeneous distribution was as low as ~4 GHz, which enabled the observation of Sn isotope-dependent resonant frequencies. Owing to the narrow inhomogeneous distribution, we observed multiple SnV centers showing identical photons with almost the same wavelength and linewidth. Identical SnV centers were also observed even in different diamond samples, confirming the reliable fabrication of the high-quality SnV centers.

preprint2022arXiv

Instance Image Retrieval by Learning Purely From Within the Dataset

Quality feature representation is key to instance image retrieval. To attain it, existing methods usually resort to a deep model pre-trained on benchmark datasets or even fine-tune the model with a task-dependent labelled auxiliary dataset. Although achieving promising results, this approach is restricted by two issues: 1) the domain gap between benchmark datasets and the dataset of a given retrieval task; 2) the required auxiliary dataset cannot be readily obtained. In light of this situation, this work looks into a different approach which has not been well investigated for instance image retrieval previously: {can we learn feature representation \textit{specific to} a given retrieval task in order to achieve excellent retrieval?} Our finding is encouraging. By adding an object proposal generator to generate image regions for self-supervised learning, the investigated approach can successfully learn feature representation specific to a given dataset for retrieval. This representation can be made even more effective by boosting it with image similarity information mined from the dataset. As experimentally validated, such a simple ``self-supervised learning + self-boosting'' approach can well compete with the relevant state-of-the-art retrieval methods. Ablation study is conducted to show the appealing properties of this approach and its limitation on generalisation across datasets.

preprint2022arXiv

Investigation of Bare-bones Algorithms from Quantum Perspective: A Quantum Dynamical Global Optimizer

Recent decades, the emergence of numerous novel algorithms makes it a gimmick to propose an intelligent optimization system based on metaphor, and hinders researchers from exploring the essence of search behavior in algorithms. However, it is difficult to directly discuss the search behavior of an intelligent optimization algorithm, since there are so many kinds of intelligent schemes. To address this problem, an intelligent optimization system is regarded as a simulated physical optimization system in this paper. The dynamic search behavior of such a simplified physical optimization system are investigated with quantum theory. To achieve this goal, the Schroedinger equation is employed as the dynamics equation of the optimization algorithm, which is used to describe dynamic search behaviours in the evolution process with quantum theory. Moreover, to explore the basic behaviour of the optimization system, the optimization problem is assumed to be decomposed and approximated. Correspondingly, the basic search behaviour is derived, which constitutes the basic iterative process of a simple optimization system. The basic iterative process is compared with some classical bare-bones schemes to verify the similarity of search behavior under different metaphors. The search strategies of these bare bones algorithms are analyzed through experiments.

preprint2022arXiv

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

Hierarchical multi-granularity classification (HMC) assigns hierarchical multi-granularity labels to each object and focuses on encoding the label hierarchy, e.g., ["Albatross", "Laysan Albatross"] from coarse-to-fine levels. However, the definition of what is fine-grained is subjective, and the image quality may affect the identification. Thus, samples could be observed at any level of the hierarchy, e.g., ["Albatross"] or ["Albatross", "Laysan Albatross"], and examples discerned at coarse categories are often neglected in the conventional setting of HMC. In this paper, we study the HMC problem in which objects are labeled at any level of the hierarchy. The essential designs of the proposed method are derived from two motivations: (1) learning with objects labeled at various levels should transfer hierarchical knowledge between levels; (2) lower-level classes should inherit attributes related to upper-level superclasses. The proposed combinatorial loss maximizes the marginal probability of the observed ground truth label by aggregating information from related labels defined in the tree hierarchy. If the observed label is at the leaf level, the combinatorial loss further imposes the multi-class cross-entropy loss to increase the weight of fine-grained classification loss. Considering the hierarchical feature interaction, we propose a hierarchical residual network (HRN), in which granularity-specific features from parent levels acting as residual connections are added to features of children levels. Experiments on three commonly used datasets demonstrate the effectiveness of our approach compared to the state-of-the-art HMC approaches and fine-grained visual classification (FGVC) methods exploiting the label hierarchy.

preprint2022arXiv

Leveraging Structural Information to Improve Point Line Visual-Inertial Odometry

Leveraging line features can help to improve the localization accuracy of point-based monocular Visual-Inertial Odometry (VIO) system, as lines provide additional constraints. Moreover, in an artificial environment, some straight lines are parallel to each other. In this paper, we designed a VIO system based on points and straight lines, which divides straight lines into structural straight lines (that is, straight lines parallel to each other) and non-structural straight lines. In addition, unlike the orthogonal representation using four parameters to represent the 3D straight line, we only used two parameters to minimize the representation of the structural straight line and the non-structural straight line. Furthermore, we designed a straight line matching strategy based on sampling points to improve the efficiency and success rate of straight line matching. The effectiveness of our method is verified on both public datasets of EuRoc and TUM VI benchmark and compared with other state-of-the-art algorithms.

preprint2022arXiv

LoS-Map Construction for Proactive Relay of Opportunity Selection in 6G V2X Systems

Recent advances in Vehicle-to-Everything (V2X) technology and the upcoming sixth-generation (6G) network will dawn a new era for vehicular services with enhanced communication capabilities. Connected and Autonomous Vehicles (CAVs) are expected to deliver a new transportation experience, increasing the safety and efficiency of road networks. The use of millimeter-wave (mmW) frequencies guarantees a huge amount of bandwidth (> 1GHz) and a high data rate (> 10Gbit/s), which are required for CAVs applications. However, high frequency is impaired by severe path loss, and line of sight (LoS) propagation can be easily blocked by static and dynamic obstacles. Several solutions are being investigated, and the most promising one exploits relays. However, traditional relay schemes react to link failure and leverage instantaneous information, which impedes efficient relay selection in highly mobile and complex networks, such as vehicular scenarios. In this context, we propose a novel proactive relaying strategy that exploits the cooperation between CAVs and environment information to predict the dynamic LoS-map, which describes the links' evolution in time. The proactive relaying schemes exploit the dynamic LoS-map to maximize the network connectivity. A novel framework integrating realistic mobility patterns and geometric channel propagation models is proposed to analyze the performance in different scenarios. Numerical simulations suggest that the proactive relaying schemes mitigate beam blockage and maximize the average probability of connecting CAVs with reliable links.

preprint2022arXiv

Multi-Domain Joint Training for Person Re-Identification

Deep learning-based person Re-IDentification (ReID) often requires a large amount of training data to achieve good performance. Thus it appears that collecting more training data from diverse environments tends to improve the ReID performance. This paper re-examines this common belief and makes a somehow surprising observation: using more samples, i.e., training with samples from multiple datasets, does not necessarily lead to better performance by using the popular ReID models. In some cases, training with more samples may even hurt the performance of the evaluation is carried out in one of those datasets. We postulate that this phenomenon is due to the incapability of the standard network in adapting to diverse environments. To overcome this issue, we propose an approach called Domain-Camera-Sample Dynamic network (DCSD) whose parameters can be adaptive to various factors. Specifically, we consider the internal domain-related factor that can be identified from the input features, and external domain-related factors, such as domain information or camera information. Our discovery is that training with such an adaptive model can better benefit from more training samples. Experimental results show that our DCSD can greatly boost the performance (up to 12.3%) while joint training in multiple datasets.

preprint2022arXiv

Multistage smoothing based multistep pulse compressor for ultrahigh peak power lasers

Ultrahigh peak power lasers are important scientific tools for frontier laser-physics researches, in which both the peak power improvement and operating safety are very important meanwhile limited by the damage threshold and size of compression gratings currently. Based on a recent reported method "multistep pulse compressor (MPC)", a multistage smoothing based MPC (MS-MPC) is proposed here to further improve the running safety, operating convenience, and simplify the whole setup of the MPC. In this optimized design, the beam smoothing is not simply executed in the pre-compressor or main-compressor, but separated into multistage. Then, it can protect important optics in every stage directly and reduce the executing difficult of typical MPC at the same time. The prism pair based pre-compressor will induce suitable spatial dispersion which is easier to be achieved and enough to protect the first grating directly. At the same time, the asymmetric four-grating compressor (AFGC) will also induce spatial dispersion to further smooth the laser beam which helps to protect the last grating directly. In this way, 10s-100s PW lasers can be compressed by using current available optics with improved operating safety owing to remove random spatial intensity modulations. Furthermore, an additional beam smoothing stage can be added before the main amplifier to protect the biggest amplification crystal away from damage. This MS-MPC optical design can be easily extended to be used in all exist PW laser facilities to improve their potential compressed pulse energy and running safety.

preprint2022arXiv

Negative-ResNet: Noisy Ambulatory Electrocardiogram Signal Classification Scheme

With recently successful applications of deep learning in computer vision and general signal processing, deep learning has shown many unique advantages in medical signal processing. However, data labelling quality has become one of the most significant issues for AI applications, especially when it requires domain knowledge (e.g. medical image labelling). In addition, there might be noisy labels in practical datasets, which might impair the training process of neural networks. In this work, we propose a semi-supervised algorithm for training data samples with noisy labels by performing selected Positive Learning (PL) and Negative Learning (NL). To verify the effectiveness of the proposed scheme, we designed a portable ECG patch -- iRealCare -- and applied the algorithm on a real-life dataset. Our experimental results show that we can achieve an accuracy of 91.0 %, which is 6.2 % higher than a normal training process with ResNet. There are 65 patients in our dataset and we randomly picked 2 patients to perform validation.

preprint2022arXiv

Neural Rays for Occlusion-aware Image-based Rendering

We present a new neural representation, called Neural Ray (NeuRay), for the novel view synthesis task. Recent works construct radiance fields from image features of input views to render novel view images, which enables the generalization to new scenes. However, due to occlusions, a 3D point may be invisible to some input views. On such a 3D point, these generalization methods will include inconsistent image features from invisible views, which interfere with the radiance field construction. To solve this problem, we predict the visibility of 3D points to input views within our NeuRay representation. This visibility enables the radiance field construction to focus on visible image features, which significantly improves its rendering quality. Meanwhile, a novel consistency loss is proposed to refine the visibility in NeuRay when finetuning on a specific scene. Experiments demonstrate that our approach achieves state-of-the-art performance on the novel view synthesis task when generalizing to unseen scenes and outperforms per-scene optimization methods after finetuning.

preprint2022arXiv

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

The semantic segmentation of nighttime scenes is a challenging problem that is key to impactful applications like self-driving cars. Yet, it has received little attention compared to its daytime counterpart. In this paper, we propose NightLab, a novel nighttime segmentation framework that leverages multiple deep learning models imbued with night-aware features to yield State-of-The-Art (SoTA) performance on multiple night segmentation benchmarks. Notably, NightLab contains models at two levels of granularity, i.e. image and regional, and each level is composed of light adaptation and segmentation modules. Given a nighttime image, the image level model provides an initial segmentation estimate while, in parallel, a hardness detection module identifies regions and their surrounding context that need further analysis. A regional level model focuses on these difficult regions to provide a significantly improved segmentation. All the models in NightLab are trained end-to-end using a set of proposed night-aware losses without handcrafted heuristics. Extensive experiments on the NightCity and BDD100K datasets show NightLab achieves SoTA performance compared to concurrent methods.

preprint2022arXiv

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

The key towards learning informative node representations in graphs lies in how to gain contextual information from the neighbourhood. In this work, we present a simple-yet-effective self-supervised node representation learning strategy via directly maximizing the mutual information between the hidden representations of nodes and their neighbourhood, which can be theoretically justified by its link to graph smoothing. Following InfoNCE, our framework is optimized via a surrogate contrastive loss, where the positive selection underpins the quality and efficiency of representation learning. To this end, we propose a topology-aware positive sampling strategy, which samples positives from the neighbourhood by considering the structural dependencies between nodes and thus enables positive selection upfront. In the extreme case when only one positive is sampled, we fully avoid expensive neighbourhood aggregation. Our methods achieve promising performance on various node classification datasets. It is also worth mentioning by applying our loss function to MLP based node encoders, our methods can be orders of faster than existing solutions. Our codes and supplementary materials are available at https://github.com/dongwei156/n2n.

preprint2022arXiv

Nonlinear Thouless pumping: solitons and transport breakdown

One-dimensional topological pumping of matter waves in two overlaid optical lattices moving with respect to each other is considered in the presence of attractive nonlinearity. It is shown that there exists a threshold nonlinearity level above which the matter transfer is completely arrested. Below this threshold the transfer of both dispersive wavepackets and solitons occurs in accordance with the predictions of the linear theory, i.e. it is quantized and determined by the dynamical Chern numbers of the lowest band. The breakdown of the transport is also explained by nontrivial topology of the bands. In that case, the nonlinearity induces Rabi oscillations of atoms between two (or more) lowest bands. If the sum of the dynamical Chern numbers of the populated bands is zero, the oscillatory dynamics of a matter soliton in space occurs, which corresponds to the transport breakdown. Otherwise the sum of the Chern numbers of the nonlinearity-excited bands determines the direction and magnitude of the average velocity of matter solitons that remains quantized and admits fractional values. Thus, even in strongly nonlinear regime the topology of the linear bands is responsible for the evolution of solitons. The transition between different dynamical regimes is accurately described by the perturbation theory for solitons.

preprint2022arXiv

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization. We propose OFA, a Task-Agnostic and Modality-Agnostic framework that supports Task Comprehensiveness. OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language modeling, etc., in a simple sequence-to-sequence learning framework. OFA follows the instruction-based learning in both pretraining and finetuning stages, requiring no extra task-specific layers for downstream tasks. In comparison with the recent state-of-the-art vision & language models that rely on extremely large cross-modal datasets, OFA is pretrained on only 20M publicly available image-text pairs. Despite its simplicity and relatively small-scale training data, OFA achieves new SOTAs in a series of cross-modal tasks while attaining highly competitive performances on uni-modal tasks. Our further analysis indicates that OFA can also effectively transfer to unseen tasks and unseen domains. Our code and models are publicly available at https://github.com/OFA-Sys/OFA.

preprint2022arXiv

Probing Phase Structure of Black Holes with Lyapunov Exponents

We conjecture that there exists a relationship between Lyapunov exponents and black hole phase transitions. To support our conjecture, Lyapunov exponents of the motion of particles and ring strings are calculated for Reissner-Nordström-AdS black holes. When a phase transition occurs, the Lyapunov exponents become multivalued, and branches of the Lyapunov exponents coincide with black hole phases. Moreover, the discontinuous change in the Lyapunov exponents can be treated as an order parameter, and has a critical exponent of $1/2$ near the critical point. Our findings reveal that Lyapunov exponents can be an efficient tool to study phase structure of black holes.

preprint2022arXiv

Progressively-connected Light Field Network for Efficient View Synthesis

This paper presents a Progressively-connected Light Field network (ProLiF), for the novel view synthesis of complex forward-facing scenes. ProLiF encodes a 4D light field, which allows rendering a large batch of rays in one training step for image- or patch-level losses. Directly learning a neural light field from images has difficulty in rendering multi-view consistent images due to its unawareness of the underlying 3D geometry. To address this problem, we propose a progressive training scheme and regularization losses to infer the underlying geometry during training, both of which enforce the multi-view consistency and thus greatly improves the rendering quality. Experiments demonstrate that our method is able to achieve significantly better rendering quality than the vanilla neural light fields and comparable results to NeRF-like rendering methods on the challenging LLFF dataset and Shiny Object dataset. Moreover, we demonstrate better compatibility with LPIPS loss to achieve robustness to varying light conditions and CLIP loss to control the rendering style of the scene. Project page: https://totoro97.github.io/projects/prolif.

preprint2022arXiv

Prompt Tuning for Generative Multimodal Pretrained Models

Prompt tuning has become a new paradigm for model tuning and it has demonstrated success in natural language pretraining and even vision pretraining. In this work, we explore the transfer of prompt tuning to multimodal pretraining, with a focus on generative multimodal pretrained models, instead of contrastive ones. Specifically, we implement prompt tuning on the unified sequence-to-sequence pretrained model adaptive to both understanding and generation tasks. Experimental results demonstrate that the light-weight prompt tuning can achieve comparable performance with finetuning and surpass other light-weight tuning methods. Besides, in comparison with finetuned models, the prompt-tuned models demonstrate improved robustness against adversarial attacks. We further figure out that experimental factors, including the prompt length, prompt depth, and reparameteratization, have great impacts on the model performance, and thus we empirically provide a recommendation for the setups of prompt tuning. Despite the observed advantages, we still find some limitations in prompt tuning, and we correspondingly point out the directions for future studies. Codes are available at \url{https://github.com/OFA-Sys/OFA}

preprint2022arXiv

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

Scene text recognition (STR) attracts much attention over the years because of its wide application. Most methods train STR model in a fully supervised manner which requires large amounts of labeled data. Although synthetic data contributes a lot to STR, it suffers from the real-tosynthetic domain gap that restricts model performance. In this work, we aim to boost STR models by leveraging both synthetic data and the numerous real unlabeled images, exempting human annotation cost thoroughly. A robust consistency regularization based semi-supervised framework is proposed for STR, which can effectively solve the instability issue due to domain inconsistency between synthetic and real images. A character-level consistency regularization is designed to mitigate the misalignment between characters in sequence recognition. Extensive experiments on standard text recognition benchmarks demonstrate the effectiveness of the proposed method. It can steadily improve existing STR models, and boost an STR model to achieve new state-of-the-art results. To our best knowledge, this is the first consistency regularization based framework that applies successfully to STR.

preprint2022arXiv

Quasinormal Modes of Black Holes with Multiple Photon Spheres

For a static and spherically symmetric black hole, a photon sphere is composed of circular null geodesics of fixed radius, and plays an important role in observing the black hole. Recently, in an Einstein-Maxwell-scalar model with a non-minimal coupling between the scalar and electromagnetic fields, a class of hairy black holes has been found to possess two unstable and one stable circular null geodesics on the equatorial plane, corresponding to three photon spheres outside the event horizon. In this paper, we study quasinormal modes of the scalar field, which are associated with these circular null geodesics, in the hairy black hole spacetime. In the eikonal regime with $l\gg1$, the real part of the quasinormal modes is determined by the angular velocity of the corresponding circular geodesics. The imaginary part of the quasinormal modes associated with the unstable circular null geodesics encodes the information about the Lyapunov exponent of the corresponding circular geodesics. Interestingly, we find long-lived and sub-long-lived modes, which are associated with the stable and one of the unstable circular null geodesics, respectively. Due to tunneling through potential barriers, the damping times of the long-lived and sub-long-lived modes can be exponentially and logarithmically large in terms of $l$, respectively.

preprint2022arXiv

Quenching of Massive Disk Galaxies in the IllustrisTNG Simulation

A rare population of massive disk galaxies have been found to invade the red sequence dominated by early-type galaxies. These red/quenched massive disk galaxies have recently gained great interest into their formation and origins. The usually proposed quenching mechanisms, such as bar quenching and environment quenching, seem not suitable for those bulge-less quenched disks in low-density environment. In this paper, we use the IllustrisTNG-300 simulation to investigate the formation of massive quenched central disk galaxies. It is found that these galaxies contain less gas and harbor giant supermassive black holes(SMBHs) (above $ 10^{8}M_{\odot}$) than their star forming counterparts. By tracing their formation history, we found that quenched disk galaxies formed early and preserved disk morphology for cosmological time scales. They have experienced less than one major merger on average and it is mainly mini-mergers (mass ratio $<$1/10) that contribute to the growth of their SMBHs. In the Illustris-TNG simulation the black hole feedback mode switches from thermal to kinetic feedback when the black hole mass is more massive than $\sim 10^{8}M_{\odot}$, which is more efficient to eject gas outside of the galaxy and to suppress further cooling of hot gaseous halo. We conclude that kinetic AGN feedback in massive red/quenched disk galaxy is the dominant quenching mechanism.

preprint2022arXiv

Relation Regularized Scene Graph Generation

Scene graph generation (SGG) is built on top of detected objects to predict object pairwise visual relations for describing the image content abstraction. Existing works have revealed that if the links between objects are given as prior knowledge, the performance of SGG is significantly improved. Inspired by this observation, in this article, we propose a relation regularized network (R2-Net), which can predict whether there is a relationship between two objects and encode this relation into object feature refinement and better SGG. Specifically, we first construct an affinity matrix among detected objects to represent the probability of a relationship between two objects. Graph convolution networks (GCNs) over this relation affinity matrix are then used as object encoders, producing relation-regularized representations of objects. With these relation-regularized features, our R2-Net can effectively refine object labels and generate scene graphs. Extensive experiments are conducted on the visual genome dataset for three SGG tasks (i.e., predicate classification, scene graph classification, and scene graph detection), demonstrating the effectiveness of our proposed method. Ablation studies also verify the key roles of our proposed components in performance improvement.

preprint2022arXiv

Robust Security Analysis Based on Random Geometry Theory for Satellite-Terrestrial-Vehicle Network

Driven by B5G and 6G technologies, multi-network fusion is an indispensable tendency for future communications. In this paper, we focus on and analyze the \emph{security performance} (SP) of the \emph{satellite-terrestrial downlink transmission} (STDT). Here, the STDT is composed of a satellite network and a vehicular network with a legitimate mobile receiver and an mobile eavesdropper distributing. To theoretically analyze the SP of this system from the perspective of mobile terminals better, the random geometry theory is adopted, which assumes that both terrestrial vehicles are distributed stochastically in one beam of the satellite. Furthermore, based on this theory, the closed-form analytical expressions for two crucial and specific indicators in the STDT are derived, respectively, the secrecy outage probability and the ergodic secrecy capacity. Additionally, several related variables restricting the SP of the STDT are discussed, and specific schemes are presented to enhance the SP. Then, the asymptotic property is investigated in the high signal-to-noise ratio scenario, and accurate and asymptotic closed-form expressions are given. Finally, simulation results show that, under the precondition of guaranteeing the reliability of the STDT, the asymptotic solutions outperform the corresponding accurate results significantly in the effectiveness.

preprint2022arXiv

Self-Contrastive Learning based Semi-Supervised Radio Modulation Classification

This paper presents a semi-supervised learning framework that is new in being designed for automatic modulation classification (AMC). By carefully utilizing unlabeled signal data with a self-supervised contrastive-learning pre-training step, our framework achieves higher performance given smaller amounts of labeled data, thereby largely reducing the labeling burden of deep learning. We evaluate the performance of our semi-supervised framework on a public dataset. The evaluation results demonstrate that our semi-supervised approach significantly outperforms supervised frameworks thereby substantially enhancing our ability to train deep neural networks for automatic modulation classification in a manner that leverages unlabeled data.

preprint2022arXiv

Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer

We propose a semi-supervised network for wide-angle portraits correction. Wide-angle images often suffer from skew and distortion affected by perspective distortion, especially noticeable at the face regions. Previous deep learning based approaches need the ground-truth correction flow maps for training guidance. However, such labels are expensive, which can only be obtained manually. In this work, we design a semi-supervised scheme and build a high-quality unlabeled dataset with rich scenarios, allowing us to simultaneously use labeled and unlabeled data to improve performance. Specifically, our semi-supervised scheme takes advantage of the consistency mechanism, with several novel components such as direction and range consistency (DRC) and regression consistency (RC). Furthermore, different from the existing methods, we propose the Multi-Scale Swin-Unet (MS-Unet) based on the multi-scale swin transformer block (MSTB), which can simultaneously learn short-distance and long-distance information to avoid artifacts. Extensive experiments demonstrate that the proposed method is superior to the state-of-the-art methods and other representative baselines. The source code and dataset are available at: https://github.com/megvii-research/Portraits_Correction.

preprint2022arXiv

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

We introduce SparseNeuS, a novel neural rendering based method for the task of surface reconstruction from multi-view images. This task becomes more difficult when only sparse images are provided as input, a scenario where existing neural reconstruction approaches usually produce incomplete or distorted results. Moreover, their inability of generalizing to unseen new scenes impedes their application in practice. Contrarily, SparseNeuS can generalize to new scenes and work well with sparse images (as few as 2 or 3). SparseNeuS adopts signed distance function (SDF) as the surface representation, and learns generalizable priors from image features by introducing geometry encoding volumes for generic surface prediction. Moreover, several strategies are introduced to effectively leverage sparse views for high-quality reconstruction, including 1) a multi-level geometry reasoning framework to recover the surfaces in a coarse-to-fine manner; 2) a multi-scale color blending scheme for more reliable color prediction; 3) a consistency-aware fine-tuning scheme to control the inconsistent regions caused by occlusion and noise. Extensive experiments demonstrate that our approach not only outperforms the state-of-the-art methods, but also exhibits good efficiency, generalizability, and flexibility.

preprint2022arXiv

Spin conservation of cosmic filaments

Cosmic filaments are the largest collapsing structure in the Universe. Recently both observations and simulations inferred that cosmic filaments have coherent angular momenta (spins). Here we use filament finders to identify the filamentary structures in cosmological simulations and study their physical origins, which are well described by the primordial tidal torque of their Lagrangian counterpart regions -- protofilaments. This initial angular momenta statistically preserve their directions to low redshifts. We further show that a spin reconstruction method can predict the spins of filaments and potentially relate their spins to the initial conditions of the Universe. This correlation provides a new way of constraining and obtaining additional information of the initial perturbations of the Universe.

preprint2022arXiv

Thermodynamic Geometry of Black Holes Enclosed by a Cavity in Extended Phase Space

Recently, the phase space of black holes in a spherical cavity of radius $r_{B}$ has been extended by introducing a thermodynamic volume $V\equiv4πr_{B}^{3}/3$. In the extended phase space, we consider the thermodynamic geometry, which provides a powerful tool to understand the microscopic structure of black holes, of Reissner-Nordström (RN) black holes in a cavity, as well as that of Reissner-Nordström-AdS black holes. Although the phase structures of the cavity and AdS cases show striking resemblance, we find that there exist significant differences between the thermodynamic geometries of these two cases. In particular, a reentrant transition of the type of the microstructure interactions, i.e., repulsive $\rightarrow$ attractive $\rightarrow$ repulsive with increasing temperature in an isobaric process, is observed for RN black holes in a cavity.

preprint2022arXiv

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service".

preprint2021arXiv

Beam smoothing based on prism pair for multistep pulse compressor in PW lasers

Ultra-short ultra-intense laser provides unprecedented experimental tools and extreme physical conditions to explore frontier secrets of nature. Recently, multistep pulse compressor (MPC) was proposed to break through the limitation of the size and damage threshold of the grating in the compressor during the realization of higher peak power laser. In the MPC methods, beam smoothing in the pre-compressor is a very important process. Here, beam smoothing based on prism pair were studied technically, in which both the spatial profiles and the spectral dispersive properties were analyzed in detail. The simulation results show clearly that the prism pair can effectively smooth the laser beam. Furthermore, the beam smoothing is much more efficiency with shorter separated distance if two prism pairs are arranged to induce spatial dispersion at one direction or two directions. The results of beam smoothing here will help the optimized optical designs in all PW laser systems to improve their output and running safety.

preprint2021arXiv

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

Holistically understanding an object and its 3D movable parts through visual perception models is essential for enabling an autonomous agent to interact with the world. For autonomous driving, the dynamics and states of vehicle parts such as doors, the trunk, and the bonnet can provide meaningful semantic information and interaction states, which are essential to ensuring the safety of the self-driving vehicle. Existing visual perception models mainly focus on coarse parsing such as object bounding box detection or pose estimation and rarely tackle these situations. In this paper, we address this important autonomous driving problem by solving three critical issues. First, to deal with data scarcity, we propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images before reconstructing human-vehicle interaction (VHI) scenarios. Our approach is fully automatic without any human interaction, which can generate a large number of vehicles in uncommon states (VUS) for training deep neural networks (DNNs). Second, to perform fine-grained vehicle perception, we present a multi-task network for VUS parsing and a multi-stream network for VHI parsing. Third, to quantitatively evaluate the effectiveness of our data augmentation approach, we build the first VUS dataset in real traffic scenarios (e.g., getting on/out or placing/removing luggage). Experimental results show that our approach advances other baseline methods in 2D detection and instance segmentation by a big margin (over 8%). In addition, our network yields large improvements in discovering and understanding these uncommon cases. Moreover, we have released the source code, the dataset, and the trained model on Github (https://github.com/zongdai/EditingForDNN).

preprint2021arXiv

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

Learning cross-view consistent feature representation is the key for accurate vehicle Re-identification (ReID), since the visual appearance of vehicles changes significantly under different viewpoints. To this end, most existing approaches resort to the supervised cross-view learning using extensive extra viewpoints annotations, which however, is difficult to deploy in real applications due to the expensive labelling cost and the continous viewpoint variation that makes it hard to define discrete viewpoint labels. In this study, we present a pluggable Weakly-supervised Cross-View Learning (WCVL) module for vehicle ReID. Through hallucinating the cross-view samples as the hardest positive counterparts in feature domain, we can learn the consistent feature representation via minimizing the cross-view feature distance based on vehicle IDs only without using any viewpoint annotation. More importantly, the proposed method can be seamlessly plugged into most existing vehicle ReID baselines for cross-view learning without re-training the baselines. To demonstrate its efficacy, we plug the proposed method into a bunch of off-the-shelf baselines and obtain significant performance improvement on four public benchmark datasets, i.e., VeRi-776, VehicleID, VRIC and VRAI.

preprint2021arXiv

Quantifying Bounds of Model Gap for Synchronous Generators

In practice, uncertainties in parameters and model structures always cause a gap between a model and the corresponding physical entity. Hence, to evaluate the performance of a model, the bounds of this gap must be assessed. In this paper, we propose a trajectory-sensitivity--based approach to quantify the bounds of the gap. The trajectory sensitivity is expressed as a linear time-varying system. We thus first derive several bounds for a general linear time-varying system in different scenarios. The derived bounds are then applied to obtain bounds of the model gap for generator plant models with different types of structural information, e.g., models of different orders. Case studies are carried out to show the efficacy of the bounds through synchronous generator models on different accuracy levels.

preprint2021arXiv

Scalable Learning With a Structural Recurrent Neural Network for Short-Term Traffic Prediction

This paper presents a scalable deep learning approach for short-term traffic prediction based on historical traffic data in a vehicular road network. Capturing the spatio-temporal relationship of the big data often requires a significant amount of computational burden or an ad-hoc design aiming for a specific type of road network. To tackle the problem, we combine a road network graph with recurrent neural networks (RNNs) to construct a structural RNN (SRNN). The SRNN employs a spatio-temporal graph to infer the interaction between adjacent road segments as well as the temporal dynamics of the time series data. The model is scalable thanks to two key aspects. First, the proposed SRNN architecture is built by using the semantic similarity of the spatio-temporal dynamic interactions of all segments. Second, we design the architecture to deal with fixed-length tensors regardless of the graph topology. With the real traffic speed data measured in the city of Santander, we demonstrate the proposed SRNN outperforms the image-based approaches using the capsule network (CapsNet) by 14.1% and the convolutional neural network (CNN) by 5.87%, respectively, in terms of root mean squared error (RMSE). Moreover, we show that the proposed model is scalable. The SRNN model trained with data of a road network is able to predict traffic speed of different road networks, with the fixed number of parameters to train.

preprint2021arXiv

Scalarized Einstein-Maxwell-scalar Black Holes in Anti-de Sitter Spacetime

In this paper, we study spontaneous scalarization of asymptotically anti-de Sitter charged black holes in the Einstein-Maxwell-scalar model with a non-minimal coupling between the scalar and Maxwell fields. In this model, Reissner-Nordström-AdS (RNAdS) black holes are scalar-free black hole solutions, and may induce scalarized black holes due to the presence of a tachyonic instability of the scalar field near the event horizon. For RNAdS and scalarized black hole solutions, we investigate the domain of existence, perturbative stability against spherical perturbations and phase structure. In a micro-canonical ensemble, scalarized solutions are always thermodynamically preferred over RNAdS black holes. However, the system has much rich phase structure and phase transitions in a canonical ensemble. In particular, we report a RNAdS BH/scalarized BH/RNAdS BH reentrant phase transition, which is composed of a zeroth-order phase transition and a second-order one.

preprint2021arXiv

Space-and-time-synchronized simultaneous vehicle tracking/formation using cascaded prescribed-time control

In this paper, we present a space-and-time-synchronized control method with application to the simultaneous tracking/formation. In the framework of polar coordinates, through correlating and decoupling the reference/actual kinematics between the self vehicle and target, time and space are separated, controlled independently. As such, the specified state can be achieved at the predetermined terminal time, meanwhile, the relative trajectory in space is independent of time. In addition, for the stabilization before the predesigned time, a cascaded prescribed-time control theorem is provided as the preliminary of vehicle tracking control. The obtained results can be directly extended to the simultaneous tracking/formation of multiple vehicles. Finally, numerical examples are provided to verify the effectiveness and superiority of the proposed scheme.

preprint2021arXiv

Thermodynamics and Phase Structure of an Einstein-Maxwell-scalar Model in Extended Phase Space

In this paper, we study thermodynamics and phase structure of asymptotically AdS hairy and Reissner-Nordström-AdS (RNAdS) black holes in the extended phase space, where the cosmological constant is interpreted as a thermal pressure. The RNAdS and hairy black holes are black hole solutions of an Einstein-Maxwell-scalar (EMS) model with a non-minimal coupling between the scalar and electromagnetic fields. The Smarr relation, the first law of thermodynamics and the free energy are derived for black hole solutions in the EMS model. Moreover, the phase structure of the RNAdS and hairy black holes is investigated in canonical and grand canonical ensembles. Interestingly, RNAdS BH/hairy BH/RNAdS BH reentrant phase transitions, consisting of zeroth-order and second-order phase transitions, are found in both ensembles.

preprint2021arXiv

Validity of Thermodynamic Laws and Weak Cosmic Censorship for AdS Black Holes and Black Holes in a Cavity

By throwing a test charged particle into a Reissner-Nordstrom (RN) black hole, we test the validity of the first and second laws of thermodynamics and weak cosmic censorship conjecture (WCCC) with two types of boundary conditions, i.e., the asymptotically anti-de Sitter (AdS) space and a Dirichlet cavity wall placed in the asymptotically at space. For the RN-AdS black hole, the second law of thermodynamics is satisfied, and the WCCC is violated for both extremal and nearextremal black holes. For the RN black hole in a cavity, the entropy can either increase or decrease depending on the change in the charge, and WCCC is satisfied/violated for the extremal/nearextremal black hole. Our results indicate that there may be a connection between the black hole thermodynamics and the boundary condition imposed on the black hole.

preprint2020arXiv

A Robust Attentional Framework for License Plate Recognition in the Wild

Recognizing car license plates in natural scene images is an important yet still challenging task in realistic applications. Many existing approaches perform well for license plates collected under constrained conditions, eg, shooting in frontal and horizontal view-angles and under good lighting conditions. However, their performance drops significantly in an unconstrained environment that features rotation, distortion, occlusion, blurring, shading or extreme dark or bright conditions. In this work, we propose a robust framework for license plate recognition in the wild. It is composed of a tailored CycleGAN model for license plate image generation and an elaborate designed image-to-sequence network for plate recognition. On one hand, the CycleGAN based plate generation engine alleviates the exhausting human annotation work. Massive amount of training data can be obtained with a more balanced character distribution and various shooting conditions, which helps to boost the recognition accuracy to a large extent. On the other hand, the 2D attentional based license plate recognizer with an Xception-based CNN encoder is capable of recognizing license plates with different patterns under various scenarios accurately and robustly. Without using any heuristics rule or post-processing, our method achieves the state-of-the-art performance on four public datasets, which demonstrates the generality and robustness of our framework. Moreover, we released a new license plate dataset, named "CLPD", with 1200 images from all 31 provinces in mainland China. The dataset can be available from: https://github.com/wangpengnorman/CLPD_dataset.

preprint2020arXiv

A robust determination of halo environment in the cosmic field

A number of methods for studying the large-scale cosmic matter distribution exist in the literature. One particularly common method employed to define the cosmic web is to examine the density, velocity or potential field. Such methods are advantageous since a Hessian matrix can be constructed whose eigenvectors (and eigenvalues) indicate the principal directions (and strength) of local collapse or expansion. Technically this is achieved by diagonalizing the Hessian matrix using a fixed finite grid. The resultant large-scale structure quantification is thus inherently limited by the grid's finite resolution. Here, we overcome the obstacle of finite grid resolution by introducing a new method to determine halo environment using an adaptive interpolation which is more robust to resolution than the typical "Nearest Grid Point" (NGP) method. Essentially instead of computing and diagonalizing the Hessian matrix once for the entire grid, we suggest doing so once for each halo or galaxy in question. We examine how the eigenvalues and eigenvector direction's computed using our algorithm and the NGP method converge for different grid resolutions, finding that our new method is convergent faster. Namely changes of resolution have a much smaller effect than in the NGP method. We therefore suggest this method for future use by the community.

preprint2020arXiv

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

As a voxel-wise labeling task, semantic scene completion (SSC) tries to simultaneously infer the occupancy and semantic labels for a scene from a single depth and/or RGB image. The key challenge for SSC is how to effectively take advantage of the 3D context to model various objects or stuffs with severe variations in shapes, layouts and visibility. To handle such variations, we propose a novel module called anisotropic convolution, which properties with flexibility and power impossible for the competing methods such as standard 3D convolution and some of its variations. In contrast to the standard 3D convolution that is limited to a fixed 3D receptive field, our module is capable of modeling the dimensional anisotropy voxel-wisely. The basic idea is to enable anisotropic 3D receptive field by decomposing a 3D convolution into three consecutive 1D convolutions, and the kernel size for each such 1D convolution is adaptively determined on the fly. By stacking multiple such anisotropic convolution modules, the voxel-wise modeling capability can be further enhanced while maintaining a controllable amount of model parameters. Extensive experiments on two SSC benchmarks, NYU-Depth-v2 and NYUCAD, show the superior performance of the proposed method. Our code is available at https://waterljwant.github.io/SSC/

preprint2020arXiv

Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus fromWikipedia or SimpleWikipedia. However, retrieving knowledge from the large corpus is time-consuming and questions embedded in complex semantic representation may interfere with retrieval. Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module. Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external knowledge bases. We evaluate our method on AI2 Reasoning Challenge (ARC), and the experimental results show that meta-classifier yields considerable classification performance on emerging question types. The information provided by meta-classifier significantly improves the accuracy of reasoning module from 46.6% to 64.2%, which has a competitive advantage over retrieval-based QA methods.

preprint2020arXiv

Chaotic Motion around a Black Hole under Minimal Length Effects

We use the Melnikov method to identify chaotic behavior in geodesic motion perturbed by the minimal length effects around a Schwarzschild black hole. Unlike the integrable unperturbed geodesic motion, our results show that the perturbed homoclinic orbit, which is a geodesic joining the unstable circular orbit to itself, becomes chaotic in the sense that Smale horseshoes chaotic structure is present in phase space.

preprint2020arXiv

Construct $α^{\prime}$ corrected or loop corrected solutions without curvature singularities

For the bosonic gravi-dilaton system, we provide systematical approaches to construct non-perturbative string cosmological solutions without curvature singularities, which can match the perturbative solution to any order in $α^{\prime}$ expansion. When higher order $α^{\prime}$ corrections are calculated, they can be straightforwardly plugged in to generate compatible non-perturbative evolutions without curvature singularities. We also give a (phenomenological) map between $α^{\prime}$ corrected EOM and loop corrected EOM. This map enables us to easily generate a loop corrected solution from an $α^{\prime}$ corrected solution, and vice versa, therefore substantially enlarges the solution space.

preprint2020arXiv

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression. It requires joint reasoning over the textual and visual domains to solve the problem. Some popular referring expression datasets, however, fail to provide an ideal test bed for evaluating the reasoning ability of the models, mainly because 1) their expressions typically describe only some simple distinctive properties of the object and 2) their images contain limited distracting information. To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features. First, we design a novel expression engine rendering various reasoning logics that can be flexibly combined with rich visual properties to generate expressions with varying compositionality. Second, to better exploit the full reasoning chain embodied in an expression, we propose a new test setting by adding additional distracting images containing objects sharing similar properties with the referent, thus minimising the success rate of reasoning-free cross-domain alignment. We evaluate several state-of-the-art REF models, but find none of them can achieve promising performance. A proposed modular hard mining strategy performs the best but still leaves substantial room for improvement. We hope this new dataset and task can serve as a benchmark for deeper visual reasoning analysis and foster the research on referring expression comprehension.

preprint2020arXiv

Dislocation Slip or Phase Transformation Lead to Room-Temperature Plasticity in Diamond: Comment on Plastic Deformation of Single-Crystal Diamond Nanopillars

Despite decades of extensive research on mechanical properties of diamond, much remains to be understood in term of plastic deformation mechanisms due to the poor deformability at room temperature. In a recent work in Advanced Materials, it was claimed that room-temperature plasticity occurred in <001>-oriented single-crystal diamond nanopillars based on observation of unrecovered deformation inside scanning electron microscope. The plastic deformation was suggested to be mediated by a phase transition from sp3 carbon to an O8-carbon phase by molecular dynamics simulations. By comparison, our in-situ transmission electron microscopy study reveals that the room-temperature plasticity can be carried out by dislocation slip in both <100> and <111>-oriented diamond nanopillars. The brittle-to-ductile transition is highly dependent on the stress state. We note that the surface structure may play a significant role in the deformation mechanisms as the incipient plasticity always occurs from the surface region in nanoscale diamonds.

preprint2020arXiv

Giant Polarization and Abnormal Flexural Deformation in Bent Freestanding Perovskite Oxides

Recent realizations of ultrathin freestanding perovskite oxides offer a unique platform to probe novel properties in two-dimensional oxides. Here, we observed a giant flexoelectric response in freestanding BiFeO3 and SrTiO3 in their bent state arising from strain gradients up to 4x10e7/m, suggesting a promising approach for realizing extremely large polarizations. Additionally, a substantial reversible change in thickness was discovered in bent freestanding BiFeO3, which implies an unusual bending-expansion/shrinkage and thickness-dependence Poisson's ratios in this ferroelectric membrane that has never been seen before in crystalline materials. Our theoretical modeling reveals that this unprecedented flexural deformation within the membrane is attributable to a flexoelectricity-piezoelectricity interplay. The finding unveils intriguing nanoscale electromechanical properties and provides guidance for their practical applications in flexible nanoelectromechanical systems.

preprint2020arXiv

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge

Conventional referring expression comprehension (REF) assumes people to query something from an image by describing its visual appearance and spatial location, but in practice, we often ask for an object by describing its affordance or other non-visual attributes, especially when we do not have a precise target. For example, sometimes we say 'Give me something to eat'. In this case, we need to use commonsense knowledge to identify the objects in the image. Unfortunately, these is no existing referring expression dataset reflecting this requirement, not to mention a model to tackle this challenge. In this paper, we collect a new referring expression dataset, called KB-Ref, containing 43k expressions on 16k images. In KB-Ref, to answer each expression (detect the target object referred by the expression), at least one piece of commonsense knowledge must be required. We then test state-of-the-art (SoTA) REF models on KB-Ref, finding that all of them present a large drop compared to their outstanding performance on general REF datasets. We also present an expression conditioned image and fact attention (ECIFA) network that extract information from correlated image regions and commonsense knowledge facts. Our method leads to a significant improvement over SoTA REF models, although there is still a gap between this strong baseline and human performance. The dataset and baseline models will be released.

preprint2020arXiv

Incubation Induced Light Concentration Beyond the Diffraction Limit for High-Resolution Glass Printing

In the past two decades, tremendous efforts have been exerted to understand and control the delivery of ultrashort laser pulses into various types of transparent materials ranging from glass and crystal to polymer and even bio-materials. This approach opens up the route toward determinative and highly localized modification within the transparent materials, enabling three-dimensional (3D) micromachining of the materials into sophisticated structures and devices with the extreme geometrical flexibility. Owing to the linear diffraction and nonlinear self-focusing effects, the focal volume typically exhibits an asymmetric profile stretching along the longitudinal direction. This effect becomes more severe when focusing deeply into the transparent substrates for printing objects of large heights. In this work a new laser-material interaction regime is identified with the exceptional incubation effect originating from self-regulated multiple-pulse interactions with accumulated material changes. Our finding reveals a focal-volume-invariant modification deeply inside the fused silica glass, in striking contrary to the traditional believes that the geometrical shape of the laser induced modification follows the intensity distribution of the inscription laser. A macro-scale geometrically complex glass sculpture is successfully manufactured with the incubation assisted ultrashort laser inscription at uniform micrometer resolutions in all three dimensions.

preprint2020arXiv

MeisterMorxrc at SemEval-2020 Task 9: Fine-Tune Bert and Multitask Learning for Sentiment Analysis of Code-Mixed Tweets

Natural language processing (NLP) has been applied to various fields including text classification and sentiment analysis. In the shared task of sentiment analysis of code-mixed tweets, which is a part of the SemEval-2020 competition~\cite{patwa2020sentimix}, we preprocess datasets by replacing emoji and deleting uncommon characters and so on, and then fine-tune the Bidirectional Encoder Representation from Transformers(BERT) to perform the best. After exhausting top3 submissions, Our team MeisterMorxrc achieves an averaged F1 score of 0.730 in this task, and and our codalab username is MeisterMorxrc.

preprint2020arXiv

Minimal Length Effects on Motion of a Particle in Rindler Space

Various quantum theories of gravity predict the existence of a minimal measurable length. In this paper, we study effects of the minimal length on the motion of a particle in the Rindler space under a harmonic potential. This toy model captures key features of particle dynamics near a black hole horizon, and allows us to make three observations. First, we find that the chaotic behavior is stronger with the increases of the minimal length effects, which manifests that the maximum Lyapunov characteristic exponents mostly grow, and the KAM curves on Poincaré surfaces of section tend to disintegrate into chaotic layers. Second, in the presence of the minimal length effects, it can take a finite amount of Rindler time for a particle to cross the Rindler horizon, which implies a shorter scrambling time of black holes. Finally, it shows that some Lyapunov characteristic exponents can be greater than the surface gravity of the horizon, violating the recently conjectured universal upper bound. In short, our results reveal that quantum gravity effects may make black holes prone to more chaos and faster scrambling.

preprint2020arXiv

NAS-FCOS: Fast Neural Architecture Search for Object Detection

The success of deep neural networks relies on significant architecture engineering. Recently neural architecture search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount of computational resources, e.g., a few thousand GPU-days. To date, on challenging vision tasks such as object detection, NAS, especially fast versions of NAS, is less studied. Here we propose to search for the decoder structure of object detectors with search efficiency being taken into consideration. To be more specific, we aim to efficiently search for the feature pyramid network (FPN) as well as the prediction head of a simple anchor-free object detector, namely FCOS, using a tailored reinforcement learning paradigm. With carefully designed search space, search algorithms and strategies for evaluating network quality, we are able to efficiently search a top-performing detection architecture within 4 days using 8 V100 GPUs. The discovered architecture surpasses state-of-the-art object detection models (such as Faster R-CNN, RetinaNet and FCOS) by 1.5 to 3.5 points in AP on the COCO dataset, with comparable computation complexity and memory footprint, demonstrating the efficacy of the proposed NAS for object detection.

preprint2020arXiv

Non-equilibrium transport of inhomogeneous shale gas under ultra-tight confinement

The non-equilibrium transport of inhomogeneous and dense gases highly confined by surface is encountered in many engineering applications. For example, in the shale gas production process, methane is extracted from ultra-tight pores under high pressure so the gas is inhomogeneous and dense. Currently, the complex non-equilibrium transport of inhomogeneous and dense gases where gas surface interactions play a key role is commonly investigated by molecular dynamics or on a continuum-assumption basis. Here, a tractable kinetic model based on the generalized Enskog equation and the mean-field theory is employed to couple the effects of the volume exclusion and the long-range intermolecular attraction forces. The interactions between gas molecules and confined surface are modelled by a 10-4-3 Lennard-Jones potential, which can capture gas surface adsorption. The cross-sectional density profiles of methane under different confinements are in good agreement with the molecular dynamics results reported in the literature, and the transport behaviors are validated by the non-equilibrium molecular dynamics. The velocity of methane flow in shale matrix is plug-like due to its dense characteristics in nanopores. The influence of pressure, temperature, pore size and shale composition on density and velocity profiles is analyzed quantitatively. Our results show that the Klinkenberg correction is not applicable to model shale gas flow in the production process; the Navier-Stokes model using the second-order slip boundary condition cannot produce the proper velocity profiles, and consequently fails to predict the accurate flow rate in nanopores. This study sheds new light on understanding the physics of non-equilibrium dense gas flows in shale strata.

preprint2020arXiv

Non-ideal gas dynamics under confinement: rarefaction effect, dense effect and molecular interaction

The effects of volume exclusion and long-range intermolecular attraction are investigated by the simplified kinetic model for surface-confined inhomogeneous fluids. Gas dynamics of the ideal gas, the hard-sphere fluid and the real gas are simulated by the Boltzmann equation, the Enskog equation and the simple kinetic equation, respectively. Only the Knudsen minimum appears for the ideal gas, while both the Knudsen minimum and the Knudsen maximum occur for the hard-sphere fluid and the real gas under certain confinements, beyond which the maximum and minimum may disappear. The Boltzmann equation and the Enskog equation overestimates and underestimates the mass flow rate of the real gas dynamics under confinement, respectively, where the volume exclusion and the long-range intermolecular attractive potential among molecules are not ignorable. With the increase of the channel width, gas dynamics of the hard-sphere fluid and the real gas tends to the Boltzmann prediction gradually. The density inhomogeneity, which hinders the flow under confinement, is more obvious when the solid fraction is larger. The anomalous slip occurs for real gas under constant confinement. The flow at a smaller Knudsen number (larger solid fraction or channel width) contributes more practical amount of mass transfer, although the rarefaction effects is more prominent at larger Knudsen numbers. The temperature has no effect on density and velocity profiles of the ideal gas and the hard-sphere fluid, but the energy parameter among the real gas molecules decreases with the increasing temperature and the real gas dynamics tends to the hard-sphere ones consequently.

preprint2020arXiv

ODE-CNN: Omnidirectional Depth Extension Networks

Omnidirectional 360° camera proliferates rapidly for autonomous robots since it significantly enhances the perception ability by widening the field of view(FoV). However, corresponding 360° depth sensors, which are also critical for the perception system, are still difficult or expensive to have. In this paper, we propose a low-cost 3D sensing system that combines an omnidirectional camera with a calibrated projective depth camera, where the depth from the limited FoV can be automatically extended to the rest of the recorded omnidirectional image. To accurately recover the missing depths, we design an omnidirectional depth extension convolutional neural network(ODE-CNN), in which a spherical feature transform layer(SFTL) is embedded at the end of feature encoding layers, and a deformable convolutional spatial propagation network(D-CSPN) is appended at the end of feature decoding layers. The former resamples the neighborhood of each pixel in the omnidirectional coordination to the projective coordination, which reduces the difficulty of feature learning, and the later automatically finds a proper context to well align the structures in the estimated depths via CNN w.r.t. the reference image, which significantly improves the visual quality. Finally, we demonstrate the effectiveness of proposed ODE-CNN over the popular 360D dataset and show that ODE-CNN significantly outperforms (relatively 33% reduction in-depth error) other state-of-the-art (SoTA) methods.

preprint2020arXiv

On symmetric Willmore surfaces in spheres II: the orientation reversing case

In this paper we provide a systematic treatment of Willmore surfaces with orientation reversing symmetries and illustrate the theory by (old and new) examples. We apply our theory to isotropic Willmore two-spheres in $S^4$ and derive a necessary condition for such ( possibly branched) isotropic surfaces to descend to (possibly branched) maps from $\mathbb{R} P^2$ to $S^4$. The Veronese sphere and several other examples of non-branched Willmore immersions from $\mathbb{R} P^2$ to $S^4$ are derived as an illustration of the general theory. The Willmore immersions of $\mathbb{R} P^2$, just mentioned and different from the Veronese sphere, are new to the authors' best knowledge.

preprint2020arXiv

Person Re-identification in Aerial Imagery

Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), visual surveillance by utilizing the UAV platform has been very attractive. Most of the research works for UAV captured visual data are mainly focused on the tasks of object detection and tracking. However, limited attention has been paid to the task of person Re-identification (ReID) which has been widely studied in ordinary surveillance cameras with fixed emplacements. In this paper, to facilitate the research of person ReID in aerial imagery, we collect a large scale airborne person ReID dataset named as Person ReID for Aerial Imagery (PRAI-1581), which consists of 39,461 images of 1581 person identities. The images of the dataset are shot by two DJI consumer UAVs flying at an altitude ranging from 20 to 60 meters above the ground, which covers most of the real UAV surveillance scenarios. In addition, we propose to utilize subspace pooling of convolution feature maps to represent the input person images. Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently. We conduct extensive experiments on the proposed dataset and the experimental results demonstrate that re-identify persons in aerial imagery is a challenging problem, where our method performs favorably against state of the arts. Our dataset can be accessed via \url{https://github.com/stormyoung/PRAI-1581}.

preprint2020arXiv

Real-time Segmentation and Facial Skin Tones Grading

Modern approaches for semantic segmention usually pay too much attention to the accuracy of the model, and therefore it is strongly recommended to introduce cumbersome backbones, which brings heavy computation burden and memory footprint. To alleviate this problem, we propose an efficient segmentation method based on deep convolutional neural networks (DCNNs) for the task of hair and facial skin segmentation, which achieving remarkable trade-off between speed and performance on three benchmark datasets. As far as we know, the accuracy of skin tones classification is usually unsatisfactory due to the influence of external environmental factors such as illumination and background noise. Therefore, we use the segmentated face to obtain a specific face area, and further exploit the color moment algorithm to extract its color features. Specifically, for a 224 x 224 standard input, using our high-resolution spatial detail information and low-resolution contextual information fusion network (HLNet), we achieve 90.73% Pixel Accuracy on Figaro1k dataset at over 16 FPS in the case of CPU environment. Additional experiments on CamVid dataset further confirm the universality of the proposed model. We further use masked color moment for skin tones grade evaluation and approximate 80% classification accuracy demonstrate the feasibility of the proposed scheme.Code is available at https://github.com/JACKYLUO1991/Face-skin-hair-segmentaiton-and-skin-color-evaluation.

preprint2020arXiv

Resilient Load Restoration in Microgrids Considering Mobile Energy Storage Fleets: A Deep Reinforcement Learning Approach

Mobile energy storage systems (MESSs) provide mobility and flexibility to enhance distribution system resilience. The paper proposes a Markov decision process (MDP) formulation for an integrated service restoration strategy that coordinates the scheduling of MESSs and resource dispatching of microgrids. The uncertainties in load consumption are taken into account. The deep reinforcement learning (DRL) algorithm is utilized to solve the MDP for optimal scheduling. Specifically, the twin delayed deep deterministic policy gradient (TD3) is applied to train the deep Q-network and policy network, then the well trained policy can be deployed in on-line manner to perform multiple actions simultaneously. The proposed model is demonstrated on an integrated test system with three microgrids connected by Sioux Falls transportation network. The simulation results indicate that mobile and stationary energy resources can be well coordinated to improve system resilience.

preprint2020arXiv

Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Humans are able to describe image contents with coarse to fine details as they wish. However, most image captioning models are intention-agnostic which can not generate diverse descriptions according to different user intentions initiatively. In this work, we propose the Abstract Scene Graph (ASG) structure to represent user intention in fine-grained level and control what and how detailed the generated description should be. The ASG is a directed graph consisting of three types of \textbf{abstract nodes} (object, attribute, relationship) grounded in the image without any concrete semantic labels. Thus it is easy to obtain either manually or automatically. From the ASG, we propose a novel ASG2Caption model, which is able to recognise user intentions and semantics in the graph, and therefore generate desired captions according to the graph structure. Our model achieves better controllability conditioning on ASGs than carefully designed baselines on both VisualGenome and MSCOCO datasets. It also significantly improves the caption diversity via automatically sampling diverse ASGs as control signals.

preprint2020arXiv

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks

Most existing crowd counting systems rely on the availability of the object location annotation which can be expensive to obtain. To reduce the annotation cost, one attractive solution is to leverage a large number of unlabeled images to build a crowd counting model in semi-supervised fashion. This paper tackles the semi-supervised crowd counting problem from the perspective of feature learning. Our key idea is to leverage the unlabeled images to train a generic feature extractor rather than the entire network of a crowd counter. The rationale of this design is that learning the feature extractor can be more reliable and robust towards the inevitable noisy supervision generated from the unlabeled data. Also, on top of a good feature extractor, it is possible to build a density map regressor with much fewer density map annotations. Specifically, we proposed a novel semi-supervised crowd counting method which is built upon two innovative components: (1) a set of inter-related binary segmentation tasks are derived from the original density map regression task as the surrogate prediction target; (2) the surrogate target predictors are learned from both labeled and unlabeled data by utilizing a proposed self-training scheme which fully exploits the underlying constraints of these binary segmentation tasks. Through experiments, we show that the proposed method is superior over the existing semisupervised crowd counting method and other representative baselines.

preprint2020arXiv

Sympathetic eruptions of two filaments with an identifiable causal link observed by the Solar Dynamics Observatory

Filament eruptions occurring at different places within a relatively short time internal, but with a certain physical causal connection are usually known as sympathetic eruption. Studies on sympathetic eruptions are not uncommon. However, in the existed reports, the causal links between sympathetic eruptions remain rather speculative. In this work, we present detailed observations of a sympathetic filament eruption event, where an identifiable causal link between two eruptive filaments is observed. On 2015 November 15, two filaments (F1 in the north and F2 in the south) were located at the southwestern quadrant of solar disk. The main axes of them were almost parallel to each other. Around 22:20 UT, F1 began to erupt, forming two flare ribbons. The southwestern ribbon apparently moved to southwest and intruded southeast part of F2. This continuous intrusion caused F2's eventual eruption. Accompanying the eruption of F2, flare ribbons and post-flare loops appeared in northwest region of F2. Meanwhile, neither flare ribbons nor post-flare loops could be observed in southeastern area of F2. In addition, the nonlinear force-free field (NLFFF) extrapolations show that the magnetic fields above F2 in the southeast region are much weaker than that in the northwest region. These results imply that the overlying magnetic fields of F2 were not uniform. So we propose that the southwest ribbon formed by eruptive F1 invaded F2 from its southeast region with relatively weaker overlying magnetic fields in comparison with its northwest region, disturbing F2 and leading F2 to erupt eventually.

preprint2020arXiv

TEDL: A Text Encryption Method Based on Deep Learning

Recent years have seen an increasing emphasis on information security, and various encryption methods have been proposed. However, for symmetric encryption methods, the well-known encryption techniques still rely on the key space to guarantee security and suffer from frequent key updating. Aiming to solve those problems, this paper proposes a novel text encryption method based on deep learning called TEDL, where the secret key includes hyperparameters in deep learning model and the core step of encryption is transforming input data into weights trained under hyperparameters. Firstly, both communication parties establish a word vector table by training a deep learning model according to specified hyperparameters. Then, a self-update codebook is constructed on the word vector table with the SHA-256 function and other tricks. When communication starts, encryption and decryption are equivalent to indexing and inverted indexing on the codebook, respectively, thus achieving the transformation between plaintext and ciphertext. Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution. Especially, as a supplement to current encryption methods, the time-consuming process of constructing a codebook increases the difficulty of brute-force attacks while not degrade the communication efficiency.

preprint2020arXiv

The alignment of satellite systems with cosmic filaments in the SDSS DR12

Galaxies, as well as their satellites, are known to form within the cosmic web: the large, multi-scale distribution of matter in the universe. It is known that the surrounding large scale structure (LSS) can impact and influence the formation of galaxies, e.g. the spin and shape of haloes or galaxies are correlated with the LSS and the correlation depends on halo mass or galaxy morphology. In this work, we use group and filament catalogues constructed from the SDSS DR12 to investigate the correlation between satellite systems and the large scale filaments they are located in. We find that the distribution of satellites is significantly correlated with filaments, namely the major axis of the satellite systems are preferentially aligned with the spine of the closest filament. Stronger alignment signals are found for the cases where the system away from the filament spine, while systems close to the filament spine show significantly weaker alignment. Our results suggest that satellites are accreted along filaments, which agrees with previous works. The case of which away from the filament spine may help us to understand how the filament forms as well as the peculiar satellite distribution in the Local Universe.

preprint2020arXiv

The Hestia project: simulations of the Local Group

We present the Hestia simulation suite: High-resolutions Environmental Simulations of The Immediate Area, a set of cosmological simulations of the Local Group. Initial conditions constrained by the observed peculiar velocity of nearby galaxies are employed to accurately simulate the local cosmography. Halo pairs that resemble the Local Group are found in low resolutions constrained, dark matter only simulations, and selected for higher resolution magneto hydrodynamic simulation using the Arepo code. Baryonic physics follows the Auriga model of galaxy formation. The simulations contain a high resolution region of 3-5 Mpc in radius from the Local Group midpoint embedded in the correct cosmographic landscape. Within this region a simulated Local Group consisting of a Milky Way and Andromeda like galaxy forms, whose description is in excellent agreement with observations. The simulated Local Group galaxies resemble the Milky Way and Andromeda in terms of their halo mass, mass ratio, stellar disc mass, morphology separation, relative velocity, rotation curves, bulge-disc morphology, satellite galaxy stellar mass function, satellite radial distribution and in some cases, the presence of a Magellanic cloud like object. Because these simulations properly model the Local Group in their cosmographic context, they provide a testing ground for questions where environment is thought to play an important role.

preprint2020arXiv

Thermodynamic extremality relations in the massive gravity

A universal relation between the leading correction to the entropy and extremality was gotten in the work of Goon and Penco. In this paper, we extend this work to the massive gravity and investigate thermodynamic extremality relations in a topologically higher-dimensional black hole. A rescaled cosmological constant is added to the action of the massive gravity as a perturbative correction. This correction modifies the extremality bound of the black hole and leads to the shifts of the mass, entropy, etc. The Goon-Penco relation is gotten. Regarding the cosmological constant as a variable related to pressure, we get the thermodynamic extremality relations between the mass and pressure, charge, parameters $c_i$ by accurate calculations, respectively. Finally, these relations are verified by a triple product identity, which shows that the universal relation exists in black holes.

preprint2020arXiv

To Balance or Not to Balance: A Simple-yet-Effective Approach for Learning with Long-Tailed Distributions

Real-world visual data often exhibits a long-tailed distribution, where some ''head'' classes have a large number of samples, yet only a few samples are available for ''tail'' classes. Such imbalanced distribution causes a great challenge for learning a deep neural network, which can be boiled down into a dilemma: on the one hand, we prefer to increase the exposure of tail class samples to avoid the excessive dominance of head classes in the classifier training. On the other hand, oversampling tail classes makes the network prone to over-fitting, since head class samples are often consequently under-represented. To resolve this dilemma, in this paper, we propose a simple-yet-effective auxiliary learning approach. The key idea is to split a network into a classifier part and a feature extractor part, and then employ different training strategies for each part. Specifically, to promote the awareness of tail-classes, a class-balanced sampling scheme is utilised for training both the classifier and the feature extractor. For the feature extractor, we also introduce an auxiliary training task, which is to train a classifier under the regular random sampling scheme. In this way, the feature extractor is jointly trained from both sampling strategies and thus can take advantage of all training data and avoid the over-fitting issue. Apart from this basic auxiliary task, we further explore the benefit of using self-supervised learning as the auxiliary task. Without using any bells and whistles, our model achieves superior performance over the state-of-the-art solutions.

preprint2020arXiv

Toward Interpretability of Dual-Encoder Models for Dialogue Response Suggestions

This work shows how to improve and interpret the commonly used dual encoder model for response suggestion in dialogue. We present an attentive dual encoder model that includes an attention mechanism on top of the extracted word-level features from two encoders, one for context and one for label respectively. To improve the interpretability in the dual encoder models, we design a novel regularization loss to minimize the mutual information between unimportant words and desired labels, in addition to the original attention method, so that important words are emphasized while unimportant words are de-emphasized. This can help not only with model interpretability, but can also further improve model accuracy. We propose an approximation method that uses a neural network to calculate the mutual information. Furthermore, by adding a residual layer between raw word embeddings and the final encoded context feature, word-level interpretability is preserved at the final prediction of the model. We compare the proposed model with existing methods for the dialogue response task on two public datasets (Persona and Ubuntu). The experiments demonstrate the effectiveness of the proposed model in terms of better Recall@1 accuracy and visualized interpretability.

preprint2020arXiv

Using Sampled Network Data With The Autologistic Actor Attribute Model

Social science research increasingly benefits from statistical methods for understanding the structured nature of social life, including for social network data. However, the application of statistical network models within large-scale community research is hindered by too little understanding of the validity of their inferences under realistic data collection conditions, including sampled or missing network data. The autologistic actor attribute model (ALAAM) is a statistical model based on the well-established exponential random graph model (ERGM) for social networks. ALAAMs can be regarded as a social influence model, predicting an individual-level outcome based on the actor's network ties, concurrent outcomes of his/her network partners, and attributes of the actor and his/her network partners. In particular, an ALAAM can be used to measure contagion effects, that is, the propensity of two actors connected by a social network tie to both have the same value of an attribute. We investigate the effect of using simple random samples and snowball samples of network data on ALAAM parameter inference, and find that parameter inference can still work well even with a nontrivial fraction of missing nodes. However it is safer to take a snowball sample of the network and estimate conditional on the snowball sampling structure.

preprint2020arXiv

Vid2Curve: Simultaneous Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video

Thin structures, such as wire-frame sculptures, fences, cables, power lines, and tree branches, are common in the real world. It is extremely challenging to acquire their 3D digital models using traditional image-based or depth-based reconstruction methods because thin structures often lack distinct point features and have severe self-occlusion. We propose the first approach that simultaneously estimates camera motion and reconstructs the geometry of complex 3D thin structures in high quality from a color video captured by a handheld camera. Specifically, we present a new curve-based approach to estimate accurate camera poses by establishing correspondences between featureless thin objects in the foreground in consecutive video frames, without requiring visual texture in the background scene to lock on. Enabled by this effective curve-based camera pose estimation strategy, we develop an iterative optimization method with tailored measures on geometry, topology as well as self-occlusion handling for reconstructing 3D thin structures. Extensive validations on a variety of thin structures show that our method achieves accurate camera pose estimation and faithful reconstruction of 3D thin structures with complex shape and topology at a level that has not been attained by other existing reconstruction methods.

preprint2020arXiv

Wettability and surface energy of parylene F

Parylenes are barrier materials employed as protective layers. However, many parylenes are unsuitable for applications under harsh conditions. A new material, parylene F, demonstrates considerable potential for a wide range of applications due to its high temperature and UV resistance. For the first time, the wettability and surface energy of parylene F were investigated to determine the feasibility of parylene F as an alternative to the commonly employed parylene C. The results show that parylene F has a hydrophobic surface with a water contact angle of 109.63 degrees. We found that 3.5 ul probe liquid is an optimal value for the contact angle measurement of parylene F. Moreover, we found that the Owens-Wendt-Kaelble and the Lifshitz-van der Waals/acid-base approaches are unsuitable for determining the surface energy of parylene F, whereas an approach based on the limitless liquid-solid interface wetting system is compatible. Furthermore, the results show that parylene F has a surface energy of 39.05 mJ/m2. Considering the improved resistance, relatively low cost, and the desirable properties, parylene F can replace parylene C for applications under harsh conditions.

preprint2020arXiv

What has quenched the massive spiral galaxies?

Quenched massive spiral galaxies have attracted great attention recently, as more data is available to constrain their environment and cold gas content. However, the quenching mechanism is still uncertain, as it depends on the mass range and baryon budget of the galaxy. In this letter, we report the identification of a rare population of very massive, quenched spiral galaxies with stellar mass $\gtrsim10^{11}{\rm~M_\odot}$ and halo mass $\gtrsim10^{13}{\rm~M_\odot}$ from the Sloan Digital Sky Survey at redshift $z\sim0.1$. Our CO observations using the IRAM-30m telescope show that these galaxies contain only a small amount of molecular gas. Similar galaxies are also seen in the state-of-the-art semi-analytical models and hydro-dynamical simulations. It is found from these theoretical models that these quenched spiral galaxies harbor massive black holes, suggesting that feedback from the central black holes has quenched these spiral galaxies. This quenching mechanism seems to challenge the popular scenario of the co-evolution between massive black holes and massive bulges.

preprint2020arXiv

Zero Correlation Zone Sequences With Flexible Block-Repetitive Spectral Constraints

A general construction of a set of time-domain sequences with sparse periodic correlation functions, having multiple segments of consecutive zero-values, i.e. multiple zero correlation zones (ZCZs), is presented. All such sequences have a common and block-repetitive structure of the positions of zeros in their Discrete Fourier Transform (DFT) sequences, where the exact positions of zeros in a DFT sequence do not impact the positions and sizes of ZCZs. This property offers completely new degree of flexibility in designing signals with good correlation properties under various spectral constraints. The non-zero values of the DFT sequences are determined by the corresponding frequency-domain modulation sequences, constructed as the element-by-element product of two component sequences: a "long" one, which is common to the set of time-domain sequences, and which controls the peak-to-average power ratio (PAPR) properties of the time-domain sequences; and a "short" one, periodically extended to match the length of the "long" component sequence, which controls the non-zero crosscorrelation values of all time-domain sequences. It is shown that 0 dB PAPR of time-domain sequences can be obtained if the "long" frequency-domain component sequence is selected to be a modulatable constant amplitude zero autocorrelation (MCAZAC) sequence. A generalized and simplified unified construction of MCAZAC sequences is presented.

preprint2019arXiv

A compact and efficient three-dimensional microfluidic mixer

Microfluidic mixing is a fundamental functionality in most lab on a chip (LOC) systems,whereas realization of efficient mixing is challenging in microfluidic channels due to the small Reynolds numbers. Here, we design and fabricate a compact three-dimensional (3D) micromixer to enable efficient mixing at various flow rates. The performance of the fabricated micromixer was examined using blue and red inks. The extreme flexibility in fabricating microfluidic structures of arbitrary 3D geometries using femtosecond laser micromachining allows us to tackle the major disadvantageous effects for optimizing the mixing efficiency.

preprint2019arXiv

Can we find steady-state solutions to multiscale rarefied gas flows within dozens of iterations?

One of the central problems in the study of rarefied gas dynamics is to find the steady-state solution of the Boltzmann equation quickly. When the Knudsen number is large, i.e. the system is highly rarefied, the conventional iteration scheme can lead to convergence within a few iterations. However, when the Knudsen number is small, i.e. the flow falls in the near-continuum regime, hundreds of thousands iterations are needed, and yet the "converged" solutions are prone to be contaminated by accumulated error and large numerical dissipation. Recently, based on the gas kinetic models, the implicit unified gas kinetic scheme (UGKS) and its variants have significantly reduced the iterations in the near-continuum flow regime, but still much higher than that of the highly rarefied gas flows. In this paper, we put forward a general synthetic iteration scheme (GSIS) to find the steady-state solutions of general rarefied gas flows within dozens of iterations at any Knudsen number. As the GSIS does not rely on the specific kinetic model/collision operator, it can be naturally extended to quickly find converged solutions for mixture flows and even flows involving chemical reactions. These two superior advantages are also expected to accelerate the slow convergence in simulation of near-continuum flows via the direct simulation Monte Carlo method and its low-variance version.

preprint2019arXiv

Comparative analysis of layered structures in empirical investor networks and cellphone communication networks

Empirical investor networks (EIN) proposed by \cite{Ozsoylev-Walden-Yavuz-Bildik-2014-RFS} are assumed to capture the information spreading path among investors. Here, we perform a comparative analysis between the EIN and the cellphone communication networks (CN) to test whether EIN is an information exchanging network from the perspective of the layer structures of ego networks. We employ two clustering algorithms ($k$-means algorithm and $H/T$ break algorithm) to detect the layer structures for each node in both networks. We find that the nodes in both networks can be clustered into two groups, one that has a layer structure similar to the theoretical Dunbar Circle corresponding to that the alters in ego networks exhibit a four-layer hierarchical structure with the cumulative number of 5, 15, 50 and 150 from the inner layer to the outer layer, and the other one having an additional inner layer with about 2 alters compared with the Dunbar Circle. We also find that the scale ratios, which are estimated based on the unique parameters in the theoretical model of layer structures \citep{Tamarit-Cuesta-Dunbar-Sanchez-2018-PNAS}, conform to a log-normal distribution for both networks. Our results not only deepen our understanding on the topological structures of EIN, but also provide empirical evidence of the channels of information diffusion among investors.

preprint2019arXiv

Evaluating Local Geometric Feature Representations for 3D Rigid Data Matching

Local geometric descriptors remain an essential component for 3D rigid data matching and fusion. The devise of a rotational invariant local geometric descriptor usually consists of two steps: local reference frame (LRF) construction and feature representation. Existing evaluation efforts have mainly been paid on the LRF or the overall descriptor, yet the quantitative comparison of feature representations remains unexplored. This paper fills this gap by comprehensively evaluating nine state-of-the-art local geometric feature representations. Our evaluation is on the ground that ground-truth LRFs are leveraged such that the ranking of tested feature representations are more convincing as opposed to existing studies. The experiments are deployed on six standard datasets with various application scenarios (shape retrieval, point cloud registration, and object recognition) and data modalities (LiDAR, Kinect, and Space Time) as well as perturbations including Gaussian noise, shot noise, data decimation, clutter, occlusion, and limited overlap. The evaluated terms cover the major concerns for a feature representation, e.g., distinctiveness, robustness, compactness, and efficiency. The outcomes present interesting findings that may shed new light on this community and provide complementary perspectives to existing evaluations on the topic of local geometric feature description. A summary of evaluated methods regarding their peculiarities is also presented to guide real-world applications and new descriptor crafting.

preprint2019arXiv

Free-fall Rainbow BTZ Black Hole

Doubly special relativity (DSR) is an effective model for encoding quantum gravity in flat spacetime. To incorporate DSR into general relativity, one could use "Gravity's rainbow", where the spacetime background felt by a test particle would depend on its energy. For a black hole, there are two natural orthonormal frames, the stationary one hovering above it and freely falling one along geodesics. Since the rainbow metric is the metric that the radiated particles "see", a more natural orthonormal frame is the one anchored to the particles. And the cases with the stationary orthonormal frame have been extensively studied in the literature. In this paper, we investigate properties of rainbow BTZ black holes in the scenario with the free-fall orthonormal frame. We first review the thermodynamic properties of a BTZ black hole. Furthermore, we obtain the Free-fall (FF) rainbow BTZ black hole and then calculate its Hawking temperature via the Hamilton-Jacobi method. Finally, we discuss the thermodynamic properties of a FF stationary rainbow BTZ black hole .

preprint2019arXiv

Freeform microfluidic networks encapsulated in laser printed three-dimensional macro-scale glass objects

Large-scale microfluidic microsystems with complex three-dimensional (3D) configurations are highly in demand by both fundamental research and industrial application, holding the potentials for fostering a wide range of innovative applications such as lab-on-a-chip and organ-on-a-chip as well as continuous-flow manufacturing of fine chemicals. However, freeform fabrication of such systems remains challenging for most of the current fabrication techniques in terms of fabrication resolution, flexibility, and achievable footprint size. Here, we report ultrashort pulse laser microfabrication of freeform microfluidic circuits with high aspect ratios and tunable diameters embedded in 3D printed glass objects. We achieve uniform microfluidic channel diameter by carefully distributing a string of extra access ports along the microfluidic channels for avoiding the over-etching in the thin microfluidic channels. After the chemical etching is completed, the extra access ports are sealed using carbon dioxide laser induced localized glass melting. We demonstrate a model hand of fused silica with a size of ~3 cm * 2.7 cm * 1.1 cm in which the whole blood vessel system is encapsulated.

preprint2019arXiv

Optimal Power Flow in Hybrid AC and Multi-terminal HVDC Networks with Offshore Wind Farm Integration Based on Semidefinite Programming

Multi-terminal high voltage direct current (MTHVDC) technology is a promising technology for the offshore wind farm integration, which requires the new control and operation scheme. Therefore, the optimal power flow problem for this system is important to achieve the optimal economic operation. In this paper, convex relaxation model based on semidefinite programming for the MT-HVDC system considering DC/DC converters is proposed to solve the optimal power flow problem. A hybrid AC and MT-HVDC system for offshore wind farm integration is used for the test. The simulation results validate the effectiveness of the proposed model and guarantee that the global optimum solution is achieved.

preprint2019arXiv

Phase Structures and Transitions of Born-Infeld Black Holes in a Grand Canonical Ensemble

To make a Born-Infeld (BI) black hole thermally stable, we consider two types of boundary conditions, i.e., the asymptotically anti-de Sitter (AdS) space and a Dirichlet wall placed in the asymptotically flat space. The phase structures and transitions of these two types of BI black holes, namely BI-AdS black holes and BI black holes in a cavity, are investigated in a grand canonical ensemble, where the temperature and the potential are fixed. For BI-AdS black holes, the globally stable phases can be the thermal AdS space. For small values of the potential, there is a Hawking-Page-like first order phase transition between the BI-AdS black holes and the thermal-AdS space. However, the phase transition becomes zeroth order when the values of the potential are large enough. For BI black holes in a cavity, the globally stable phases can be a naked singularity or an extremal black hole with the horizon merging with the wall, which both are on the boundaries of the physical parameter region. The thermal flat space is never globally preferred. Besides a first order phase transition, there is a second order phase transition between the globally stable phases. Thus, it shows that the phase structures and transitions of BI black holes with these two different boundary conditions have several dissimilarities.

preprint2019arXiv

Thermodynamic Geometry of AdS Black Holes and Black Holes in a Cavity

The thermodynamic geometry has been proved to be quite useful in understanding the microscopic structure of black holes. We investigate the phase structure, thermodynamic geometry and critical behavior of a Reissner-Nordstrom-AdS black hole and a Reissner-Nordstrom black hole in a cavity, which can reach equilibrium in a canonical ensemble. Although the phase structure and critical behavior of both cases show striking resemblance, we find that there exist significant differences between the thermodynamic geometry of these two cases. Our results imply that there may be a connection between the black hole microstates and its boundary condition.

preprint2019arXiv

Thermodynamics and Phase Transition of a Gauss-Bonnet Black Hole in a Cavity

Considering a canonical ensemble, in which the temperature and the charge on a wall of the cavity are fixed, we investigate the thermodynamics of a D-dimensional Gauss-Bonnet black hole in a finite spherical cavity. Moreover, it shows that the first law of thermodynamics is still satisfied. We then discuss the phase structure and transition in both five and six dimensions. Specifically, we show that there always exist two regions in the parameter space. In one region, the system possesses one single phase. However in the other region, there could coexist three phases and a van der Waals-like phase transition occurs. Finally, we find that there is a fairly close resemblance in thermodynamic properties and phase structure of a Gauss-Bonnet-Maxwell black hole, either in a cavity or in anti-de Sitter space.

preprint2019arXiv

Thermodynamics and Phase Transition of a Nonlinear Electrodynamics Black Hole in a Cavity

We discuss the thermodynamics of a general nonlinear electrodynamics (NLED) asymptotically flat black hole enclosed in a finite spherical cavity. A canonical ensemble is considered, which means that the temperature and the charge on the wall of the cavity are fixed. After the free energy is obtained by computing the Euclidean action, it shows that the first law of thermodynamics is satisfied at the locally stationary points of the free energy. Focusing on a Born-Infeld (BI) black hole in a cavity, the phase structure and transition in various regions of the parameter space are investigated. In the region where the BI electrodynamics has weak nonlinearities, Hawking-Page-like and van der Waals-like phase transitions occur, and a tricritical point appears. In the region where the BI electrodynamics has strong enough nonlinearities, only Hawking-Page-like phase transitions occur. The phase diagram of the BI black hole in a cavity can have dissimilarity from that of a BI black hole using asymptotically anti-de Sitter boundary conditions. The dissimilarity may stem from a lack of an appropriate reference state with the same charge and temperature for the BI-AdS black hole.

preprint2018arXiv

GPU acceleration of an iterative scheme for gas-kinetic model equations with memory reduction techniques

This paper presents a Graphics Processing Units (GPUs) acceleration method of an iterative scheme for gas-kinetic model equations. Unlike the previous GPU parallelization of explicit kinetic schemes, this work features a fast converging iterative scheme. The memory reduction techniques in this method enable full three-dimensional (3D) solution of kinetic model equations in contemporary GPUs usually with a limited memory capacity that otherwise would need terabytes of memory. The GPU algorithm is validated against the DSMC simulation of the 3D lid-driven cavity flow and the supersonic rarefied gas flow past a cube with grids size up to 0.7 trillion points in the phase space. The performance of the GPU algorithm is assessed by comparing with the corresponding parallel CPU program using Message Passing Interface (MPI). The profiling on several models of GPUs shows that the algorithm has a medium to high level of utilization of the GPUs' computing and memory resources. A $190\times$ speedup can be achieved on the Tesla K40 GPUs against a single core of Intel Xeon-E5-2680v3 CPU for the 3D lid-driven cavity flow.

preprint2018arXiv

High-Order Implicit Hybridizable Discontinuous Galerkin Method for the Boltzmann Equation

The high-order hybridizable discontinuous Galerkin (HDG) method combining with an implicit iterative scheme is used to find the steady-state solution of the Boltzmann equation with full collision integral on two-dimensional triangular meshes. The velocity distribution function and its trace are approximated in the piecewise polynomial space of degree up to 4. The fast spectral method (FSM) is incorporated into the DG discretization to evaluate the collision operator. Specific polynomial approximation is proposed for the collision term to reduce the computational cost. The proposed scheme is proved to be accurate and efficient.

preprint2018arXiv

Minimal Length Effects on Chaotic Motion of Particles around Black Hole Horizon

Recently, it was conjectured that the Lyapunov exponent of chaotic motion of a particle in a black hole is universally bounded from above by the surface gravity of the black hole. On the other hand, the minimal length appears in various theories of quantum gravity and leads to the deformed canonical position-momentum commutation relation. In this paper, we use the Hamilton-Jacobi method to study effects of the minimal length on the motion of a massive particle perturbed away from an unstable equilibrium near the black hole horizon. We find that the minimal length effects make the particle move faster away from the equilibrium, and hence the corresponding Lyapunov exponent is greater than that in the usual case with the absence of the minimal length. It therefore shows that if the minimal length effects are taken into account, the conjectured universal bound on the Lyapunov exponent could be violated.

preprint2018arXiv

Polarization-insensitive space-selective etching in fused silica induced by picosecond laser irradiation

It is well known that when the fused silica is irradiated with focused femtosecond laser beams, space selective chemical etching can be achieved. The etching rate depends sensitively on the polarization of the laser. Surprisingly, we observe that by chirping the Fourier-transform-limited femtosecond laser pulses to picosecond pulses, the polarization dependence of the etching rate disappears, whereas an efficient etching rate can still be maintained. Observation with a scanning electron microscope reveals that the chirped pulses can induce interconnected nanocracks in the irradiated areas which facilitates efficient introduction of the etchant into the microchannel. The reported technology is of great use for fabrication of three-dimensional (3D) microfluidic systems and glass-based 3D printing.

preprint2017arXiv

Self-Taught Convolutional Neural Networks for Short Text Clustering

Short text clustering is a challenging problem due to its sparseness of text representation. Here we propose a flexible Self-Taught Convolutional neural network framework for Short Text Clustering (dubbed STC^2), which can flexibly and successfully incorporate more useful semantic features and learn non-biased deep text representation in an unsupervised manner. In our framework, the original raw text features are firstly embedded into compact binary codes by using one existing unsupervised dimensionality reduction methods. Then, word embeddings are explored and fed into convolutional neural networks to learn deep feature representations, meanwhile the output units are used to fit the pre-trained binary codes in the training process. Finally, we get the optimal clusters by employing K-means to cluster the learned representations. Extensive experimental results demonstrate that the proposed framework is effective, flexible and outperform several popular clustering methods when tested on three public short text datasets.

Peng Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

250 published item(s)

Ability Transfer and Recovery via Modularized Parameters Localization

BOLT: Online Lightweight Adaptation for Preparation-Free Heterogeneous Cooperative Perception

CA-GCL: Cross-Anatomy Global-Local Contrastive Learning for Robust 3D Medical Image Understanding

Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei

DecoRec: Decomposed 3D Scene Reconstruction from Single-View Images via Object-Level Diffusion

Generative 3D Gaussians with Learned Density Control

GeoTopoDiff: Learning Geometry--Topology Graph Priors through Boundary-Constrained Mixed Diffusion for Sparse-Slice 3D Porous Reconstruction

GP-GS: Gaussian Processes Densification for 3D Gaussian Splatting

GSMap: 2D Gaussians for Online HD Mapping

HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection

Is One Score Enough? Rethinking the Evaluation of Sequentially Evolving LLM Memory

LLMs Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

PDR: A Plug-and-Play Positional Decay Framework for LLM Pre-training Data Detection

Softmax-GS: Generalized Gaussians Learning When to Blend or Bound

Towards Robust Sequential Decomposition for Complex Image Editing

Quaternion Approximation Networks for Enhanced Image Classification and Oriented Object Detection

Interferometric Signatures of Black Holes with Multiple Photon Spheres

Timelike entanglement entropy and $T\bar{T}$ deformation

Timelike entanglement entropy in dS$_3$/CFT$_2$

Classification of Minimal Immersions of Conformally Flat $3$-Tori and $4$-Tori in Spheres by The First Eigenfunctions

A Wearable ECG Monitor for Deep Learning Based Real-Time Cardiovascular Disease Detection

Anisotropic satellite accretion onto the Local Group with HESTIA

Appearance of an Infalling Star in Black Holes with Multiple Photon Spheres

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Balanced control between performance and saturation for constrained nonlinear systems

CapOnImage: Context-driven Dense-Captioning on Image

Chaos bound and its violation in charged Kiselev black hole

Cold and Hot gas distribution around the Milky-Way-M31 system in the HESTIA simulations

Connections between reflected entropies and hyperbolic string vertices

Convergence and Recovery Guarantees of the K-Subspaces Method for Subspace Clustering

Diffraction properties of lights with transverse orbital angular momentum

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

Dual-Level Decoupled Transformer for Video Captioning

Echoes from Hairy Black Holes

Effects of Born-Infeld electrodynamics on black hole shadows

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

Exact Community Recovery over Signed Graphs

Fast-Spanning Ant Colony Optimisation (FaSACO) for Mobile Robot Coverage Path Planning

FastRE: Towards Fast Relation Extraction with Convolutional Encoder and Improved Cascade Binary Tagging Framework

Gravitational Lensing by Black Holes with Multiple Photon Spheres

Identical Photons from Multiple Tin-Vacancy Centers in Diamond

Instance Image Retrieval by Learning Purely From Within the Dataset

Investigation of Bare-bones Algorithms from Quantum Perspective: A Quantum Dynamical Global Optimizer

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

Leveraging Structural Information to Improve Point Line Visual-Inertial Odometry

LoS-Map Construction for Proactive Relay of Opportunity Selection in 6G V2X Systems

Multi-Domain Joint Training for Person Re-Identification

Multistage smoothing based multistep pulse compressor for ultrahigh peak power lasers

Negative-ResNet: Noisy Ambulatory Electrocardiogram Signal Classification Scheme

Neural Rays for Occlusion-aware Image-based Rendering

NightLab: A Dual-level Architecture with Hardness Detection for Segmentation at Night

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

Nonlinear Thouless pumping: solitons and transport breakdown

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Probing Phase Structure of Black Holes with Lyapunov Exponents

Progressively-connected Light Field Network for Efficient View Synthesis

Prompt Tuning for Generative Multimodal Pretrained Models

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

Quasinormal Modes of Black Holes with Multiple Photon Spheres

Quenching of Massive Disk Galaxies in the IllustrisTNG Simulation

Relation Regularized Scene Graph Generation

Robust Security Analysis Based on Random Geometry Theory for Satellite-Terrestrial-Vehicle Network

Self-Contrastive Learning based Semi-Supervised Radio Modulation Classification

Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

Spin conservation of cosmic filaments

Thermodynamic Geometry of Black Holes Enclosed by a Cavity in Extended Phase Space

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Beam smoothing based on prism pair for multistep pulse compressor in PW lasers

Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data Augmentation

Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification

Quantifying Bounds of Model Gap for Synchronous Generators

Scalable Learning With a Structural Recurrent Neural Network for Short-Term Traffic Prediction

Scalarized Einstein-Maxwell-scalar Black Holes in Anti-de Sitter Spacetime