Source author record

Bowen Li

Bowen Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

33works

24topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought

Despite notable advancements in prompting methods for Large Language Models (LLMs), such as Chain-of-Thought (CoT), existing strategies still suffer from excessive token usage and limited generalisability across diverse reasoning tasks. To address these limitations, we propose an Adaptive Causal Prompting with Sketch-of-Thought (ACPS) framework, which leverages structural causal models to infer the causal effect of a query on its answer and adaptively select an appropriate intervention (i.e., standard front-door and conditional front-door adjustments). This design enables generalisable causal reasoning across heterogeneous tasks without task-specific retraining. By replacing verbose CoT with concise Sketch-of-Thought, ACPS enables efficient reasoning that significantly reduces token usage and inference cost. Extensive experiments on multiple reasoning benchmarks and LLMs demonstrate that ACPS consistently outperforms existing prompting baselines in terms of accuracy, robustness, and computational efficiency.

preprint2026arXiv

When to Invoke: Refining LLM Fairness with Toxicity Assessment

Large Language Models (LLMs) are increasingly used for toxicity assessment in online moderation systems, where fairness across demographic groups is essential for equitable treatment. However, LLMs often produce inconsistent toxicity judgements for subtle expressions, particularly those involving implicit hate speech, revealing underlying biases that are difficult to correct through standard training. This raises a key question that existing approaches often overlook: when should corrective mechanisms be invoked to ensure fair and reliable assessments? To address this, we propose FairToT, an inference-time framework that enhances LLM fairness through prompt-guided toxicity assessment. FairToT identifies cases where demographic-related variation is likely to occur and determines when additional assessment should be applied. In addition, we introduce two interpretable fairness indicators that detect such cases and improve inference consistency without modifying model parameters. Experiments on benchmark datasets show that FairToT reduces group-level disparities while maintaining stable and reliable toxicity predictions, demonstrating that inference-time refinement offers an effective and practical approach for fairness improvement in LLM-based toxicity assessment systems. The source code can be found at https://aisuko.github.io/fair-tot/.

preprint2023arXiv

Linear Convergence of ISTA and FISTA

In this paper, we revisit the class of iterative shrinkage-thresholding algorithms (ISTA) for solving the linear inverse problem with sparse representation, which arises in signal and image processing. It is shown in the numerical experiment to deblur an image that the convergence behavior in the logarithmic-scale ordinate tends to be linear instead of logarithmic, approximating to be flat. Making meticulous observations, we find that the previous assumption for the smooth part to be convex weakens the least-square model. Specifically, assuming the smooth part to be strongly convex is more reasonable for the least-square model, even though the image matrix is probably ill-conditioned. Furthermore, we improve the pivotal inequality tighter for composite optimization with the smooth part to be strongly convex instead of general convex, which is first found in [Li et al., 2022]. Based on this pivotal inequality, we generalize the linear convergence to composite optimization in both the objective value and the squared proximal subgradient norm. Meanwhile, we set a simple ill-conditioned matrix which is easy to compute the singular values instead of the original blur matrix. The new numerical experiment shows the proximal generalization of Nesterov's accelerated gradient descent (NAG) for the strongly convex function has a faster linear convergence rate than ISTA. Based on the tighter pivotal inequality, we also generalize the faster linear convergence rate to composite optimization, in both the objective value and the squared proximal subgradient norm, by taking advantage of the well-constructed Lyapunov function with a slight modification and the phase-space representation based on the high-resolution differential equation framework from the implicit-velocity scheme.

preprint2022arXiv

A Smart Contract based Crowdfunding Mechanism for Hierarchical Federated Learning

Hierarchical Federated Learning (HFL) is introduced as a promising technique that allows model owners to fully exploit computational resources and bandwidth resources to train the global model. However, due to the high training cost, a single model owner may not be able to deploy HFL. To address this issue, we develop a smart contract based trust crowdfunding mechanism for HFL, which enables multiple model owners to obtain a crowdfunding model with high social utility for multiple crowdfunding participants. To ensure the authenticity of the crowdfunding mechanism, we implemented the Vickey-Clark-Croves (VCG) mechanism to encourage all crowdfunding participants and clients to provide realistic bids and offers. At the same time, in order to ensure guaranteed trustworthiness of crowdfunding and automatic distribution of funds, we develop and implement a smart contract to record the crowdfunding process and training results in the blockchain. We prove that the proposed scheme satisfies the budget balance and participant constraint. Finally, we implement a prototype of this smart contract on an Ethereoum private chain and evaluate the proposed VCG mechanism. The experimental results demonstrate that the proposed scheme can effectively improve social utility while ensuring the authenticity and trustworthiness of the crowdfunding process.

preprint2022arXiv

Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Perceptual quality assessment of the videos acquired in the wilds is of vital importance for quality assurance of video services. The inaccessibility of reference videos with pristine quality and the complexity of authentic distortions pose great challenges for this kind of blind video quality assessment (BVQA) task. Although model-based transfer learning is an effective and efficient paradigm for the BVQA task, it remains to be a challenge to explore what and how to bridge the domain shifts for better video representation. In this work, we propose to transfer knowledge from image quality assessment (IQA) databases with authentic distortions and large-scale action recognition with rich motion patterns. We rely on both groups of data to learn the feature extractor. We train the proposed model on the target VQA databases using a mixed list-wise ranking loss function. Extensive experiments on six databases demonstrate that our method performs very competitively under both individual database and mixed database training settings. We also verify the rationality of each component of the proposed method and explore a simple manner for further improvement.

preprint2022arXiv

Correlation Filters for Unmanned Aerial Vehicle-Based Aerial Tracking: A Review and Experimental Evaluation

Aerial tracking, which has exhibited its omnipresent dedication and splendid performance, is one of the most active applications in the remote sensing field. Especially, unmanned aerial vehicle (UAV)-based remote sensing system, equipped with a visual tracking approach, has been widely used in aviation, navigation, agriculture,transportation, and public security, etc. As is mentioned above, the UAV-based aerial tracking platform has been gradually developed from research to practical application stage, reaching one of the main aerial remote sensing technologies in the future. However, due to the real-world onerous situations, e.g., harsh external challenges, the vibration of the UAV mechanical structure (especially under strong wind conditions), the maneuvering flight in complex environment, and the limited computation resources onboard, accuracy, robustness, and high efficiency are all crucial for the onboard tracking methods. Recently, the discriminative correlation filter (DCF)-based trackers have stood out for their high computational efficiency and appealing robustness on a single CPU, and have flourished in the UAV visual tracking community. In this work, the basic framework of the DCF-based trackers is firstly generalized, based on which, 23 state-of-the-art DCF-based trackers are orderly summarized according to their innovations for solving various issues. Besides, exhaustive and quantitative experiments have been extended on various prevailing UAV tracking benchmarks, i.e., UAV123, UAV123@10fps, UAV20L, UAVDT, DTB70, and VisDrone2019-SOT, which contain 371,903 frames in total. The experiments show the performance, verify the feasibility, and demonstrate the current challenges of DCF-based trackers onboard UAV tracking.

preprint2022arXiv

DarkLighter: Light Up the Darkness for UAV Tracking

Recent years have witnessed the fast evolution and promising performance of the convolutional neural network (CNN)-based trackers, which aim at imitating biological visual systems. However, current CNN-based trackers can hardly generalize well to low-light scenes that are commonly lacked in the existing training set. In indistinguishable night scenarios frequently encountered in unmanned aerial vehicle (UAV) tracking-based applications, the robustness of the state-of-the-art (SOTA) trackers drops significantly. To facilitate aerial tracking in the dark through a general fashion, this work proposes a low-light image enhancer namely DarkLighter, which dedicates to alleviate the impact of poor illumination and noise iteratively. A lightweight map estimation network, i.e., ME-Net, is trained to efficiently estimate illumination maps and noise maps jointly. Experiments are conducted with several SOTA trackers on numerous UAV dark tracking scenes. Exhaustive evaluations demonstrate the reliability and universality of DarkLighter, with high efficiency. Moreover, DarkLighter has further been implemented on a typical UAV system. Real-world tests at night scenes have verified its practicability and dependability.

preprint2022arXiv

Dissipative soliton generation and real-time dynamics in microresonator-filtered fiber lasers

Optical frequency combs in microresonators (microcombs) have a wide range of applications in science and technology, due to its compact size and access to considerably larger comb spacing. Despite recent successes, the problems of self-starting, high mode efficiency as well as high output power have not been fully addressed for conventional soliton microcombs. Recent demonstration of laser cavity soliton microcombs by nesting a microresonator into a fiber cavity, shows great potential to solve the problems. Here we comprehensively study the dissipative soliton generation and interaction dynamics in a microresonator-filtered fiber laser in both theory and experiment. We first bring theoretical insight into the mode-locking principle, discuss the parameters effect on soliton properties and provide experimental guidelines for broadband soliton generation. We predict chirped bright dissipative soliton with flat-top spectral envelope in microresonators with normal dispersion, which is fundamentally infeasible for externally driven case. Furthermore, we experimentally achieve soliton microcombs with large bandwidth of ~10 nm and high mode efficiency of 90.7%. Finally, by taking advantage of an ultrahigh-speed time magnifier, we study the real-time soliton formation and interaction dynamics and experimentally observe soliton Newton's cradle. Our study will benefit the design of the novel, high-efficiency and self-starting microcombs for real-world applications.

preprint2022arXiv

Experimental Realization of the Rabi-Hubbard Model with Trapped Ions

Quantum simulation provides important tools in studying strongly correlated many-body systems with controllable parameters. As a hybrid of two fundamental models in quantum optics and in condensed matter physics, the Rabi-Hubbard model demonstrates rich physics through the competition between local spin-boson interactions and long-range boson hopping. Here we report an experimental realization of the Rabi-Hubbard model using up to $16$ trapped ions and present a controlled study of its equilibrium properties and quantum dynamics. We observe the ground-state quantum phase transition by slowly quenching the coupling strength, and measure the quantum dynamical evolution in various parameter regimes. With the magnetization and the spin-spin correlation as probes, we verify the prediction of the model Hamiltonian by comparing theoretical results in small system sizes with experimental observations. For larger-size systems of $16$ ions and $16$ phonon modes, the effective Hilbert space dimension exceeds $2^{57}$, whose dynamics is intractable for classical supercomputers.

preprint2022arXiv

FedIPR: Ownership Verification for Federated Deep Neural Network Models

Federated learning models are collaboratively developed upon valuable training data owned by multiple parties. During the development and deployment of federated models, they are exposed to risks including illegal copying, re-distribution, misuse and/or free-riding. To address these risks, the ownership verification of federated learning models is a prerequisite that protects federated learning model intellectual property rights (IPR) i.e., FedIPR. We propose a novel federated deep neural network (FedDNN) ownership verification scheme that allows private watermarks to be embedded and verified to claim legitimate IPR of FedDNN models. In the proposed scheme, each client independently verifies the existence of the model watermarks and claims respective ownership of the federated model without disclosing neither private training data nor private watermark information. The effectiveness of embedded watermarks is theoretically justified by the rigorous analysis of conditions under which watermarks can be privately embedded and detected by multiple clients. Moreover, extensive experimental results on computer vision and natural language processing tasks demonstrate that varying bit-length watermarks can be embedded and reliably detected without compromising original model performances. Our watermarking scheme is also resilient to various federated training settings and robust against removal attacks.

preprint2022arXiv

Lightweight Long-Range Generative Adversarial Networks

In this paper, we introduce novel lightweight generative adversarial networks, which can effectively capture long-range dependencies in the image generation process, and produce high-quality results with a much simpler architecture. To achieve this, we first introduce a long-range module, allowing the network to dynamically adjust the number of focused sampling pixels and to also augment sampling locations. Thus, it can break the limitation of the fixed geometric structure of the convolution operator, and capture long-range dependencies in both spatial and channel-wise directions. Also, the proposed long-range module can highlight negative relations between pixels, working as a regularization to stabilize training. Furthermore, we propose a new generation strategy through which we introduce metadata into the image generation process to provide basic information about target images, which can stabilize and speed up the training process. Our novel long-range module only introduces few additional parameters and is easily inserted into existing models to capture long-range dependencies. Extensive experiments demonstrate the competitive performance of our method with a lightweight architecture.

preprint2022arXiv

Memory-Driven Text-to-Image Generation

We introduce a memory-driven semi-parametric approach to text-to-image generation, which is based on both parametric and non-parametric techniques. The non-parametric component is a memory bank of image features constructed from a training set of images. The parametric component is a generative adversarial network. Given a new text description at inference time, the memory bank is used to selectively retrieve image features that are provided as basic information of target images, which enables the generator to produce realistic synthetic results. We also incorporate the content information into the discriminator, together with semantic features, allowing the discriminator to make a more reliable prediction. Experimental results demonstrate that the proposed memory-driven semi-parametric approach produces more realistic images than purely parametric approaches, in terms of both visual fidelity and text-image semantic consistency.

preprint2022arXiv

Photonic frequency microcombs based on dissipative Kerr and quadratic cavity solitons

Optical frequency comb, with precisely controlled spectral lines spanning a broad range, has been the key enabling technology for many scientific breakthroughs. In addition to the traditional implementation based on modelocked lasers, photonic frequency microcombs based on dissipative Kerr and quadratic cavity solitons in high-Q microresonators have become invaluable in applications requiring compact footprint, low cost, good energy efficiency, large comb spacing, and access to nonconventional spectral regions. In this review, we comprehensively examine the recent progress of photonic frequency microcombs and discuss how various phenomena can be utilized to enhance the microcomb performances that benefit a plethora of applications including optical atomic clockwork, optical frequency synthesizer, precision spectroscopy, astrospectrograph calibration, biomedical imaging, optical communications, coherent ranging, and quantum information science.

preprint2022arXiv

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

The importance of building text-to-SQL parsers which can be applied to new databases has long been acknowledged, and a critical step to achieve this goal is schema linking, i.e., properly recognizing mentions of unseen columns or tables when generating SQLs. In this work, we propose a novel framework to elicit relational structures from large-scale pre-trained language models (PLMs) via a probing procedure based on Poincaré distance metric, and use the induced relations to augment current graph-based parsers for better schema linking. Compared with commonly-used rule-based methods for schema linking, we found that probing relations can robustly capture semantic correspondences, even when surface forms of mentions and entities differ. Moreover, our probing procedure is entirely unsupervised and requires no additional parameters. Extensive experiments show that our framework sets new state-of-the-art performance on three benchmarks. We empirically verify that our probing procedure can indeed find desired relational structures through qualitative analysis. Our code can be found at https://github.com/AlibabaResearch/DAMO-ConvAI.

preprint2022arXiv

Robotic Interestingness via Human-Informed Few-Shot Object Detection

Interestingness recognition is crucial for decision making in autonomous exploration for mobile robots. Previous methods proposed an unsupervised online learning approach that can adapt to environments and detect interesting scenes quickly, but lack the ability to adapt to human-informed interesting objects. To solve this problem, we introduce a human-interactive framework, AirInteraction, that can detect human-informed objects via few-shot online learning. To reduce the communication bandwidth, we first apply an online unsupervised learning algorithm on the unmanned vehicle for interestingness recognition and then only send the potential interesting scenes to a base-station for human inspection. The human operator is able to draw and provide bounding box annotations for particular interesting objects, which are sent back to the robot to detect similar objects via few-shot learning. Only using few human-labeled examples, the robot can learn novel interesting object categories during the mission and detect interesting scenes that contain the objects. We evaluate our method on various interesting scene recognition datasets. To the best of our knowledge, it is the first human-informed few-shot object detection framework for autonomous exploration.

preprint2022arXiv

S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

The task of converting a natural language question into an executable SQL query, known as text-to-SQL, is an important branch of semantic parsing. The state-of-the-art graph-based encoder has been successfully used in this task but does not model the question syntax well. In this paper, we propose S$^2$SQL, injecting Syntax to question-Schema graph encoder for Text-to-SQL parsers, which effectively leverages the syntactic dependency information of questions in text-to-SQL to improve the performance. We also employ the decoupling constraint to induce diverse relational edge embedding, which further improves the network's performance. Experiments on the Spider and robustness setting Spider-Syn demonstrate that the proposed approach outperforms all existing methods when pre-training models are used, resulting in a performance ranks first on the Spider leaderboard.

preprint2022arXiv

Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Unmanned aerial vehicle (UAV)-based visual object tracking has enabled a wide range of applications and attracted increasing attention in the field of intelligent transportation systems because of its versatility and effectiveness. As an emerging force in the revolutionary trend of deep learning, Siamese networks shine in UAV-based object tracking with their promising balance of accuracy, robustness, and speed. Thanks to the development of embedded processors and the gradual optimization of deep neural networks, Siamese trackers receive extensive research and realize preliminary combinations with UAVs. However, due to the UAV's limited onboard computational resources and the complex real-world circumstances, aerial tracking with Siamese networks still faces severe obstacles in many aspects. To further explore the deployment of Siamese networks in UAV-based tracking, this work presents a comprehensive review of leading-edge Siamese trackers, along with an exhaustive UAV-specific analysis based on the evaluation using a typical UAV onboard processor. Then, the onboard tests are conducted to validate the feasibility and efficacy of representative Siamese trackers in real-world UAV deployment. Furthermore, to better promote the development of the tracking community, this work analyzes the limitations of existing Siamese trackers and conducts additional experiments represented by low-illumination evaluations. In the end, prospects for the development of Siamese tracking for UAV-based intelligent transportation systems are deeply discussed. The unified framework of leading-edge Siamese trackers, i.e., code library, and the results of their experimental evaluations are available at https://github.com/vision4robotics/SiameseTracking4UAV .

preprint2022arXiv

Universal Learned Image Compression With Low Computational Cost

Recently, learned image compression methods have developed rapidly and exhibited excellent rate-distortion performance when compared to traditional standards, such as JPEG, JPEG2000 and BPG. However, the learning-based methods suffer from high computational costs, which is not beneficial for deployment on devices with limited resources. To this end, we propose shift-addition parallel modules (SAPMs), including SAPM-E for the encoder and SAPM-D for the decoder, to largely reduce the energy consumption. To be specific, they can be taken as plug-and-play components to upgrade existing CNN-based architectures, where the shift branch is used to extract large-grained features as compared to small-grained features learned by the addition branch. Furthermore, we thoroughly analyze the probability distribution of latent representations and propose to use Laplace Mixture Likelihoods for more accurate entropy estimation. Experimental results demonstrate that the proposed methods can achieve comparable or even better performance on both PSNR and MS-SSIM metrics to that of the convolutional counterpart with an about 2x energy reduction.

preprint2022arXiv

Unveiling the relative timing jitter in counter propagating all normal dispersion (CANDi) dual-comb fiber laser

Counter-propagating all-normal dispersion (CANDi) fiber laser is an emerging high-energy single-cavity dual-comb laser source. Its relative timing jitter (RTJ), a critical parameter for dual-comb timing precision and spectral resolution, has not been comprehensively investigated. In this paper, we enhance the state-of-the-art CANDi fiber laser pulse energy from 1 nJ to 8 nJ. We then introduce a novel reference-free RTJ characterization technique that provides shot-to-shot measurement capability at femtosecond precision for the first time. The measurement noise floor reaches 1.6x10-7 fs2/Hz, and the corresponding integrated measurement precision is only 1.8 fs (1 kHz, 20 MHz). With this new characterization tool, we are able to study the physical origin of CANDi laser's RTJ in detail. We first verify that the cavity length fluctuation does not contribute to the RTJ. Then we measure the integrated RTJ to be 39 fs (1 kHz, 20 MHz) and identify the pump relative intensity noise (RIN) to be the dominant factor responsible for it. In particular, pump RIN is coupled to the RTJ through the Gordon-Haus effect. Finally, solutions to reduce the free-running CANDi laser's RTJ are discussed. This work provides a general guideline to improve the performance of compact single-cavity dual-comb systems like CANDi laser benefitting various dual-comb applications.

preprint2021arXiv

All-Day Object Tracking for Unmanned Aerial Vehicle

Visual object tracking, which is representing a major interest in image processing field, has facilitated numerous real world applications. Among them, equipping unmanned aerial vehicle (UAV) with real time robust visual trackers for all day aerial maneuver, is currently attracting incremental attention and has remarkably broadened the scope of applications of object tracking. However, prior tracking methods have merely focused on robust tracking in the well-illuminated scenes, while ignoring trackers' capabilities to be deployed in the dark. In darkness, the conditions can be more complex and harsh, easily posing inferior robust tracking or even tracking failure. To this end, this work proposed a novel discriminative correlation filter based tracker with illumination adaptive and anti dark capability, namely ADTrack. ADTrack firstly exploits image illuminance information to enable adaptability of the model to the given light condition. Then, by virtue of an efficient and effective image enhancer, ADTrack carries out image pretreatment, where a target aware mask is generated. Benefiting from the mask, ADTrack aims to solve a dual regression problem where dual filters, i.e., the context filter and target focused filter, are trained with mutual constraint. Thus ADTrack is able to maintain continuously favorable performance in all-day conditions. Besides, this work also constructed one UAV nighttime tracking benchmark UAVDark135, comprising of more than 125k manually annotated frames, which is also very first UAV nighttime tracking benchmark. Exhaustive experiments are extended on authoritative daytime benchmarks, i.e., UAV123 10fps, DTB70, and the newly built dark benchmark UAVDark135, which have validated the superiority of ADTrack in both bright and dark conditions on a single CPU.

preprint2021arXiv

Ferroelectric-HfO2/Oxide Interfaces, Oxygen Distribution Effect and Implications for Device Performance

Atomic-scale understanding of HfO2 ferroelectricity is important to help address many challenges in developing reliable and high-performance ferroelectric HfO2 (fe-HfO2) based devices. Though investigated from different angles, a factor that is real device-relevant and clearly deserves more attention has largely been overlooked by previous research, namely, the fe-HfO2/dielectric interface. Here, we investigate the electronic structures of several typical interfaces formed between ultrathin fe-HfO2 and oxide dielectrics in the sub-3-nm region. We find that interface formation introduces strong depolarizing fields in fe-HfO2, which is detrimental for ferroelectric polarization but can be a merit if tamed for tunneling devices, as recently demonstrated. Asymmetric oxygen distribution-induced polarity, intertwined with ferroelectric polarization or not, is also investigated as a relevant interfacial effect in real device. Though considered detrimental from certain aspects, such as inducing build-in field (independent of ferroelectric polarization) and exacerbating depolarization (intertwined with ferroelectric polarization), it can be partly balanced out by other effects, such as annealing (extrinsic) and polarity-induced defect formation (intrinsic). This work provides insights into ferroelectric-HfO2/dielectric interfaces and some useful implications for the development of devices.

preprint2021arXiv

Internal reverse-biased p-n junctions: a possible origin of the high resistance in phase change superlattice

Phase change superlattice is one of the emerging material technologies for ultralow-power phase change memories. However, the resistance switching mechanism of phase change superlattice is still hotly debated. Early electrical measurements and recent materials characterizations have suggested that the Kooi phase is very likely to be the as-fabricated low-resistance state. Due to the difficulty in in-situ characterization at atomic resolution, the structure of the electrically switched superlattice in its high-resistance state is still unknown and mainly investigated by theoretical modellings. So far, there has been no simple model that can unify experimental results obtained from device-level electrical measurements and atomic-level materials characterizations. In this work, we carry out atomistic transport modellings of the phase change superlattice device and propose a simple mechanism accounting for its high resistance. The modeled high-resistance state is based on the interfacial phase changed superlattice that has previously been mistaken for the low-resistance state. This work advances the understanding of phase change superlattice for emerging memory applications.

preprint2021arXiv

Van Hove Singularity Arising from Mexican-Hat-Shaped Inverted Bands in the Topological Insulator Sn-doped Bi$_{1.1}$Sb$_{0.9}$Te$_{2}$S

The optical properties of Sn-doped Bi$_{1.1}$Sb$_{0.9}$Te$_{2}$S, the most bulk-insulating topological insulator thus far, have been examined at different temperatures over a broad frequency range. No Drude response is detected in the low-frequency range down to 30~cm$^{-1}$, corroborating the excellent bulk-insulating property of this material. Intriguingly, we observe a sharp peak at about 2\,200~cm$^{-1}$ in the optical conductivity at 5~K. Further quantitative analyses of the line shape and temperature dependence of this sharp peak, in combination with first-principles calculations, suggest that it corresponds to a van Hove singularity arising from Mexican-hat-shaped inverted bands. Such a van Hove singularity is a pivotal ingredient of various strongly correlated phases.

preprint2020arXiv

An improved transformation between Fibonacci FSRs and Galois FSRs

Feedback shift registers (FSRs), which have two configurations: Fibonacci and Galois, are a primitive building block in stream ciphers. In this paper, an improved transformation is proposed between Fibonacci FSRs and Galois FSRs. In the previous results, the number of stages is identical when constructing the equivalent FSRs. In this paper, there is no requirement to keep the number of stages equal for two equivalent FSRs here. More precisely, it is verified that an equivalent Galois FSR with fewer stages cannot be found for a Fibonacci FSR, but the converse is not true. Furthermore, the total number of equivalent Galois FSRs for a given Fibonacci FSR with n stages is calculated. In order to reduce the propagation time and memory, an effective algorithm is developed to find equivalent Galois FSR and is proved to own minimal operators and stages. Finally, the feasibility of our proposed strategies, to mutually transform Fibonacci FSRs and Galois FSRs, is demonstrated by numerical examples.

preprint2020arXiv

Atomic origin for hydrogenation promoted bulk oxygen vacancies removal in vanadium dioxide

Oxygen vacancies (VO), a common type of point defects in metal oxides materials, play important roles on the physical and chemical properties. To obtain stoichiometric oxide crystal, the pre-existing VO is always removed via careful post-annealing treatment at high temperature in air or oxygen atmosphere. However, the annealing conditions is difficult to control and the removal of VO in bulk phase is restrained due to high energy barrier of VO migration. Here, we selected VO2 crystal film as the model system and developed an alternative annealing treatment aided by controllable hydrogen doping, which can realizes effective removal of VO defects in VO2-δ crystal at lower temperature. This finding is attributed to the hydrogenation accelerated oxygen vacancies recovery in VO2-δ crystal. Theoretical calculations revealed that the H-doping induced electrons are prone to accumulate around the oxygen defects in VO2-δ film, which facilitates the diffusion of VO and thus makes it easier to be removed. The methodology is expected to be applied to other metal oxides for oxygen-related point defects control.

preprint2020arXiv

Image-to-Image Translation with Text Guidance

The goal of this paper is to embed controllable factors, i.e., natural language descriptions, into image-to-image translation with generative adversarial networks, which allows text descriptions to determine the visual attributes of synthetic images. We propose four key components: (1) the implementation of part-of-speech tagging to filter out non-semantic words in the given description, (2) the adoption of an affine combination module to effectively fuse different modality text and image features, (3) a novel refined multi-stage architecture to strengthen the differential ability of discriminators and the rectification ability of generators, and (4) a new structure loss to further improve discriminators to better distinguish real and synthetic images. Extensive experiments on the COCO dataset demonstrate that our method has a superior performance on both visual realism and semantic consistency with given descriptions.

preprint2020arXiv

Knowledge Graph Extraction from Videos

Nearly all existing techniques for automated video annotation (or captioning) describe videos using natural language sentences. However, this has several shortcomings: (i) it is very hard to then further use the generated natural language annotations in automated data processing, (ii) generating natural language annotations requires to solve the hard subtask of generating semantically precise and syntactically correct natural language sentences, which is actually unrelated to the task of video annotation, (iii) it is difficult to quantitatively measure performance, as standard metrics (e.g., accuracy and F1-score) are inapplicable, and (iv) annotations are language-specific. In this paper, we propose the new task of knowledge graph extraction from videos, i.e., producing a description in the form of a knowledge graph of the contents of a given video. Since no datasets exist for this task, we also include a method to automatically generate them, starting from datasets where videos are annotated with natural language. We then describe an initial deep-learning model for knowledge graph extraction from videos, and report results on MSVD* and MSR-VTT*, two datasets obtained from MSVD and MSR-VTT using our method.

preprint2020arXiv

ManiGAN: Text-Guided Image Manipulation

The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method. Code is available at https://github.com/mrlibw/ManiGAN.

preprint2020arXiv

Reliable Liver Fibrosis Assessment from Ultrasound using Global Hetero-Image Fusion and View-Specific Parameterization

Ultrasound (US) is a critical modality for diagnosing liver fibrosis. Unfortunately, assessment is very subjective, motivating automated approaches. We introduce a principled deep convolutional neural network (CNN) workflow that incorporates several innovations. First, to avoid overfitting on non-relevant image features, we force the network to focus on a clinical region of interest (ROI), encompassing the liver parenchyma and upper border. Second, we introduce global heteroimage fusion (GHIF), which allows the CNN to fuse features from any arbitrary number of images in a study, increasing its versatility and flexibility. Finally, we use 'style'-based view-specific parameterization (VSP) to tailor the CNN processing for different viewpoints of the liver, while keeping the majority of parameters the same across views. Experiments on a dataset of 610 patient studies (6979 images) demonstrate that our pipeline can contribute roughly 7% and 22% improvements in partial area under the curve and recall at 90% precision, respectively, over conventional classifiers, validating our approach to this crucial problem.

preprint2020arXiv

Singularities of plane gravitational waves and their memory effects

Similar to the Schwarzschild coordinates for spherical black holes, the Baldwin, Jeffery and Rosen (BJR) coordinates for plane gravitational waves are often singular, and extensions beyond such singularities are necessary, before studying asymptotic properties of such spacetimes at the null infinity of the plane, on which the gravitational waves propagate. The latter is closely related to the studies of memory effects and soft graviton theorems. In this paper, we point out that in the BJR coordinates all the spacetimes are singular physically at the focused point $u = u_s$, except for the two cases: (1) $α=1/2, \; \forall \; χ_n$; and (2) $α=1, \; χ_i = 0\; (i = 1, 2, 3)$, where $χ_n$ are the coefficients in the expansion $χ\equiv \left[{\mbox{det}}\left(g_{ab}\right) \right]^{1/4} = \left(u - u_s\right)^α\sum_{n = 0}^{\infty}χ_n \left(u - u_s\right)^n$ with $χ_0 \not= 0$, the constant $α\in (0, 1]$ characterizes the strength of the singularities, and $g_{ab}$ denotes the reduced metric on the two-dimensional plane orthogonal to the propagation direction of the wave. Therefore, the hypersurfaces $u= u_s$ already represent the boundaries of such spacetimes, and the null infinity does not belong to them. As a result, they cannot be used to study properties of plane gravitational waves at null infinities, including memory effects and soft graviton theorems.

preprint2020arXiv

Spectral self-adaptive absorber/emitter for harvesting energy from the sun and outer space

The sun (~6000 K) and outer space (~3 K) are the original heat source and sink for human beings on Earth. The energy applications of absorbing solar irradiation and harvesting the coldness of outer space for energy utilization have attracted considerable interest from researchers. However, combining these two functions in a static device for continuous energy harvesting is unachievable due to the intrinsic infrared spectral conflict. In this study, we developed spectral self-adaptive absorber/emitter (SSA/E) for daytime photothermal and nighttime radiative sky cooling modes depending on the phase transition of the vanadium dioxide coated layer. A 24-hour day-night test showed that the fabricated SSA/E has continuous energy harvesting ability and improved overall energy utilization performance, thus showing remarkable potential in future energy applications.

preprint2020arXiv

Super-resolution in recovering embedded electromagnetic sources in high contrast media

The purpose of this work is to provide a rigorous mathematical analysis of the expected super-resolution phenomenon in the time-reversal imaging of electromagnetic (EM) radiating sources embedded in a high contrast medium. It is known that the resolution limit is essentially determined by the sharpness of the imaginary part of the EM Green's tensor for the associated background. We first establish the close connection between the resolution and the material parameters and the resolvent of the electric integral operator, via the Lippmann-Schwinger representation formula. We then present an insightful characterization of the spectral structure of the integral operator for a general bounded domain and derive the pole-pencil decomposition of its resolvent in the high contrast regime. For the special case of a spherical domain, we provide some quantitative asymptotic behavior of the eigenvalues and eigenfunctions. These mathematical findings shall enable us to provide a concise and rigorous illustration of the super-resolution in the EM source reconstruction in high contrast media. Some numerical examples are also presented to verify our main theoretical results.

preprint2016arXiv

Active control of surface plasmon resonance in MoS2-Ag hybrid nanostructures

Molybdenum disulfide (MoS2) monolayers have attracted much attention for their novel optical properties and efficient light-matter interactions. When excited by incident laser, the optical response of MoS2 monolayers was effectively modified by elementary photo-excited excitons owing to their large exciton binding energy, which can be facilitated for the optical-controllable exciton-plasmon interactions. Inspired by this concept, we experimentally investigated active light control of surface plasmon resonance (SPR) in MoS2-Ag hybrid nanostructures. The white light spectra of SPR were gradually red-shifted by increasing laser power, which was distinctly different from the one of bare Ag nanostructure. This spectroscopic tunability can be further controlled by near-field coupling strength and polarization state of light, and selectively applied to the control of plasmonic dark mode. An analytical Lorentz model for photo-excited excitons induced modulation of MoS2 dielectric function was developed to explain the insight physics of this SPR tunability. Our study opens new possibilities to the development of all-optical controlled nanophotonic devices based on 2D materials.

Bowen Li

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought

When to Invoke: Refining LLM Fairness with Toxicity Assessment

Linear Convergence of ISTA and FISTA

A Smart Contract based Crowdfunding Mechanism for Hierarchical Federated Learning

Blindly Assess Quality of In-the-Wild Videos via Quality-aware Pre-training and Motion Perception

Correlation Filters for Unmanned Aerial Vehicle-Based Aerial Tracking: A Review and Experimental Evaluation

DarkLighter: Light Up the Darkness for UAV Tracking

Dissipative soliton generation and real-time dynamics in microresonator-filtered fiber lasers

Experimental Realization of the Rabi-Hubbard Model with Trapped Ions

FedIPR: Ownership Verification for Federated Deep Neural Network Models

Lightweight Long-Range Generative Adversarial Networks

Memory-Driven Text-to-Image Generation

Photonic frequency microcombs based on dissipative Kerr and quadratic cavity solitons

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

Robotic Interestingness via Human-Informed Few-Shot Object Detection

S$^2$SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers

Siamese Object Tracking for Unmanned Aerial Vehicle: A Review and Comprehensive Analysis

Universal Learned Image Compression With Low Computational Cost

Unveiling the relative timing jitter in counter propagating all normal dispersion (CANDi) dual-comb fiber laser

All-Day Object Tracking for Unmanned Aerial Vehicle

Ferroelectric-HfO2/Oxide Interfaces, Oxygen Distribution Effect and Implications for Device Performance

Internal reverse-biased p-n junctions: a possible origin of the high resistance in phase change superlattice

Van Hove Singularity Arising from Mexican-Hat-Shaped Inverted Bands in the Topological Insulator Sn-doped Bi$_{1.1}$Sb$_{0.9}$Te$_{2}$S

An improved transformation between Fibonacci FSRs and Galois FSRs

Atomic origin for hydrogenation promoted bulk oxygen vacancies removal in vanadium dioxide

Image-to-Image Translation with Text Guidance

Knowledge Graph Extraction from Videos

ManiGAN: Text-Guided Image Manipulation

Reliable Liver Fibrosis Assessment from Ultrasound using Global Hetero-Image Fusion and View-Specific Parameterization

Singularities of plane gravitational waves and their memory effects

Spectral self-adaptive absorber/emitter for harvesting energy from the sun and outer space

Super-resolution in recovering embedded electromagnetic sources in high contrast media

Active control of surface plasmon resonance in MoS2-Ag hybrid nanostructures