Source author record

Fu Li

Fu Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV quant-ph Computational Complexity Human-Computer Interaction physics.optics Artificial Intelligence Computer Science and Game Theory cond-mat.mtrl-sci Distributed, Parallel, and Cluster Computing Machine Learning math.LO physics.med-ph physics.soc-ph Populations and Evolution Social and Information Networks

Catalog footprint

What is connected

22works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning from Brain Topography: A Hierarchical Local-Global Graph-Transformer Network for EEG Emotion Recognition

Understanding how local neurophysiological patterns interact with global brain dynamics is essential for decoding human emotions from EEG signals. However, existing deep learning approaches often overlook the brain's intrinsic spatial organization, failing to simultaneously capture local topological relations and global dependencies. To address these challenges, we propose Neuro-HGLN, a Neurologically-informed Hierarchical Graph-Transformer Learning Network that integrates biologically grounded priors with hierarchical representation learning. Neuro-HGLN first constructs a spatial Euclidean prior graph based on physical electrode distances to serve as an anatomically grounded inductive bias. A learnable global dynamic graph is then introduced to model functional connectivity across the entire brain. In parallel, to capture fine-grained regional dependencies, Neuro-HGLN builds region-level local graphs using a multi-head self-attention mechanism. These graphs are processed synchronously through local-constrained parallel GCN layers to produce region-specific representations. Subsequently, an iTransformer encoder aggregates these features to capture cross-region dependencies under a dimension-as-token formulation. Extensive experiments demonstrate that Neuro-HGLN achieves state-of-the-art performance on multiple benchmarks, providing enhanced interpretability grounded in neurophysiological structure. These results highlight the efficacy of unifying local topological learning with cross-region dependency modeling for robust EEG emotion recognition.

preprint2022arXiv

3-D Stochastic Numerical Breast Phantoms for Enabling Virtual Imaging Trials of Ultrasound Computed Tomography

Ultrasound computed tomography (USCT) is an emerging imaging modality for breast imaging that can produce quantitative images that depict the acoustic properties of tissues. Computer-simulation studies, also known as virtual imaging trials, provide researchers with an economical and convenient route to systematically explore imaging system designs and image reconstruction methods. When simulating an imaging technology intended for clinical use, it is essential to employ realistic numerical phantoms that can facilitate the objective, or task-based, assessment of image quality. Moreover, when computing objective image quality measures, an ensemble of such phantoms should be employed that display the variability in anatomy and object properties that is representative of the to-be-imaged patient cohort. Such stochastic phantoms for clinically relevant applications of USCT are currently lacking. In this work, a methodology for producing realistic three-dimensional (3D) numerical breast phantoms for enabling clinically relevant computer-simulation studies of USCT breast imaging is presented. By extending and adapting an existing stochastic 3D breast phantom for use withUSCT, methods for creating ensembles of numerical acoustic breast phantoms are established. These breast phantoms will possess clinically relevant variations in breast size, composition, acoustic properties, tumor locations, and tissue textures. To demonstrate the use of the phantoms in virtual USCT studies, two brief case studies are presented that address the development and assessment of image reconstruction procedures. Examples of breast phantoms produced by use of the proposed methods and a collection of 52 sets of simulated USCT measurement data have been made open source for use in image reconstruction development

preprint2022arXiv

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 365 videos, including the LDV 2.0 dataset (335 videos) and 30 additional videos. In this challenge, there are 12 teams and 2 teams that submitted the final results to Track 1 and Track 2, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution on compressed image and video. The proposed LDV 3.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge is at https://github.com/RenYang-home/AIM22_CompressSR.

preprint2022arXiv

Boosting Video-Text Retrieval with Explicit High-Level Semantics

Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video). Existing methods typically employ completely heterogeneous visual-textual information to align video and text, whilst lacking the awareness of homogeneous high-level semantic information residing in both modalities. To fill this gap, in this work, we propose a novel visual-linguistic aligning model named HiSE for VTR, which improves the cross-modal representation by incorporating explicit high-level semantics. First, we explore the hierarchical property of explicit high-level semantics, and further decompose it into two levels, i.e. discrete semantics and holistic semantics. Specifically, for visual branch, we exploit an off-the-shelf semantic entity predictor to generate discrete high-level semantics. In parallel, a trained video captioning model is employed to output holistic high-level semantics. As for the textual modality, we parse the text into three parts including occurrence, action and entity. In particular, the occurrence corresponds to the holistic high-level semantics, meanwhile both action and entity represent the discrete ones. Then, different graph reasoning techniques are utilized to promote the interaction between holistic and discrete high-level semantics. Extensive experiments demonstrate that, with the aid of explicit high-level semantics, our method achieves the superior performance over state-of-the-art methods on three benchmark datasets, including MSR-VTT, MSVD and DiDeMo.

preprint2022arXiv

Egalitarian Resource Sharing Over Multiple Rounds

It is often beneficial for agents to pool their resources in order to better accommodate fluctuations in individual demand. Many multi-round resource allocation mechanisms operate in an online manner: in each round, the agents specify their demands for that round, and the mechanism determines a corresponding allocation. In this paper, we focus instead on the offline setting in which the agents specify their demand for each round at the outset. We formulate a specific resource allocation problem in this setting, and design and analyze an associated mechanism based on the solution concept of lexicographic maximin fairness. We present an efficient implementation of our mechanism, and prove that it is envy-free, non-wasteful, resource monotonic, population monotonic, and group strategyproof. We also prove that our mechanism guarantees each agent at least half of the utility that they can obtain by not sharing their resources. We complement these positive results by proving that no maximin fair mechanism can improve on the aforementioned factor of one-half.

preprint2022arXiv

NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at a fixed bit-rate. Besides, the quality enhancement of Tracks 1 and 3 targets at improving the fidelity (PSNR), and Track 2 targets at enhancing the perceptual quality. The three tracks totally attract 482 registrations. In the test phase, 12 teams, 8 teams and 11 teams submitted the final results of Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of video quality enhancement. The homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh

preprint2022arXiv

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects. At test time, it takes as input a target image and a textured 3D query model. The core idea is to represent a 3D model with a number of 2D templates rendered from different viewpoints. This enables CNN-based direct dense feature extraction and matching. The object is first localized in 2D, then its approximate viewpoint is estimated, followed by dense 2D-3D correspondence prediction. The final pose is computed with PnP. We evaluate the method on LineMOD, Occlusion, Homebrewed, YCB-V and TLESS datasets and report very competitive performance in comparison to the state-of-the-art methods trained on synthetic data, even though our method is not trained on the object models used for testing.

preprint2022arXiv

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

To achieve disentangled image manipulation, previous works depend heavily on manual annotation. Meanwhile, the available manipulations are limited to a pre-defined set the models were trained for. We propose a novel framework, i.e., Predict, Prevent, and Evaluate (PPE), for disentangled text-driven image manipulation that requires little manual annotation while being applicable to a wide variety of manipulations. Our method approaches the targets by deeply exploiting the power of the large-scale pre-trained vision-language model CLIP. Concretely, we firstly Predict the possibly entangled attributes for a given text command. Then, based on the predicted attributes, we introduce an entanglement loss to Prevent entanglements during training. Finally, we propose a new evaluation metric to Evaluate the disentangled image manipulation. We verify the effectiveness of our method on the challenging face editing task. Extensive experiments show that the proposed PPE framework achieves much better quantitative and qualitative results than the up-to-date StyleCLIP baseline.

preprint2021arXiv

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Conventionally, spatiotemporal modeling network and its complexity are the two most concentrated research topics in video action recognition. Existing state-of-the-art methods have achieved excellent accuracy regardless of the complexity meanwhile efficient spatiotemporal modeling solutions are slightly inferior in performance. In this paper, we attempt to acquire both efficiency and effectiveness simultaneously. First of all, besides traditionally treating H x W x T video frames as space-time signal (viewing from the Height-Width spatial plane), we propose to also model video from the other two Height-Time and Width-Time planes, to capture the dynamics of video thoroughly. Secondly, our model is designed based on 2D CNN backbones and model complexity is well kept in mind by design. Specifically, we introduce a novel multi-view fusion (MVF) module to exploit video dynamics using separable convolution for efficiency. It is a plug-and-play module and can be inserted into off-the-shelf 2D CNNs to form a simple yet effective model called MVFNet. Moreover, MVFNet can be thought of as a generalized video modeling framework and it can specialize to be existing methods such as C2D, SlowOnly, and TSM under different settings. Extensive experiments are conducted on popular benchmarks (i.e., Something-Something V1 & V2, Kinetics, UCF-101, and HMDB-51) to show its superiority. The proposed MVFNet can achieve state-of-the-art performance with 2D CNN's complexity.

preprint2020arXiv

A Novel Transferability Attention Neural Network Model for EEG Emotion Recognition

The existed methods for electroencephalograph (EEG) emotion recognition always train the models based on all the EEG samples indistinguishably. However, some of the source (training) samples may lead to a negative influence because they are significant dissimilar with the target (test) samples. So it is necessary to give more attention to the EEG samples with strong transferability rather than forcefully training a classification model by all the samples. Furthermore, for an EEG sample, from the aspect of neuroscience, not all the brain regions of an EEG sample contains emotional information that can transferred to the test data effectively. Even some brain region data will make strong negative effect for learning the emotional classification model. Considering these two issues, in this paper, we propose a transferable attention neural network (TANN) for EEG emotion recognition, which learns the emotional discriminative information by highlighting the transferable EEG brain regions data and samples adaptively through local and global attention mechanism. This can be implemented by measuring the outputs of multiple brain-region-level discriminators and one single sample-level discriminator. We conduct the extensive experiments on three public EEG emotional datasets. The results validate that the proposed model achieves the state-of-the-art performance.

preprint2020arXiv

Anisotropic circular photogalvanic effect in colloidal tin sulfide nanosheets

Tin sulfide promises very interesting properties such as a high optical absorption coefficient and a small band gap, while being less toxic compared to other metal chalcogenides. However, the limitations in growing atomically thin structures of tin sulfide hinder the experimental realization of these properties. Due to the flexibility of the colloidal synthesis, it is possible to synthesize very thin and at the same time large nanosheets. Electrical transport measurements show that these nanosheets can function as field-effect transistors with high on/off ratio and p-type behavior. The temperature dependency of the charge transport reveals that defects in the crystal are responsible for the formation of holes as majority carriers. During illumination with circularly polarized light, these crystals generate a helicity dependent photocurrent at zero-volt bias, since their symmetry is broken by asymmetric interfaces (substrate and vacuum). Further, the observed circular photogalvanic effect shows a pronounced in-plane anisotropy, with a higher photocurrent along the armchair direction, originating from the higher absorption coefficient in this direction. Our new insights show the potential of tin sulfide for new functionalities in electronics and optoelectronics, for instance as polarization sensors.

preprint2020arXiv

Cost-effectiveness Analysis of Antiepidemic Policies and Global Situation Assessment of COVID-19

With a two-layer contact-dispersion model and data in China, we analyze the cost-effectiveness of three types of antiepidemic measures for COVID-19: regular epidemiological control, local social interaction control, and inter-city travel restriction. We find that: 1) intercity travel restriction has minimal or even negative effect compared to the other two at the national level; 2) the time of reaching turning point is independent of the current number of cases, and only related to the enforcement stringency of epidemiological control and social interaction control measures; 3) strong enforcement at the early stage is the only opportunity to maximize both antiepidemic effectiveness and cost-effectiveness; 4) mediocre stringency of social interaction measures is the worst choice. Subsequently, we cluster countries/regions into four groups based on their control measures and provide situation assessment and policy suggestions for each group.

preprint2020arXiv

NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best perceptual quality and similar to the ground truth. The track had 280 registered participants, and 19 teams submitted the final results. They gauge the state-of-the-art in single image super-resolution.

preprint2020arXiv

NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results

This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from more compressed videos to less compressed videos in a supervised training manner. In track 2, algorithms are required to learn the quality mapping from one device to another when their quality varies substantially and weakly-aligned video pairs are available. For track 1, in total 7 teams competed in the final test phase, demonstrating novel and effective solutions to the problem. For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.

preprint2020arXiv

Squeezed Light Induced Two-photon Absorption Fluorescence of Fluorescein Biomarkers

Two-photon absorption (TPA) fluorescence of biomarkers has been decisive in advancing the fields of biosensing and deep-tissue in vivo imaging of live specimens. However, due to the extremely small TPA cross section and the quadratic dependence on the input photon flux, extremely high peak-intensity pulsed lasers are imperative, which can result in significant photo- and thermal-damage. Previous works on entangled TPA (ETPA) with spontaneous parametric down-conversion (SPDC) light sources found a linear dependence on the input photon-pair flux, but are limited by low optical powers, along with a very broad spectrum. We report that by using a high-flux squeezed light source for TPA, a fluorescence enhancement of 47 is achieved in fluorescein biomarkers as compared to classical TPA. Moreover, a polynomial behavior of the TPA rate is observed in the DCM laser dye.

preprint2019arXiv

Photon statistics of quantum light on scattering from rotating ground glass

When a laser beam passes through a rotating ground glass (RGG), the scattered light exhibits thermal statistics. This is extensively used in speckle imaging. This scattering process has not been addressed in photon picture and is especially relevant if non-classical light is scattered by the RGG. We develop the photon picture for the scattering process using the Bose statistics for distributing $N$ photons in $M$ pixels. We obtain analytical form for the P-distribution of the output field in terms of the P-distribution of the input field. In particular we obtain a general relation for the $n$-th order correlation function of the scattered light, i.e., $g_{\text{out}}^{(n)}\simeq n!\,g_{\text{in}}^{(n)}$, which holds for any order-$n$ and for arbitrary input states. This result immediately recovers the classical transformation of coherent light to pseudo-thermal light by RGG.

preprint2015arXiv

Characterizing Propositional Proofs as Non-Commutative Formulas

Does every Boolean tautology have a short propositional-calculus proof? Here, a propositional calculus (i.e. Frege) proof is a proof starting from a set of axioms and deriving new Boolean formulas using a set of fixed sound derivation rules. Establishing any super-polynomial size lower bound on Frege proofs (in terms of the size of the formula proved) is a major open problem in proof complexity, and among a handful of fundamental hardness questions in complexity theory by and large. Non-commutative arithmetic formulas, on the other hand, constitute a quite weak computational model, for which exponential-size lower bounds were shown already back in 1991 by Nisan [Nis91] who used a particularly transparent argument. In this work we show that Frege lower bounds in fact follow from corresponding size lower bounds on non-commutative formulas computing certain polynomials (and that such lower bounds on non-commutative formulas must exist, unless NP=coNP). More precisely, we demonstrate a natural association between tautologies $T$ to non-commutative polynomials $p$, such that: if $T$ has a polynomial-size Frege proof then $p$ has a polynomial-size non-commutative arithmetic formula; and conversely, when $T$ is a DNF, if $p$ has a polynomial-size non-commutative arithmetic formula over $GF(2)$ then $T$ has a Frege proof of quasi-polynomial size.

preprint2015arXiv

Combining Traditional Marketing and Viral Marketing with Amphibious Influence Maximization

In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set of content providers and consumers form a bipartite network while consumers also form their social network, and influence propagates from the content providers to consumers and among consumers in the social network following the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed consumers, such that the influence from the seed providers passing through the seed consumers could reach a large number of consumers in the social network in expectation. We prove that the AIM problem is NP-hard to approximate to within any constant factor via a reduction from Feige's k-prover proof system for 3-SAT5. We also give evidence that even when the social network graph is trivial (i.e. has no edges), a polynomial time constant factor approximation for AIM is unlikely. However, when we assume that the weighted bi-adjacency matrix that describes the influence of content providers on consumers is of constant rank, a common assumption often used in recommender systems, we provide a polynomial-time algorithm that achieves approximation ratio of $(1-1/e-ε)^3$ for any (polynomially small) $ε> 0$. Our algorithmic results still hold for a more general model where cascades in social network follow a general monotone and submodular function.

preprint2014arXiv

An ideal experiment to determine the 'past of a particle' in the nested Mach-Zehnder Interferometer

An ideal experiment is designed to determine the past of a particle in the nested Mach-Zehnder interferometer (MZI) by using standard quantum mechanics with quantum non-demolition measurements. We find that when the photon reaches the detector, it only follows one arm of the outer interferometer and leaves no trace in the inner MZI; while when it goes through the inner MZI, it cannot reach the detector. Our result obtained from the standard quantum mechanics is contradict to the statement based on two state vector formulism, "the photon did not enter the (inner) interferometer, the photon never left the interferometer, but it was there". Therefore, the statement and also the overlap claim are incorrect.

preprint2014arXiv

Generating Matrix Identities and Proof Complexity

Motivated by the fundamental lower bounds questions in proof complexity, we initiate the study of matrix identities as hard instances for strong proof systems. A matrix identity of $d \times d$ matrices over a field $\mathbb{F}$, is a non-commutative polynomial $f(x_1,\ldots,x_n)$ over $\mathbb{F}$ such that $f$ vanishes on every $d \times d$ matrix assignment to its variables. We focus on arithmetic proofs, which are proofs of polynomial identities operating with arithmetic circuits and whose axioms are the polynomial-ring axioms (these proofs serve as an algebraic analogue of the Extended Frege propositional proof system; and over $GF(2)$ they constitute formally a sub-system of Extended Frege [HT12]). We introduce a decreasing in strength hierarchy of proof systems within arithmetic proofs, in which the $d$th level is a sound and complete proof system for proving $d \times d$ matrix identities (over a given field). For each level $d>2$ in the hierarchy, we establish a proof-size lower bound in terms of the number of variables in the matrix identity proved: we show the existence of a family of matrix identities $f_n$ with $n$ variables, such that any proof of $f_n=0$ requires $Ω(n^{2d})$ number of lines. The lower bound argument uses fundamental results from the theory of algebras with polynomial identities together with a generalization of the arguments in [Hru11]. We then set out to study matrix identities as hard instances for (full) arithmetic proofs. We present two conjectures, one about non-commutative arithmetic circuit complexity and the other about proof complexity, under which up to exponential-size lower bounds on arithmetic proofs (in terms of the arithmetic circuit size of the identities proved) hold. Finally, we discuss the applicability of our approach to strong propositional proof systems such as Extended Frege.

preprint2014arXiv

The effect of dissipation in direct communication scheme

The effect of the dissipation and finite number of beam splitters are discussed. A method using balanced dissipation to improve the communication for finite beam splitters, which greatly increases communication reliability with an expense of decreasing communication efficiency.

preprint2013arXiv

Energy-Aware Aggregation of Dynamic Temporal Workload in Data Centers

Data center providers seek to minimize their total cost of ownership (TCO), while power consumption has become a social concern. We present formulations to minimize server energy consumption and server cost under three different data center server setups (homogeneous, heterogeneous, and hybrid hetero-homogeneous clusters) with dynamic temporal workload. Our studies show that the homogeneous model significantly differs from the heterogeneous model in computational time (by an order of magnitude). To be able to compute optimal configurations in near real-time for large scale data centers, we propose two modes, aggregation by maximum and aggregation by mean. In addition, we propose two aggregation methods, static (periodic) aggregation and dynamic (aperiodic) aggregation. We found that in the aggregation by maximum mode, the dynamic aggregation resulted in cost savings of up to approximately 18% over the static aggregation. In the aggregation by mean mode, the dynamic aggregation by mean could save up to approximately 50% workload rearrangement compared to the static aggregation by mean mode. Overall, our methodology helps to understand the trade-off in energy-aware aggregation.

Fu Li

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Learning from Brain Topography: A Hierarchical Local-Global Graph-Transformer Network for EEG Emotion Recognition

3-D Stochastic Numerical Breast Phantoms for Enabling Virtual Imaging Trials of Ultrasound Computed Tomography

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

Boosting Video-Text Retrieval with Explicit High-Level Semantics

Egalitarian Resource Sharing Over Multiple Rounds

NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

A Novel Transferability Attention Neural Network Model for EEG Emotion Recognition

Anisotropic circular photogalvanic effect in colloidal tin sulfide nanosheets

Cost-effectiveness Analysis of Antiepidemic Policies and Global Situation Assessment of COVID-19

NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results

Squeezed Light Induced Two-photon Absorption Fluorescence of Fluorescein Biomarkers

Photon statistics of quantum light on scattering from rotating ground glass

Characterizing Propositional Proofs as Non-Commutative Formulas

Combining Traditional Marketing and Viral Marketing with Amphibious Influence Maximization

An ideal experiment to determine the 'past of a particle' in the nested Mach-Zehnder Interferometer

Generating Matrix Identities and Proof Complexity

The effect of dissipation in direct communication scheme

Energy-Aware Aggregation of Dynamic Temporal Workload in Data Centers