Source author record

Gang He

Gang He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.stat-mech eess.IV Multimedia Artificial Intelligence Computation and Language cond-mat.dis-nn nlin.AO physics.soc-ph

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Allegory of the Cave: Measurement-Grounded Vision-Language Learning

Vision-language models typically reason over post-ISP RGB images, although RGB rendering can clip, suppress, or quantize sensor evidence before inference. We study whether grounding improves when the visual interface is moved closer to the underlying camera measurement. We formulate measurement-grounded vision-language learning and instantiate it as PRISM-VL, which combines RAW-derived Meas.-XYZ inputs, camera-conditioned grounding, and Exposure-Bracketed Supervision Aggregation for transferring supervision from RGB proxies to measurement-domain observations. Using a quality-controlled 150K instruction-tuning set and a held-out benchmark targeting low-light, HDR, visibility-sensitive, and hallucination-sensitive cases, PRISM-VL-8B reaches 0.6120 BLEU, 0.4571 ROUGE-L, and 82.66\% LLM-Judge accuracy, improving over the RGB Qwen3-VL-8B baseline by +0.1074 BLEU, +0.1071 ROUGE-L, and +4.46 percentage points. These results suggest that part of VLM grounding error arises from information lost during RGB rendering, and that preserving measurement-domain evidence can improve multimodal reasoning.

preprint2026arXiv

Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion

The rise of HDR-WCG display devices has highlighted the need to convert SDRTV to HDRTV, as most video sources are still in SDR. Existing methods primarily focus on designing neural networks to learn a single-style mapping from SDRTV to HDRTV. However, the limited information in SDRTV and the diversity of styles in real-world conversions render this process an ill-posed problem, thereby constraining the performance and generalization of these methods. Inspired by generative approaches, we propose a novel method for SDRTV to HDRTV conversion guided by real HDRTV priors. Despite the limited information in SDRTV, introducing real HDRTV as reference priors significantly constrains the solution space of the originally high-dimensional ill-posed problem. This shift transforms the task from solving an unreferenced prediction problem to making a referenced selection, thereby markedly enhancing the accuracy and reliability of the conversion process. Specifically, our approach comprises two stages: the first stage employs a Vector Quantized Generative Adversarial Network to capture HDRTV priors, while the second stage matches these priors to the input SDRTV content to recover realistic HDRTV outputs. We evaluate our method on public datasets, demonstrating its effectiveness with significant improvements in both objective and subjective metrics across real and synthetic datasets.

preprint2022arXiv

Hard-sample Guided Hybrid Contrast Learning for Unsupervised Person Re-Identification

Unsupervised person re-identification (Re-ID) is a promising and very challenging research problem in computer vision. Learning robust and discriminative features with unlabeled data is of central importance to Re-ID. Recently, more attention has been paid to unsupervised Re-ID algorithms based on clustered pseudo-label. However, the previous approaches did not fully exploit information of hard samples, simply using cluster centroid or all instances for contrastive learning. In this paper, we propose a Hard-sample Guided Hybrid Contrast Learning (HHCL) approach combining cluster-level loss with instance-level loss for unsupervised person Re-ID. Our approach applies cluster centroid contrastive loss to ensure that the network is updated in a more stable way. Meanwhile, introduction of a hard instance contrastive loss further mines the discriminative information. Extensive experiments on two popular large-scale Re-ID benchmarks demonstrate that our HHCL outperforms previous state-of-the-art methods and significantly improves the performance of unsupervised person Re-ID. The code of our work is available soon at https://github.com/bupt-ai-cz/HHCL-ReID.

preprint2022arXiv

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR) observations, which might suffer from under- or over-exposed regions and different sources of noise. The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i.e. solutions can not exceed a given number of operations). In Track 2, participants are asked to minimize the complexity of their solutions while imposing a constraint on fidelity scores (i.e. solutions are required to obtain a higher fidelity score than the prescribed baseline). Both tracks use the same data and metrics: Fidelity is measured by means of PSNR with respect to a ground-truth HDR image (computed both directly and with a canonical tonemapping operation), while complexity metrics include the number of Multiply-Accumulate (MAC) operations and runtime (in seconds).

preprint2022arXiv

SDRTV-to-HDRTV via Hierarchical Dynamic Context Feature Mapping

In this work, we address the task of SDR videos to HDR videos(SDRTV-to-HDRTV). Previous approaches use global feature modulation for SDRTV-to-HDRTV. Feature modulation scales and shifts the features in the original feature space, which has limited mapping capability. In addition, the global image mapping cannot restore detail in HDR frames due to the luminance differences in different regions of SDR frames. To resolve the appeal, we propose a two-stage solution. The first stage is a hierarchical Dynamic Context feature mapping (HDCFM) model. HDCFM learns the SDR frame to HDR frame mapping function via hierarchical feature modulation (HME and HM ) module and a dynamic context feature transformation (DCT) module. The HME estimates the feature modulation vector, HM is capable of hierarchical feature modulation, consisting of global feature modulation in series with local feature modulation, and is capable of adaptive mapping of local image features. The DCT module constructs a feature transformation module in conjunction with the context, which is capable of adaptively generating a feature transformation matrix for feature mapping. Compared with simple feature scaling and shifting, the DCT module can map features into a new feature space and thus has a more excellent feature mapping capability. In the second stage, we introduce a patch discriminator-based context generation model PDCG to obtain subjective quality enhancement of over-exposed regions. PDCG can solve the problem that the model is challenging to train due to the proportion of overexposed regions of the image. The proposed method can achieve state-of-the-art objective and subjective quality results. Specifically, HDCFM achieves a PSNR gain of 0.81 dB at a parameter of about 100K. The number of parameters is 1/14th of the previous state-of-the-art methods. The test code will be released soon.

preprint2021arXiv

RR-DnCNN v2.0: Enhanced Restoration-Reconstruction Deep Neural Network for Down-Sampling Based Video Coding

Integrating deep learning techniques into the video coding framework gains significant improvement compared to the standard compression techniques, especially applying super-resolution (up-sampling) to down-sampling based video coding as post-processing. However, besides up-sampling degradation, the various artifacts brought from compression make super-resolution problem more difficult to solve. The straightforward solution is to integrate the artifact removal techniques before super-resolution. However, some helpful features may be removed together, degrading the super-resolution performance. To address this problem, we proposed an end-to-end restoration-reconstruction deep neural network (RR-DnCNN) using the degradation-aware technique, which entirely solves degradation from compression and sub-sampling. Besides, we proved that the compression degradation produced by Random Access configuration is rich enough to cover other degradation types, such as Low Delay P and All Intra, for training. Since the straightforward network RR-DnCNN with many layers as a chain has poor learning capability suffering from the gradient vanishing problem, we redesign the network architecture to let reconstruction leverages the captured features from restoration using up-sampling skip connections. Our novel architecture is called restoration-reconstruction u-shaped deep neural network (RR-DnCNN v2.0). As a result, our RR-DnCNN v2.0 outperforms the previous works and can attain 17.02% BD-rate reduction on UHD resolution for all-intra anchored by the standard H.265/HEVC. The source code is available at https://minhmanho.github.io/rrdncnn/.

preprint2016arXiv

Critical noise of majority-vote model on complex networks

The majority-vote model with noise is one of the simplest nonequilibrium statistical model that has been extensively studied in the context of complex networks. However, the relationship between the critical noise where the order-disorder phase transition takes place and the topology of the underlying networks is still lacking. In the paper, we use the heterogeneous mean-field theory to derive the rate equation for governing the model's dynamics that can analytically determine the critical noise $f_c$ in the limit of infinite network size $N\rightarrow \infty$. The result shows that $f_c$ depends on the ratio of ${\left\langle k \right\rangle }$ to ${\left\langle k^{3/2} \right\rangle }$, where ${\left\langle k \right\rangle }$ and ${\left\langle k^{3/2} \right\rangle }$ are the average degree and the $3/2$ order moment of degree distribution, respectively. Furthermore, we consider the finite size effect where the stochastic fluctuation should be involved. To the end, we derive the Langevin equation and obtain the potential of the corresponding Fokker-Planck equation. This allows us to calculate the effective critical noise $f_c(N)$ at which the susceptibility is maximal in finite size networks. We find that the $f_c-f_c(N)$ decays with $N$ in a power-law way and vanishes for $N\rightarrow \infty$. All the theoretical results are confirmed by performing the extensive Monte Carlo simulations in random $k$-regular networks, Erdös-Rényi random networks and scale-free networks.

preprint2014arXiv

Complex activated transition in a system of two coupled bistable oscillators

We study the fluctuation-activated transition process in a system of two coupled bistable oscillators, in which each oscillator is driven by one constant force and an independent Gaussian white noise. The transition pathway has been identified and the transition rate has been computed as the coupling strength $μ$ and the mismatch $σ$ in the force constants are varied. For identical oscillators ($σ=0$), the transition undergoes a change from a two-step process with two candidate pathways to a one-step process with also two candidate pathways to a one-step process with a single pathway as $μ$ is increased. For nonidentical oscillators ($σ\neq0$), a novel transition emerges that is a mixture of a two-step pathway and a one-step pathway. Interestingly, we find that the total transition rate depends nonmonotonically on $μ$: a maximal rate appears in an intermediate magnitude of $μ$. Moreover, in the presence of weak coupling the rate also exhibits an unexpected maximum as a function of $σ$. The results are in an excellent agreement with our numerical simulations by forward flux sampling.

preprint2013arXiv

How does degree heterogeneity affect nucleation of Ising model on complex networks?

We investigate the nucleation of Ising model on complex networks and focus on the role played by the heterogeneity of degree distribution on nucleation rate. Using Monte Carlo simulation combined with forward flux sampling, we find that for a weak external field the nucleation rate decreases monotonically as degree heterogeneity increases. Interestingly, for a relatively strong external field the nucleation rate exhibits a nonmonotonic dependence on degree heterogeneity, in which there exists a maximal nucleation rate at an intermediate level of degree heterogeneity. Furthermore, we develop a heterogeneous mean-field theory for evaluating the free-energy barrier of nucleation. The theoretical estimations are qualitatively consistent with the simulation results. Our study suggests that degree heterogeneity plays a nontrivial role in the dynamics of phase transition in networked Ising systems.

Gang He

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Allegory of the Cave: Measurement-Grounded Vision-Language Learning

Beyond Feature Mapping GAP: Integrating Real HDRTV Priors for Superior SDRTV-to-HDRTV Conversion

Hard-sample Guided Hybrid Contrast Learning for Unsupervised Person Re-Identification

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

SDRTV-to-HDRTV via Hierarchical Dynamic Context Feature Mapping

RR-DnCNN v2.0: Enhanced Restoration-Reconstruction Deep Neural Network for Down-Sampling Based Video Coding

Critical noise of majority-vote model on complex networks

Complex activated transition in a system of two coupled bistable oscillators

How does degree heterogeneity affect nucleation of Ising model on complex networks?