Source author record

Hongbo Zhao

Hongbo Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.QA Artificial Intelligence cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el Machine Learning

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Practical Continual Forgetting for Pre-trained Vision Models

For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners, and these requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while maintaining the rest. We define this problem as continual forgetting and identify three key challenges. (i) For unwanted knowledge, efficient and effective deleting is crucial. (ii) For remaining knowledge, the impact brought by the forgetting procedure should be minimal. (iii) In real-world scenarios, the training samples may be scarce or partially missing during the process of forgetting. To address them, we first propose Group Sparse LoRA (GS-LoRA). Specifically, towards (i), we introduce Low-Rank Adaptation (LoRA) modules to fine-tune the Feed-Forward Network (FFN) layers in Transformer blocks for each forgetting task independently, and towards (ii), a simple group sparse regularization is adopted, enabling automatic selection of specific LoRA groups and zeroing out the others. To further extend GS-LoRA to more practical scenarios, we incorporate prototype information as additional supervision and introduce a more practical approach, GS-LoRA++. For each forgotten class, we move the logits away from its original prototype. For the remaining classes, we pull the logits closer to their respective prototypes. We conduct extensive experiments on face recognition, object detection, and image classification and demonstrate that our method manages to forget specific classes with minimal impact on other classes. Codes have been released on https://github.com/bjzhb666/GS-LoRA.

preprint2026arXiv

The Midas Touch for Metric Depth

Recent advances have markedly improved the cross-scene generalization of relative depth estimation, yet its practical applicability remains limited by the absence of metric scale, local inconsistencies, and low computational efficiency. To address these issues, we present \emph{\textbf{M}idas \textbf{T}ouch for \textbf{D}epth} (MTD), a mathematically interpretable approach that converts relative depth into metric depth using only extremely sparse 3D data. To eliminate local scale inconsistencies, it applies a segment-wise recovery strategy via sparse graph optimization, followed by a pixel-wise refinement strategy using a discontinuity-aware geodesic cost. MTD exhibits strong generalization and achieves substantial accuracy improvements over previous depth completion and depth estimation methods. Moreover, its lightweight, plug-and-play design facilitates deployment and integration on diverse downstream 3D tasks. Project page is available at https://mias.group/MTD.

preprint2025arXiv

MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark

Continual learning enables AI systems to acquire new knowledge while retaining previously learned information. While traditional unimodal methods have made progress, the rise of Multimodal Large Language Models (MLLMs) brings new challenges in Multimodal Continual Learning (MCL), where models are expected to address both catastrophic forgetting and cross-modal coordination. To advance research in this area, we present MCITlib, a comprehensive library for Multimodal Continual Instruction Tuning. MCITlib currently implements 8 representative algorithms and conducts evaluations on 3 benchmarks under 2 backbone models. The library will be continuously updated to support future developments in MCL. The codebase is released at https://github.com/Ghy0501/MCITlib.

preprint2022arXiv

PGGANet: Pose Guided Graph Attention Network for Person Re-identification

Person re-identification (reID) aims at retrieving a person from images captured by different cameras. For deep-learning-based reID methods, it has been proved that using local features together with global feature could help to give robust representation for person retrieval. Human pose information could provide the locations of human skeleton to effectively guide the network to pay more attention on these key areas and could also help to reduce the noise distractions from background or occlusion. However, methods proposed by previous pose-based works might not be able to fully exploit the benefits of pose information and few of them take into consideration the different contributions of separate local features. In this paper, we propose a pose guided graph attention network, a multi-branch architecture consisting of one branch for global feature, one branch for mid-granular body features and one branch for fine-granular key point features. We use a pre-trained pose estimator to generate the key-point heatmaps for local feature learning and carefully design a graph attention convolution layer to re-assign the contribution weights of extracted local features by modeling the similarities relations. Experiment results demonstrate the effectiveness of our approach on discriminative feature learning and we show that our model achieves state-of-the-art performances on several mainstream evaluation datasets. We also conduct a plenty of ablation studies and design different kinds of comparison experiments for our network to prove its effectiveness and robustness, including occluded experiments and cross-domain tests.

preprint2021arXiv

Correlative image learning of chemo-mechanics in phase-transforming solids

Constitutive laws underlie most physical processes in nature. However, learning such equations in heterogeneous solids (e.g., due to phase separation) is challenging. One such relationship is between composition and eigenstrain, which governs the chemo-mechanical expansion in solids. In this work, we developed a generalizable, physically-constrained image-learning framework to algorithmically learn the chemo-mechanical constitutive law at the nanoscale from correlative four-dimensional scanning transmission electron microscopy and X-ray spectro-ptychography images. We demonstrated this approach on Li$_X$FePO$_4$, a technologically-relevant battery positive electrode material. We uncovered the functional form of composition-eigenstrain relation in this two-phase binary solid across the entire composition range (0 $\leq$ X $\leq$ 1), including inside the thermodynamically-unstable miscibility gap. The learned relation directly validates Vegard's law of linear response at the nanoscale. Our physics-constrained data-driven approach directly visualizes the residual strain field (by removing the compositional and coherency strain), which is otherwise impossible to quantify. Heterogeneities in the residual strain arise from misfit dislocations and were independently verified by X-ray diffraction line profile analysis. Our work provides the means to simultaneously quantify chemical expansion, coherency strain and dislocations in battery electrodes, which has implications on rate capabilities and lifetime. Broadly, this work also highlights the potential of integrating correlative microscopy and image learning for extracting material properties and physics.

preprint2020arXiv

Lie Conformal Algebra and Dual Pair Type Realizations of Some Moonshine Type VOAs, and Calculations of the Correlation Functions

In this paper we use Lie conformal algebras to realize some moonshine type VOAs, whose Greiss algebras are Jordan algebras. On the other hand, we consider some free fields which realizes the corresponding simple VOAs. As an application, we can calculate the correlation functions of these VOAs in a relatively easy way.

preprint2016arXiv

On Correlation Functions of Vertex Operator Algebras Associated to Jordan Algebras

In this paper we study certain vertex operator algebras associated to Jordan algebras and compute the correlation function of basic fields

preprint2016arXiv

Simplicities of VOAs Associated to Jordan Algebras of Type $B$ and Character Formulas for Simple Quotients

In ths paper we study the VOA $V_{\mathcal{J},r}$ constructed by Ashihara and Miyamoto, We construct simple quotients of $V_{\mathcal{J},r},r\in\mathbb{Z}_{\neq 0}$ explicitly using dual-pair type constructions. We also compute the character formula of the simple quotients when $r=-2n,n\geq 1$.

preprint2012arXiv

Triplet excitations in carbon nanostructures

We show that the energy differences between the lowest optical singlet exciton and the lowest triplet exciton in semiconducting single-walled carbon nanotubes with diameter $\sim 1$ nm and graphene nanoribbons with widths $\sim 2$ nm are an order of magnitude smaller than in the $π$-conjugated polymer poly(para-phenylenevinylene). Our calculated energy gaps between the singlet and triplet excitons are in excellent agreement with the measured values in three different nanotubes with diameters close to 1 nm. The spatial extent of the triplet exciton is nearly the same as that of the singlet exciton in wide nanotubes and nanoribbons, in contrast to that in $π$-conjugated polymers, in which the triplet exciton exhibits strong spatial confinement. Weakly confined behavior of the triplet state begins in nanoribbons with widths as narrow as 2.5 times the graphene unit lattice vector. We discuss possible consequences of the small singlet-triplet energy difference in the carbon nanostructures on device applications.

Hongbo Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Practical Continual Forgetting for Pre-trained Vision Models

The Midas Touch for Metric Depth

MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark

PGGANet: Pose Guided Graph Attention Network for Person Re-identification

Correlative image learning of chemo-mechanics in phase-transforming solids

Lie Conformal Algebra and Dual Pair Type Realizations of Some Moonshine Type VOAs, and Calculations of the Correlation Functions

On Correlation Functions of Vertex Operator Algebras Associated to Jordan Algebras

Simplicities of VOAs Associated to Jordan Algebras of Type $B$ and Character Formulas for Simple Quotients

Triplet excitations in carbon nanostructures