Researcher profile

Bo Huang

Bo Huang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2026arXiv

From Holo Pockets to Electron Density: GPT-style Drug Design with Density

Recent advances in generative modeling have enabled significant progress in structure-based drug design (SBDD). Existing methods typically condition molecule generation on empty binding pockets from holo complexes, overlooking informative components such as the filler (ligands and solvent). Here, we leverage low-resolution electron density (ED) derived from the filler as a physically grounded condition for \textit{de novo} drug design. We consider two types of ED, calculated and cryo-EM/X-ray, obtainable from computational or experimental sources, supporting unified pre-training and experimental integration. Compared with rigid pocket representations, experimental ED naturally captures conformational flexibility and provides a more faithful description of the binding environment. Based on this, we introduce EDMolGPT, a decoder-only autoregressive framework that generates molecules from low-resolution ED point clouds. By grounding generation in physically meaningful density signals, EDMolGPT mitigates structural bias and produces molecules with 3D conformations. Evaluations on 101 biological targets verify the effectiveness. Our project page: https://jiahaochen1.github.io/EDMolGPT_Page/.

preprint2026arXiv

Parallel Dynamic Spatial Indexes

Maintaining spatial data (points in two or three dimensions) is crucial and has a wide range of applications, such as graphics, GIS, and robotics. To handle spatial data, many data structures, called spatial indexes, have been proposed, e.g. kd-trees, oct/quadtrees (also called Orth-trees), R-trees, and bounding volume hierarchies (BVHs). In real-world applications, spatial datasets tend to be highly dynamic, requiring batch updates of points with low latency. This calls for efficient parallel batch updates on spatial indexes. Unfortunately, there is very little work that achieves this. In this paper, we systematically study parallel spatial indexes, with a special focus on achieving high-performance update performance for highly dynamic workloads. We select two types of spatial indexes that are considered optimized for low-latency updates: Orth-tree and R-tree/BVH. We propose two data structures: the P-Orth tree, a parallel Orth-tree, and the SPaC-tree family, a parallel R-tree/BVH. Both the P-Orth tree and the SPaC-tree deliver superior performance in batch updates compared to existing parallel kd-trees and Orth-trees, while preserving better or competitive query performance relative to their corresponding Orth-tree and R-tree counterparts. We also present comprehensive experiments comparing the performance of various parallel spatial indexes and share our findings at the end of the paper.

preprint2025arXiv

TrimTokenator-LC: Towards Adaptive Visual Token Pruning for Large Multimodal Models with Long Contexts

Large Multimodal Models (LMMs) have proven effective on various tasks. They typically encode visual inputs into Original Model sequences of tokens, which are then concatenated with textual tokens and jointly processed by the language model. However, the growing number of visual tokens greatly increases inference cost. Visual token pruning has emerged as a promising solution. However, existing methods often overlook scenarios involving long context inputs with multiple images. In this paper, we analyze the challenges of visual token pruning in long context, multi-image settings and introduce an adaptive pruning method tailored for such scenarios. We decompose redundancy into intra-image and inter-image components and quantify them through intra-image diversity and inter-image variation, which jointly guide dynamic budget allocation. Our approach consists of two stages. The intra-image stage allocates each image a content-aware token budget and greedily selects its most representative tokens. The inter-image stage performs global diversity filtering to form a candidate pool and then applies a Pareto selection procedure that balances diversity with text alignment. Extensive experiments show that our approach can reduce up to 80% of visual tokens while maintaining performance in long context settings.

preprint2022arXiv

Large Intrinsic Valley Polarization and High Curie Temperature in Stable Two-dimensional Ferrovalley YX$_2$(X=I,Br and Cl)

Ferrovalley materials with spontaneous valley polarization are crucial to valleytronic application. Based on first-principles calculations, we demonstrate that two-dimensional (2D) YX$_2$(X= I, Br,and Cl) in 2H structure constitute a series of promising ferrovalley semiconductors with large spontaneous valley polarization and high Curie temperature. Our calculations reveal that YX$_2$ are dynamically and thermally stable 2D ferromagnetic semiconductors with a Curie temperature above 200 K. Due to the natural noncentrosymmetric structure, intrinsic ferromagnetic order and strong spin orbital coupling, the large spontaneous valley polarizations of 108.98, 57.70 and 22.35 meV are also predicted in single-layer YX$_2$(X = I, Br and Cl),respectively. The anomalous valley Hall effect is also proposed based on the valley contrasting Berry curvature. Moreover, the ferromagnetism and valley polarization are found to be effectively tuning by applying a biaxial strain. Interestingly, the suppressed valley physics of YBr$_2$ and YCl$_2$ can be switched on via applying a moderate compression strain. The present findings promise YX$_2$ as competitive candidates for the further experimental studies and practical applications in valleytronics.

preprint2021arXiv

Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data

Corrupted labels and class imbalance are commonly encountered in practically collected training data, which easily leads to over-fitting of deep neural networks (DNNs). Existing approaches alleviate these issues by adopting a sample re-weighting strategy, which is to re-weight sample by designing weighting function. However, it is only applicable for training data containing only either one type of data biases. In practice, however, biased samples with corrupted labels and of tailed classes commonly co-exist in training data. How to handle them simultaneously is a key but under-explored problem. In this paper, we find that these two types of biased samples, though have similar transient loss, have distinguishable trend and characteristics in loss curves, which could provide valuable priors for sample weight assignment. Motivated by this, we delve into the loss curves and propose a novel probe-and-allocate training strategy: In the probing stage, we train the network on the whole biased training data without intervention, and record the loss curve of each sample as an additional attribute; In the allocating stage, we feed the resulting attribute to a newly designed curve-perception network, named CurveNet, to learn to identify the bias type of each sample and assign proper weights through meta-learning adaptively. The training speed of meta learning also blocks its application. To solve it, we propose a method named skip layer meta optimization (SLMO) to accelerate training speed by skipping the bottom layers. Extensive synthetic and real experiments well validate the proposed method, which achieves state-of-the-art performance on multiple challenging benchmarks.

preprint2021arXiv

Reimagining City Configuration: Automated Urban Planning via Adversarial Learning

Urban planning refers to the efforts of designing land-use configurations. Effective urban planning can help to mitigate the operational and social vulnerability of a urban system, such as high tax, crimes, traffic congestion and accidents, pollution, depression, and anxiety. Due to the high complexity of urban systems, such tasks are mostly completed by professional planners. But, human planners take longer time. The recent advance of deep learning motivates us to ask: can machines learn at a human capability to automatically and quickly calculate land-use configuration, so human planners can finally adjust machine-generated plans for specific needs? To this end, we formulate the automated urban planning problem into a task of learning to configure land-uses, given the surrounding spatial contexts. To set up the task, we define a land-use configuration as a longitude-latitude-channel tensor, where each channel is a category of POIs and the value of an entry is the number of POIs. The objective is then to propose an adversarial learning framework that can automatically generate such tensor for an unplanned area. In particular, we first characterize the contexts of surrounding areas of an unplanned area by learning representations from spatial graphs using geographic and human mobility data. Second, we combine each unplanned area and its surrounding context representation as a tuple, and categorize all the tuples into positive (well-planned areas) and negative samples (poorly-planned areas). Third, we develop an adversarial land-use configuration approach, where the surrounding context representation is fed into a generator to generate a land-use configuration, and a discriminator learns to distinguish among positive and negative samples.

preprint2020arXiv

An Inter- and Intra-Band Loss for Pansharpening Convolutional Neural Networks

Pansharpening aims to fuse panchromatic and multispectral images from the satellite to generate images with both high spatial and spectral resolution. With the successful applications of deep learning in the computer vision field, a lot of scholars have proposed many convolutional neural networks (CNNs) to solve the pansharpening task. These pansharpening networks focused on various distinctive structures of CNNs, and most of them are trained by L2 loss between fused images and simulated desired multispectral images. However, L2 loss is designed to directly minimize the difference of spectral information of each band, which does not consider the inter-band relations in the training process. In this letter, we propose a novel inter- and intra-band (IIB) loss to overcome the drawback of original L2 loss. Our proposed IIB loss can effectively preserve both inter- and intra-band relations and can be directly applied to different pansharpening CNNs.

preprint2020arXiv

BEC immersed in a Fermi sea: Theory of static and dynamic behavior across phase separation

We theoretically study the static and dynamic behavior of a BEC immersed in a large Fermi sea of ultracold atoms under conditions of tunable interspecies interaction. The degenerate Bose-Fermi mixture is kept in an elongated trap, typical for a single-beam optical dipole trap. We focus on the case of repulsive Bose-Fermi interaction and develop mean-field models to simulate the system over a wide range of repulsion strength. We further get analytical solutions in the regimes of phase separation and weak interaction. We obtain static density profiles and the frequency of the radial breathing mode, which is an elementary dynamic phenomenon of the mixture. Our results unveil the structure of the Bose-Fermi interface and describe the origin of the frequency shift of the breathing mode when the components become phase-separated at strong repulsion. We show that the mediated interaction between bosons induced by the Fermi sea can be understood as an adiabatic second-order mean-field effect, which is valid also beyond the weak-interaction regime for relevant experimental conditions. These results are consistent with our recent observations in a mixture of $^{41}$K and $^6$Li.

preprint2020arXiv

Inverse design of multilayer nanoparticles using artificial neural networks and genetic algorithm

The light scattering of multilayer nanoparticles can be solved by Maxwell equations. However, it is difficult to solve the inverse design of multilayer nanoparticles by using the traditional trial-and-error method. Here, we present a method for forward simulation and inverse design of multilayer nanoparticles. We combine the global search ability of genetic algorithm with the local search ability of neural network. First, the genetic algorithm is used to find a suitable solution, and then the neural network is used to fine-tune it. Due to the non-unique relationship between physical structures and optical responses, we first train a forward neural network, and then it is applied to the inverse design of multilayer nanoparticles. Not only here, this method can easily be extended to predict and find the best design parameters for other optical structures.

preprint2020arXiv

Multiform Fonts-to-Fonts Translation via Style and Content Disentangled Representations of Chinese Character

This paper mainly discusses the generation of personalized fonts as the problem of image style transfer. The main purpose of this paper is to design a network framework that can extract and recombine the content and style of the characters. These attempts can be used to synthesize the entire set of fonts with only a small amount of characters. The paper combines various depth networks such as Convolutional Neural Network, Multi-layer Perceptron and Residual Network to find the optimal model to extract the features of the fonts character. The result shows that those characters we have generated is very close to real characters, using Structural Similarity index and Peak Signal-to-Noise Ratio evaluation criterions.

preprint2020arXiv

Side-On transition radiation detector: a detector prototype for TeV energy scale calibration of calorimeters in space

Transition Radiation (TR) plays an important role in particle identification in high-energy physics and its characteristics provide a feasible method of energy calibration in the energy range up to 10 TeV, which is of interest for dark matter searches in cosmic rays. In a Transition Radiation Detector (TRD), the TR signal is superimposed onto the ionization energy loss signal induced by incident charged particles. In order to make the TR signal stand out from the background of ionization energy loss in a significant way, we optimized both the radiators and the detector. We have designed a new prototype of regular radiator optimized for a maximal TR photon yield, combined with the Side-On TRD which is supposed to improve the detection efficiency of TR. We started a test beam experiment with the Side-On TRD at Conseil Européen pour la Recherche Nucléaire (CERN), and found that the experimental data is consistent with the simulation results.

preprint2020arXiv

u-net CNN based fourier ptychography

Fourier ptychography is a recently explored imaging method for overcoming the diffraction limit of conventional cameras with applications in microscopy and yielding high-resolution images. In order to splice together low-resolution images taken under different illumination angles of coherent light source, an iterative phase retrieval algorithm is adopted. However, the reconstruction procedure is slow and needs a good many of overlap in the Fourier domain for the continuous recorded low-resolution images and is also worse under system aberrations such as noise or random update sequence. In this paper, we propose a new retrieval algorithm that is based on convolutional neural networks. Once well trained, our model can perform high-quality reconstruction rapidly by using the graphics processing unit. The experiments demonstrate that our model achieves better reconstruction results and is more robust under system aberrations.

preprint2020arXiv

UBER-GNN: A User-Based Embeddings Recommendation based on Graph Neural Networks

The problem of session-based recommendation aims to predict user next actions based on session histories. Previous methods models session histories into sequences and estimate user latent features by RNN and GNN methods to make recommendations. However under massive-scale and complicated financial recommendation scenarios with both virtual and real commodities , such methods are not sufficient to represent accurate user latent features and neglect the long-term characteristics of users. To take long-term preference and dynamic interests into account, we propose a novel method, i.e. User-Based Embeddings Recommendation with Graph Neural Network, UBER-GNN for brevity. UBER-GNN takes advantage of structured data to generate longterm user preferences, and transfers session sequences into graphs to generate graph-based dynamic interests. The final user latent feature is then represented as the composition of the long-term preferences and the dynamic interests using attention mechanism. Extensive experiments conducted on real Ping An scenario show that UBER-GNN outperforms the state-of-the-art session-based recommendation methods.

preprint2019arXiv

Alternative Analysis Methods for Time to Event Endpoints under Non-proportional Hazards: A Comparative Analysis

The log-rank test is most powerful under proportional hazards (PH). In practice, non-PH patterns are often observed in clinical trials, such as in immuno-oncology; therefore, alternative methods are needed to restore the efficiency of statistical testing. Three categories of testing methods were evaluated, including weighted log-rank tests, Kaplan-Meier curve-based tests (including weighted Kaplan-Meier and Restricted Mean Survival Time, RMST), and combination tests (including Breslow test, Lee's combo test, and MaxCombo test). Nine scenarios representing the PH and various non-PH patterns were simulated. The power, type I error, and effect estimates of each method were compared. In general, all tests control type I error well. There is not a single most powerful test across all scenarios. In the absence of prior knowledge regarding the PH or non-PH patterns, the MaxCombo test is relatively robust across patterns. Since the treatment effect changes overtime under non-PH, the overall profile of the treatment effect may not be represented comprehensively based on a single measure. Thus, multiple measures of the treatment effect should be pre-specified as sensitivity analyses to evaluate the totality of the data.