Source author record

Chuang Wang

Chuang Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.dis-nn Computer Vision Information Theory math.IT cond-mat.mtrl-sci cond-mat.stat-mech Machine Learning physics.comp-ph cond-mat.mes-hall math.NA math.OC Numerical Analysis

Catalog footprint

What is connected

17works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning

Large-scale chemical reaction datasets are crucial for AI research in chemistry. However, existing chemical reaction data often exist as images within papers, making them not machine-readable and unusable for training machine learning models. In response to this challenge, we propose the RxnCaption framework for the task of chemical Reaction Diagram Parsing (RxnDP). Our framework reformulates the traditional coordinate prediction driven parsing process into an image captioning problem, which Large Vision Language Models (LVLMs) handle naturally. We introduce a strategy termed BBox and Index as Visual Prompt (BIVP), which uses our state-of-the-art molecular detector, MolYOLO, to pre-draw molecular bounding boxes and indices directly onto the input image. This turns the downstream parsing into a natural-language description problem. Extensive experiments show that the BIVP strategy significantly improves structural extraction quality while simplifying model design. We further construct the RxnCaption-15k dataset, an order of magnitude larger than prior real-world literature benchmarks, with a balanced test subset across four layout archetypes. Experiments demonstrate that RxnCaption-VL achieves state-of-the-art performance on multiple metrics. We believe our method, dataset, and models will advance structured information extraction from chemical literature and catalyze broader AI applications in chemistry. We will release data, models, and code on GitHub.

preprint2026arXiv

VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation

Scalable Vector Graphics (SVG) animation generation is pivotal for professional design due to their structural editability and resolution independence. However, this task remains challenging as it requires bridging discrete code representations with continuous visual dynamics. Existing optimization-based methods often destroy topological consistency, while general-purpose LLMs rely on rigid CSS/SMIL transformations, failing to model geometry-level non-rigid deformations. To address these limitations, we present VAnim, the first LLM-based framework for open-domain text-to-SVG animation. We reconceptualize animation not as sequence generation, but as Sparse State Updates (SSU) on a persistent SVG DOM tree. This paradigm compresses sequence length by over 9.8x while preserving the SVG DOM structure and non-participating elements by construction. To enable precise control, we propose an Identification-First Motion Planning mechanism that grounds textual instructions in explicit visual entities. Furthermore, to overcome the non-differentiable nature of SVG rendering, we employ Rendering-Aware Reinforcement Learning via Group Relative Policy Optimization (GRPO). By leveraging a hybrid reward from a state-of-the-art video perception encoder, we align discrete code updates with high-fidelity visual feedback. We also introduce SVGAnim-134k, the first benchmark for vector animation. Extensive experiments demonstrate that VAnim significantly outperforms state-of-the-art baselines in semantic alignment and structural validity, with additional appendix metrics further validating motion quality and identity preservation.

preprint2024arXiv

Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

Exemplar-based sketch-to-photo synthesis allows users to generate photo-realistic images based on sketches. Recently, diffusion-based methods have achieved impressive performance on image generation tasks, enabling highly-flexible control through text-driven generation or energy functions. However, generating photo-realistic images with color and texture from sketch images remains challenging for diffusion models. Sketches typically consist of only a few strokes, with most regions left blank, making it difficult for diffusion-based methods to produce photo-realistic images. In this work, we propose a two-stage method named ``Inversion-by-Inversion" for exemplar-based sketch-to-photo synthesis. This approach includes shape-enhancing inversion and full-control inversion. During the shape-enhancing inversion process, an uncolored photo is generated with the guidance of a shape-energy function. This step is essential to ensure control over the shape of the generated photo. In the full-control inversion process, we propose an appearance-energy function to control the color and texture of the final generated photo.Importantly, our Inversion-by-Inversion pipeline is training-free and can accept different types of exemplars for color and texture control. We conducted extensive experiments to evaluate our proposed method, and the results demonstrate its effectiveness. The code and project can be found at https://ximinng.github.io/inversion-by-inversion-project/.

preprint2020arXiv

FISHING Net: Future Inference of Semantic Heatmaps In Grids

For autonomous robots to navigate a complex environment, it is crucial to understand the surrounding scene both geometrically and semantically. Modern autonomous robots employ multiple sets of sensors, including lidars, radars, and cameras. Managing the different reference frames and characteristics of the sensors, and merging their observations into a single representation complicates perception. Choosing a single unified representation for all sensors simplifies the task of perception and fusion. In this work, we present an end-to-end pipeline that performs semantic segmentation and short term prediction using a top-down representation. Our approach consists of an ensemble of neural networks which take in sensor data from different sensor modalities and transform them into a single common top-down semantic grid representation. We find this representation favorable as it is agnostic to sensor-specific reference frames and captures both the semantic and geometric information for the surrounding scene. Because the modalities share a single output representation, they can be easily aggregated to produce a fused output. In this work we predict short-term semantic grids but the framework can be extended to other tasks. This approach offers a simple, extensible, end-to-end approach for multi-modal perception and prediction.

preprint2020arXiv

Lossless Attention in Convolutional Networks for Facial Expression Recognition in the Wild

Unlike the constraint frontal face condition, faces in the wild have various unconstrained interference factors, such as complex illumination, changing perspective and various occlusions. Facial expressions recognition (FER) in the wild is a challenging task and existing methods can't perform well. However, for occluded faces (containing occlusion caused by other objects and self-occlusion caused by head posture changes), the attention mechanism has the ability to focus on the non-occluded regions automatically. In this paper, we propose a Lossless Attention Model (LLAM) for convolutional neural networks (CNN) to extract attention-aware features from faces. Our module avoids decay information in the process of generating attention maps by using the information of the previous layer and not reducing the dimensionality. Sequentially, we adaptively refine the feature responses by fusing the attention map with the feature map. We participate in the seven basic expression classification sub-challenges of FG-2020 Affective Behavior Analysis in-the-wild Challenge. And we validate our method on the Aff-Wild2 datasets released by the Challenge. The total accuracy (Accuracy) and the unweighted mean (F1) of our method on the validation set are 0.49 and 0.38 respectively, and the final result is 0.42 (0.67 F1-Score + 0.33 Accuracy).

preprint2019arXiv

Electronic, magnetic, and optical properties of Mn-doped GaSb: a first-principles study

Half-metallic ferromagnets can produce fully spin-polarized conduction electrons and can be applied to fabricate spintronic devices. Thus, in this study, the electronic structure, magnetic properties, and optical properties of GaSb, which has exhibited half-metallicity, doped with Mn, a 3d transition metal, are calculated using the generalized gradient approximation and Heyd-Scuseria-Ernzerhof (HSE) functional. Ga$_{1-x}$Mn$_x$Sb ($x = 0.25, 0.5, 0.75$) materials exhibit ferromagnetic half-metallic properties and a high Curie temperature, indicating that this series can applied in spintronic devices. Meanwhile, they absorb strongly in the infrared band, suggesting that Ga$_{1-x}$Mn$_{x}$Sb also has potential applications in infrared photoelectric devices.

preprint2019arXiv

Study of Constrained Network Structures for WGANs on Numeric Data Generation

Some recent studies have suggested using GANs for numeric data generation such as to generate data for completing the imbalanced numeric data. Considering the significant difference between the dimensions of the numeric data and images, as well as the strong correlations between features of numeric data, the conventional GANs normally face an overfitting problem, consequently leads to an ill-conditioning problem in generating numeric and structured data. This paper studies the constrained network structures between generator G and discriminator D in WGAN, designs several structures including isomorphic, mirror and self-symmetric structures. We evaluates the performances of the constrained WGANs in data augmentations, taking the non-constrained GANs and WGANs as the baselines. Experiments prove the constrained structures have been improved in 17/20 groups of experiments. In twenty experiments on four UCI Machine Learning Repository datasets, Australian Credit Approval data, German Credit data, Pima Indians Diabetes data and SPECT heart data facing five conventional classifiers. Especially, Isomorphic WGAN is the best in 15/20 experiments. Finally, we theoretically proves that the effectiveness of constrained structures by the directed graphic model (DGM) analysis.

preprint2019arXiv

Theoretical study of structure and magnetism of Ga$_{1-x}$V$_x$Sb compounds for spintronic applications

In this paper, the structural, electronic and magnetic properties of Zinc-blende Ga1-xVxSb compounds, with x from dilute doping situation to extreme doping limiting, were systematically investigated by first-principles calculations. V atoms prefer to substitute the Ga atoms and the formation energy is lower in Sb-rich than Ga-rich growth condition. Meantime, the SbGa antisite defects can effectively decrease the energy barrier of substitution process, from 0.85 eV to 0.53 eV. The diffusion of V atom in GaSb lattice is through meta-stable interstitial sites with an energy barrier of 0.6 eV. At a low V concentration x = 0.0625, V atoms prefer a homogeneous distribution and an antiferromagnetic coupling among them. However, starting from x = 0.5, the magnetic coupling among V atoms changes to be ferromagnetic, due to enhanced superexchange interaction between eg and t2g states of neighbouring V atoms. At the extreme limiting of x = 1.00, we found that Zinc-blende VSb as well as its analogs VAs and VP are intrinsic ferromagneitc semiconductors, with a large change of light absorption at the curie temperature. These results indicate that Ga1-xVxSb compounds can provide a platform to design the new electronic, spintronic and optoelectronic devices.

preprint2016arXiv

On one-step replica symmetry breaking in the Edwards-Anderson spin glass model

We consider a one-step replica symmetry breaking description of the Edwards-Anderson spin glass model in 2D. The ingredients of this description are a Kikuchi approximation to the free energy and a second-level statistical model built on the extremal points of the Kikuchi approximation, which are also fixed points of a Generalized Belief Propagation (GBP) scheme. We show that a generalized free energy can be constructed where these extremal points are exponentially weighted by their Kikuchi free energy and a Parisi parameter $y$, and that the Kikuchi approximation of this generalized free energy leads to second-level, one-step replica symmetry breaking (1RSB), GBP equations. We then proceed analogously to Bethe approximation case for tree-like graphs, where it has been shown that 1RSB Belief Propagation equations admit a Survey Propagation solution. We discuss when and how the one-step-replica symmetry breaking GBP equations that we obtain also allow a simpler class of solutions which can be interpreted as a class of Generalized Survey Propagation equations for the single instance graph case.

preprint2016arXiv

Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions

We study the dynamics of an online algorithm for learning a sparse leading eigenvector from samples generated from a spiked covariance model. This algorithm combines the classical Oja's method for online PCA with an element-wise nonlinearity at each iteration to promote sparsity. In the high-dimensional limit, the joint empirical measure of the underlying sparse eigenvector and its estimate provided by the algorithm is shown to converge weakly to a deterministic, measure-valued process. This scaling limit is characterized as the unique solution of a nonlinear PDE, and it provides exact information regarding the asymptotic performance of the algorithm. For example, performance metrics such as the cosine similarity and the misclassification rate in sparse support recovery can be obtained by examining the limiting dynamics. A steady-state analysis of the nonlinear PDE also reveals an interesting phase transition phenomenon. Although our analysis is asymptotic in nature, numerical simulations show that the theoretical predictions are accurate for moderate signal dimensions.

preprint2015arXiv

Randomized Kaczmarz Algorithm for Inconsistent Linear Systems: An Exact MSE Analysis

We provide a complete characterization of the randomized Kaczmarz algorithm (RKA) for inconsistent linear systems. The Kaczmarz algorithm, known in some fields as the algebraic reconstruction technique, is a classical method for solving large-scale overdetermined linear systems through a sequence of projection operators; the randomized Kaczmarz algorithm is a recent proposal by Strohmer and Vershynin to randomize the sequence of projections in order to guarantee exponential convergence (in mean square) to the solutions. A flurry of work followed this development, with renewed interest in the algorithm, its extensions, and various bounds on their performance. Earlier, we studied the special case of consistent linear systems and provided an exact formula for the mean squared error (MSE) in the value reconstructed by RKA, as well as a simple way to compute the exact decay rate of the error. In this work, we consider the case of inconsistent linear systems, which is a more relevant scenario for most applications. First, by using a "lifting trick", we derive an exact formula for the MSE given a fixed noise vector added to the measurements. Then we show how to average over the noise when it is drawn from a distribution with known first and second-order statistics. Finally, we demonstrate the accuracy of our exact MSE formulas through numerical simulations, which also illustrate that previous upper bounds in the literature may be several orders of magnitude too high.

preprint2014arXiv

Cavity Method: Message Passing from a Physics Perspective

In this three-sections lecture cavity method is introduced as heuristic framework from a Physics perspective to solve probabilistic graphical models and it is presented both at the replica symmetric (RS) and 1-step replica symmetry breaking (1RSB) level. This technique has been applied with success on a wide range of models and problems such as spin glasses, random constrain satisfaction problems (rCSP), error correcting codes etc. Firstly, the RS cavity solution for Sherrington-Kirkpatrick model---a fully connected spin glass model---is derived and its equivalence to the RS solution obtained using replicas is discussed. Then, the general cavity method for diluted graphs is illustrated both at RS and 1RSB level. The latter was a significant breakthrough in the last decade and has direct applications to rCSP. Finally, as example of an actual problem, K-SAT is investigated using belief and survey propagation.

preprint2014arXiv

Topological invariant tensor renormalization group method for spin glasses

Tensor renormalization group method (TRG) is a real space renormalization group approach. It has been successfully applied to both classical and quantum systems. In this paper, we study a disordered and frustrated system, the two-dimensional Edward-Anderson model, by a new topological invariant TRG scheme. We propose an approach to calculate the local magnetizations and nearest pair correlations simultaneously. The Nishimori multi-critical point predicted by the topological invariant TRG agrees well with the recent Monte-Carlo results. The TRG schemes outperform the mean field methods on the calculation of the partition function. We notice that it maybe obtain a negative partition function at sufficiently low temperatures. However, the negative contribution can be neglected if the systems is large enough. This topological invariant TRG can also be used to study three-dimensional spin glass systems.

preprint2013arXiv

Simplifying Generalized Belief Propagation on Redundant Region Graphs

The cluster variation method has been developed into a general theoretical framework for treating short-range correlations in many-body systems after it was first proposed by Kikuchi in 1951. On the numerical side, a message-passing approach called generalized belief propagation (GBP) was proposed by Yedidia, Freeman and Weiss about a decade ago as a way of computing the minimal value of the cluster variational free energy and the marginal distributions of clusters of variables. However the GBP equations are often redundant, and it is quite a non-trivial task to make the GBP iteration converges to a fixed point. These drawbacks hinder the application of the GBP approach to finite-dimensional frustrated and disordered systems. In this work we report an alternative and simple derivation of the GBP equations starting from the partition function expression. Based on this derivation we propose a natural and systematic way of removing the redundance of the GBP equations. We apply the simplified generalized belief propagation (SGBP) equations to the two-dimensional and the three-dimensional ferromagnetic Ising model and Edwards-Anderson spin glass model. The numerical results confirm that the SGBP message-passing approach is able to achieve satisfactory performance on these model systems. We also suggest that a subset of the SGBP equations can be neglected in the numerical iteration process without affecting the final results.

preprint2012arXiv

Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results

Graphical models for finite-dimensional spin glasses and real-world combinatorial optimization and satisfaction problems usually have an abundant number of short loops. The cluster variation method and its extension, the region graph method, are theoretical approaches for treating the complicated short-loop-induced local correlations. For graphical models represented by non-redundant or redundant region graphs, approximate free energy landscapes are constructed in this paper through the mathematical framework of region graph partition function expansion. Several free energy functionals are obtained, each of which use a set of probability distribution functions or functionals as order parameters. These probability distribution function/functionals are required to satisfy the region graph belief-propagation equation or the region graph survey-propagation equation to ensure vanishing correction contributions of region subgraphs with dangling edges. As a simple application of the general theory, we perform region graph belief-propagation simulations on the square-lattice ferromagnetic Ising model and the Edwards-Anderson model. Considerable improvements over the conventional Bethe-Peierls approximation are achieved. Collective domains of different sizes in the disordered and frustrated square lattice are identified by the message-passing procedure. Such collective domains and the frustrations among them are responsible for the low-temperature glass-like dynamical behaviors of the system.

preprint2011arXiv

Partition Function Expansion on Region-Graphs and Message-Passing Equations

Disordered and frustrated graphical systems are ubiquitous in physics, biology, and information science. For models on complete graphs or random graphs, deep understanding has been achieved through the mean-field replica and cavity methods. But finite-dimensional `real' systems persist to be very challenging because of the abundance of short loops and strong local correlations. A statistical mechanics theory is constructed in this paper for finite-dimensional models based on the mathematical framework of partition function expansion and the concept of region-graphs. Rigorous expressions for the free energy and grand free energy are derived. Message-passing equations on the region-graph, such as belief-propagation and survey-propagation, are also derived rigorously.

preprint2010arXiv

Ground-state configuration space heterogeneity of random finite-connectivity spin glasses and random constraint satisfaction problems

We demonstrate through two case studies, one on the p-spin interaction model and the other on the random K-satisfiability problem, that a heterogeneity transition occurs to the ground-state configuration space of a random finite-connectivity spin glass system at certain critical value of the constraint density. At the transition point, exponentially many configuration communities emerge from the ground-state configuration space, making the entropy density s(q) of configuration-pairs a non-concave function of configuration-pair overlap q. Each configuration community is a collection of relatively similar configurations and it forms a stable thermodynamic phase in the presence of a suitable external field. We calculate s(q) by the replica-symmetric and the first-step replica-symmetry-broken cavity methods, and show by simulations that the configuration space heterogeneity leads to dynamical heterogeneity of particle diffusion processes because of the entropic trapping effect of configuration communities. This work clarifies the fine structure of the ground-state configuration space of random spin glass models, it also sheds light on the glassy behavior of hard-sphere colloidal systems at relatively high particle volume fraction.

Chuang Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

RxnCaption: Reformulating Reaction Diagram Parsing as Visual Prompt Guided Captioning

VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation

Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

FISHING Net: Future Inference of Semantic Heatmaps In Grids

Lossless Attention in Convolutional Networks for Facial Expression Recognition in the Wild

Electronic, magnetic, and optical properties of Mn-doped GaSb: a first-principles study

Study of Constrained Network Structures for WGANs on Numeric Data Generation

Theoretical study of structure and magnetism of Ga$_{1-x}$V$_x$Sb compounds for spintronic applications

On one-step replica symmetry breaking in the Edwards-Anderson spin glass model

Online Learning for Sparse PCA in High Dimensions: Exact Dynamics and Phase Transitions

Randomized Kaczmarz Algorithm for Inconsistent Linear Systems: An Exact MSE Analysis

Cavity Method: Message Passing from a Physics Perspective

Topological invariant tensor renormalization group method for spin glasses

Simplifying Generalized Belief Propagation on Redundant Region Graphs

Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results

Partition Function Expansion on Region-Graphs and Message-Passing Equations

Ground-state configuration space heterogeneity of random finite-connectivity spin glasses and random constraint satisfaction problems