Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
30works
0followers
23topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

30 published item(s)

preprint2026arXiv

BenchCAD: A Comprehensive, Industry-Standard Benchmark for Programmatic CAD

Industrial Computer-Aided Design (CAD) code generation requires models to produce executable parametric programs from visual or textual inputs. Beyond recognizing the outer shape of a part, this task involves understanding its 3D structure, inferring engineering parameters, and choosing CAD operations that reflect how the part would be designed and manufactured. Despite the promise of Multimodal large language models (MLLMs) for this task, they are rarely evaluated on whether these capabilities jointly hold in realistic industrial CAD settings. We present BenchCAD, a unified benchmark for industrial CAD reasoning. BenchCAD contains 17,900 execution-verified CadQuery programs across 106 industrial part families, including bevel gears, compression springs, twist drills, and other reusable engineering designs. It evaluates models through visual question answering, code question answering, image-to-code generation, and instruction-guided code editing, enabling fine-grained analysis across perception, parametric abstraction, and executable program synthesis. Across 10+ frontier models, BenchCAD shows that current systems often recover coarse outer geometry but fail to produce faithful parametric CAD programs. Common failures include missing fine 3D structure, misinterpreting industrial design parameters, and replacing essential operations such as sweeps, lofts, and twist-extrudes with simpler sketch-and-extrude patterns. Fine-tuning and reinforcement learning improve in-distribution performance, but generalization to unseen part families remains limited. These results position BenchCAD as a benchmark for measuring and improving the industrial readiness of multimodal CAD automation.

preprint2026arXiv

HetScene: Heterogeneity-Aware Diffusion for Dense Indoor Scene Generation

Generating controllable and physically plausible indoor scenes is a pivotal prerequisite for constructing high-fidelity simulation environments for embodied AI. However, existing deeplearning-based methods usually treat all objects as homogeneous instances within a unified generation process. While effective for sparse and simplistic layouts, they struggle to model realistic layouts with dense object arrangements and complex spatial dependencies, leadingto limited scalability and degraded physical plausibility. To deal with these challenges, we revisit indoor layout generation from the perspective of structural heterogeneity and decompose the objects into primary objects and secondary objects according to their distinct roles in shaping a scene. Based on this decomposition, we propose HetScene, a heterogeneous two-stage generation framework that decouples indoor layout synthesis into Structural Layout Generation (SLG) and Contextual Layout Generation (CLG). SLG first generates globally coherent structural layouts with only primary objects conditioned on text descriptions, top-down binary room masks, and spatial relation graphs, establishing a stable global macro-skeleton of large core furniture.

preprint2026arXiv

SciFig: Towards Automating Scientific Figure Generation

Creating high-quality figures and visualizations for scientific papers is a time-consuming task that requires both deep domain knowledge and professional design skills. Despite over 2.5 million scientific papers published annually, the figure generation process remains largely manual. We introduce $\textbf{SciFig}$, an end-to-end AI agent system that generates publication-ready pipeline figures directly from research paper texts. SciFig uses a hierarchical layout generation strategy, which parses research descriptions to identify component relationships, groups related elements into functional modules, and generates inter-module connections to establish visual organization. Furthermore, an iterative chain-of-thought (CoT) feedback mechanism progressively improves layouts through multiple rounds of visual analysis and reasoning. We introduce a rubric-based evaluation framework that analyzes 2,219 real scientific figures to extract evaluation rubrics and automatically generates comprehensive evaluation criteria. SciFig demonstrates remarkable performance: achieving 70.1$\%$ overall quality on dataset-level evaluation and 66.2$\%$ on paper-specific evaluation, and consistently high scores across metrics such as visual clarity, structural organization, and scientific accuracy. SciFig figure generation pipeline and our evaluation benchmark will be open-sourced.

preprint2025arXiv

Distributed Information Bottleneck Theory for Multi-Modal Task-Aware Semantic Communication

Semantic communication shifts the focus from bit-level accuracy to task-relevant semantic delivery, enabling efficient and intelligent communication for next-generation networks. However, existing multi-modal solutions often process all available data modalities indiscriminately, ignoring that their contributions to downstream tasks are often unequal. This not only leads to severe resource inefficiency but also degrades task inference performance due to irrelevant or redundant information. To tackle this issue, we propose a novel task-aware distributed information bottleneck (TADIB) framework, which quantifies the contribution of any set of modalities to given tasks. Based on this theoretical framework, we design a practical coding scheme that intelligently selects and compresses only the most task-relevant modalities at the transmitter. To find the optimal selection and the codecs in the network, we adopt the probabilistic relaxation of discrete selection, enabling distributed encoders to make coordinated decisions with score function estimation and common randomness. Extensive experiments on public datasets demonstrate that our solution matches or surpasses the inference quality of full-modal baselines while significantly reducing communication and computational costs.

preprint2024arXiv

Order by projection in single-band Hubbard model: a DMRG study

In a Fermi system near or at half-filling, a specific superconducting pairing channel, if not explicitly included in the Hamiltonian, can be boosted by suppressing a competing pairing channel; this is exemplified by the enhancement of extended $s$-wave correlations upon suppressing $s$-wave Cooper pairing. This phenomenon, originally found by the use of generalized uncertainty relations is referred to as \emph{order by projection}. The case of zero on-site Coulomb interaction in the thermodynamic limit, confirms this mechanism through the analytical solution. In this study, we go further and systematically investigate this mechanism for a strongly correlated fermionic Hubbard model, now with finite on-site interaction, on a square lattice with an extended set of hopping parameters. We explore the behaviors of different pairing channels when one of them is suppressed, utilizing density matrix renormalization group calculations. Our findings provide numerical evidence supporting the existence of \emph{order by projection} in the strongly correlated system we studied. We also investigate the effect of the strength of Hubbard $U$, next-nearest neighbor $t'$, hole-doping, as well as finite-size scaling approaching the thermodynamic limit.

preprint2022arXiv

A fugacity-based Lattice Boltzmann method for multicomponent multiphase systems

The free energy model can extend the Lattice Boltzmann method to multiphase systems. However, there is a lack of models capable of simulating multicomponent multiphase fluids with partial miscibility. In addition, existing models cannot be generalized to honor thermodynamic information provided by any multicomponent equation of state of choice. In this paper, we introduce a free energy Lattice Boltzmann model where the forcing term is determined by the fugacity of the species, the thermodynamic property that connects species partial pressure to chemical potential calculations. By doing so, we are able to carry out multicomponent multiphase simulations of partially miscible fluids and generalize the methodology for use with any multicomponent equation of state of interest. We test this fugacity-based Lattice Boltzmann method for the cases of vapor-liquid equilibrium for two and three-component mixtures in various temperature and pressure conditions. We demonstrate that the model is able to reliably reproduce phase densities and compositions as predicted by multicomponent thermodynamics and can reproduce different characteristic pressure-composition and temperature-composition envelopes with a high degree of accuracy. We also demonstrate that the model can offer accurate predictions under dynamic conditions.

preprint2022arXiv

A quantum-inspired tensor network method for constrained combinatorial optimization problems

Combinatorial optimization is of general interest for both theoretical study and real-world applications. Fast-developing quantum algorithms provide a different perspective on solving combinatorial optimization problems. In this paper, we propose a quantum-inspired tensor-network-based algorithm for general locally constrained combinatorial optimization problems. Our algorithm constructs a Hamiltonian for the problem of interest, effectively mapping it to a quantum problem, then encodes the constraints directly into a tensor network state and solves the optimal solution by evolving the system to the ground state of the Hamiltonian. We demonstrate our algorithm with the open-pit mining problem, which results in a quadratic asymptotic time complexity. Our numerical results show the effectiveness of this construction and potential applications in further studies for general combinatorial optimization problems.

preprint2022arXiv

Disordered vector models: from higher spins to incipient strings

We present a one-parameter family of large $N$ disordered models, with and without supersymmetry, in three spacetime dimensions. They interpolate from the critical large $N$ vector model dual to a classical higher spin theory, towards a theory with a classical string dual. We analyze the spectrum and OPE data of the theories. While the supersymmetric model is always well-behaved the non-supersymmetric model is unitary only over a small parameter range. We offer some speculations on the origin of strings from the higher spins.

preprint2022arXiv

Emergence of Crystalline Few-body Correlations in Mass-imbalanced Fermi Polarons

Polarons can serve as an ideal platform to identify few-body correlations in tackling complex many-body problems. In this work, we reveal various crystalline few-body correlations smoothly emergent from the mass-imbalanced Fermi polarons in two dimensions. A unified variational approach up to three particle-hole excitations allows us to extract the dominant dimer, trimer or tetramer correlation in a single framework. When the fermion-impurity mass ratio is beyond certain critical value, the Fermi polaron is found to undergo a smooth crossover, instead of a sharp transition, from the polaronic to trimer and tetramer regimes as increasing the fermion-impurity attraction. The emergent trimer and tetramer correlations result in the momentum-space crystallization of particle-hole excitations featuring a stable diagonal or triangular structure, as can be directly probed through the density-density correlation of majority fermions. Our results shed light on the intriguing quantum phases in the mass-imbalanced Fermi-Fermi mixtures beyond the pairing superfluid paradigm.

preprint2022arXiv

Giant enhancement of third-harmonic generation in graphene-metal heterostructures

Nonlinear nanophotonics leverages engineered nanostructures to funnel light into small volumes and intensify nonlinear optical processes with spectral and spatial control. Due to its intrinsically large and electrically tunable nonlinear optical response, graphene is an especially promising nanomaterial for nonlinear optoelectronic applications. Here we report on exceptionally strong optical nonlinearities in graphene-insulator-metal heterostructures, demonstrating an enhancement by three orders of magnitude in the third-harmonic signal compared to bare graphene. Furthermore, by increasing the graphene Fermi energy through an external gate voltage, we find that graphene plasmons mediate the optical nonlinearity and modify the third-harmonic signal. Our findings show that graphene-insulator-metal is a promising heterostructure for optically-controlled and electrically-tunable nano-optoelectronic components.

preprint2022arXiv

Half-Wormholes and Ensemble Averages

We study "half-wormhole-like" saddle point contributions to spectral correlators in a variety of ensemble average models, including various statistical models, generalized 0d SYK models, 1d Brownian SYK models and an extension of it. In statistical ensemble models, where more general distributions of the random variables could be studied in great details, we find the accuracy of the previously proposed approximation for the half-wormholes could be improved when the distribution of the random variables deviate significantly from Gaussian distributions. We propose a modified approximation scheme of the half-wormhole contributions that also work well in these more general theories. In various generalized 0d SYK models we identify new half-wormhole-like saddle point contributions. In the 0d SYK model and 1d Brownian SYK model, apart from the wormhole and half-wormhole saddles, we find new non-trivial saddles in the spectral correlators that would potentially give contributions of the same order as the trivial self-averaging saddles. However after a careful Lefschetz-thimble analysis we show that these non-trivial saddles should not be included. We also clarify the difference between "linked half-wormholes" and "unlinked half-wormholes" in some models.

preprint2022arXiv

HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet

Semantic segmentation of 3D medical images is a challenging task due to the high variability of the shape and pattern of objects (such as organs or tumors). Given the recent success of deep learning in medical image segmentation, Neural Architecture Search (NAS) has been introduced to find high-performance 3D segmentation network architectures. However, because of the massive computational requirements of 3D data and the discrete optimization nature of architecture search, previous NAS methods require a long search time or necessary continuous relaxation, and commonly lead to sub-optimal network architectures. While one-shot NAS can potentially address these disadvantages, its application in the segmentation domain has not been well studied in the expansive multi-scale multi-path search space. To enable one-shot NAS for medical image segmentation, our method, named HyperSegNAS, introduces a HyperNet to assist super-net training by incorporating architecture topology information. Such a HyperNet can be removed once the super-net is trained and introduces no overhead during architecture search. We show that HyperSegNAS yields better performing and more intuitive architectures compared to the previous state-of-the-art (SOTA) segmentation networks; furthermore, it can quickly and accurately find good architecture candidates under different computing constraints. Our method is evaluated on public datasets from the Medical Segmentation Decathlon (MSD) challenge, and achieves SOTA performances.

preprint2022arXiv

PDRF: Progressively Deblurring Radiance Field for Fast and Robust Scene Reconstruction from Blurry Images

We present Progressively Deblurring Radiance Field (PDRF), a novel approach to efficiently reconstruct high quality radiance fields from blurry images. While current State-of-The-Art (SoTA) scene reconstruction methods achieve photo-realistic rendering results from clean source views, their performances suffer when the source views are affected by blur, which is commonly observed for images in the wild. Previous deblurring methods either do not account for 3D geometry, or are computationally intense. To addresses these issues, PDRF, a progressively deblurring scheme in radiance field modeling, accurately models blur by incorporating 3D scene context. PDRF further uses an efficient importance sampling scheme, which results in fast scene optimization. Specifically, PDRF proposes a Coarse Ray Renderer to quickly estimate voxel density and feature; a Fine Voxel Renderer is then used to achieve high quality ray tracing. We perform extensive experiments and show that PDRF is 15X faster than previous SoTA while achieving better performance on both synthetic and real scenes.

preprint2022arXiv

Potential energy surface and formation of superheavy nuclei with the Skyrme energy-density functional

Within the framework of Skyrme energy-density functional theory, the nucleus-nucleus potential is calculated and potential energy surface is obtained with different effective forces for accurately estimating the formation cross sections of superheavy nuclei in massive fusion reactions. The width and height of the potential pocket are influenced by the Skyrme effective forces SkM, SkM$^{\ast}$, SkP, SIII, Ska and SLy4, which correspond to the different equation of state for the isospin symmetry nuclear matter. It is found that the nucleus-nucleus potential is associated with the collision orientation and Skyrme parameters. More repulsive nuclear potential is pronounced with increasing the incompressible modulus of nuclear matter. The available data in the fusion-evaporation reaction of $^{48}$Ca+$^{238}$U are nicely reproduced with the SkM$^{\ast}$ parameter by implementing into the dinuclear system model.

preprint2022arXiv

REGAS: REspiratory-GAted Synthesis of Views for Multi-Phase CBCT Reconstruction from a single 3D CBCT Acquisition

It is a long-standing challenge to reconstruct Cone Beam Computed Tomography (CBCT) of the lung under respiratory motion. This work takes a step further to address a challenging setting in reconstructing a multi-phase}4D lung image from just a single}3D CBCT acquisition. To this end, we introduce REpiratory-GAted Synthesis of views, or REGAS. REGAS proposes a self-supervised method to synthesize the undersampled tomographic views and mitigate aliasing artifacts in reconstructed images. This method allows a much better estimation of between-phase Deformation Vector Fields (DVFs), which are used to enhance reconstruction quality from direct observations without synthesis. To address the large memory cost of deep neural networks on high resolution 4D data, REGAS introduces a novel Ray Path Transformation (RPT) that allows for distributed, differentiable forward projections. REGAS require no additional measurements like prior scans, air-flow volume, or breathing velocity. Our extensive experiments show that REGAS significantly outperforms comparable methods in quantitative metrics and visual quality.

preprint2022arXiv

Towards performant and reliable undersampled MR reconstruction via diffusion model sampling

Magnetic Resonance (MR) image reconstruction from under-sampled acquisition promises faster scanning time. To this end, current State-of-The-Art (SoTA) approaches leverage deep neural networks and supervised training to learn a recovery model. While these approaches achieve impressive performances, the learned model can be fragile on unseen degradation, e.g. when given a different acceleration factor. These methods are also generally deterministic and provide a single solution to an ill-posed problem; as such, it can be difficult for practitioners to understand the reliability of the reconstruction. We introduce DiffuseRecon, a novel diffusion model-based MR reconstruction method. DiffuseRecon guides the generation process based on the observed signals and a pre-trained diffusion model, and does not require additional training on specific acceleration factors. DiffuseRecon is stochastic in nature and generates results from a distribution of fully-sampled MR images; as such, it allows us to explicitly visualize different potential reconstruction solutions. Lastly, DiffuseRecon proposes an accelerated, coarse-to-fine Monte-Carlo sampling scheme to approximate the most likely reconstruction candidate. The proposed DiffuseRecon achieves SoTA performances reconstructing from raw acquisition signals in fastMRI and SKM-TEA. Code will be open-sourced at www.github.com/cpeng93/DiffuseRecon.

preprint2022arXiv

Undersampled MRI Reconstruction with Side Information-Guided Normalisation

Magnetic resonance (MR) images exhibit various contrasts and appearances based on factors such as different acquisition protocols, views, manufacturers, scanning parameters, etc. This generally accessible appearance-related side information affects deep learning-based undersampled magnetic resonance imaging (MRI) reconstruction frameworks, but has been overlooked in the majority of current works. In this paper, we investigate the use of such side information as normalisation parameters in a convolutional neural network (CNN) to improve undersampled MRI reconstruction. Specifically, a Side Information-Guided Normalisation (SIGN) module, containing only few layers, is proposed to efficiently encode the side information and output the normalisation parameters. We examine the effectiveness of such a module on two popular reconstruction architectures, D5C5 and OUCR. The experimental results on both brain and knee images under various acceleration rates demonstrate that the proposed method improves on its corresponding baseline architectures with a significant margin.

preprint2022arXiv

Universal tetramer and pentamer in two-dimensional fermionic mixtures

We study the emergence of universal tetramer and pentamer bound states in the two-dimensional $(N+1)$ system, which consists of $N$ identical heavy fermions interacting with a light atom. We show that the critical heavy-light mass ratio to support a ($3+1$) tetramer below the trimer threshold is $3.38$, and to support a ($4+1$) pentamer below the tetramer threshold is $5.14$. While these ground state tetramer and pentamer are both with zero total angular momentum, they exhibit very different density distributions and correlations in momentum space, due to their distinct angular momentum decompositions in the dimer-fermion frame. These universal bound states can be accessible by a number of Fermi-Fermi mixtures now realized in cold atoms laboratories, which also suggest novel few-body correlations dominant in their corresponding many-body systems.

preprint2022arXiv

XraySyn: Realistic View Synthesis From a Single Radiograph Through CT Priors

A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane. Hence, radiograph analysis naturally requires physicians to relate the prior about 3D human anatomy to 2D radiographs. Synthesizing novel radiographic views in a small range can assist physicians in interpreting anatomy more reliably; however, radiograph view synthesis is heavily ill-posed, lacking in paired data, and lacking in differentiable operations to leverage learning-based approaches. To address these problems, we use Computed Tomography (CT) for radiograph simulation and design a differentiable projection algorithm, which enables us to achieve geometrically consistent transformations between the radiography and CT domains. Our method, XraySyn, can synthesize novel views on real radiographs through a combination of realistic simulation and finetuning on real radiographs. To the best of our knowledge, this is the first work on radiograph view synthesis. We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.

preprint2021arXiv

Entanglement and Confinement in Coupled Quantum Systems

We study some general properties of coupled quantum systems. We consider simple interactions between two copies of identical Hamiltonians such as the SYK model, Pauli spin chains with random magnetic field and harmonic oscillators. Such couplings make the ground states close to the thermofield double states of the uncoupled Hamiltonians. For the coupled SYK model, we push the numerical computation further towards the thermodynamic limit so that an extrapolation in the size of the system is possible. We find good agreement between the extrapolated numerical result and the analytic result in the large-$q$ limit. We also consider the coupled gauged matrix model and vector model, and argue that the deconfinement is associated with the loss of the entanglement, similarly to the previous observation for the coupled SYK model. The understanding of the microscopic mechanism of the confinement/deconfinement transition enables us to estimate the quantum entanglement precisely, and backs up the dual gravity interpretation which relates the deconfinement to the disappearance of the wormhole. Our results demonstrate the importance of the entanglement between the color degrees of freedom in the emergence of the bulk geometry from quantum field theory via holography.

preprint2021arXiv

Gapless spin liquid and pair density wave of the Hubbard model on three-leg triangular cylinders

We study the ground state properties of the Hubbard model on three-leg triangular cylinders using large-scale density-matrix renormalization group simulations. At half-filling, we identify an intermediate gapless spin liquid phase between a metallic phase at weak coupling and Mott insulating dimer phase at strong interaction, which has one gapless spin mode and algebraic spin-spin correlations but exponential decay scalar chiral-chiral correlations. Upon light doping the gapless spin liquid, the system exhibits power-law charge-density-wave (CDW) correlations but short-range single-particle, spin-spin, and chiral-chiral correlations. Similar to CDW correlations, the superconducting correlations are also quasi-long-ranged but oscillate in sign as a function of distance, which is consistent with the striped pair-density wave. When further doping the gapless spin liquid phase or doping the dimer order phase, another phase takes over, which has similar CDW correlations but all other correlations decay exponentially.

preprint2021arXiv

Production of neutron-rich heavy nuclei around N = 162 in multinucleon transfer reactions

Within the framework of the dinuclear system model, the production mechanism of neutron-rich heavy nuclei around N = 162 has been investigated systematically. The isotopic yields in the multinucleon transfer reaction of $^{238}$U + $^{248}$Cm was analyzed and compared the available experimental data. Systematics on the production of superheavy nuclei via $^{238}$U on $^{252,254}$Cf, $^{254}$Es and $^{257}$Fm is investigated. It is found that the shell effect is of importance in the formation of neutron-rich nuclei around N=162 owing to the enhancement of fission barrier. The fragments in the multinucleon transfer reactions manifest the broad isotopic distribution and are dependent on the beam energy. The polar angles of the fragments tend to the forward emission with increasing the beam energy. The production cross sections of new isotopes are estimated and heavier targets are available for the neutron-rich superheavy nucleus formation. The optimal system and beam energy are proposed for the future experimental measurements.

preprint2021arXiv

U-DuDoNet: Unpaired dual-domain network for CT metal artifact reduction

Recently, both supervised and unsupervised deep learning methods have been widely applied on the CT metal artifact reduction (MAR) task. Supervised methods such as Dual Domain Network (Du-DoNet) work well on simulation data; however, their performance on clinical data is limited due to domain gap. Unsupervised methods are more generalized, but do not eliminate artifacts completely through the sole processing on the image domain. To combine the advantages of both MAR methods, we propose an unpaired dual-domain network (U-DuDoNet) trained using unpaired data. Unlike the artifact disentanglement network (ADN) that utilizes multiple encoders and decoders for disentangling content from artifact, our U-DuDoNet directly models the artifact generation process through additions in both sinogram and image domains, which is theoretically justified by an additive property associated with metal artifact. Our design includes a self-learned sinogram prior net, which provides guidance for restoring the information in the sinogram domain, and cyclic constraints for artifact reduction and addition on unpaired data. Extensive experiments on simulation data and clinical images demonstrate that our novel framework outperforms the state-of-the-art unpaired approaches.

preprint2020arXiv

Ellipse R-CNN: Learning to Infer Elliptical Object from Clustering and Occlusion

Images of heavily occluded objects in cluttered scenes, such as fruit clusters in trees, are hard to segment. To further retrieve the 3D size and 6D pose of each individual object in such cases, bounding boxes are not reliable from multiple views since only a little portion of the object's geometry is captured. We introduce the first CNN-based ellipse detector, called Ellipse R-CNN, to represent and infer occluded objects as ellipses. We first propose a robust and compact ellipse regression based on the Mask R-CNN architecture for elliptical object detection. Our method can infer the parameters of multiple elliptical objects even they are occluded by other neighboring objects. For better occlusion handling, we exploit refined feature regions for the regression stage, and integrate the U-Net structure for learning different occlusion patterns to compute the final detection score. The correctness of ellipse regression is validated through experiments performed on synthetic data of clustered ellipses. We further quantitatively and qualitatively demonstrate that our approach outperforms the state-of-the-art model (i.e., Mask R-CNN followed by ellipse fitting) and its three variants on both synthetic and real datasets of occluded and clustered elliptical objects.

preprint2020arXiv

Potential Field: Interpretable and Unified Representation for Trajectory Prediction

Predicting an agent's future trajectory is a challenging task given the complicated stimuli (environmental/inertial/social) of motion. Prior works learn individual stimulus from different modules and fuse the representations in an end-to-end manner, which makes it hard to understand what are actually captured and how they are fused. In this work, we borrow the notion of potential field from physics as an interpretable and unified representation to model all stimuli. This allows us to not only supervise the intermediate learning process, but also have a coherent method to fuse the information of different sources. From the generated potential fields, we further estimate future motion direction and speed, which are modeled as Gaussian distributions to account for the multi-modal nature of the problem. The final prediction results are generated by recurrently moving past location based on the estimated motion direction and speed. We show state-of-the-art results on the ETH, UCY, and Stanford Drone datasets.

preprint2020arXiv

SAINT: Spatially Aware Interpolation NeTwork for Medical Slice Synthesis

Deep learning-based single image super-resolution (SISR) methods face various challenges when applied to 3D medical volumetric data (i.e., CT and MR images) due to the high memory cost and anisotropic resolution, which adversely affect their performance. Furthermore, mainstream SISR methods are designed to work over specific upsampling factors, which makes them ineffective in clinical practice. In this paper, we introduce a Spatially Aware Interpolation NeTwork (SAINT) for medical slice synthesis to alleviate the memory constraint that volumetric data poses. Compared to other super-resolution methods, SAINT utilizes voxel spacing information to provide desirable levels of details, and allows for the upsampling factor to be determined on the fly. Our evaluations based on 853 CT scans from four datasets that contain liver, colon, hepatic vessels, and kidneys show that SAINT consistently outperforms other SISR methods in terms of medical slice synthesis quality, while using only a single model to deal with different upsampling factors.

preprint2019arXiv

Anatomy of Deconfinement

In the weak coupling limit of ${\rm SU}(N)$ Yang-Mills theory and the ${\rm O}(N)$ vector model, explicit state counting allows us to demonstrate the existence of a partially deconfined phase: $M$ of $N$ colors deconfine, and $\frac{M}{N}$ gradually grows from zero (confinement) to one (complete deconfinement). We point out that the mechanism admits a simple interpretation in the form of spontaneous breaking of gauge symmetry. In terms of the dual gravity theory, such breaking occurs during the formation of a black hole. We speculate whether the breaking and restoration of gauge symmetry can serve as an alternative definition of the deconfinement transition in theories without center symmetry, such as QCD. We also discuss the role of the color degrees of freedom in the emergence of the bulk geometry in holographic duality.

preprint2019arXiv

Generative Tensor Network Classification Model for Supervised Machine Learning

Tensor network (TN) has recently triggered extensive interests in developing machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance.

preprint2019arXiv

Lecture Notes of Tensor Network Contractions

Tensor network (TN), a young mathematical tool of high vitality and great potential, has been undergoing extremely rapid developments in the last two decades, gaining tremendous success in condensed matter physics, atomic physics, quantum information science, statistical physics, and so on. In this lecture notes, we focus on the contraction algorithms of TN as well as some of the applications to the simulations of quantum many-body systems. Starting from basic concepts and definitions, we first explain the relations between TN and physical problems, including the TN representations of classical partition functions, quantum many-body states (by matrix product state, tree TN, and projected entangled pair state), time evolution simulations, etc. These problems, which are challenging to solve, can be transformed to TN contraction problems. We present then several paradigm algorithms based on the ideas of the numerical renormalization group and/or boundary states, including density matrix renormalization group, time-evolving block decimation, coarse-graining/corner tensor renormalization group, and several distinguished variational algorithms. Finally, we revisit the TN approaches from the perspective of multi-linear algebra (also known as tensor algebra or tensor decompositions) and quantum simulation. Despite the apparent differences in the ideas and strategies of different TN algorithms, we aim at revealing the underlying relations and resemblances in order to present a systematic picture to understand the TN contraction approaches.