Source author record

Chao Tang

Chao Tang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

19works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Towards Customized Multimodal Role-Play

Unified multimodal understanding and generation models enable richer human-AI interaction. Yet jointly customizing a character's persona, dialogue style, and visual identity while maintaining output consistency across modalities remains largely unexplored. To mitigate this gap, we introduce a new task, Customized Multimodal Role-Play (CMRP). We construct the RoleScape-20 dataset comprising 20 characters, including training and evaluation data that cover persona, stylistic descriptions, visual/expressive cues, and text-image interactions. Building on a unified model, we devise UniCharacter, a two-stage training framework containing Unified Supervised Finetuning (Unified-SFT) and character-specific group relative policy optimization (Character-GRPO). Given only 10 images plus corresponding interaction examples, the model acquires the target character and exhibits coherent persona, style, and visual identity in both generated text and images. This process takes about 100 GPU hours. Experiments on the RoleScape-20 dataset show that the proposed method substantially outperforms prior approaches. Ablation studies further validate the effectiveness of our cross-modal consistency design and few-shot customization strategy. We argue that CMRP, coupled with unified modeling, provides a basis for next-generation characterful and immersive interactive agents.

preprint2026arXiv

Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent

Stochastic gradient descent (SGD) is central to deep learning, yet the dynamical origin of its preference for flatter, more generalizable solutions remains unclear. Here, by analyzing SGD learning dynamics, we identify a nonequilibrium mechanism governing solution selection. Numerical experiments reveal a transient exploratory phase in which SGD trajectories repeatedly escape sharp valleys and transition toward flatter regions of the loss landscape. By using a tractable physical model, we show that the SGD noise reshapes the landscape into an effective potential that favors flat solutions. Crucially, we uncover a transient freezing mechanism: as training proceeds, growing energy barriers suppress inter-valley transitions and ultimately trap the dynamics within a single basin. Increasing the SGD noise strength delays this freezing, which enhances convergence to flatter minima. Together, these results provide a unified physical framework linking learning dynamics, loss-landscape geometry, and generalization, and suggest principles for the design of more effective optimization algorithms.

preprint2022arXiv

MorphoSim: An efficient and scalable phase-field framework for accurately simulating multicellular morphologies

The phase field model can accurately simulate the evolution of microstructures with complex morphologies, and it has been widely used for cell modeling in the last two decades. However, compared to other cellular models such as the coarse-grained model and the vertex model, its high computational cost caused by three-dimensional spatial discretization hampered its application and scalability, especially for multicellular organisms. Recently, we built a phase field model coupled with in vivo imaging data to accurately reconstruct the embryonic morphogenesis of Caenorhabditis elegans from 1- to 8-cell stages [Kuang et al, PLoS Comput. Biol., 2022]. In this work, we propose an improved phase field model by using the stabilized numerical scheme and modified volume constriction. Then we present a scalable phase-field framework, MorphoSim, which is 100 times more efficient than the previous one, and can simulate over 100 mechanically interacting cells. Finally, we demonstrate how MorphoSim can be successfully applied to reproduce the assembly, self-repairing, and dissociation of a synthetic artificial multicellular system - the synNotch system.

preprint2022arXiv

Primitive Shape Recognition for Object Grasping

Shape informs how an object should be grasped, both in terms of where and how. As such, this paper describes a segmentation-based architecture for decomposing objects sensed with a depth camera into multiple primitive shapes, along with a post-processing pipeline for robotic grasping. Segmentation employs a deep network, called PS-CNN, trained on synthetic data with 6 classes of primitive shapes and generated using a simulation engine. Each primitive shape is designed with parametrized grasp families, permitting the pipeline to identify multiple grasp candidates per shape region. The grasps are rank ordered, with the first feasible one chosen for execution. For task-free grasping of individual objects, the method achieves a 94.2% success rate placing it amongst the top performing grasp methods when compared to top-down and SE(3)-based approaches. Additional tests involving variable viewpoints and clutter demonstrate robustness to setup. For task-oriented grasping, PS-CNN achieves a 93.0% success rate. Overall, the outcomes support the hypothesis that explicitly encoding shape primitives within a grasping pipeline should boost grasping performance, including task-free and task-relevant grasp prediction.

preprint2022arXiv

Spontaneous mechanical and energetic state transitions during Caenorhabditis elegans gastrulation

Gastrulation, namely cell internalization, is a significant milestone during the development of metazoans from worm to human, which generates multiple embryonic layers with distinct cell fates and spatial organizations. Although many molecular activities are known to facilitate this process, in this paper, we focus on gastrulation of the nematode Caenorhabditis elegans and theoretically demonstrate that even a group of cells with only isotropic repulsive and attractive interactions can experience such internalization behavior when dividing within a confined space. As the cell number increases and cell size decreases, the cells contacted to the eggshell become closer to each other along with harder lateral compression, and a cell that internalizes could effectively increase the cell neighbor distance and lower the potential energy of the system. The multicellular structure transits from single- to double-layer spontaneously with bistable states existing from 15- to 44-cell stages, near the gastrulation timing in vivo. Specifically, the cells with a larger size or placed near a smaller-curvature boundary are easier to internalize. Actively regulating a few cells' internalizations can make the morphogenesis noise-resistant. Our work successfully recaptures the key characteristics in C. elegans gastrulation and provides a rational interpretation of how this phenomenon emerges and is optimally programmed.

preprint2021arXiv

Efficient Frequency Doubling with Active Stabilization on Chip

Thin-film lithium niobate (TFLN) is superior for integrated nanophotonics due to its outstanding properties in nearly all aspects: strong second-order nonlinearity, fast and efficient electro-optic effects, wide transparency window, and little two photon absorption and free carrier scattering. Together, they permit highly integrated nanophotonic circuits capable of complex photonic processing by incorporating disparate elements on the same chip. Yet, there has to be a demonstration that synergizes those superior properties for system advantage. Here we demonstrate such a chip that capitalizes on TFLNs favorable ferroelectricity, high second-order nonlinearity, and strong electro-optic effects. It consists of a monolithic circuit integrating a Z-cut, quasi-phase matched microring with high quality factor and a phase modulator used in active feedback control. By Pound-Drever-Hall locking, it realizes stable frequency doubling at about 50% conversion with only milliwatt pump, marking the highest by far among all nanophotonic platforms with milliwatt pumping. Our demonstration addresses a long-outstanding challenge facing cavity-based optical processing, including frequency conversion, frequency comb generation, and all-optical switching, whose stable performance is hindered by photorefractive or thermal effects. Our results further establish TFLN as an excellent material capable of optical multitasking, as desirable to build multi-functional chip devices.

preprint2020arXiv

Deciphering gene regulation from gene expression dynamics using deep neural network

Complex biological functions are carried out by the interaction of genes and proteins. Uncovering the gene regulation network behind a function is one of the central themes in biology. Typically, it involves extensive experiments of genetics, biochemistry and molecular biology. In this paper, we show that much of the inference task can be accomplished by a deep neural network (DNN), a form of machine learning or artificial intelligence. Specifically, the DNN learns from the dynamics of the gene expression. The learnt DNN behaves like an accurate simulator of the system, on which one can perform in-silico experiments to reveal the underlying gene network. We demonstrate the method with two examples: biochemical adaptation and the gap-gene patterning in fruit fly embryogenesis. In the first example, the DNN can successfully find the two basic network motifs for adaptation - the negative feedback and the incoherent feed-forward. In the second and much more complex example, the DNN can accurately predict behaviors of essentially all the mutants. Furthermore, the regulation network it uncovers is strikingly similar to the one inferred from experiments. In doing so, we develop methods for deciphering the gene regulation network hidden in the DNN "black box". Our interpretable DNN approach should have broad applications in genotype-phenotype mapping.

preprint2020arXiv

Generative Adversarial Network-Based Sinogram Super-Resolution for Computed Tomography Imaging

Compared with the conventional 1*1 acquisition mode of projection in computed tomography (CT) image reconstruction, the 2*2 acquisition mode improves the collection efficiency of the projection and reduces the X-ray exposure time. However, the collected projection based on the 2*2 acquisition mode has low resolution (LR) and the reconstructed image quality is poor, thus limiting the use of this mode in CT imaging systems. In this study, a novel sinogram-super-resolution generative adversarial network (SSR-GAN) model is proposed to obtain high-resolution (HR) sinograms from LR sinograms, thereby improving the reconstruction image quality under the 2*2 acquisition mode. The proposed generator is based on the residual network for LR sinogram feature extraction and super-resolution (SR) sinogram generation. A relativistic discriminator is designed to render the network capable of obtaining more realistic SR sinograms. Moreover, we combine the cycle consistency loss, sinogram domain loss, and reconstruction image domain loss in the total loss function to supervise SR sinogram generation. Then, a trained model can be obtained by inputting the paired LR/HR sinograms into the network. Finally, the classic FBP reconstruction algorithm is used for CT image reconstruction based on the generated SR sinogram. The qualitative and quantitative results of evaluations on digital and real data illustrate that the proposed model not only obtains clean SR sinograms from noisy LR sinograms but also outperforms its counterparts.

preprint2020arXiv

New structure canditates for the experimentally synthesized heptazine-based and triazine-based two dimensional graphitic carbon nitride

The widely used crystal structures for both heptazine-based and triazine-based two-dimensional (2D) graphitic carbon nitride (g-C$_3$N$_4$) are the flat P-6m2 configurations. However, the experimentally synthesized 2D g-C$_3$N$_4$ possess thickness ranging in 0.2-0.5 nm, indicating that the theoretically used flat P-6m2 configurations are not the correct ground states. In this work, we propose three new corrugated structures P321, P3m1 and Pca21 with energies of 66 (86), 77 (87) and 78 (89) meV/atom lower than that of the corresponding heptazine-based (triazine-based) g-C$_3$N$_4$ in flat P-6m2 configuration, respectively. These corrugated structures have very similar periodic patterns to the flat P-6m2 ones and they are difficult to be distinguished from each other according to their top-views. The optimized thicknesses of the three corrugated structures ranging in 1.347-3.142 Å are in good agreement with the experimental results. The first-principles results show that these corrugated structural candidates are also semiconductors with band gaps slightly larger than those of the correspondingly flat P-6m2 ones. Furthermore, they possess also suitable band edge positions for sun-light-driven water-splitting at both $pH=0$ and $pH=7$ environments. Our results show that these three new structures are more promising candidates for the experimentally synthesized g-C$_3$N$_4$.

preprint2020arXiv

Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks

Affordance information about a scene provides important clues as to what actions may be executed in pursuit of meeting a specified goal state. Thus, integrating affordance-based reasoning into symbolic action plannning pipelines would enhance the flexibility of robot manipulation. Unfortunately, the top performing affordance recognition methods use object category priors to boost the accuracy of affordance detection and segmentation. Object priors limit generalization to unknown object categories. This paper describes an affordance recognition pipeline based on a category-agnostic region proposal network for proposing instance regions of an image across categories. To guide affordance learning in the absence of category priors, the training process includes the auxiliary task of explicitly inferencing existing affordances within a proposal. Secondly, a self-attention mechanism trained to interpret each proposal learns to capture rich contextual dependencies through the region. Visual benchmarking shows that the trained network, called AffContext, reduces the performance gap between object-agnostic and object-informed affordance recognition. AffContext is linked to the Planning Domain Definition Language (PDDL) with an augmented state keeper for action planning across temporally spaced goal-oriented tasks. Manipulation experiments show that AffContext can successfully parse scene content to seed a symbolic planner problem specification, whose execution completes the target task. Additionally, task-oriented grasping for cutting and pounding actions demonstrate the exploitation of multiple affordances for a given object to complete specified tasks.

preprint2020arXiv

Theoretical prediction of a low-energy Stone-Wales graphene with intrinsic type-III Dirac-cone

Based on first-principles method we predict a new low-energy Stone-Wales graphene SW40, which has an orthorhombic lattice with Pbam symmetry and 40 carbon atoms in its crystalline cell forming well-arranged Stone-Wales patterns. The calculated total energy of SW40 is just about 133 meV higher than that of graphene, indicating its excellent stability exceeds all the previously proposed graphene allotropes. We find that SW40 processes intrinsic Type-III Dirac-cone (Phys. Rev. Lett., 120, 237403, 2018) formed by band-crossing of a local linear-band and a local flat-band, which can result in highly anisotropic Fermions in the system. Interestingly, such intrinsic type-III Dirac-cone can be effectively tuned by inner-layer strains and it will be transferred into Type-II and Type-I Dirac-cones under tensile and compressed strains, respectively. Finally, a general tight-binding model was constructed to understand the electronic properties nearby the Fermi-level in SW40. The results show that type-III Dirac-cone feature can be well understood by the $π$-electron interactions between adjacent Stone-Wales defects.

preprint2020arXiv

Ultra-bright Quantum Photon Sources on Chip

Quantum photon sources of high rate, brightness, and purity are increasingly desirable as quantum information systems are quickly scaled up and applied to many fields. Using a periodically poled lithium niobate microresonator on chip, we demonstrate photon-pair generation at high rates of 8.5 MHz and 36.3 MHz using only 3.4-$μ$W and 13.4-$μ$W pump power, respectively, marking orders of magnitude improvement over the state-of-the-art. The measured coincidence to accidental ratio is well above 100 at those high rates and reaches $14,682\pm 4427$ at a lower pump power. The same chip enables heralded single-photon generation at tens of megahertz rates, each with low auto-correlation $g^{(2)}_{H}(0)=0.008$ and $0.097$ for the microwatt pumps. Such distinct performance, facilitated by the chip device's noiseless and giant optical nonlinearity, will contribute to the forthcoming pervasive adoption of quantum optical information technologies.

preprint2020arXiv

Ultra-efficient and highly-tunable second-harmonic generation in Z-cut periodically poled lithium niobate nanowaveguides

Thin-film lithium niobate on insulator (LNOI) has emerged as a superior integrated-photonics platform for linear, nonlinear, and electro-optics. Here we combine quasi-phase-matching, dispersion engineering, and tight mode confinement to realize nonlinear parametric processes with both high efficiency and wide wavelength tunability. On a millimeter-long, Z-cut LNOI waveguide, we demonstrate ultra-efficient ($1900\pm500 \% $W$^{-1}$cm$^{-2}$) and highly tunable (-1.71 nm/K) second-harmonic generation from 1530 to 1583 nm by type-0 quasi-phase-matching. Our technique is applicable to optical harmonic generation, quantum light sources, frequency conversion, and many other photonic information processing across visible to mid-IR spectral bands.

preprint2018arXiv

Critical slowing down and attractive manifold: a mechanism for dynamic robustness in yeast cell-cycle process

The biological processes that execute complex multiple functions, such as cell cycle, must ensure the order of sequential events and keep the dynamic robustness against various fluctuations. Here, we examine the dynamic mechanism and the fundamental structure to achieve these properties in the cell-cycle process of budding yeast Saccharomyces cerevisiae. We show that the budding yeast cell-cycle process behaves like an excitable system containing three well-coupled saddle-node bifurcations to execute DNA replication and mitosis events. The yeast cell-cycle regulatory network can be separated into G1/S phase module, early M module and late M phase module, where the positive feedbacks in each module and the interactions among the modules play important role. If the cell-cycle process operates near the critical points of the saddle-node bifurcations, there is a critical slowing down or ghost effect. This can provide the cell-cycle process with a sufficient duration for each event and an attractive manifold for the state checking of the completion of DNA replication and mitosis; moreover, the fluctuation in the early module/event is forbidden to transmit to the latter module/event. Our results suggest both a fundamental structure of cell-cycle regulatory network and a hint for the evolution of eukaryotic cell-cycle processes, from the dynamic checking mechanism to the molecule checkpoint pathway.

preprint2016arXiv

Dynamics of ellipsoidal tracers in swimming algal suspensions

Enhanced diffusion of passive tracers immersed in active fluids is a universal feature of active fluids and has been extensively studied in recent years. Similar to microrheology for equilibrium complex fluids, the unusual enhanced particle dynamics reveal intrinsic properties of active fluids. Nevertheless, previous studies have shown that the translational dynamics of spherical tracers are qualitatively similar, independent of whether active particles are pushers or pullers---the two fundamental classes of active fluids. Is it possible to distinguish pushers from pullers by simply imaging the dynamics of passive tracers? Here, we investigated the diffusion of isolated ellipsoids in algal C. reinhardtii suspensions---a model for puller-type active fluids. In combination with our previous results on pusher-type E. coli suspensions [Peng et al., Phys. Rev. Lett. 116, 068303 (2016)], we showed that the dynamics of asymmetric tracers show a profound difference in pushers and pullers due to their rotational degree of freedom. Although the laboratory-frame translation and rotation of ellipsoids are enhanced in both pushers and pullers, similar to spherical tracers, the anisotropic diffusion in the body frame of ellipsoids shows opposite trends in the two classes of active fluids. An ellipsoid diffuses fastest along its major axis when immersed in pullers, whereas it diffuses slowest along the major axis in pushers. This striking difference can be qualitatively explained using a simple hydrodynamic model. In addition, our study on algal suspensions reveals that the influence of the near-field advection of algal swimming flows on the translation and rotation of ellipsoids shows different ranges and strengths. Our work provides not only new insights into universal organizing principles of active fluids, but also a convenient tool for detecting the class of active particles.

preprint2015arXiv

Two dimensional topological insulators with tunable band gaps: HgTe and HgSe monolayers

Employing ab initio electronic calculations, we propose a new type of two-dimensional (2D) topological insulator (TI), monolayer (ML) low buckled (LB) mercury telluride (HgTe) and mercury selenide (HgSe), with tunable band gaps. Monolayer LB HgTe undergoes a transition to a topological nontrivial phase under the appropriate in-plane tensile strain (ε > 2.6%) due to the combination effects of strain and spin orbital coupling (SOC). Under the 2.6%< ε <4.2% tensile strain, the band inversion and topological nontrivial gap are induced by the SOC. For ε >4.2%, the band inversion is already realized by strain but the topological gap is induced by SOC. The band gap of monolayer LB HgTe TI phase can be tuned over a wide range from 0 eV to 0.20 eV as the tensile strain increases from 2.6% to 7.4%. Similarly, the topological phase transition of monolayer LB HgSe is induced by strain and SOC as the strain ε >3.1%. The topological band gap can be 0.05 eV as the strain increases to about 4.6%. The large band gap of 2D LB HgTe and HgSe monolayers make this type of material suitable for practical applications at room-temperature.

preprint2014arXiv

Community detection for networks with unipartite and bipartite structure

Finding community structures in networks is important in network science, technology, and applications. To date, most algorithms that aim to find community structures only focus either on unipartite or bipartite networks. A unipartite network consists of one set of nodes and a bipartite network consists of two nonoverlapping sets of nodes with only links joining the nodes in different sets. However, a third type of network exists, defined here as the mixture network. Just like a bipartite network, a mixture network also consists of two sets of nodes, but some nodes may simultaneously belong to two sets, which breaks the nonoverlapping restriction of a bipartite network. The mixture network can be considered as a general case, with unipartite and bipartite networks viewed as its limiting cases. A mixture network can represent not only all the unipartite and bipartite networks, but also a wide range of real-world networks that cannot be properly represented as either unipartite or bipartite networks in fields such as biology and social science. Based on this observation, we first propose a probabilistic model that can find modules in unipartite, bipartite, and mixture networks in a unified framework based on the link community model for a unipartite undirected network [B Ball et al (2011 Phys. Rev. E 84 036103)]. We test our algorithm on synthetic networks (both overlapping and nonoverlapping communities) and apply it to two real-world networks: a southern women bipartite network and a human transcriptional regulatory mixture network. The results suggest that our model performs well for all three types of networks, is competitive with other algorithms for unipartite or bipartite networks, and is applicable to real-world networks.

preprint2014arXiv

Generic Properties of Random Gene Regulatory Networks

Modeling gene regulatory networks (GRNs) is an important topic in systems biology. Although there has been much work focusing on various specific systems, the generic behavior of GRNs with continuous variables is still elusive. In particular, it is not clear typically how attractors partition among the three types of orbits: steady state, periodic and chaotic, and how the dynamical properties change with network's topological characteristics. In this work, we first investigated these questions in random GRNs with different network sizes, connectivity, fraction of inhibitory links and transcription regulation rules. Then we searched for the core motifs that govern the dynamic behavior of large GRNs. We show that the stability of a random GRN is typically governed by a few embedding motifs of small sizes, and therefore can in general be understood in the context of these short motifs. Our results provide insights for the study and design of genetic networks.

preprint2001arXiv

Fast Tree Search for Enumeration of a Lattice Model of Protein Folding

Using a fast tree-searching algorithm and a Pentium cluster, we enumerated all the sequences and compact conformations (structures) for a protein folding model on a cubic lattice of size $4\times3\times3$. We used two types of amino acids -- hydrophobic (H) and polar (P) -- to make up the sequences, so there were $2^{36} \approx 6.87 \times 10^{10}$ different sequences. The total number of distinct structures was 84,731,192. We made use of a simple solvation model in which the energy of a sequence folded into a structure is minus the number of hydrophobic amino acids in the ``core'' of the structure. For every sequence, we found its ground state or ground states, i.e., the structure or structures for which its energy is lowest. About 0.3% of the sequences have a unique ground state. The number of structures that are unique ground states of at least one sequence is 2,662,050, about 3% of the total number of structures. However, these ``designable'' structures differ drastically in their designability, defined as the number of sequences whose unique ground state is that structure. To understand this variation in designability, we studied the distribution of structures in a high dimensional space in which each structure is represented by a string of 1's and 0's, denoting core and surface sites, respectively.

Chao Tang

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Towards Customized Multimodal Role-Play

Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent

MorphoSim: An efficient and scalable phase-field framework for accurately simulating multicellular morphologies

Primitive Shape Recognition for Object Grasping

Spontaneous mechanical and energetic state transitions during Caenorhabditis elegans gastrulation

Efficient Frequency Doubling with Active Stabilization on Chip

Deciphering gene regulation from gene expression dynamics using deep neural network

Generative Adversarial Network-Based Sinogram Super-Resolution for Computed Tomography Imaging

New structure canditates for the experimentally synthesized heptazine-based and triazine-based two dimensional graphitic carbon nitride

Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks

Theoretical prediction of a low-energy Stone-Wales graphene with intrinsic type-III Dirac-cone

Ultra-bright Quantum Photon Sources on Chip

Ultra-efficient and highly-tunable second-harmonic generation in Z-cut periodically poled lithium niobate nanowaveguides

Critical slowing down and attractive manifold: a mechanism for dynamic robustness in yeast cell-cycle process

Dynamics of ellipsoidal tracers in swimming algal suspensions

Two dimensional topological insulators with tunable band gaps: HgTe and HgSe monolayers

Community detection for networks with unipartite and bipartite structure

Generic Properties of Random Gene Regulatory Networks

Fast Tree Search for Enumeration of a Lattice Model of Protein Folding