Source author record

Yu Deng

Yu Deng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.AP math-ph math.MP Computation and Language cond-mat.mtrl-sci Artificial Intelligence Databases Digital Libraries eess.IV Information Retrieval math.CA math.NT math.PR Multimedia Social and Information Networks Software Engineering

Catalog footprint

What is connected

18works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Cognifold: Always-On Proactive Memory via Cognitive Folding

Existing agent memory remains predominantly reactive and retrieval-based, lacking the capacity to autonomously organize experience into persistent cognitive structure. Toward genuinely autonomous agents, we introduce Cognifold, a brain-inspired "always-on" agent memory designed for the next generation of proactive assistants. CogniFold continuously folds fragmented event streams into self-emerging cognitive structures, bootstrapping progressively higher-level cognition from incoming events and accumulated knowledge. We ground this by extending Complementary Learning Systems (CLS) theory from two layers (hippocampus, neocortex) to three, adding a prefrontal intent layer. Emulating the prefrontal cortex as the locus of intentional control and decision-making, CogniFold achieves this through graph-topology self-organization: cognitive structures proactively assemble under the stream, merge when semantically similar, decay when stale, relink through associative recall, and surface intents when concept-cluster density crosses a threshold. We evaluate structural formation using CogEval-Bench, demonstrating that CogniFold uniquely produces memory structures that match cognitive expectations and concept emergence. Furthermore, across 7 broad-coverage benchmarks spanning five cognitive domains, we validate that CogniFold simultaneously performs robustly on conventional memory benchmarks.

preprint2022arXiv

Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects

3D-aware generative models have demonstrated their superb performance to generate 3D neural radiance fields (NeRF) from a collection of monocular 2D images even for topology-varying object categories. However, these methods still lack the capability to separately control the shape and appearance of the objects in the generated radiance fields. In this paper, we propose a generative model for synthesizing radiance fields of topology-varying objects with disentangled shape and appearance variations. Our method generates deformable radiance fields, which builds the dense correspondence between the density fields of the objects and encodes their appearances in a shared template field. Our disentanglement is achieved in an unsupervised manner without introducing extra labels to previous 3D-aware GAN training. We also develop an effective image inversion scheme for reconstructing the radiance field of an object in a real monocular image and manipulating its shape and appearance. Experiments show that our method can successfully learn the generative model from unstructured monocular images and well disentangle the shape and appearance for objects (e.g., chairs) with large topological variance. The model trained on synthetic data can faithfully reconstruct the real object in a given single image and achieve high-quality texture and shape editing results.

preprint2022arXiv

GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation

3D-aware image generative modeling aims to generate 3D-consistent images with explicitly controllable camera poses. Recent works have shown promising results by training neural radiance field (NeRF) generators on unstructured 2D images, but still can not generate highly-realistic images with fine details. A critical reason is that the high memory and computation cost of volumetric representation learning greatly restricts the number of point samples for radiance integration during training. Deficient sampling not only limits the expressive power of the generator to handle fine details but also impedes effective GAN training due to the noise caused by unstable Monte Carlo sampling. We propose a novel approach that regulates point sampling and radiance field learning on 2D manifolds, embodied as a set of learned implicit surfaces in the 3D volume. For each viewing ray, we calculate ray-surface intersections and accumulate their radiance generated by the network. By training and rendering such radiance manifolds, our generator can produce high quality images with realistic fine details and strong visual 3D consistency.

preprint2022arXiv

Invariant Gibbs measures for the three dimensional cubic nonlinear wave equation

We prove the invariance of the Gibbs measure under the dynamics of the three-dimensional cubic wave equation, which is also known as the hyperbolic $Φ^4_3$-model. This result is the hyperbolic counterpart to seminal works on the parabolic $Φ^4_3$-model by Hairer '14 and Hairer-Matetski '18. The heart of the matter lies in establishing local in time existence and uniqueness of solutions on the statistical ensemble, which is achieved by using a para-controlled Ansatz for the solution, the analytical framework of the random tensor theory, and the combinatorial molecule estimates. The singularity of the Gibbs measure with respect to the Gaussian free field brings out a new caloric representation of the Gibbs measure and a synergy between the parabolic and hyperbolic theories embodied in the analysis of heat-wave stochastic objects. Furthermore from a purely hyperbolic standpoint our argument relies on key new ingredients that include a hidden cancellation between sextic stochastic objects and a new bilinear random tensor estimate.

preprint2022arXiv

Propagation of chaos and the higher order statistics in the wave kinetic theory

This manuscript continues and extends in various directions the result in arXiv:2104.11204, which gave a full derivation of the wave kinetic equation (WKE) from the nonlinear Schrödinger (NLS) equation in dimensions $d\geq 3$. The wave kinetic equation describes the effective dynamics of the second moments of the Fourier modes of the NLS solution at the kinetic timescale, and in the kinetic limit in which the size of the system diverges to infinity and the strength of the nonlinearity vanishes asymptotically according to a specified scaling law. Here, we investigate the behavior of the joint distribution of these Fourier modes and derive their effective limit dynamics at the kinetic timescale. In particular, we prove propagation of chaos in the wave setting: initially independent Fourier modes retain this independence in the kinetic limit. Such statements are central to the formal derivations of all kinetic theories, dating back to the work of Boltzmann (Stosszahlansatz). We obtain this by deriving the asymptotics of the higher Fourier moments, which are given by solutions of the wave kinetic heirarchy (WKH) with factorized initial data. As a byproduct, we also provide a rigorous justification of this hierarchy for general (not necessarily factorized) initial data. We treat both Gaussian and non-Gaussian initial distributions. In the Gaussian setting, we prove propagation of Gaussianity as we show that the asymptotic distribution retains the Gaussianity of the initial data in the limit. In the non-Gaussian setting, we derive the limiting equations for the higher order moments, as well as for the density function (PDF) of the solution. Some of the results we prove were conjectured in the physics literature, others appear to be new. This gives a complete description of the statistics of the solutions in the kinetic limit.

preprint2022arXiv

Rigorous justification of the wave kinetic theory

The main purpose of this expository note is to give a short account of the recent developments in mathematical wave kinetic theory. After reviewing the physical theory, we explain the importance of the notion of a scaling law, which dictates the relation between the asymptotic parameters as the kinetic limit is taken. This sets some natural limitations on the kinetic approximation that were not precisely understood in the literature as far as we know. We then describe our recent and upcoming works that give the first full, mathematically rigorous, derivation of the wave kinetic theory at the natural kinetic timescale. The key new ingredient is a delicate analysis of the diagrammatic expansion that allows to a) uncover highly elaborate cancellations at arbitrary large order of diagrams, and b) overcome difficulties coming from factorial divergences in the expansion and the criticality of the problem. The results mentioned in this note appear in our recent works [16, 17] as well as an upcoming one in [18].

preprint2021arXiv

1550 nm compatible ultrafast photoconductive material based on a GaAs/ErAs/GaAs heterostructure

The sub-bandgap absorption and ultrafast relaxation in a GaAs/ErAs/GaAs heterostructure are reported. The infrared absorption and 1550 nm-excited ultrafast photo-response are studied by Fourier transform infrared (FTIR) spectrometry and time-domain pump-probe technique. The two absorption peaks located at 2.0 um (0.62 eV) and 2.7 um (0.45 eV) are originated from the ErAs/GaAs interfacial Schottky states and sub-bandgap transition within GaAs, respectively. The photo-induced carrier lifetime, excited using 1550 nm light, is measured to be as low as 190 fs for the GaAs/ErAs/GaAs heterostructure, making it a promising material for 1550-nm-technology-compatible, high critical-breakdown-field THz devices. The relaxation mechanism is proposed and the functionality of ErAs is revealed.

preprint2021arXiv

A Deep Learning-Based Approach to Extracting Periosteal and Endosteal Contours of Proximal Femur in Quantitative CT Images

Automatic CT segmentation of proximal femur is crucial for the diagnosis and risk stratification of orthopedic diseases; however, current methods for the femur CT segmentation mainly rely on manual interactive segmentation, which is time-consuming and has limitations in both accuracy and reproducibility. In this study, we proposed an approach based on deep learning for the automatic extraction of the periosteal and endosteal contours of proximal femur in order to differentiate cortical and trabecular bone compartments. A three-dimensional (3D) end-to-end fully convolutional neural network, which can better combine the information between neighbor slices and get more accurate segmentation results, was developed for our segmentation task. 100 subjects aged from 50 to 87 years with 24,399 slices of proximal femur CT images were enrolled in this study. The separation of cortical and trabecular bone derived from the QCT software MIAF-Femur was used as the segmentation reference. We randomly divided the whole dataset into a training set with 85 subjects for 10-fold cross-validation and a test set with 15 subjects for evaluating the performance of models. Two models with the same network structures were trained and they achieved a dice similarity coefficient (DSC) of 97.87% and 96.49% for the periosteal and endosteal contours, respectively. To verify the excellent performance of our model for femoral segmentation, we measured the volume of different parts of the femur and compared it with the ground truth and the relative errors between predicted result and ground truth are all less than 5%. It demonstrated a strong potential for clinical use, including the hip fracture risk prediction and finite element analysis.

preprint2021arXiv

Spread Mechanism and Influence Measurement of Online Rumors in China During the COVID-19 Pandemic

In early 2020, the Corona Virus Disease 2019 (COVID-19) pandemic swept the world.In China, COVID-19 has caused severe consequences. Moreover, online rumors during the COVID-19 pandemic increased people's panic about public health and social stability. At present, understanding and curbing the spread of online rumors is an urgent task. Therefore, we analyzed the rumor spreading mechanism and propose a method to quantify a rumors' influence by the speed of new insiders. The search frequency of the rumor is used as an observation variable of new insiders. The peak coefficient and the attenuation coefficient are calculated for the search frequency, which conforms to the exponential distribution. We designed several rumor features and used the above two coefficients as predictable labels. A 5-fold cross-validation experiment using the mean square error (MSE) as the loss function showed that the decision tree was suitable for predicting the peak coefficient, and the linear regression model was ideal for predicting the attenuation coefficient. Our feature analysis showed that precursor features were the most important for the outbreak coefficient, while location information and rumor entity information were the most important for the attenuation coefficient. Meanwhile, features that were conducive to the outbreak were usually harmful to the continued spread of rumors. At the same time, anxiety was a crucial rumor causing factor. Finally, we discuss how to use deep learning technology to reduce the forecast loss by using the Bidirectional Encoder Representations from Transformers (BERT) model.

preprint2021arXiv

Strichartz estimates for the Schroedinger equation on non-rectangular two-dimensional tori

We propose a conjecture for long time Strichartz estimates on generic (non-rectangular) flat tori. We proceed to partially prove it in dimension 2. Our arguments involve on the one hand Weyl bounds; and on the other hands bounds on the number of solutions of Diophantine problems.

preprint2020arXiv

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set

Recently, deep learning based 3D face reconstruction methods have shown promising results in both quality and efficiency.However, training deep neural networks typically requires a large volume of data, whereas face images with ground-truth 3D face shapes are scarce. In this paper, we propose a novel deep 3D face reconstruction approach that 1) leverages a robust, hybrid loss function for weakly-supervised learning which takes into account both low-level and perception-level information for supervision, and 2) performs multi-image face reconstruction by exploiting complementary information from different images for shape aggregation. Our method is fast, accurate, and robust to occlusion and large pose. We provide comprehensive experiments on three datasets, systematically comparing our method with fifteen recent methods and demonstrating its state-of-the-art performance.

preprint2020arXiv

Crossing Variational Autoencoders for Answer Retrieval

Answer retrieval is to find the most aligned answer from a large set of candidates given a question. Learning vector representations of questions/answers is the key factor. Question-answer alignment and question/answer semantics are two important signals for learning the representations. Existing methods learned semantic representations with dual encoders or dual variational auto-encoders. The semantic information was learned from language models or question-to-question (answer-to-answer) generative processes. However, the alignment and semantics were too separate to capture the aligned semantics between question and answer. In this work, we propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions. Experiments show that our method outperforms the state-of-the-art answer retrieval method on SQuAD.

preprint2020arXiv

Deep 3D Portrait from a Single Image

In this paper, we present a learning-based approach for recovering the 3D geometry of human head from a single portrait image. Our method is learned in an unsupervised manner without any ground-truth 3D data. We represent the head geometry with a parametric 3D face model together with a depth map for other head regions including hair and ear. A two-step geometry learning scheme is proposed to learn 3D head reconstruction from in-the-wild face images, where we first learn face shape on single images using self-reconstruction and then learn hair and ear geometry using pairs of images in a stereo-matching fashion. The second step is based on the output of the first to not only improve the accuracy but also ensure the consistency of overall head geometry. We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images. We alter pose based on the recovered geometry and apply a refinement network trained with adversarial learning to ameliorate the reprojected images and translate them to the real image domain. Extensive evaluations and comparison with previous methods show that our new method can produce high-fidelity 3D head geometry and head pose manipulation results.

preprint2020arXiv

Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning

We propose DiscoFaceGAN, an approach for face image generation of virtual people with disentangled, precisely-controllable latent representations for identity of non-existing people, expression, pose, and illumination. We embed 3D priors into adversarial learning and train the network to imitate the image formation of an analytic 3D face deformation and rendering process. To deal with the generation freedom induced by the domain gap between real and rendered faces, we further introduce contrastive learning to promote disentanglement by comparing pairs of generated images. Experiments show that through our imitative-contrastive learning, the factor variations are very well disentangled and the properties of a generated face can be precisely controlled. We also analyze the learned latent space and present several meaningful properties supporting factor disentanglement. Our method can also be used to embed real images into the disentangled latent space. We hope our method could provide new understandings of the relationship between physical properties and deep image synthesis.

preprint2020arXiv

Random tensors, propagation of randomness, and nonlinear dispersive equations

Abstract. The purpose of this paper is twofold. We introduce the theory of random tensors, which naturally extends the method of random averaging operators in our earlier work arXiv:1910.08492, to study the propagation of randomness under nonlinear dispersive equations. By applying this theory we also solve Conjecture 1.7 in arXiv:1910.08492, and establish almost-sure local well-posedness for semilinear Schrödinger equations in spaces that are subcritical in the probabilistic scaling. The solution we find has an explicit expansion in terms of multilinear Gaussians with adapted random tensor coefficients. In the random setting, the probabilistic scaling is the natural scaling for dispersive equations, and is different from the natural scaling for parabolic equations. Our theory, which covers the full subcritical regime in the probabilistic scaling, can be viewed as the dispersive counterpart of the existing parabolic theories (regularity structure, para-controlled calculus and renormalization group techniques).

preprint2019arXiv

Photoluminescence mapping and time-domain thermo-photoluminescence for rapid imaging and measurement of thermal conductivity of boron arsenide

Cubic boron arsenide (BAs) is attracting greater attention due to the recent experimental demonstration of ultrahigh thermal conductivity \k{appa} above 1000 W/mK. However, its bandgap has not been settled and a simple yet effective method to probe its crystal quality is missing. Furthermore, traditional \k{appa} measurement methods are destructive and time consuming, thus they cannot meet the urgent demand for fast screening of high \k{appa} materials. After we experimentally established 1.82 eV as the indirect bandgap of BAs and observed room-temperature band-edge photoluminescence, we developed two new optical techniques that can provide rapid and non-destructive characterization of \k{appa} with little sample preparation: photoluminescence mapping (PL-mapping) and time-domain thermo-photoluminescence (TDTP). PL-mapping provides nearly real-time image of crystal quality and \k{appa} over mm-sized crystal surfaces; while TDTP allows us to pick up any spot on the sample surface and measure its \k{appa} using nanosecond laser pulses. These new techniques reveal that the apparent single crystals are not only non-uniform in \k{appa}, but also are made of domains of very distinct \k{appa}. Because PL-mapping and TDTP are based on the band-edge PL and its dependence on temperature, they can be applied to other semiconductors, thus paving the way for rapid identification and development of high-\k{appa} semiconducting materials.

preprint2016arXiv

Multispeed Klein-Gordon systems in dimension three

We consider long time evolution of small solutions to general multispeed Klein-Gordon systems in 3+1 dimensions. We prove that such solutions are always global and scatter to a linear flow, thus extending previous partial results. The main new ingredients of our method is an improved linear dispersion estimate exploiting the asymptotic spherical symmetry of Klein-Gordon waves, and a corresponding bilinear oscillatory integral estimate.

preprint2009arXiv

OntoELAN: An Ontology-based Linguistic Multimedia Annotator

Despite its scientific, political, and practical value, comprehensive information about human languages, in all their variety and complexity, is not readily obtainable and searchable. One reason is that many language data are collected as audio and video recordings which imposes a challenge to document indexing and retrieval. Annotation of multimedia data provides an opportunity for making the semantics explicit and facilitates the searching of multimedia documents. We have developed OntoELAN, an ontology-based linguistic multimedia annotator that features: (1) support for loading and displaying ontologies specified in OWL; (2) creation of a language profile, which allows a user to choose a subset of terms from an ontology and conveniently rename them if needed; (3) creation of ontological tiers, which can be annotated with profile terms and, therefore, corresponding ontological terms; and (4) saving annotations in the XML format as Multimedia Ontology class instances and, linked to them, class instances of other ontologies used in ontological tiers. To our best knowledge, OntoELAN is the first audio/video annotation tool in linguistic domain that provides support for ontology-based annotation.

Yu Deng

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Cognifold: Always-On Proactive Memory via Cognitive Folding

Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects

GRAM: Generative Radiance Manifolds for 3D-Aware Image Generation

Invariant Gibbs measures for the three dimensional cubic nonlinear wave equation

Propagation of chaos and the higher order statistics in the wave kinetic theory

Rigorous justification of the wave kinetic theory

1550 nm compatible ultrafast photoconductive material based on a GaAs/ErAs/GaAs heterostructure

A Deep Learning-Based Approach to Extracting Periosteal and Endosteal Contours of Proximal Femur in Quantitative CT Images

Spread Mechanism and Influence Measurement of Online Rumors in China During the COVID-19 Pandemic

Strichartz estimates for the Schroedinger equation on non-rectangular two-dimensional tori

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set

Crossing Variational Autoencoders for Answer Retrieval

Deep 3D Portrait from a Single Image

Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning

Random tensors, propagation of randomness, and nonlinear dispersive equations

Photoluminescence mapping and time-domain thermo-photoluminescence for rapid imaging and measurement of thermal conductivity of boron arsenide

Multispeed Klein-Gordon systems in dimension three

OntoELAN: An Ontology-based Linguistic Multimedia Annotator