Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

Contour-Native Bridge Defect Detection and Compact Digital Archiving with Frequency-Supervised Fourier Contours

AI-assisted bridge defect inspection often produces bounding boxes with crude geometry or raster masks that are costly to store, transmit, and reuse. This study investigates how detected defects can be represented as compact, recoverable contour-level vector records in image space. We propose Frequency-Supervised Fourier Series Detection (FS-FSD), which directly regresses Fourier contour descriptors and evaluates boxes, masks, and contours under a unified polygon-space protocol. On 3,767 UAV-collected bridge images with 42,346 defect instances, FS-FSD achieves higher polygon-space accuracy and better matched-TP geometric quality than representative detection, segmentation, and contour baselines. These results show that, compared with bounding boxes and raster masks, Fourier contour records preserve defect-boundary geometry in a more compact, recoverable, and shareable form for engineering review and downstream information workflows. Future work will study the modeling of multi-region, fragmented, and adjacent bridge-defect boundaries and extend the framework toward long-term bridge-defect tracking and lifecycle-oriented management.

preprint2022arXiv

A Frequency-aware Software Cache for Large Recommendation System Embeddings

Deep learning recommendation models (DLRMs) have been widely applied in Internet companies. The embedding tables of DLRMs are too large to fit on GPU memory entirely. We propose a GPU-based software cache approaches to dynamically manage the embedding table in the CPU and GPU memory space by leveraging the id's frequency statistics of the target dataset. Our proposed software cache is efficient in training entire DLRMs on GPU in a synchronized update manner. It is also scaled to multiple GPUs in combination with the widely used hybrid parallel training approaches. Evaluating our prototype system shows that we can keep only 1.5% of the embedding parameters in the GPU to obtain a decent end-to-end training speed.

preprint2022arXiv

Assessing Mediational Processes Using Piecewise Linear Growth Curve Models with Individual Measurement Occasions

Longitudinal processes often unfold concurrently where the growth of two or more longitudinal outcomes are associated. Additionally, if the study under investigation is long, the growth curves may exhibit nonconstant change with respect to time. Multiple existing studies have developed multivariate growth models with nonlinear functional forms to explore joint development where two longitudinal records are correlated over time. However, the relationship between multiple longitudinal outcomes may also be unidirectional. Accordingly, it is of interest to estimate regression coefficients of such unidirectional paths. One statistical tool for such analyses is longitudinal mediation models. In this study, we develop two models to evaluate mediational processes where the linear-linear piecewise growth model is utilized to capture the change patterns. We define the mediational process as either the baseline covariate or the change in covariate influencing the change in the mediator, which, in turn, affects the change in the outcome. We present the proposed models through simulation studies and real-world data analyses. Our simulation studies demonstrate that the proposed mediational models can provide unbiased and accurate point estimates with target coverage probabilities with a 95% confidence interval. The empirical analyses demonstrate that the proposed model can estimate covariates' direct and indirect effects on the change in the outcome. We also provide the corresponding code for the proposed models.

preprint2022arXiv

CODE-MVP: Learning to Represent Source Code from Multiple Views with Contrastive Pre-Training

Recent years have witnessed increasing interest in code representation learning, which aims to represent the semantics of source code into distributed vectors. Currently, various works have been proposed to represent the complex semantics of source code from different views, including plain text, Abstract Syntax Tree (AST), and several kinds of code graphs (e.g., Control/Data Flow Graph). However, most of them only consider a single view of source code independently, ignoring the correspondences among different views. In this paper, we propose to integrate different views with the natural-language description of source code into a unified framework with Multi-View contrastive Pre-training, and name our model as CODE-MVP. Specifically, we first extract multiple code views using compiler tools, and learn the complementary information among them under a contrastive learning framework. Inspired by the type checking in compilation, we also design a fine-grained type inference objective in the pre-training. Experiments on three downstream tasks over five datasets demonstrate the superiority of CODE-MVP when compared with several state-of-the-art baselines. For example, we achieve 2.4/2.3/1.1 gain in terms of MRR/MAP/Accuracy metrics on natural language code retrieval, code similarity, and code defect detection tasks, respectively.

preprint2022arXiv

Compilable Neural Code Generation with Compiler Feedback

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering. Existing deep-learning approaches model code generation as text generation, either constrained by grammar structures in decoder, or driven by pre-trained language models on large-scale code corpus (e.g., CodeGPT, PLBART, and CodeT5). However, few of them account for compilability of the generated programs. To improve compilability of the generated programs, this paper proposes COMPCODER, a three-stage pipeline utilizing compiler feedback for compilable code generation, including language model fine-tuning, compilability reinforcement, and compilability discrimination. Comprehensive experiments on two code generation tasks demonstrate the effectiveness of our proposed approach, improving the success rate of compilation from 44.18 to 89.18 in code completion on average and from 70.3 to 96.2 in text-to-code generation, respectively, when comparing with the state-of-the-art CodeGPT.

preprint2022arXiv

Deep Dimension Reduction for Supervised Representation Learning

The goal of supervised representation learning is to construct effective data representations for prediction. Among all the characteristics of an ideal nonparametric representation of high-dimensional complex data, sufficiency, low dimensionality and disentanglement are some of the most essential ones. We propose a deep dimension reduction approach to learning representations with these characteristics. The proposed approach is a nonparametric generalization of the sufficient dimension reduction method. We formulate the ideal representation learning task as that of finding a nonparametric representation that minimizes an objective function characterizing conditional independence and promoting disentanglement at the population level. We then estimate the target representation at the sample level nonparametrically using deep neural networks. We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero. Our extensive numerical experiments using simulated and real benchmark data demonstrate that the proposed methods have better performance than several existing dimension reduction methods and the standard deep learning models in the context of classification and regression.

preprint2022arXiv

Extending Growth Mixture Model to Assess Heterogeneity in Joint Development with Piecewise Linear Trajectories in the Framework of Individual Measurement Occasions

Researchers continue to be interested in exploring the effects that covariates have on the heterogeneity in trajectories. The inclusion of covariates associated with latent classes allows for a more clear understanding of individual differences and a more meaningful interpretation of latent class membership. Many theoretical and empirical studies have focused on investigating heterogeneity in change patterns of a univariate repeated outcome and examining the effects on baseline covariates that inform the cluster formation. However, developmental processes rarely unfold in isolation; therefore, empirical researchers often desire to examine two or more outcomes over time, hoping to understand their joint development where these outcomes and their change patterns are correlated. This study examines the heterogeneity in parallel nonlinear trajectories and identifies baseline characteristics as predictors of latent classes. Our simulation studies show that the proposed model can tell the clusters of parallel trajectories apart and provide unbiased and accurate point estimates with target coverage probabilities for the parameters of interest in general. We illustrate how to apply the model to investigate the heterogeneity in the joint development of reading and mathematics ability from Grade K to 5. In this real-world example, we also demonstrate how to select covariates that contribute the most to the latent classes and transform candidate covariates from a large set into a more manageable set with retaining the meaningful properties of the original set in the structural equation modeling framework.

preprint2022arXiv

Identification of Autism spectrum disorder based on a novel feature selection method and Variational Autoencoder

The development of noninvasive brain imaging such as resting-state functional magnetic resonance imaging (rs-fMRI) and its combination with AI algorithm provides a promising solution for the early diagnosis of Autism spectrum disorder (ASD). However, the performance of the current ASD classification based on rs-fMRI still needs to be improved. This paper introduces a classification framework to aid ASD diagnosis based on rs-fMRI. In the framework, we proposed a novel filter feature selection method based on the difference between step distribution curves (DSDC) to select remarkable functional connectivities (FCs) and utilized a multilayer perceptron (MLP) which was pretrained by a simplified Variational Autoencoder (VAE) for classification. We also designed a pipeline consisting of a normalization procedure and a modified hyperbolic tangent (tanh) activation function to replace the original tanh function, further improving the model accuracy. Our model was evaluated by 10 times 10-fold cross-validation and achieved an average accuracy of 78.12%, outperforming the state-of-the-art methods reported on the same dataset. Given the importance of sensitivity and specificity in disease diagnosis, two constraints were designed in our model which can improve the model's sensitivity and specificity by up to 9.32% and 10.21%, respectively. The added constraints allow our model to handle different application scenarios and can be used broadly.

preprint2022arXiv

Low-threshold nanolasers based on miniaturized bound states in the continuum

The pursuit of compact lasers with low-thresholds has imposed strict requirements on tight light confinements with minimized radiation losses. Bound states in the continuum (BICs) have been recently demonstrated as an effective mechanism to trap light along the out-of-plane direction, paving the way to low-threshold lasers. To date, most reported BIC lasers are still bulky due to the absence of in-plane light confinement. In this work, we combine BICs and photonic band gaps to realize three-dimensional (3D) light confinements, as referred to miniaturized (mini-) BICs. Together with 3D carrier confinements provided by quantum dots (QDs) as optical gain materials, we have realized highly-compact active BIC resonators with a record-high quality ($Q$) factor up to 32500, which enables single-mode continuous wave (CW) lasing with the lowest threshold of 80 W/cm$^{2}$ among the reported BIC lasers. In addidtion, our photon statistics measurements under both CW and pulsed excitations confirm the occurence of the phase transition from spontaneous emission to stimulated emission, further suggesting that conventional criteria of input-output and linewidth are not sufficient for claiming nanoscale lasing. Our work reveal a via path towards compact BIC lasers with ultra-low power consumption and potentially boost the applications in cavity quantum electrodynamics (QEDs), nonlinear optics and integrated photonics.

preprint2022arXiv

Obtaining interpretable parameters from reparameterizing longitudinal models: transformation matrices between growth factors in two parameter spaces

The linear spline growth model (LSGM), which approximates complex patterns using at least two linear segments, is a popular tool for examining nonlinear change patterns. Among such models, the linear-linear piecewise change pattern is the most straightforward one. An earlier study has proved that other than the intercept and slopes, the knot (or change-point), at which two linear segments join together, can be estimated as a growth factor in a reparameterized longitudinal model in the latent growth curve modeling framework. However, the reparameterized coefficients were no longer directly related to the underlying developmental process and therefore lacked meaningful, substantive interpretation, although they were simple functions of the original parameters. This study proposes transformation matrices between parameters in the original and reparameterized models so that the interpretable coefficients directly related to the underlying change pattern can be derived from reparameterized ones. Additionally, the study extends the existing linear-linear piecewise model to allow for individual measurement occasions, and investigates predictors for the individual-differences in change patterns. We present the proposed methods with simulation studies and a real-world data analysis. Our simulation studies demonstrate that the proposed method can generally provide an unbiased and consistent estimation of model parameters of interest and confidence intervals with satisfactory coverage probabilities. An empirical example using longitudinal mathematics achievement scores shows that the model can estimate the growth factor coefficients and path coefficients directly related to the underlying developmental process, thereby providing meaningful interpretation. For easier implementation, we also provide the corresponding code for the proposed models.

preprint2022arXiv

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation

Real human conversation data are complicated, heterogeneous, and noisy, from which building open-domain dialogue systems remains a challenging task. In fact, such dialogue data still contains a wealth of information and knowledge, however, they are not fully explored. In this paper, we show existing open-domain dialogue generation methods that memorize context-response paired data with autoregressive or encode-decode language models underutilize the training data. Different from current approaches, using external knowledge, we explore a retrieval-generation training framework that can take advantage of the heterogeneous and noisy training data by considering them as "evidence". In particular, we use BERTScore for retrieval, which gives better qualities of the evidence and generation. Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data. Such performance gain is comparable with those improved by enlarging the training set, even better. We also found that the model performance has a positive correlation with the relevance of the retrieved evidence. Moreover, our method performed well on zero-shot experiments, which indicates that our method can be more robust to real-world data.

preprint2022arXiv

Semantic decomposition Network with Contrastive and Structural Constraints for Dental Plaque Segmentation

Segmenting dental plaque from images of medical reagent staining provides valuable information for diagnosis and the determination of follow-up treatment plan. However, accurate dental plaque segmentation is a challenging task that requires identifying teeth and dental plaque subjected to semantic-blur regions (i.e., confused boundaries in border regions between teeth and dental plaque) and complex variations of instance shapes, which are not fully addressed by existing methods. Therefore, we propose a semantic decomposition network (SDNet) that introduces two single-task branches to separately address the segmentation of teeth and dental plaque and designs additional constraints to learn category-specific features for each branch, thus facilitating the semantic decomposition and improving the performance of dental plaque segmentation. Specifically, SDNet learns two separate segmentation branches for teeth and dental plaque in a divide-and-conquer manner to decouple the entangled relation between them. Each branch that specifies a category tends to yield accurate segmentation. To help these two branches better focus on category-specific features, two constraint modules are further proposed: 1) contrastive constraint module (CCM) to learn discriminative feature representations by maximizing the distance between different category representations, so as to reduce the negative impact of semantic-blur regions on feature extraction; 2) structural constraint module (SCM) to provide complete structural information for dental plaque of various shapes by the supervision of an boundary-aware geometric constraint. Besides, we construct a large-scale open-source Stained Dental Plaque Segmentation dataset (SDPSeg), which provides high-quality annotations for teeth and dental plaque. Experimental results on SDPSeg datasets show SDNet achieves state-of-the-art performance.

preprint2022arXiv

Syntax Controlled Knowledge Graph-to-Text Generation with Order and Semantic Consistency

The knowledge graph (KG) stores a large amount of structural knowledge, while it is not easy for direct human understanding. Knowledge graph-to-text (KG-to-text) generation aims to generate easy-to-understand sentences from the KG, and at the same time, maintains semantic consistency between generated sentences and the KG. Existing KG-to-text generation methods phrase this task as a sequence-to-sequence generation task with linearized KG as input and consider the consistency issue of the generated texts and KG through a simple selection between decoded sentence word and KG node word at each time step. However, the linearized KG order is commonly obtained through a heuristic search without data-driven optimization. In this paper, we optimize the knowledge description order prediction under the order supervision extracted from the caption and further enhance the consistency of the generated sentences and KG through syntactic and semantic regularization. We incorporate the Part-of-Speech (POS) syntactic tags to constrain the positions to copy words from the KG and employ a semantic context scoring function to evaluate the semantic fitness for each word in its local context when decoding each word in the generated sentence. Extensive experiments are conducted on two datasets, WebNLG and DART, and achieve state-of-the-art performances.

preprint2022arXiv

Tailoring solid-state single-photon sources with stimulated emissions

The coherent interaction of electromagnetic fields with solid-state two-level systems can yield deterministic quantum light sources for photonic quantum technologies. To date, the performance of semiconductor single-photon sources based on three-level systems is limited mainly due to a lack of high photon indistinguishability. Here, we tailor the cavity-enhanced spontaneous emission from a ladder-type three-level system in a single epitaxial quantum dot (QD) through stimulated emission. After populating the biexciton (XX) of the QD through two-photon resonant excitation (TPE), we use another laser pulse to selectively depopulate the XX state into an exciton (X) state with a predefined polarization. The stimulated XX-X emission modifies the X decay dynamics and yields improved polarized single-photon source characteristics such as a source brightness of 0.030(2), a single-photon purity of 0.998(1), and an indistinguishability of 0.926(4). Our method can be readily applied to existing QD single-photon sources and expands the capabilities of three-level systems for advanced quantum photonic functionalities.

preprint2022arXiv

Two-step growth mixture model to examine heterogeneity in nonlinear trajectories

Empirical researchers are usually interested in investigating the impacts of baseline covariates have when uncovering sample heterogeneity and separating samples into more homogeneous groups. However, a considerable number of studies in the structural equation modeling (SEM) framework usually start with vague hypotheses in terms of heterogeneity and possible reasons. It suggests that (1) the determination and specification of a proper model with covariates is not straightforward, and (2) the exploration process may be computational intensive given that a model in the SEM framework is usually complicated and the pool of candidate covariates is usually huge in the psychological and educational domain where the SEM framework is widely employed. Following Bakk and Kuha (2017), this article presents a two-step growth mixture model (GMM) that examines the relationship between latent classes of nonlinear trajectories and baseline characteristics. Our simulation studies demonstrate that the proposed model is capable of clustering the nonlinear change patterns, and estimating the parameters of interest unbiasedly, precisely, as well as exhibiting appropriate confidence interval coverage. Considering the pool of candidate covariates is usually huge and highly correlated, this study also proposes implementing exploratory factor analysis (EFA) to reduce the dimension of covariate space. We illustrate how to use the hybrid method, the two-step GMM and EFA, to efficiently explore the heterogeneity of nonlinear trajectories of longitudinal mathematics achievement data.

preprint2021arXiv

Estimating Knots and Their Association in Parallel Bilinear Spline Growth Curve Models in the Framework of Individual Measurement Occasions

Latent growth curve models with spline functions are flexible and accessible statistical tools for investigating nonlinear change patterns that exhibit distinct phases of development in manifested variables. Among such models, the bilinear spline growth model (BLSGM) is the most straightforward and intuitive but useful. An existing study has demonstrated that the BLSGM allows the knot (or change-point), at which two linear segments join together, to be an additional growth factor other than the intercept and slopes so that researchers can estimate the knot and its variability in the framework of individual measurement occasions. However, developmental processes usually unfold in a joint development where two or more outcomes and their change patterns are correlated over time. As an extension of the existing BLSGM with an unknown knot, this study considers a parallel BLSGM (PBLSGM) for investigating multiple nonlinear growth processes and estimating the knot with its variability of each process as well as the knot-knot association in the framework of individual measurement occasions. We present the proposed model by simulation studies and a real-world data analysis. Our simulation studies demonstrate that the proposed PBLSGM generally estimate the parameters of interest unbiasedly, precisely and exhibit appropriate confidence interval coverage. An empirical example using longitudinal reading scores, mathematics scores, and science scores shows that the model can estimate the knot with its variance for each growth curve and the covariance between two knots. We also provide the corresponding code for the proposed model.

preprint2021arXiv

Extending Mixture of Experts Model to Investigate Heterogeneity of Trajectories: When, Where and How to Add Which Covariates

Researchers are usually interested in examining the impact of covariates when separating heterogeneous samples into latent classes that are more homogeneous. The majority of theoretical and empirical studies with such aims have focused on identifying covariates as predictors of class membership in the structural equation modeling framework. In other words, the covariates only indirectly affect the sample heterogeneity. However, the covariates' influence on between-individual differences can also be direct. This article presents a mixture model that investigates covariates to explain within-cluster and between-cluster heterogeneity simultaneously, known as a mixture-of-experts (MoE) model. This study aims to extend the MoE framework to investigate heterogeneity in nonlinear trajectories: to identify latent classes, covariates as predictors to clusters, and covariates that explain within-cluster differences in change patterns over time. Our simulation studies demonstrate that the proposed model generally estimates the parameters unbiasedly, precisely and exhibits appropriate empirical coverage for a nominal 95% confidence interval. This study also proposes implementing structural equation model forests to shrink the covariate space of the proposed mixture model. We illustrate how to select covariates and construct the proposed model with longitudinal mathematics achievement data. Additionally, we demonstrate that the proposed mixture model can be further extended in the structural equation modeling framework by allowing the covariates that have direct effects to be time-varying.

preprint2020arXiv

A highly efficient integrated source of twisted single-photons

Photons with a helical phase front (twisted photons) can carry a discrete, in principle, unbounded amount of orbital angular momentum (OAM). Twisted single-photons have been demonstrated as a high-dimensional quantum system with information processing ability far beyond the widely used two-level qubits. To date, the generations of single-photons carrying OAM merely rely on the non-linear process in bulk crystals, e.g., spontaneous parametric down-conversion (SPDC), which unavoidably limits both the efficiency and the scalability of the source. Therefore, an on-demand OAM quantum light source on a semiconductor chip is yet illusive and highly desirable for integrated photonic quantum technologies. Here we demonstrate highly-efficient emission of twisted single-photons from solid-state quantum emitters embedded in a microring with angular gratings. The cavity QED effect allows the generations of single-photons and encoding OAM in the same nanostructure and therefore enables the realization of devices with very small footprints and great scalability. The OAM states of singe-photons are clearly identified via quantum interference of single-photons with themselves. Our device may boost the development of integrated quantum photonic devices with potential applications towards high-dimensional quantum information processing.

preprint2020arXiv

A Novel Recurrent Encoder-Decoder Structure for Large-Scale Multi-view Stereo Reconstruction from An Open Aerial Dataset

A great deal of research has demonstrated recently that multi-view stereo (MVS) matching can be solved with deep learning methods. However, these efforts were focused on close-range objects and only a very few of the deep learning-based methods were specifically designed for large-scale 3D urban reconstruction due to the lack of multi-view aerial image benchmarks. In this paper, we present a synthetic aerial dataset, called the WHU dataset, we created for MVS tasks, which, to our knowledge, is the first large-scale multi-view aerial dataset. It was generated from a highly accurate 3D digital surface model produced from thousands of real aerial images with precise camera parameters. We also introduce in this paper a novel network, called RED-Net, for wide-range depth inference, which we developed from a recurrent encoder-decoder structure to regularize cost maps across depths and a 2D fully convolutional network as framework. RED-Net's low memory requirements and high performance make it suitable for large-scale and highly accurate 3D Earth surface reconstruction. Our experiments confirmed that not only did our method exceed the current state-of-the-art MVS methods by more than 50% mean absolute error (MAE) with less memory and computational cost, but its efficiency as well. It outperformed one of the best commercial software programs based on conventional methods, improving their efficiency 16 times over. Moreover, we proved that our RED-Net model pre-trained on the synthetic WHU dataset can be efficiently transferred to very different multi-view aerial image datasets without any fine-tuning. Dataset are available at http://gpcv.whu.edu.cn/data.

preprint2020arXiv

A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models

Feature selection is important for modeling high-dimensional data, where the number of variables can be much larger than the sample size. In this paper, we develop a support detection and root finding procedure to learn the high dimensional sparse generalized linear models and denote this method by GSDAR. Based on the KKT condition for $\ell_0$-penalized maximum likelihood estimations, GSDAR generates a sequence of estimators iteratively. Under some restricted invertibility conditions on the maximum likelihood function and sparsity assumption on the target coefficients, the errors of the proposed estimate decays exponentially to the optimal order. Moreover, the oracle estimator can be recovered if the target signal is stronger than the detectable level. We conduct simulations and real data analysis to illustrate the advantages of our proposed method over several existing methods, including Lasso and MCP.

preprint2020arXiv

Accounting for correlated horizontal pleiotropy in two-sample Mendelian randomization using correlated instrumental variants

Mendelian randomization (MR) is a powerful approach to examine the causal relationships between health risk factors and outcomes from observational studies. Due to the proliferation of genome-wide association studies (GWASs) and abundant fully accessible GWASs summary statistics, a variety of two-sample MR methods for summary data have been developed to either detect or account for horizontal pleiotropy, primarily based on the assumption that the effects of variants on exposure (γ) and horizontal pleiotropy (α) are independent. This assumption is too strict and can be easily violated because of the correlated horizontal pleiotropy (CHP). To account for this CHP, we propose a Bayesian approach, MR-Corr2, that uses the orthogonal projection to reparameterize the bivariate normal distribution for γ and α, and a spike-slab prior to mitigate the impact of CHP. We develop an efficient algorithm with paralleled Gibbs sampling. To demonstrate the advantages of MR-Corr2 over existing methods, we conducted comprehensive simulation studies to compare for both type-I error control and point estimates in various scenarios. By applying MR-Corr2 to study the relationships between pairs in two sets of complex traits, we did not identify the contradictory causal relationship between HDL-c and CAD. Moreover, the results provide a new perspective of the causal network among complex traits. The developed R package and code to reproduce all the results are available at https://github.com/QingCheng0218/MR.Corr2.

preprint2020arXiv

Large few-layer hexagonal boron nitride flakes for nonlinear optics

Hexagonal boron nitride (hBN) is a layered dielectric material with a wide range of applications in optics and photonics. In this work, we demonstrate a fabrication method for few-layer hBN flakes with areas up to 5000 $\rm μm$. We show that hBN in this form can be integrated with photonic microstructures: as an example, we use a circular Bragg grating (CBG). The layer quality of the exfoliated hBN flake on a CBG is confirmed by second-harmonic generation (SHG) microscopy. We show that the SHG signal is uniform across the hBN sample outside the CBG and is amplified in the centre of the CBG.

preprint2020arXiv

Learning Implicit Generative Models with Theoretical Guarantees

We propose a \textbf{uni}fied \textbf{f}ramework for \textbf{i}mplicit \textbf{ge}nerative \textbf{m}odeling (UnifiGem) with theoretical guarantees by integrating approaches from optimal transport, numerical ODE, density-ratio (density-difference) estimation and deep neural networks. First, the problem of implicit generative learning is formulated as that of finding the optimal transport map between the reference distribution and the target distribution, which is characterized by a totally nonlinear Monge-Ampère equation. Interpreting the infinitesimal linearization of the Monge-Ampère equation from the perspective of gradient flows in measure spaces leads to the continuity equation or the McKean-Vlasov equation. We then solve the McKean-Vlasov equation numerically using the forward Euler iteration, where the forward Euler map depends on the density ratio (density difference) between the distribution at current iteration and the underlying target distribution. We further estimate the density ratio (density difference) via deep density-ratio (density-difference) fitting and derive explicit upper bounds on the estimation error. Experimental results on both synthetic datasets and real benchmark datasets support our theoretical findings and demonstrate the effectiveness of UnifiGem.

preprint2020arXiv

TrajectoryNet: a new spatio-temporal feature learning network for human motion prediction

Human motion prediction is an increasingly interesting topic in computer vision and robotics. In this paper, we propose a new 2D CNN based network, TrajectoryNet, to predict future poses in the trajectory space. Compared with most existing methods, our model focuses on modeling the motion dynamics with coupled spatio-temporal features, local-global spatial features and global temporal co-occurrence features of the previous pose sequence. Specifically, the coupled spatio-temporal features describe the spatial and temporal structure information hidden in the natural human motion sequence, which can be mined by covering the space and time dimensions of the input pose sequence with the convolutional filters. The local-global spatial features that encode different correlations of different joints of the human body (e.g. strong correlations between joints of one limb, weak correlations between joints of different limbs) are captured hierarchically by enlarging the receptive field layer by layer and residual connections from the lower layers to the deeper layers in our proposed convolutional network. And the global temporal co-occurrence features represent the co-occurrence relationship that different subsequences in a complex motion sequence are appeared simultaneously, which can be obtained automatically with our proposed TrajectoryNet by reorganizing the temporal information as the depth dimension of the input tensor. Finally, future poses are approximated based on the captured motion dynamics features. Extensive experiments show that our method achieves state-of-the-art performance on three challenging benchmarks (e.g. Human3.6M, G3D, and FNTU), which demonstrates the effectiveness of our proposed method. The code will be available if the paper is accepted.