Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2026arXiv

DGA-Net: Enhancing SAM with Depth Prompting and Graph-Anchor Guidance for Camouflaged Object Detection

To fully exploit depth cues in Camouflaged Object Detection (COD), we present DGA-Net, a specialized framework that adapts the Segment Anything Model (SAM) via a novel ``depth prompting" paradigm. Distinguished from existing approaches that primarily rely on sparse prompts (e.g., points or boxes), our method introduces a holistic mechanism for constructing and propagating dense depth prompts. Specifically, we propose a Cross-modal Graph Enhancement (CGE) module that synthesizes RGB semantics and depth geometric within a heterogeneous graph to form a unified guidance signal. Furthermore, we design an Anchor-Guided Refinement (AGR) module. To counteract the inherent information decay in feature hierarchies, AGR forges a global anchor and establishes direct non-local pathways to broadcast this guidance from deep to shallow layers, ensuring precise and consistent segmentation. Quantitative and qualitative experimental results demonstrate that our proposed DGA-Net outperforms the state-of-the-art COD methods.

preprint2026arXiv

RePose: A Real-Time 3D Human Pose Estimation and Biomechanical Analysis Framework for Rehabilitation

We propose a real-time 3D human pose estimation and motion analysis method termed RePose for rehabilitation training. It is capable of real-time monitoring and evaluation of patients'motion during rehabilitation, providing immediate feedback and guidance to assist patients in executing rehabilitation exercises correctly. Firstly, we introduce a unified pipeline for end-to-end real-time human pose estimation and motion analysis using RGB video input from multiple cameras which can be applied to the field of rehabilitation training. The pipeline can help to monitor and correct patients'actions, thus aiding them in regaining muscle strength and motor functions. Secondly, we propose a fast tracking method for medical rehabilitation scenarios with multiple-person interference, which requires less than 1ms for tracking for a single frame. Additionally, we modify SmoothNet for real-time posture estimation, effectively reducing pose estimation errors and restoring the patient's true motion state, making it visually smoother. Finally, we use Unity platform for real-time monitoring and evaluation of patients' motion during rehabilitation, and to display the muscle stress conditions to assist patients with their rehabilitation training.

preprint2026arXiv

Structural Energy Guidance for View-Consistent Text-to-3D Generation

Text-to-3D generation based on diffusion models often suffers from the Janus problem, leading to inconsistent geometry across viewpoints. This work identifies viewpoint bias in 2D diffusion priors as the main cause and proposes Structural Energy-Guided Sampling (SEGS), a training-free and plug-and-play framework to improve multi-view consistency. SEGS constructs a structural energy in the PCA subspace of U-Net features and injects its gradient into the denoising process. It can be easily integrated into SDS/VSD pipelines without retraining. Experiments show that SEGS reduces the Janus Rate by about 10% on average and improves View-CS scores across multiple baselines, including DreamFusion, Magic3D, and LucidDreamer. This method effectively alleviates viewpoint artifacts while preserving appearance fidelity, providing a flexible solution for high-quality text-to-3D content generation.

preprint2025arXiv

Cuddle-Fish: Exploring a Soft Floating Robot with Flapping Wings for Physical Interactions

Flying robots, such as quadrotor drones, offer new possibilities for human-robot interaction but often pose safety risks due to fast-spinning propellers, rigid structures, and noise. In contrast, lighter-than-air flapping-wing robots, inspired by animal movement, offer a soft, quiet, and touch-safe alternative. Building on these advantages, we present Cuddle-Fish, a soft flapping-wing floating robot designed for close-proximity interactions in indoor spaces. Through a user study with 24 participants, we explored their perceptions of the robot and experiences during a series of co-located demonstrations in which the robot moved near them. Results showed that participants felt safe, willingly engaged in touch-based interactions with the robot, and exhibited spontaneous affective behaviours, such as patting, stroking, hugging, and cheek-touching, without external prompting. They also reported positive emotional responses towards the robot. These findings suggest that the soft floating robot with flapping wings can serve as a novel and socially acceptable alternative to traditional rigid flying robots, opening new potential for applications in companionship, affective interaction, and play in everyday indoor environments.

preprint2025arXiv

Ultrafast Exciton-Polariton Transport and Relaxation in Halide Perovskite

Halide perovskites offer a great platform for room-temperature exciton-polaritons (EPs) due to their strong oscillator strength and large exciton binding energy, promising applications in next-generation photonic and polaritonic devices. Efficient manipulation of EP transport and relaxation is critical for device performance, yet their spatiotemporal dynamics across different in-plane momenta (k//) remain poorly understood due to limitations in experimental access. In this work, we employ energy-resolved transient reflectance microscopy (TRM) combined with the dispersion relation of EPs to achieve high-resolution imaging of EP transport at specific k//. This approach directly reveals the quasi-ballistic transport and ultrafast relaxation of EPs in different k// regions, showcasing diffusion as fast as ~490 cm2/s and a relaxation time of ~95.1 fs. Furthermore, by tuning the detuning parameter, we manipulate the ballistic transport group velocity and relaxation time of EPs across varying k//. Our results reveal key insights into the dynamics of EP transport and relaxation, providing valuable guidance for the design and optimization of polaritonic devices.

preprint2022arXiv

Antiphase boundary in CH$_3$NH$_3$PbI$_3$ repels charge carriers while promotes fast ion migrations

Defects in organic-inorganic hybrid perovskites (OIHPs) greatly influence their optoelectronic properties. Identification and better understanding of defects existing in OIHPs is an essential step towards fabricating high-performance perovskite solar cells. However, direct visualizing the defects is still a challenge for OIHPs due to their sensitivity during electron microscopy characterizations. Here, by using low dose scanning transmission electron microscopy techniques, we observe the common existence of antiphase boundary (APB) in CH$_3$NH$_3$PbI$_3$ (MAPbI$_3$), resolve its atomic structure, and correlate it to the electrical/ionic activities and structural instabilities. Such an APB is caused by the half-unit-cell shift of [PbI$_6$]$_4$-octahedron along the [100]/[010] direction, leading to the transformation from corner-sharing [PbI$_6$]$_4$-octahedron in bulk MAPbI$_3$ into edge-sharing ones at the APB. Based on the identified atomic-scale configuration, we further carry out density functional theory calculations and reveal that the APB in MAPbI$_3$ repels both electrons and holes while serves as a fast ion-migration channel, causing a rapid decomposition into PbI$_2$ that is detrimental to optoelectronic performance. These findings provide valuable insights into the relationships between structures and optoelectronic properties of OIHPs and suggest that controlling the APB is essential for their stability.

preprint2022arXiv

Atomic Filter: a Weak Form of Shift Operator for Graph Signals

The shift operation plays a crucial role in the classical signal processing. It is the generator of all the filters and the basic operation for time-frequency analysis, such as windowed Fourier transform and wavelet transform. With the rapid development of internet technology and big data science, a large amount of data are expressed as signals defined on graphs. In order to establish the theory of filtering, windowed Fourier transform and wavelet transform in the setting of graph signals, we need to extend the shift operation of classical signals to graph signals. It is a fundamental problem since the vertex set of a graph is usually not a vector space and the addition operation cannot be defined on the vertex set of the graph. In this paper, based on our understanding on the core role of shift operation in classical signal processing we propose the concept of atomic filters, which can be viewed as a weak form of the shift operator for graph signals. Then, we study the conditions such that an atomic filter is norm-preserving, periodic, or real-preserving. The property of real-preserving holds naturally in the classical signal processing, but no the research has been reported on this topic in the graph signal setting. With these conditions we propose the concept of normal atomic filters for graph signals, which degenerates into the classical shift operator under mild conditions if the graph is circulant. Typical examples of graphs that have or have not normal atomic filters are given. Finally, as an application, atomic filters are utilized to construct time-frequency atoms which constitute a frame of the graph signal space.

preprint2022arXiv

Caging-Pnictogen-Induced Superconductivity in Skutterudites IrX3 (X = As, P)

Here we report on a new kind of compound, XδIr4X12-δ (X = P, As), the first hole-doped skutterudites superconductor. We provide atomic resolution images of the caging As atoms using scanning transmission electron microscopy (STEM). By inserting As atoms into the caged structure under a high pressure, superconductivity emerges with a maximum transition temperature (Tc) of 4.4 K (4.8 K) in IrAs3 (IrP3). In contrast to all of the electron-doped skutterudites, the electronic states around the Fermi level in XδIr4X12-δ are dominated by the caged X atom, which can be described by a simple body-centered tight-binding model, implying a distinct paring mechanism. Our density functional theory (DFT) calculations reveal an intimate relationship between the pressure-dependent local-phonon mode and the enhancement of Tc. The discovery of XδIr4X12-δ provides an arena to investigate the uncharted territory of hole-doped skutterudites, and the method proposed here represents a new strategy of carrier doping in caged structures, without introducing extra elements.

preprint2022arXiv

Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an Auxiliary Space

Diverse human motion prediction aims at predicting multiple possible future pose sequences from a sequence of observed poses. Previous approaches usually employ deep generative networks to model the conditional distribution of data, and then randomly sample outcomes from the distribution. While different results can be obtained, they are usually the most likely ones which are not diverse enough. Recent work explicitly learns multiple modes of the conditional distribution via a deterministic network, which however can only cover a fixed number of modes within a limited range. In this paper, we propose a novel sampling strategy for sampling very diverse results from an imbalanced multimodal distribution learned by a deep generative model. Our method works by generating an auxiliary space and smartly making randomly sampling from the auxiliary space equivalent to the diverse sampling from the target distribution. We propose a simple yet effective network architecture that implements this novel sampling strategy, which incorporates a Gumbel-Softmax coefficient matrix sampling method and an aggressive diversity promoting hinge loss function. Extensive experiments demonstrate that our method significantly improves both the diversity and accuracy of the samplings compared with previous state-of-the-art sampling approaches. Code and pre-trained models are available at https://github.com/Droliven/diverse_sampling.

preprint2022arXiv

Higher central charges and Witt groups

In this paper, we introduce the definitions of signatures of braided fusion categories, which are proved to be invariants of their Witt equivalence classes. These signature assignments define group homomorphisms on the Witt group. The higher central charges of pseudounitary modular categories can be expressed in terms of these signatures, which are applied to prove that the Ising modular categories have infinitely many square roots in the Witt group. This result is further applied to prove a conjecture of Davydov-Nikshych-Ostrik on the super-Witt group: the torsion subgroup generated by the completely anisotropic s-simple braided fusion categories has infinite rank.

preprint2022arXiv

Layer-by-layer growth of bilayer graphene single-crystals enabled by self-transmitting catalytic activity

Direct growth of large-area vertically stacked two-dimensional (2D) van der Waal (vdW) materials is a prerequisite for their high-end applications in integrated electronics, optoelectronics and photovoltaics. Currently, centimetre- to even metre-scale monolayers of single-crystal graphene (MLG) and hexagonal boron nitride (h-BN) have been achieved by epitaxial growth on various single-crystalline substrates. However, in principle, this success in monolayer epitaxy seems extremely difficult to be replicated to bi- or few-layer growth, as the full coverage of the first layer was believed to terminate the reactivity of those adopting catalytic metal surfaces. Here, we report an exceptional layer-by-layer chemical vapour deposition (CVD) growth of large size bi-layer graphene single-crystals, enabled by self-transmitting catalytic activity from platinum (Pt) surfaces to the outermost graphene layers. In-situ growth and real-time surveillance experiments, under well-controlled environments, unambiguously verify that the growth does follow the layer-by-layer mode on open surfaces of MLG/Pt(111). First-principles calculations indicate that the transmittal of catalytic activity is allowed by an appreciable electronic hybridisation between graphene overlayers and Pt surfaces, enabling catalytic dissociation of hydrocarbons and subsequently direct graphitisation of their radicals on the outermost sp2 carbon surface. This self-transmitting catalytic activity is also proven to be robust for tube-furnace CVD in fabricating single-crystalline graphene bi-, tri- and tetra-layers, as well as h-BN few-layers. Our findings offer an exceptional strategy for potential controllable, layer-by-layer and wafer-scale growth of vertically stacked few-layered 2D single crystals.

preprint2022arXiv

Progressively Generating Better Initial Guesses Towards Next Stages for High-Quality Human Motion Prediction

This paper presents a high-quality human motion prediction method that accurately predicts future human poses given observed ones. Our method is based on the observation that a good initial guess of the future poses is very helpful in improving the forecasting accuracy. This motivates us to propose a novel two-stage prediction framework, including an init-prediction network that just computes the good guess and then a formal-prediction network that predicts the target future poses based on the guess. More importantly, we extend this idea further and design a multi-stage prediction framework where each stage predicts initial guess for the next stage, which brings more performance gain. To fulfill the prediction task at each stage, we propose a network comprising Spatial Dense Graph Convolutional Networks (S-DGCN) and Temporal Dense Graph Convolutional Networks (T-DGCN). Alternatively executing the two networks helps extract spatiotemporal features over the global receptive field of the whole pose sequence. All the above design choices cooperating together make our method outperform previous approaches by large margins: 6%-7% on Human3.6M, 5%-10% on CMU-MoCap, and 13%-16% on 3DPW.

preprint2021arXiv

Growth morphology and symmetry selection of interfacial instabilities in anisotropic environments

The displacement of a fluid by another less viscous one in a quasi-two dimensional geometry typically leads to complex fingering patterns. In an isotropic system, dense-branching growth arises, which is characterized by repeated tip-splitting of evolving fingers. When anisotropy is present in the interfacial dynamics, the growth morphology changes to dendritic growth characterized by regular structures. We introduce anisotropy by engraving a six-fold symmetric lattice of channels on a Hele-Shaw cell. We show that the morphology transition in miscible fluids depends not only on the previously reported degree of anisotropy set by the lattice topography, but also on the viscosity ratio between the two fluids. Remarkably, the viscosity ratio and the degree of anisotropy also govern the global features of the dendritic patterns, inducing a systematic change from six-fold towards twelve-fold symmetric dendrites. Varying either control parameter provides a new method to tune the symmetry of complex patterns, which may also have relevance for analogous phenomena of gradient-driven interfacial dynamics, such as directional solidification or electrodeposition.

preprint2021arXiv

Modular categories with transitive Galois actions

In this paper, we study modular categories whose Galois group actions on their simple objects are transitive. We show that such modular categories admit unique factorization into prime transitive factors. The representations of $SL_2(\mathbb{Z})$ associated with transitive modular categories are proven to be minimal and irreducible. Together with the Verlinde formula, we characterize prime transitive modular categories as the Galois conjugates of the adjoint subcategory of the quantum group modular category $\mathcal{C}(\mathfrak{sl}_2,p-2)$ for some prime $p > 3$. As a consequence, we completely classify transitive modular categories. Transitivity of super-modular categories can be similarly defined. A unique factorization of any transitive super-modular category into s-simple transitive factors is obtained, and the split transitive super-modular categories are completely classified.

preprint2021arXiv

Toward the endoscopic classification of unipotent representations of $p$-adic $G_2$

We begin this paper by reviewing the Langlands correspondence for unipotent representations of the exceptional group of type $G_2$ over a $p$-adic field $F$ and present it in an explicit form. Then we compute all ABV-packets, as defined in [CFM+21] following ideas from Vogan's 1993 paper The local Langlands Conjecture, and prove that these packets satisfy properties derived from the expectation that they are generalized A-packets. We attach distributions to ABV-packets for $G_2$ and its endoscopic groups and study a geometric endoscopic transfer of these distributions. This paper builds on earlier work by the same authors.

preprint2020arXiv

Automated classification of stems and leaves of potted plants based on point cloud data

The accurate classification of plant organs is a key step in monitoring the growing status and physiology of plants. A classification method was proposed to classify the leaves and stems of potted plants automatically based on the point cloud data of the plants, which is a nondestructive acquisition. The leaf point training samples were automatically extracted by using the three-dimensional convex hull algorithm, while stem point training samples were extracted by using the point density of a two-dimensional projection. The two training sets were used to classify all the points into leaf points and stem points by utilizing the support vector machine (SVM) algorithm. The proposed method was tested by using the point cloud data of three potted plants and compared with two other methods, which showed that the proposed method can classify leaf and stem points accurately and efficiently.

preprint2020arXiv

Automatic marker-free registration of tree point-cloud data based on rotating projection

Point-cloud data acquired using a terrestrial laser scanner (TLS) play an important role in digital forestry research. Multiple scans are generally used to overcome occlusion effects and obtain complete tree structural information. However, it is time-consuming and difficult to place artificial reflectors in a forest with complex terrain for marker-based registration, a process that reduces registration automation and efficiency. In this study, we propose an automatic coarse-to-fine method for the registration of point-cloud data from multiple scans of a single tree. In coarse registration, point clouds produced by each scan are projected onto a spherical surface to generate a series of two-dimensional (2D) images, which are used to estimate the initial positions of multiple scans. Corresponding feature-point pairs are then extracted from these series of 2D images. In fine registration, point-cloud data slicing and fitting methods are used to extract corresponding central stem and branch centers for use as tie points to calculate fine transformation parameters. To evaluate the accuracy of registration results, we propose a model of error evaluation via calculating the distances between center points from corresponding branches in adjacent scans. For accurate evaluation, we conducted experiments on two simulated trees and a real-world tree. Average registration errors of the proposed method were 0.26m around on simulated tree point clouds, and 0.05m around on real-world tree point cloud.

preprint2020arXiv

Deep Filtering

This paper develops a deep learning method for linear and nonlinear filtering. The idea is to start with a nominal dynamic model and generate Monte Carlo sample paths. Then these samples are used to train a deep neutral network. A least square error is used as a loss function for network training. Then the resulting weights are applied to Monte Carlo sampl\ es from an actual dynamic model. The deep filter obtained in such a way compares favorably to the traditional Kalman filter in linear cases and the extended Kalman filter in nonlinear cases. Moreover, a switching model with jumps is studied to show the adaptiveness and power of our deep filtering method. A main advantage of deep filtering is its robustness when the nominal model and actual model differ. Another advantage of deep filtering is that real data can be used directly to train the deep neutral network. Therefore, one does not need to calibrate the model.

preprint2020arXiv

Enhancing Underexposed Photos using Perceptually Bidirectional Similarity

Although remarkable progress has been made, existing methods for enhancing underexposed photos tend to produce visually unpleasing results due to the existence of visual artifacts (e.g., color distortion, loss of details and uneven exposure). We observed that this is because they fail to ensure the perceptual consistency of visual information between the source underexposed image and its enhanced output. To obtain high-quality results free of these artifacts, we present a novel underexposed photo enhancement approach that is able to maintain the perceptual consistency. We achieve this by proposing an effective criterion, referred to as perceptually bidirectional similarity, which explicitly describes how to ensure the perceptual consistency. Particularly, we adopt the Retinex theory and cast the enhancement problem as a constrained illumination estimation optimization, where we formulate perceptually bidirectional similarity as constraints on illumination and solve for the illumination which can recover the desired artifact-free enhancement results. In addition, we describe a video enhancement framework that adopts the presented illumination estimation for handling underexposed videos. To this end, a probabilistic approach is introduced to propagate illuminations of sampled keyframes to the entire video by tackling a Bayesian Maximum A Posteriori problem. Extensive experiments demonstrate the superiority of our method over the state-of-the-art methods.

preprint2020arXiv

Observation of topological polaritons and photonic magic angles in twisted van der Waals bi-layers

Twisted two-dimensional bi-layers offer exquisite control on the electronic bandstructure through the interlayer rotation and coupling, enabling magic-angle flat-band superconductivity and moiré excitons. Here, we demonstrate how analogous principles, combined with large anisotropy, enable extreme control and manipulation of the photonic dispersion of phonon polaritons (PhPs) in van der Waals (vdW) bi-layers. We experimentally observe tunable topological transitions from open (hyperbolic) to closed (elliptic) dispersion contours in twisted bi-layered α-MoO3 at photonic magic angles, induced by polariton hybridization and robustly controlled by a topological quantity. At these transitions the bilayer dispersion flattens, exhibiting low-loss tunable polariton canalization and diffractionless propagation with resolution below λ0/40. Our findings extend twistronics and moiré physics to nanophotonics and polaritonics, with great potential for nano-imaging, nanoscale light propagation, energy transfer and quantum applications.

preprint2019arXiv

Hybrid Exciton-Plasmon-Polaritons in van der Waals Semiconductor Gratings

Van der Waals materials and heterostructures manifesting strongly bound room temperature exciton states exhibit emergent physical phenomena and are of a great promise for optoelectronic applications. Here, we demonstrate that nanostructured multilayer transition metal dichalcogenides by themselves provide an ideal platform for excitation and control of excitonic modes, paving the way to exciton-photonics. Hence, we show that by patterning the TMDCs into nanoresonators, strong dispersion and avoided crossing of excitons and hybrid polaritons with interaction potentials exceeding 410 meV may be controlled with great precision. We further observe that inherently strong TMDC exciton absorption resonances may be completely suppressed due to excitation of hybrid photon states and their interference. Our work paves the way to a next generation of integrated exciton optoelectronic nano-devices and applications in light generation, computing, and sensing.

preprint2018arXiv

Applying Deep Learning To Airbnb Search

The application to search ranking is one of the biggest machine learning success stories at Airbnb. Much of the initial gains were driven by a gradient boosted decision tree model. The gains, however, plateaued over time. This paper discusses the work done in applying neural networks in an attempt to break out of that plateau. We present our perspective not with the intention of pushing the frontier of new modeling techniques. Instead, ours is a story of the elements we found useful in applying neural networks to a real life product. Deep learning was steep learning for us. To other teams embarking on similar journeys, we hope an account of our struggles and triumphs will provide some useful pointers. Bon voyage!