Researcher profile

Dawei Zhang

Dawei Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

SAMOFT: Robust Multi-Object Tracking via Region and Flow

Multi-object tracking (MOT) is a fundamental task in computer vision that requires continuously tracking multiple targets while maintaining consistent identities across frames. However, most existing approaches primarily rely on instance-level object features for trajectory association, which often leads to degraded performance under challenging conditions such as object deformation, nonlinear motion, and occlusion. In this work, we propose SAMOFT, a robust tracker that leverages pixel-level cues to improve robustness under complex motion scenarios. Specifically, we introduce a Pixel Motion Matching (PMM) module that integrates the Segment Anything Model (SAM) with dense optical flow to refine Kalman filter-based motion prediction using instantaneous foreground pixel motion. To further enhance robustness under unreliable detections, we design a Centroid Distance Matching (CDM) module that performs flexible mask-based centroid matching for low-confidence or partially occluded observations. Moreover, a Distribution-Based Correction (DBC) module models long-tailed motion patterns in a training-free manner using historical optical flow statistics and dynamically corrects trajectory states online. We also incorporate a Cluster-Aware ReID (CA-ReID) strategy to improve the stability and discriminative power of trajectory appearance features. Extensive experiments on the DanceTrack and MOTChallenge benchmarks demonstrate that SAMOFT consistently improves baseline trackers and achieves competitive performance compared with recent state-of-the-art methods, validating the effectiveness of leveraging pixel-level cues for robust multi-object tracking.

preprint2022arXiv

21-Component Compositionally Complex Ceramics: Discovery of Ultrahigh-Entropy Weberite and Fergusonite Phases and a Pyrochlore-Weberite Transition

Two new high-entropy ceramics (HECs) in the weberite and fergusonite structures, along with unexpected formation of ordered pyrochlore phases with ultrahigh-entropy compositions and an abrupt pyrochlore-weberite transition, are discovered in a 21-component oxide system. While the Gibbs phase rule allows 21 equilibrium phases, nine out of the 13 compositions examined possess single HEC phases (with ultrahigh ideal configurational entropies: ~2.7kB per cation or higher on one sublattice in most cases). Notably, (15RE1/15)(Nb1/2Ta1/2)O4 possess a single monoclinic fergusonite (C2/c) phase and (15RE1/15)3(Nb1/2Ta1/2)1O7 form a single orthorhombic (C2221) weberite phase, where 15RE1/15 represents Sc1/15Y1/15La1/15Pr1/15Nd1/15Sm1/15Eu1/15Gd1/15Tb1/15Dy1/15Ho1/15Er1/15Tm1/15Yb1/15Lu1/15. Moreover, a series of eight (15RE1/15)2+x(Ti1/4Zr1/4Ce1/4Hf1/4)2-2x(Nb1/2Ta1/2)xO7 specimens all exhibit single phases, where a pyrochlore-weberite transition occurs within 0.75 < x < 0.8125. This cubic-to-orthorhombic transition does not change the temperature-dependent thermal conductivity appreciably, as the amorphous limit may have already been achieved in the ultrahigh-entropy 21-component oxides. These discoveries expand the diversity and complexity of HECs, towards many-component compositionally complex ceramics (CCCs) and ultrahigh-entropy ceramics.

preprint2022arXiv

Adaptive Pseudo-Siamese Policy Network for Temporal Knowledge Prediction

Temporal knowledge prediction is a crucial task for the event early warning that has gained increasing attention in recent years, which aims to predict the future facts by using relevant historical facts on the temporal knowledge graphs. There are two main difficulties in this prediction task. First, from the historical facts point of view, how to model the evolutionary patterns of the facts to predict the query accurately. Second, from the query perspective, how to handle the two cases where the query contains seen and unseen entities in a unified framework. Driven by the two problems, we propose a novel adaptive pseudo-siamese policy network for temporal knowledge prediction based on reinforcement learning. Specifically, we design the policy network in our model as a pseudo-siamese policy network that consists of two sub-policy networks. In sub-policy network I, the agent searches for the answer for the query along the entity-relation paths to capture the static evolutionary patterns. And in sub-policy network II, the agent searches for the answer for the query along the relation-time paths to deal with unseen entities. Moreover, we develop a temporal relation encoder to capture the temporal evolutionary patterns. Finally, we design a gating mechanism to adaptively integrate the results of the two sub-policy networks to help the agent focus on the destination answer. To assess our model performance, we conduct link prediction on four benchmark datasets, the experimental results demonstrate that our method obtains considerable performance compared with existing methods.

preprint2022arXiv

Different Channels to Transmit Information in a Scattering Medium

A channel should be built to transmit information from one place to another. Imaging is 2 or higher dimensional information communication. Conventionally, an imaging channel comprises a lens and free spaces of its both sides. The transfer function of each part is known; thus, the response of a conventional imaging channel is known as well. Replacing the lens with a scattering layer, the image can still be extracted from the detection plane. That is to say, the scattering medium reconstructs the channel for imaging. Aided by deep learning, we find that different from the lens there are different channels in a scattering medium, i.e., the same scattering medium can construct different channels to match different manners of source encoding. Moreover, we found that without a valid channel the convolution law for a shift-invariant system, i.e., the output is the convolution of its point spread function (PSF) and the input object, is broken, and information cannot be transmitted onto the detection plane. In other words, valid channels are essential to transmit image information through even a shift-invariant system.

preprint2022arXiv

Roles of scattered and ballistic photons in imaging through scattering media: a deep learning-based study

Scattering of light in complex media scrambles optical wavefronts and breaks the principles of conventional imaging methods. For decades, researchers have endeavored to conquer the problem by inventing approaches such as adaptive optics, iterative wavefront shaping, and transmission matrix measurement. That said, imaging through/into thick scattering media remains challenging to date. With the rapid development of computing power, deep learning has been introduced and shown potentials to reconstruct target information through complex media or from rough surfaces. But it also fails once coming to optically thick media where ballistic photons become negligible. Here, instead of treating deep learning only as an image extraction method, whose best-selling advantage is to avoid complicate physical models, we exploit it as a tool to explore the underlying physical principles. By adjusting the weights of ballistic and scattered photons through a random phasemask, it is found that although deep learning can extract images from both scattered and ballistic light, the mechanisms are different: scattering may function as an encryption key and decryption from scattered light is key sensitive, while extraction from ballistic light is stable. Based on this finding, it is hypothesized and experimentally confirmed that the foundation of the generalization capability of trained neural networks for different diffusers can trace back to the contribution of ballistic photons, even though their weights of photon counting in detection are not that significant. Moreover, the study may pave an avenue for using deep learning as a probe in exploring the unknown physical principles in various fields.

preprint2022arXiv

Transfer and evolution of structured polarization in a double-V atomic system

We numerically investigate the transfer of optical information from a vector-vortex control beam to an unstructured probe beam, as mediated by an atomic vapour. The right and left circular components of these beams drive the atomic transitions of a double-$V$ system, with the atoms acting as a spatially varying circular birefringent medium. Modelling the propagation of the light fields, we find that, for short distances, the vectorial light structure is transferred from the control field to the probe. However, for larger propagation lengths, diffraction causes the circular components of the probe field to spatially separate. We model this system for the D1 line of cold rubidium atoms. Our investigation is a first step to investigating the coupled dynamics of internal and external degrees of freedom of atoms in four wave mixing.

preprint2021arXiv

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

Recently, x-vector has been a successful and popular approach for speaker verification, which employs a time delay neural network (TDNN) and statistics pooling to extract speaker characterizing embedding from variable-length utterances. Improvement upon the x-vector has been an active research area, and enormous neural networks have been elaborately designed based on the x-vector, eg, extended TDNN (E-TDNN), factorized TDNN (F-TDNN), and densely connected TDNN (D-TDNN). In this work, we try to identify the optimal architectures from a TDNN based search space employing neural architecture search (NAS), named SpeechNAS. Leveraging the recent advances in the speaker recognition, such as high-order statistics pooling, multi-branch mechanism, D-TDNN and angular additive margin softmax (AAM) loss with a minimum hyper-spherical energy (MHE), SpeechNAS automatically discovers five network architectures, from SpeechNAS-1 to SpeechNAS-5, of various numbers of parameters and GFLOPs on the large-scale text-independent speaker recognition dataset VoxCeleb1. Our derived best neural network achieves an equal error rate (EER) of 1.02% on the standard test set of VoxCeleb1, which surpasses previous TDNN based state-of-the-art approaches by a large margin. Code and trained weights are in https://github.com/wentaozhu/speechnas.git

preprint2019arXiv

Haptic Teleoperation of UAVs through Control Barrier Functions

This paper presents a novel approach to haptic teleoperation. Specifically, we use control barrier functions (CBFs) to generate force feedback to help human operators safely fly quadrotor UAVs. CBFs take a control signal as input and output a control signal that is as close as possible to the initial control signal, while also meeting specified safety constraints. In the proposed method, we generate haptic force feedback based on the difference between a command issued by the human operator and the safe command returned by a CBF. In this way, if the user issues an unsafe control command, the haptic feedback will help guide the user towards the safe input command that is closest to their current command. We conducted a within-subject user study, in which 12 participants flew a simulated UAV in a virtual hallway environment. Participants completed the task with our proposed CBF-based haptic feedback, no haptic feedback, and haptic feedback generated via parametric risk fields, which is a state-of-the-art method described in the literature. The results of this study show that CBF-based haptic feedback can improve a human operator&#39;s ability to safely fly a UAV and reduce the operator&#39;s perceived workload, without sacrificing task efficiency.