Source author record

Miao Zhang

Miao Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

42works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CAGS: Color-Adaptive Volumetric Video Streaming with Dynamic 3D Gaussian Splatting

Volumetric video (VV) streaming enables real-time, immersive access to remote 3D environments, powering telepresence, ecological monitoring, and robotic teleoperation. These applications turn VV streaming into a real-time interface to remote physical environments, imposing new system-level demands for photorealistic scene representation, low-latency interaction, and robust performance under heterogeneous networks. 3D Gaussian Splatting (3DGS) has been widely used for real-time photorealistic rendering, offering superior visual quality and rendering performance, but it faces challenges due to bandwidth consumption. Furthermore, as the foundation of adaptive VV streaming, existing Levels of Detail (LoD) methods based on density are not well-suited to Gaussian representations, leading to visible gaps and severe quality degradation. Recent studies have also explored attribute compression techniques to reduce bandwidth consumption. Our preliminary studies reveal that aggressive attribute compression primarily causes color distortion, which can be effectively corrected in the rendered image using a reference image. Motivated by these findings, we propose a novel Color-Adaptive scheme for adaptive VV streaming that uses vector quantization (VQ) to establish LoDs and correct color distortions with low-resolution reference images. We further present CAGS, an adaptive VV streaming system compatible with diverse Gaussian representations, which integrates the Color-Adaptive scheme by rendering reference images on the streaming server and performing color restoration on the client. Extensive experiments on our prototype system demonstrate that CAGS outperforms the existing adaptive streaming systems in PSNR by 5$\sim$20 dB under fluctuating bandwidth, operates significantly faster than existing scalable Gaussian compression methods, and generalizes across different Gaussian representations.

preprint2026arXiv

Earthquake Source Depth Determination using Single Station Waveforms and Deep Learning

In areas with limited station coverage, earthquake depth constraints are much less accurate than their latitude and longitude. Traditional travel-time-based location methods struggle to constrain depths due to imperfect station distribution and the strong trade-off between source depth and origin time. Identifying depth phases at regional distances is usually hindered by strong wave scattering, which is particularly challenging for low-magnitude events. Deep learning algorithms, capable of extracting various features from seismic waveforms, including phase arrivals, phase amplitudes, as well as phase frequency, offer promising constraints to earthquake depths. In this work, we propose a novel depth feature extraction network (named VGGDepth), which directly maps seismic waveforms to earthquake depth using three-component waveforms. The network structure is adapted from VGG16 in computer vision. It is designed to take single-station three-component waveforms as inputs and produce depths as outputs, achieving a direct mapping from waveforms to depths. Two scenarios are considered in our model development: (1) training and testing solely on the same seismic station, and (2) generalizing by training and testing on different seismic stations within a particular region. We demonstrate the efficacy of our methodology using seismic data from the 2016-2017 Central Apennines, Italy earthquake sequence. Results demonstrate that earthquake depths can be estimated from single stations with uncertainties of hundreds of meters. These uncertainties are further reduced by averaging results from multiple stations. Our method shows strong potential for earthquake depth determination, particularly for events recorded by single or sparsely distributed stations, such as historically instrumented earthquakes.

preprint2026arXiv

Identity-Robust Language Model Generation via Content Integrity Preservation

Large Language Model (LLM) outputs often vary across user sociodemographic attributes, leading to disparities in factual accuracy, utility, and safety, even for objective questions where demographic information is irrelevant. Unlike prior work on stereotypical or representational bias, this paper studies identity-dependent degradation of core response quality. We show empirically that such degradation arises from biased generation behavior, despite factual knowledge being robustly encoded across identities. Motivated by this mismatch, we propose a lightweight, training-free framework for identity-robust generation that selectively neutralizes non-critical identity information while preserving semantically essential attributes, thus maintaining output content integrity. Experiments across four benchmarks and 18 sociodemographic identities demonstrate an average 77% reduction in identity-dependent bias compared to vanilla prompting and a 45% reduction relative to prompt-based defenses. Our work addresses a critical gap in mitigating the impact of user identity cues in prompts on core generation quality.

preprint2026arXiv

MacVQA: Adaptive Memory Allocation and Global Noise Filtering for Continual Visual Question Answering

Visual Question Answering (VQA) requires models to reason over multimodal information, combining visual and textual data. With the development of continual learning, significant progress has been made in retaining knowledge and adapting to new information in the VQA domain. However, current methods often struggle with balancing knowledge retention, adaptation, and robust feature representation. To address these challenges, we propose a novel framework with adaptive memory allocation and global noise filtering called MacVQA for visual question answering. MacVQA fuses visual and question information while filtering noise to ensure robust representations, and employs prototype-based memory allocation to optimize feature quality and memory usage. These designs enable MacVQA to balance knowledge acquisition, retention, and compositional generalization in continual VQA learning. Experiments on ten continual VQA tasks show that MacVQA outperforms existing baselines, achieving 43.38% average accuracy and 2.32% average forgetting on standard tasks, and 42.53% average accuracy and 3.60% average forgetting on novel composition tasks.

preprint2026arXiv

MyGram: Modality-aware Graph Transformer with Global Distribution for Multi-modal Entity Alignment

Multi-modal entity alignment aims to identify equivalent entities between two multi-modal Knowledge graphs by integrating multi-modal data, such as images and text, to enrich the semantic representations of entities. However, existing methods may overlook the structural contextual information within each modality, making them vulnerable to interference from shallow features. To address these challenges, we propose MyGram, a modality-aware graph transformer with global distribution for multi-modal entity alignment. Specifically, we develop a modality diffusion learning module to capture deep structural contextual information within modalities and enable fine-grained multi-modal fusion. In addition, we introduce a Gram Loss that acts as a regularization constraint by minimizing the volume of a 4-dimensional parallelotope formed by multi-modal features, thereby achieving global distribution consistency across modalities. We conduct experiments on five public datasets. Results show that MyGram outperforms baseline models, achieving a maximum improvement of 4.8% in Hits@1 on FBDB15K, 9.9% on FBYG15K, and 4.3% on DBP15K.

preprint2026arXiv

Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping

Recently, post-training methods based on reinforcement learning, with a particular focus on Group Relative Policy Optimization (GRPO), have emerged as the robust paradigm for further advancement of text-to-image (T2I) models. However, these methods are often prone to reward hacking, wherein models exploit biases in imperfect reward functions rather than yielding genuine performance gains. In this work, we identify that normalization could lead to miscalibration and directly removing the prompt-level standard deviation term yields an optimal policy ascent direction that is linear in the advantage but still limits the separation of genuine signals from noise. To mitigate the above issues, we propose Super-Linear Advantage Shaping (SLAS) by revisiting the functional update from an information geometry perspective. By extending the Fisher-Rao information metric with advantage-dependent weighting, SLAS introduces a non-linear geometric structure that reshapes the local policy space. This design relaxes constraints along high-advantage directions to amplify informative updates, while tightening those in low-advantage regions to suppress illusory gradients. In addition, batch-level normalization is applied to stabilize training under varying reward scales. Extensive evaluations demonstrate that SLAS consistently surpasses the DanceGRPO baseline across multiple backbones and benchmarks. In particular, it yields faster training dynamics, improved out-of-domain performance on GenEval and UniGenBench++, and enhanced robustness to model scaling, while mitigating reward hacking and preserving semantic and compositional fidelity in generations.

preprint2026arXiv

Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents

Rapid biodiversity loss underscore the urgency of effective monitoring, yet manual surveys remain resource-intensive. While on-device AI offers a scalable alternative, its performance in the wild is often challenged by environmental variability. Current methods rely heavily on cloud resource, which requires continuous uploading of field data for model retraining. This approach is unsuitable for remote deployments because it consumes limited power and network connectivity. To address these constraints, this research proposes a shift from model adaptation to knowledge adaptation. We introduce an architecture that separates visual perception from reasoning, combining a visual encoder with a dynamic knowledge base. We uses an explicit knowledge base to replace implicitly encoding expert knowledge into model parameters. This method also supports knowledge sustainability by preserving expert insights in a structured form. Through cross-disciplinary collaboration with biologists and Indigenous communities, this work advances ethical AI co-development, fostering responsible and culturally informed ecosystem management.

preprint2026arXiv

TBPLaS 2.0: a Tight-Binding Package for Large-scale Simulation

The common exact diagonalization-based techniques to solving tight-binding models suffer from O(N^2) and O(N^3) scaling with respect to model size in memory and CPU time, hindering their applications in large tight-binding models. On the contrary, the tight-binding propagation method (TBPM) can achieve linear scaling in both memory and CPU time, and is capable of handling large tight-binding models with billions of orbitals. In this paper, we introduce version 2.0 of TBPLaS, a package for large-scale simulation based on TBPM. This new version brings significant improvements with many new features. Existing Python/Cython modeling tools have been thoroughly optimized, and a compatible C++ implementation of the modeling tools is now available, offering efficiency enhancement of several orders. The solvers have been rewritten in C++ from scratch, with the efficiency enhanced by several times or even by an order of magnitude. The workflow of utilizing solvers has also been unified into a more comprehensive and consistent manner. New features include spin texture, Berry curvature and Chern number calculation, partial diagonalization for specific eigenvalues and eigenstates, analytical Hamiltonian, and GPU computing support. The documentation and tutorials have also been updated to the new version. In this paper, we discuss the revisions with respect to version 1.3 and demonstrate the new features. Benchmarks on modeling tools and solvers are also provided.

preprint2026arXiv

Utilizing Earth Foundation Models to Enhance the Simulation Performance of Hydrological Models with AlphaEarth Embeddings

Predicting river flow in places without streamflow records is challenging because basins respond differently to climate, terrain, vegetation, and soils. Traditional basin attributes describe some of these differences, but they cannot fully represent the complexity of natural environments. This study examines whether AlphaEarth Foundation embeddings, which are learned from large collections of satellite images rather than designed by experts, offer a more informative way to describe basin characteristics. These embeddings summarize patterns in vegetation, land surface properties, and long-term environmental dynamics. We find that models using them achieve higher accuracy when predicting flows in basins not used for training, suggesting that they capture key physical differences more effectively than traditional attributes. We further investigate how selecting appropriate donor basins influences prediction in ungauged regions. Similarity based on the embeddings helps identify basins with comparable environmental and hydrological behavior, improving performance, whereas adding many dissimilar basins can reduce accuracy. The results show that satellite-informed environmental representations can strengthen hydrological forecasting and support the development of models that adapt more easily to different landscapes.

preprint2022arXiv

FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for Resource and Data Heterogeneity

Federated Learning (FL) enables training a global model without sharing the decentralized raw data stored on multiple devices to protect data privacy. Due to the diverse capacity of the devices, FL frameworks struggle to tackle the problems of straggler effects and outdated models. In addition, the data heterogeneity incurs severe accuracy degradation of the global model in the FL training process. To address aforementioned issues, we propose a hierarchical synchronous FL framework, i.e., FedHiSyn. FedHiSyn first clusters all available devices into a small number of categories based on their computing capacity. After a certain interval of local training, the models trained in different categories are simultaneously uploaded to a central server. Within a single category, the devices communicate the local updated model weights to each other based on a ring topology. As the efficiency of training in the ring topology prefers devices with homogeneous resources, the classification based on the computing capacity mitigates the impact of straggler effects. Besides, the combination of the synchronous update of multiple categories and the device communication within a single category help address the data heterogeneity issue while achieving high accuracy. We evaluate the proposed framework based on MNIST, EMNIST, CIFAR10 and CIFAR100 datasets and diverse heterogeneous settings of devices. Experimental results show that FedHiSyn outperforms six baseline methods, e.g., FedAvg, SCAFFOLD, and FedAT, in terms of training accuracy and efficiency.

preprint2022arXiv

Handling Data Heterogeneity with Generative Replay in Collaborative Learning for Medical Imaging

Collaborative learning, which enables collaborative and decentralized training of deep neural networks at multiple institutions in a privacy-preserving manner, is rapidly emerging as a valuable technique in healthcare applications. However, its distributed nature often leads to significant heterogeneity in data distributions across institutions. In this paper, we present a novel generative replay strategy to address the challenge of data heterogeneity in collaborative learning methods. Different from traditional methods that directly aggregating the model parameters, we leverage generative adversarial learning to aggregate the knowledge from all the local institutions. Specifically, instead of directly training a model for task performance, we develop a novel dual model architecture: a primary model learns the desired task, and an auxiliary "generative replay model" allows aggregating knowledge from the heterogenous clients. The auxiliary model is then broadcasted to the central sever, to regulate the training of primary model with an unbiased target distribution. Experimental results demonstrate the capability of the proposed method in handling heterogeneous data across institutions. On highly heterogeneous data partitions, our model achieves ~4.88% improvement in the prediction accuracy on a diabetic retinopathy classification dataset, and ~49.8% reduction of mean absolution value on a Bone Age prediction dataset, respectively, compared to the state-of-the art collaborative learning methods.

preprint2022arXiv

MFNet: Multi-class Few-shot Segmentation Network with Pixel-wise Metric Learning

In visual recognition tasks, few-shot learning requires the ability to learn object categories with few support examples. Its re-popularity in light of the deep learning development is mainly in image classification. This work focuses on few-shot semantic segmentation, which is still a largely unexplored field. A few recent advances are often restricted to single-class few-shot segmentation. In this paper, we first present a novel multi-way (class) encoding and decoding architecture which effectively fuses multi-scale query information and multi-class support information into one query-support embedding. Multi-class segmentation is directly decoded upon this embedding. For better feature fusion, a multi-level attention mechanism is proposed within the architecture, which includes the attention for support feature modulation and attention for multi-scale combination. Last, to enhance the embedding space learning, an additional pixel-wise metric learning module is introduced with triplet loss formulated on the pixel-level embedding of the input image. Extensive experiments on standard benchmarks PASCAL-5i and COCO-20i show clear benefits of our method over the state of the art in few-shot segmentation

preprint2022arXiv

Segmenting across places: The need for fair transfer learning with satellite imagery

The increasing availability of high-resolution satellite imagery has enabled the use of machine learning to support land-cover measurement and inform policy-making. However, labelling satellite images is expensive and is available for only some locations. This prompts the use of transfer learning to adapt models from data-rich locations to others. Given the potential for high-impact applications of satellite imagery across geographies, a systematic assessment of transfer learning implications is warranted. In this work, we consider the task of land-cover segmentation and study the fairness implications of transferring models across locations. We leverage a large satellite image segmentation benchmark with 5987 images from 18 districts (9 urban and 9 rural). Via fairness metrics we quantify disparities in model performance along two axes -- across urban-rural locations and across land-cover classes. Findings show that state-of-the-art models have better overall accuracy in rural areas compared to urban areas, through unsupervised domain adaptation methods transfer learning better to urban versus rural areas and enlarge fairness gaps. In analysis of reasons for these findings, we show that raw satellite images are overall more dissimilar between source and target districts for rural than for urban locations. This work highlights the need to conduct fairness analysis for satellite imagery segmentation models and motivates the development of methods for fair transfer learning in order not to introduce disparities between places, particularly urban and rural locations.

preprint2022arXiv

SplitAVG: A heterogeneity-aware federated deep learning method for medical imaging

Federated learning is an emerging research paradigm for enabling collaboratively training deep learning models without sharing patient data. However, the data from different institutions are usually heterogeneous across institutions, which may reduce the performance of models trained using federated learning. In this study, we propose a novel heterogeneity-aware federated learning method, SplitAVG, to overcome the performance drops from data heterogeneity in federated learning. Unlike previous federated methods that require complex heuristic training or hyper parameter tuning, our SplitAVG leverages the simple network split and feature map concatenation strategies to encourage the federated model training an unbiased estimator of the target data distribution. We compare SplitAVG with seven state-of-the-art federated learning methods, using centrally hosted training data as the baseline on a suite of both synthetic and real-world federated datasets. We find that the performance of models trained using all the comparison federated learning methods degraded significantly with the increasing degrees of data heterogeneity. In contrast, SplitAVG method achieves comparable results to the baseline method under all heterogeneous settings, that it achieves 96.2% of the accuracy and 110.4% of the mean absolute error obtained by the baseline in a diabetic retinopathy binary classification dataset and a bone age prediction dataset, respectively, on highly heterogeneous data partitions. We conclude that SplitAVG method can effectively overcome the performance drops from variability in data distributions across institutions. Experimental results also show that SplitAVG can be adapted to different base networks and generalized to various types of medical imaging tasks.

preprint2022arXiv

Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. Nevertheless, it is well known that deep GCNs suffer from the over-smoothing problem, where node representations tend to be indistinguishable as more layers are stacked up. The theoretical research to date on deep GCNs has focused primarily on expressive power rather than trainability, an optimization perspective. Compared to expressivity, trainability attempts to address a more fundamental question: Given a sufficiently expressive space of models, can we successfully find a good solution via gradient descent-based optimizers? This work fills this gap by exploiting the Graph Neural Tangent Kernel (GNTK), which governs the optimization trajectory under gradient descent for wide GCNs. We formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate in the optimization process. Additionally, we extend our theoretical framework to analyze residual connection-based techniques, which are found to be merely able to mitigate the exponential decay of trainability mildly. Inspired by our theoretical insights on trainability, we propose Critical DropEdge, a connectivity-aware and graph-adaptive sampling method, to alleviate the exponential decay problem more fundamentally. Experimental evaluation consistently confirms using our proposed method can achieve better results compared to relevant counterparts with both infinite-width and finite-width.

preprint2021arXiv

Anomalous interfacial dynamics of single proton charges in binary aqueous solutions

Understanding the dynamics of charge exchange between a solid surface and a liquid is fundamental to various situations, ranging from nanofiltration to catalysis and electrochemistry. Charge transfer is ultimately determined by physicochemical processes (surface group dissociation, ion adsorption, etc...) occurring in the few layers of molecules at the interface between the solid and the liquid. Unfortunately, these processes remain largely uncharted due to the experimental challenges in probing interfacial charge dynamics with sufficiently high spatial and temporal resolution. Here, we resolve at the single-charge scale, the dynamics of proton charges at the interface between an hBN crystal and binary mixtures of water and organic amphiphilic solvents (e.g. alcohol), evidencing a dramatic influence of solvation on interfacial dynamics. Our observations rely on the application of spectral Single Molecule Localization Microscopy (sSMLM) to two types of optically active defects at the hBN surface, which act as intrinsic optical markers for both surface protonation and interaction with apolar alkyl groups of the organic solvent. We use sSMLM to reveal interfacial proton charge transport as a succession of jumps between the titratable surface defects, mediated by the transport of the solvated proton charge along the solid/liquid interface. By changing the relative concentration of water in binary mixtures, we evidence a non-trivial effect on interfacial proton charge dynamics, leading at intermediate water concentration to an increased affinity of the proton charge to the solid surface, accompanied by an increased surface diffusivity. These measurements confirm the strong role of solvation on interfacial proton charge transport and establish the potential of single-molecule localization techniques to probe a wide range of dynamic processes at solid/liquid interfaces.

preprint2021arXiv

Exploiting Deep Learning for Secure Transmission in an Underlay Cognitive Radio Network

This paper investigates a machine learning-based power allocation design for secure transmission in a cognitive radio (CR) network. In particular, a neural network (NN)-based approach is proposed to maximize the secrecy rate of the secondary receiver under the constraints of total transmit power of secondary transmitter, and the interference leakage to the primary receiver, within which three different regularization schemes are developed. The key advantage of the proposed algorithm over conventional approaches is the capability to solve the power allocation problem with both perfect and imperfect channel state information. In a conventional setting, two completely different optimization frameworks have to be designed, namely the robust and non-robust designs. Furthermore, conventional algorithms are often based on iterative techniques, and hence, they require a considerable number of iterations, rendering them less suitable in future wireless networks where there are very stringent delay constraints. To meet the unprecedented requirements of future ultra-reliable low-latency networks, we propose an NN-based approach that can determine the power allocation in a CR network with significantly reduced computational time and complexity. As this trained NN only requires a small number of linear operations to yield the required power allocations, the approach can also be extended to different delay sensitive applications and services in future wireless networks. When evaluate the proposed method versus conventional approaches, using a suitable test set, the proposed approach can achieve more than 94% of the secrecy rate performance with less than 1% computation time and more than 93% satisfaction of interference leakage constraints. These results are obtained with significant reduction in computational time, which we believe that it is suitable for future real-time wireless applications.

preprint2020arXiv

Accurate RGB-D Salient Object Detection via Collaborative Learning

Benefiting from the spatial cues embedded in depth images, recent progress on RGB-D saliency detection shows impressive ability on some challenge scenarios. However, there are still two limitations. One hand is that the pooling and upsampling operations in FCNs might cause blur object boundaries. On the other hand, using an additional depth-network to extract depth features might lead to high computation and storage cost. The reliance on depth inputs during testing also limits the practical applications of current RGB-D models. In this paper, we propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way, which solves those problems tactfully. The explicitly extracted edge information goes together with saliency to give more emphasis to the salient regions and object boundaries. Depth and saliency learning is innovatively integrated into the high-level feature learning process in a mutual-benefit manner. This strategy enables the network to be free of using extra depth networks and depth inputs to make inference. To this end, it makes our model more lightweight, faster and more versatile. Experiment results on seven benchmark datasets show its superior performance.

preprint2020arXiv

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

Light field data exhibit favorable characteristics conducive to saliency detection. The success of learning-based light field saliency detection is heavily dependent on how a comprehensive dataset can be constructed for higher generalizability of models, how high dimensional light field data can be effectively exploited, and how a flexible model can be designed to achieve versatility for desktop computers and mobile devices. To answer these questions, first we introduce a large-scale dataset to enable versatile applications for RGB, RGB-D and light field saliency detection, containing 102 classes and 4204 samples. Second, we present an asymmetrical two-stream model consisting of the Focal stream and RGB stream. The Focal stream is designed to achieve higher performance on desktop computers and transfer focusness knowledge to the RGB stream, relying on two tailor-made modules. The RGB stream guarantees the flexibility and memory/computation efficiency on mobile devices through three distillation schemes. Experiments demonstrate that our Focal stream achieves state-of-the-arts performance. The RGB stream achieves Top-2 F-measure on DUTLF-V2, which tremendously minimizes the model size by 83% and boosts FPS by 5 times, compared with the best performing method. Furthermore, our proposed distillation schemes are applicable to RGB saliency models, achieving impressive performance gains while ensuring flexibility.

preprint2020arXiv

Feasibility and A Fast Algorithm for Euclidean Distance Matrix Optimization with Ordinal Constraints

Euclidean distance matrix optimization with ordinal constraints (EDMOC) has found important applications in sensor network localization and molecular conformation. It can also be viewed as a matrix formulation of multidimensional scaling, which is to embed n points in a $r$-dimensional space such that the resulting distances follow the ordinal constraints. The ordinal constraints, though proved to be quite useful, may result in only zero solution when too many are added, leaving the feasibility of EDMOC as a question. In this paper, we first study the feasibility of EDMOC systematically. We show that if $r\ge n-2$, EDMOC always admits a nontrivial solution. Otherwise, it may have only zero solution. The latter interprets the numerical observations of 'crowding phenomenon'. Next we overcome two obstacles in designing fast algorithms for EDMOC, i.e., the low-rankness and the potential huge number of ordinal constraints. We apply the technique developed to take the low rank constraint as the conditional positive semidefinite cone with rank cut. This leads to a majorization penalty approach. The ordinal constraints are left to the subproblem, which is exactly the weighted isotonic regression, and can be solved by the enhanced implementation of Pool Adjacent Violators Algorithm (PAVA). Extensive numerical results demonstrate {the} superior performance of the proposed approach over some state-of-the-art solvers.

preprint2020arXiv

Multi-Domain Learning and Identity Mining for Vehicle Re-Identification

This paper introduces our solution for the Track2 in AI City Challenge 2020 (AICITY20). The Track2 is a vehicle re-identification (ReID) task with both the real-world data and synthetic data. Our solution is based on a strong baseline with bag of tricks (BoT-BS) proposed in person ReID. At first, we propose a multi-domain learning method to joint the real-world and synthetic data to train the model. Then, we propose the Identity Mining method to automatically generate pseudo labels for a part of the testing data, which is better than the k-means clustering. The tracklet-level re-ranking strategy with weighted features is also used to post-process the results. Finally, with multiple-model ensemble, our method achieves 0.7322 in the mAP score which yields third place in the competition. The codes are available at https://github.com/heshuting555/AICITY2020_DMT_VehicleReID.

preprint2020arXiv

Real-time Earthquake Early Warning with Deep Learning: Application to the 2016 Central Apennines, Italy Earthquake Sequence

Earthquake early warning systems are required to report earthquake locations and magnitudes as quickly as possible before the damaging S wave arrival to mitigate seismic hazards. Deep learning techniques provide potential for extracting earthquake source information from full seismic waveforms instead of seismic phase picks. We developed a novel deep learning earthquake early warning system that utilizes fully convolutional networks to simultaneously detect earthquakes and estimate their source parameters from continuous seismic waveform streams. The system determines earthquake location and magnitude as soon as one station receives earthquake signals and evolutionarily improves the solutions by receiving continuous data. We apply the system to the 2016 Mw 6.0 earthquake in Central Apennines, Italy and its subsequent sequence. Earthquake locations and magnitudes can be reliably determined as early as four seconds after the earliest P phase, with mean error ranges of 6.8-3.7 km and 0.31-0.23, respectively.

preprint2020arXiv

Resource Allocation Technique for Hybrid TDMA-NOMA System with Opportunistic Time Assignment

In this paper, we develop a resource allocation technique for a hybrid time division multiple access (TDMA) - non-orthogonal multiple access (NOMA) system with opportunistic time assignment. In particular, the available transmission time is divided into several time-slots, through which multiple users are served by exploiting power-domain NOMA. To fully exploit underlying benefits of this hybrid TDMA-NOMA system, we utilize the available resources efficiently by jointly allocating transmit power and time-slots to several groups of users in the system. Furthermore, these resources are allocated to maximize minimum rate of the users in the system. However, this max-min resource allocation problem is non-convex due to coupled design parameters of time and power allocations. Hence, we exploit a novel second-order cone formulation to overcome this non-convexity issue and develop an iterative algorithm to realize a solution to the original max-min problem. Simulation results show that this joint resource allocation technique has a considerable performance enhancement in terms of both minimum achieved rate and overall system throughput compared to that of the conventional resource allocation technique where equal time-slots are assigned to the groups of users.

preprint2020arXiv

Super-resolved optical mapping of reactive sulfur-vacancy in 2D transition metal dichalcogenides

Transition metal dichalcogenides (TMDs) represent an entire new class of semiconducting 2D materials with exciting properties. Defects in 2D TMDs can crucially affect their physical and chemical properties. However, characterization of the presence and spatial distribution of defects is limited either in throughput or in resolution. Here, we demonstrate large area mapping of reactive sulfur-deficient defects in 2D-TMDs coupling single-molecule localization microscopy with fluorescence labeling using thiol chemistry. Our method, reminiscent of PAINT strategies, relies on the specific binding by reversible physisorption of fluorescent probes to sulfur-vacancies via a thiol group and their intermittent emission to apply localization of the labeled defects with a precision down to 15 nm. Tuning the distance between the fluorophore and the docking thiol site allows us to control Föster Resonance Energy Transfer (FRET) process and reveal large structural defects such as grain boundaries and line defects, due to the local irregular lattice structure. Our methodology provides a simple and fast alternative for large-scale mapping of non-radiative defects in 2D materials and paves the way for in-situ and spatially resolved monitoring of the interaction between chemical agent and the defects in 2D materials that has general implications for defect engineering in aqueous condition.

preprint2016arXiv

High-pressure Phase Stability and Superconductivity of Pnictogen Hydrides and Chemical Trends for Compressed Hydrides

Binary hydrides formed by the pnictogens of phosphorus, arsenic and antimony are studied at high pressures using first principles methods. Stable structures are predicted and their electronic, vibrational and superconducting properties are investigated. We predict that SbH$_{4}$ and AsH$_{8}$ will be high-temperature superconductors at megabar pressures, with critical temperatures in excess of 100 K. The highly symmetric hexagonal SbH$_{4}$ phase is predicted to be stabilized above about 150 GPa, which is readily achievable in diamond anvil cell experiments. We find that all phosphorus hydrides are metastable with respect to decomposition into the elements within the pressure range studied. Trends based on our results and literature data reveal a connection between the high-pressure behaviors and ambient-pressure chemical quantities which provides insight into understanding which elements may form hydrogen-rich high-temperature superconducting phases at high pressures.

preprint2016arXiv

Secrecy Rate Maximization for MISO Multicasting SWIPT System with Power Splitting Scheme

This paper considers transmit covariance matrix design for secrecy rate maximization problem in a multiple-input single-output (MISO) multicasting simultaneous wireless information and power transfer (SWIPT) system. In order to enhance the performance of the system, artificial noise (AN) is added to the transmit signal in the design for the following purposes: to reduce the received signal-to-noise ratio (SNR) at the eavesdroppers and increase the harvested energy. We assume that all the channel-state-information (CSI) is perfectly known at the transmitter and all legitimate users are capable of simultaneously receiving information and harvesting energy. In addition, all the eavesdroppers are passive and they can harvest energy only when they are not intercepting or eavesdropping the messages intended for the legitimate users. The original secrecy rate maximization problem is not convex in terms of transmit and artificial covariance matrices as well as the power splitting (PS) ratio. In order to circumvent this non-convexity issue, we exploit the \emph{Charnes-Cooper} Transformation and semidefinite relaxation (SDR) to convert this original problem into a convex one. However, this convex problem does not always yield the rank-one transmit and AN covariance matrices to obtain the solution of the original problem. Therefore, we analyze the optimal conditions and utilize a Gaussian randomization (GR) method to construct the rank-one solutions from the non-rank one results. Simulation results have been provided to demonstrate the performance of the proposed transmit covariance matrices design for MISO multicasting SWIPT system.

preprint2016arXiv

Social- and Mobility-Aware Device-to-Device Content Delivery

Mobile online social network services have seen a rapid increase, in which the huge amount of user-generated social media contents propagating between users via social connections has significantly challenged the traditional content delivery paradigm: First, replicating all of the contents generated by users to edge servers that well "fit" the receivers becomes difficult due to the limited bandwidth and storage capacities. Motivated by device-to-device (D2D) communication that allows users with smart devices to transfer content directly, we propose replicating bandwidth-intensive social contents in a device-to-device manner. Based on large-scale measurement studies on social content propagation and user mobility patterns in edge-network regions, we observe that (1) Device-to-device replication can significantly help users download social contents from nearby neighboring peers; (2) Both social propagation and mobility patterns affect how contents should be replicated; (3) The replication strategies depend on regional characteristics ({\em e.g.}, how users move across regions). Using these measurement insights, we propose a joint \emph{propagation- and mobility-aware} content replication strategy for edge-network regions, in which social contents are assigned to users in edge-network regions according to a joint consideration of social graph, content propagation and user mobility. We formulate the replication scheduling as an optimization problem and design distributed algorithm only using historical, local and partial information to solve it. Trace-driven experiments further verify the superiority of our proposal: compared with conventional pure movement-based and popularity-based approach, our design can significantly ($2-4$ times) improve the amount of social contents successfully delivered by device-to-device replication.

preprint2015arXiv

Weak values could reveal the hidden effects of quantum interactions

Due to the reduced probability of successful post-selection, the weak-value amplification seems to be unavailable for the parameter-estimation. Here, we show theoretically that, some effects due to the weak interactions present only in the properly post-selected sub-ensemble, however are canceled by themselves in the total ensemble. From this point of view, the post-selection induced weak value could be one of the feasible methods for measuring the weak interaction, since the standard measurement does not work. Additionally, we employ the system of trapped ions to simulate the weak measurement and calculate relevant results without the frequently-used weak interaction approximation.

preprint2014arXiv

Hardness of FeB4: Density functional theory investigation

A recent experimental study reported the successful synthesis of an orthorhombic FeB4 with a high hardness of 62 GPa, which has reignited extensive interests on whether transition metal borides (TRBs) compounds will become superhard materials. However, it is contradicted with some theoretical studies suggesting transition metal boron compounds are unlikely to become superhard materials. Here, we examined structural and electronic properties of FeB4 using density functional theory. The electronic calculations show the good metallicity and covalent FeB bonding. Meanwhile, we extensively investigated stress strain relations of FeB4 under various tensile and shear loading directions. The calculated weakest tensile and shear stresses are 40 GPa and 25 GPa, respectively. Further simulations (e.g. electron localized function and bond length along the weakest loading direction) on FeB4 show the weak Fe-B bonding is responsible for this low hardness. Moreover, these results are consistent with the value of Vickers hardness (11.7 to 32.3 GPa) by employing different empirical hardness models and below the superhardness threshold of 40 GPa. Our current results suggest FeB4 is a hard material and unlikely to become superhard.

preprint2013arXiv

Fast Polarization Switching Demonstration Using Crossed-Planar Undulator in a Seeded Free Electron Laser

Fast polarization switching of light sources is required over a wide spectral range to investigate the symmetry of matter. In this Letter, we report the first experimental demonstration of the crossed-planar undulator technique at a seeded free-electron laser, which holds great promise for the full control and fast switching of the polarization of short-wavelength radiation. In the experiment, the polarization state of the coherent radiation at the 2nd harmonic of the seed laser is switched successfully. The experiment results confirm the theory, and pave the way for applying the crossed-planar undulator technique for the seeded X-ray free electron lasers.

preprint2013arXiv

Frequency-doubled scattering of symmetry-breaking surface-state electrons on liquid Helium

Any systems with symmetry-breaking eigenstates can effectively radiate photons with doubled frequency of the incident light, which is known as the second harmonic generation. Here, we study the second-order nonlinear effects with the system of surface-state electrons on liquid Helium. Due to the symmetry-breaking eigenstates, we show that a Rabi oscillation between two levels of the surface-state electrons can be realized beyond the usual resonant driving. Consequently, an electromagnetic field with the doubled frequency of the applied driving could be effectively radiated. This can be regarded as a frequency-doubled fluorescence, and interestingly, it works in the unusual Terahertz range.

preprint2013arXiv

Photon-induced thermal effects in superconducting coplanar waveguide resonators

We experimentally investigated the optical responses of a superconducting niobium resonator. It was found that, with increasing radiation power, the resonance frequency increases monotonically below around 500 mK, decreases monotonically above around 1 K and exhibits a nonmonotonic behavior at around 700 mK. These observations show that one can operate the irradiated resonator in three temperature regimes, depending on whether two-level system (TLS) effects or kinetic inductance effects dominate. Furthermore, we found that the optical responses at ultra-low temperatures can be qualitatively regarded as a photon-induced thermalization effect of TLSs, which could be utilized to achieve thermal sensitive photon detections.

preprint2012arXiv

Image encryption schemes for JPEG and GIF formats based on 3D baker with compound chaotic sequence generator

This paper proposed several methods to transplant the compound chaotic image encryption scheme with permutation based on 3D baker into image formats as Joint Photographic Experts Group (JPEG) and Graphics Interchange Format (GIF). The new method averts the lossy Discrete Cosine Transform and quantization and can encrypt and decrypt JPEG images lossless. Our proposed method for GIF keeps the property of animation successfully. The security test results indicate the proposed methods have high security. Since JPEG and GIF image formats are popular contemporarily, this paper shows that the prospect of chaotic image encryption is promising.

preprint2012arXiv

Polarization control proposal for Shanghai deep ultraviolet free electron laser

In this paper, a fully coherent radiation option with controllable polarization is proposed for Shanghai deep ultraviolet free electron laser (FEL) test facility. Intensive start-to-end simulation suggests that, the two crossed planar undulators which generate the horizontal and vertical linear polarized FEL respectively, should be placed as close as possible for avoiding the polarization performance degradation of the final combined FEL radiation. With the existence of the phase-shifter between the two crossed radiators, Fourier-Transform-Limited output radiation with 100 nJ order pulse energy, 5 ps full pulse length and circular polarization degree above 90% could be achieved.

preprint2012arXiv

Quantum gates implementations in the separated ion-traps by fast laser pulses

An approach is proposed to implement the universal quantum gates between the ions confined individually in the separated traps. Instead of the typical adiabatic operations, performed for manipulating the ion-ion coupling, here the switchable couplings between ions are implemented non-adiabatically by using the fast laser pulses. Consequently, the desirable quantum gates between the ions could be implemented by using only a series of laser pulses. The proposal may be conveniently generalized to the quantum computation with the scalable ion-traps.

preprint2012arXiv

Spin-orbit couplings between distant electrons trapped individually on liquid helium

We propose an approach to entangle spins of electrons floating on liquid helium by coherently manipulating their spin-orbit interactions. The configuration consists of single electrons, confined individually on liquid helium by the microelectrodes, moving along the surface as the harmonic oscillators. It has been known that the spin of an electron could be coupled to its orbit (i.e., the vibrational motion) by properly applying a magnetic field. Based on this single electron spin-orbit coupling, here we show that a Jaynes-Cummings (JC) type interaction between the spin of an electron and the orbit of another electron at a distance could be realized via the strong Coulomb interaction between the electrons. Consequently, the proposed JC interaction could be utilized to realize a strong orbit-mediated spin-spin coupling and implement the desirable quantum information processing between the distant electrons trapped individually on liquid helium.

preprint2012arXiv

Status of polarization control experiment at Shanghai deep ultraviolet free electron laser

A polarization control experiment by utilizing a pair of crossed undulators has been proposed for the Shanghai deep ultraviolet free electron laser test facility. Numerical simulations indicate that, with the electromagnetic phase-shifter located between the two crossed planar undulators, fully coherent radiation with 100 nJ order pulse energy, 5 picoseconds pulse length and circular polarization degree above 90% could be generated. The physical design study and the preparation status of the experiment are presented in the paper.

preprint2011arXiv

Coherently manipulating cold ions in separated traps by their vibrational couplings

Recent experiments [K. R. Brown, et al., Nature 471, 196 (2011); and M. Harlander, et al., Nature 471, 200 (2011)] have demonstrated the coherent manipulations on the external vibrations of two ions, confined individually in the separated ion traps. Using these recently developed techniques, we propose here an approach to realize the coherent operations, e.g., the universal quantum gates, between the separated ion-trap qubits encoded by two internal atomic states of the trapped ions. Our proposal operates beyond the usual Lamb-Dicke limits, and could be applied to the scalable ion traps coupled by their vibrations.

preprint2011arXiv

Entangling a series of trapped ions by moving cavity bus

Entangling multiple qubits is one of the central tasks for quantum information processings. Here, we propose an approach to entangle a number of cold ions (individually trapped in a string of microtraps) by a moved cavity. The cavity is pushed to include the ions one by one with an uniform velocity, and thus the information stored in former ions could be transferred to the latter ones by such a moving cavity bus. Since the positions of the trapped ions are precisely located, the strengths and durations of the ion-cavity interactions can be exactly controlled. As a consequence, by properly setting the relevant parameters typical multi-ion entangled states, e.g., $W$ state for 10 ions, could be deterministically generated. The feasibility of the proposal is also discussed.

preprint2009arXiv

Jaynes-Cummings Models with trapped electrons on liquid Helium

Jaynes-Cummings model is a typical model in quantum optics and has been realized with various physical systems (e.g, cavity QED, trapped ions, and circuit QED etc..) of two-level atoms interacting with quantized bosonic fields. Here, we propose a new implementation of this model by using a single classical laser beam to drive an electron floating on liquid Helium. Two lowest levels of the {\it vertical} motion of the electron acts as a two-level "atom", and the quantized vibration of the electron along one of the {\it parallel} directions, e.g., $x$-direction, serves the bosonic mode. These two degrees of freedom of the trapped electron can be coupled together by using a classical laser field. If the frequencies of the applied laser fields are properly set, the desirable Jaynes-Cummings models could be effectively realized.

preprint2009arXiv

Jaynes-Cummings Models with trapped surface-state electrons in THz cavities

An electron floating on the liquid Helium is proposed to be trapped (by a micro-electrode set below the liquid Helium) in a high finesse cavity. Two lowest levels of the vertical motion of the electron acts as a two-level "atom", which could resonantly interact with the THz cavity. In the Lamb-Dicke regime, wherein the electron's in-plane activity region is much smaller than the wavelength of the cavity mode, the famous Jaynes-Cummings model (JCM) could be realized. By applying an additional external classical laser beam to the electron, a driven JCM could also be implemented. With such a driven JCM certain quantum states, e.g., coherent states and the Schrodinger cat states, of the THz cavity field could be prepared by one-step evolution. The numerical results show that, for the typical parameters of the cavity and electron on liquid Helium, a strong coupling between the artificial atom and the THz cavity could be obtained.

preprint2008arXiv

Simplified approach to generate controlled-NOT gates with single trapped ions for arbitrary Lamb-Dicke parameters

For certain {\it specific} (or {\it"magic"}) Lamb-Dicke (LD) parameters, Monroe {\it et al} showed [Phys. Rev. {\bf A 55}, R2489 (1997)] that a two-qubit quantum operation, between the external and internal degrees of freedom of a single trapped ion, could be implemented by applying a single carrier laser pulse. Here, we further show that, such a two-qubit operation (which is equivalent to the standard CNOT gate, only apart from certain phase factors) could also be significantly-well realized for {\it arbitrarily} selected LD parameters. Instead of the so-called "$π$-pulses" used in the previous demonstrations, the durations of the pulses applied in the present proposal are required to be accurately set within the decoherence times of the ion. % We also propose a simple approach by using only one off-resonant (e.g., blue-sideband) laser pulse to eliminate the unwanted phase factors existed in the above two-qubit operations for generating the standard CNOT gates.

Miao Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

42 published item(s)

CAGS: Color-Adaptive Volumetric Video Streaming with Dynamic 3D Gaussian Splatting

Earthquake Source Depth Determination using Single Station Waveforms and Deep Learning

Identity-Robust Language Model Generation via Content Integrity Preservation

MacVQA: Adaptive Memory Allocation and Global Noise Filtering for Continual Visual Question Answering

MyGram: Modality-aware Graph Transformer with Global Distribution for Multi-modal Entity Alignment

Power Reinforcement Post-Training of Text-to-Image Models with Super-Linear Advantage Shaping

Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents

TBPLaS 2.0: a Tight-Binding Package for Large-scale Simulation

Utilizing Earth Foundation Models to Enhance the Simulation Performance of Hydrological Models with AlphaEarth Embeddings

FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for Resource and Data Heterogeneity

Handling Data Heterogeneity with Generative Replay in Collaborative Learning for Medical Imaging

MFNet: Multi-class Few-shot Segmentation Network with Pixel-wise Metric Learning

Segmenting across places: The need for fair transfer learning with satellite imagery

SplitAVG: A heterogeneity-aware federated deep learning method for medical imaging

Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective

Anomalous interfacial dynamics of single proton charges in binary aqueous solutions

Exploiting Deep Learning for Secure Transmission in an Underlay Cognitive Radio Network

Accurate RGB-D Salient Object Detection via Collaborative Learning

DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency Detection

Feasibility and A Fast Algorithm for Euclidean Distance Matrix Optimization with Ordinal Constraints

Multi-Domain Learning and Identity Mining for Vehicle Re-Identification

Real-time Earthquake Early Warning with Deep Learning: Application to the 2016 Central Apennines, Italy Earthquake Sequence

Resource Allocation Technique for Hybrid TDMA-NOMA System with Opportunistic Time Assignment

Super-resolved optical mapping of reactive sulfur-vacancy in 2D transition metal dichalcogenides

High-pressure Phase Stability and Superconductivity of Pnictogen Hydrides and Chemical Trends for Compressed Hydrides

Secrecy Rate Maximization for MISO Multicasting SWIPT System with Power Splitting Scheme

Social- and Mobility-Aware Device-to-Device Content Delivery

Weak values could reveal the hidden effects of quantum interactions

Hardness of FeB4: Density functional theory investigation

Fast Polarization Switching Demonstration Using Crossed-Planar Undulator in a Seeded Free Electron Laser

Frequency-doubled scattering of symmetry-breaking surface-state electrons on liquid Helium

Photon-induced thermal effects in superconducting coplanar waveguide resonators

Image encryption schemes for JPEG and GIF formats based on 3D baker with compound chaotic sequence generator

Polarization control proposal for Shanghai deep ultraviolet free electron laser

Quantum gates implementations in the separated ion-traps by fast laser pulses

Spin-orbit couplings between distant electrons trapped individually on liquid helium

Status of polarization control experiment at Shanghai deep ultraviolet free electron laser

Coherently manipulating cold ions in separated traps by their vibrational couplings

Entangling a series of trapped ions by moving cavity bus

Jaynes-Cummings Models with trapped electrons on liquid Helium

Jaynes-Cummings Models with trapped surface-state electrons in THz cavities

Simplified approach to generate controlled-NOT gates with single trapped ions for arbitrary Lamb-Dicke parameters