Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
23topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

Caracal: Causal Architecture via Spectral Mixing

The scalability of Large Language Models to long sequences is hindered by the quadratic cost of attention and the limitations of positional encodings. To address these, we introduce Caracal, a novel architecture that replaces attention with a parameter-efficient, O(L log(L)) Multi-Head Fourier (MHF) module. Our contributions are threefold: (1) We leverage the Fast Fourier Transform (FFT) for sequence mixing, inherently addressing both bottlenecks mentioned above. (2) We apply a frequency-domain causal masking technique that enforces autoregressive capabilities via asymmetric padding and truncation, overcoming a critical barrier for Fourier-based generative models. (3) Unlike efficient models relying on hardware-specific implementations (e.g., Mamba), we uses standard library operators. This ensures robust portability, eliminating common deployment barriers. Evaluations demonstrate that Caracal performs competitively with Transformer and SSM baselines, offering a scalable and simple pathway for efficient long-sequence modeling. Code is available in Appendix.

preprint2022arXiv

Identity-Sensitive Knowledge Propagation for Cloth-Changing Person Re-identification

Cloth-changing person re-identification (CC-ReID), which aims to match person identities under clothing changes, is a new rising research topic in recent years. However, typical biometrics-based CC-ReID methods often require cumbersome pose or body part estimators to learn cloth-irrelevant features from human biometric traits, which comes with high computational costs. Besides, the performance is significantly limited due to the resolution degradation of surveillance images. To address the above limitations, we propose an effective Identity-Sensitive Knowledge Propagation framework (DeSKPro) for CC-ReID. Specifically, a Cloth-irrelevant Spatial Attention module is introduced to eliminate the distraction of clothing appearance by acquiring knowledge from the human parsing module. To mitigate the resolution degradation issue and mine identity-sensitive cues from human faces, we propose to restore the missing facial details using prior facial knowledge, which is then propagated to a smaller network. After training, the extra computations for human parsing or face restoration are no longer required. Extensive experiments show that our framework outperforms state-of-the-art methods by a large margin. Our code is available at https://github.com/KimbingNg/DeskPro.

preprint2022arXiv

Implicit Channel Learning for Machine Learning Applications in 6G Wireless Networks

With the deployment of the fifth generation (5G) wireless systems gathering momentum across the world, possible technologies for 6G are under active research discussions. In particular, the role of machine learning (ML) in 6G is expected to enhance and aid emerging applications such as virtual and augmented reality, vehicular autonomy, and computer vision. This will result in large segments of wireless data traffic comprising image, video and speech. The ML algorithms process these for classification/recognition/estimation through the learning models located on cloud servers. This requires wireless transmission of data from edge devices to the cloud server. Channel estimation, handled separately from recognition step, is critical for accurate learning performance. Toward combining the learning for both channel and the ML data, we introduce implicit channel learning to perform the ML tasks without estimating the wireless channel. Here, the ML models are trained with channel-corrupted datasets in place of nominal data. Without channel estimation, the proposed approach exhibits approximately 60% improvement in image and speech classification tasks for diverse scenarios such as millimeter wave and IEEE 802.11p vehicular channels.

preprint2022arXiv

Preserving Dense Features for Ki67 Nuclei Detection

Nuclei detection is a key task in Ki67 proliferation index estimation in breast cancer images. Deep learning algorithms have shown strong potential in nuclei detection tasks. However, they face challenges when applied to pathology images with dense medium and overlapping nuclei since fine details are often diluted or completely lost by early maxpooling layers. This paper introduces an optimized UV-Net architecture, specifically developed to recover nuclear details with high-resolution through feature preservation for Ki67 proliferation index computation. UV-Net achieves an average F1-score of 0.83 on held-out test patch data, while other architectures obtain 0.74-0.79. On tissue microarrays (unseen) test data obtained from multiple centers, UV-Net's accuracy exceeds other architectures by a wide margin, including 9-42\% on Ontario Veterinary College, 7-35\% on Protein Atlas and 0.3-3\% on University Health Network.

preprint2022arXiv

RobustAnalog: Fast Variation-Aware Analog Circuit Design Via Multi-task RL

Analog/mixed-signal circuit design is one of the most complex and time-consuming stages in the whole chip design process. Due to various process, voltage, and temperature (PVT) variations from chip manufacturing, analog circuits inevitably suffer from performance degradation. Although there has been plenty of work on automating analog circuit design under the typical condition, limited research has been done on exploring robust designs under real and unpredictable silicon variations. Automatic analog design against variations requires prohibitive computation and time costs. To address the challenge, we present RobustAnalog, a robust circuit design framework that involves the variation information in the optimization process. Specifically, circuit optimizations under different variations are considered as a set of tasks. Similarities among tasks are leveraged and competitions are alleviated to realize a sample-efficient multi-task training. Moreover, RobustAnalog prunes the task space according to the current performance in each iteration, leading to a further simulation cost reduction. In this way, RobustAnalog can rapidly produce a set of circuit parameters that satisfies diverse constraints (e.g. gain, bandwidth, noise...) across variations. We compare RobustAnalog with Bayesian optimization, Evolutionary algorithm, and Deep Deterministic Policy Gradient (DDPG) and demonstrate that RobustAnalog can significantly reduce required optimization time by 14-30 times. Therefore, our study provides a feasible method to handle various real silicon conditions.

preprint2022arXiv

SpinQ Triangulum: a commercial three-qubit desktop quantum computer

SpinQ Triangulum is the second generation of the desktop quantum computers designed and manufactured by SpinQ Technology. SpinQ's desktop quantum computer series, based on room temperature NMR spectrometer, provide light-weighted, cost-effective and maintenance-free quantum computing platforms that aim to provide real-device experience for quantum computing education for K-12 and college level. These platforms also feature quantum control design capabilities for studying quantum control and quantum noise. Compared with the first generation product, the two-qubit SpinQ Gemini, Triangulum features a three-qubit QPU, smaller dimensions (61 * 33 * 56 cm^3) and lighter (40 kg). Furthermore, the magnetic field is more stable and the performance of quantum control is more accurate. This paper introduces the system design of Triangulum and its new features. As an example of performing quantum computing tasks, we present the implementation of the Harrow-Hassidim-Lloyd (HHL) algorithm on Triangulum, demonstrating Triangulum's capability of undertaking complex quantum computing tasks. SpinQ will continue to develop desktop quantum computing platform with more qubits. Meanwhile, a simplified version of SpinQ Gemini, namely Gemini Mini (https://www.spinq.cn/products#geminiMini-anchor) , has been recently realised. Gemini Mini is much more portable (20* 35 * 26 cm^3, 14 kg) and affordable for most K-12 schools around the world.

preprint2022arXiv

Supervised Contrastive CSI Representation Learning for Massive MIMO Positioning

Similarity metric is crucial for massive MIMO positioning utilizing channel state information~(CSI). In this letter, we propose a novel massive MIMO CSI similarity learning method via deep convolutional neural network~(DCNN) and contrastive learning. A contrastive loss function is designed considering multiple positive and negative CSI samples drawn from a training dataset. The DCNN encoder is trained using the loss so that positive samples are mapped to points close to the anchor's encoding, while encodings of negative samples are kept away from the anchor's in the representation space. Evaluation results of fingerprint-based positioning on a real-world CSI dataset show that the learned similarity metric improves positioning accuracy significantly compared with other known state-of-the-art methods.

preprint2021arXiv

Obfuscation of Images via Differential Privacy: From Facial Images to General Images

Due to the pervasiveness of image capturing devices in every-day life, images of individuals are routinely captured. Although this has enabled many benefits, it also infringes on personal privacy. A promising direction in research on obfuscation of facial images has been the work in the k-same family of methods which employ the concept of k-anonymity from database privacy. However, there are a number of deficiencies of k-anonymity that carry over to the k-same methods, detracting from their usefulness in practice. In this paper, we first outline several of these deficiencies and discuss their implications in the context of facial obfuscation. We then develop a framework through which we obtain a formal differentially private guarantee for the obfuscation of facial images in generative machine learning models. Our approach provides a provable privacy guarantee that is not susceptible to the outlined deficiencies of k-same obfuscation and produces photo-realistic obfuscated output. In addition, we demonstrate through experimental comparisons that our approach can achieve comparable utility to k-same obfuscation in terms of preservation of useful features in the images. Furthermore, we propose a method to achieve differential privacy for any image (i.e., without restriction to facial images) through the direct modification of pixel intensities. Although the addition of noise to pixel intensities does not provide the high visual quality obtained via generative machine learning models, it offers greater versatility by eliminating the need for a trained model. We demonstrate that our proposed use of the exponential mechanism in this context is able to provide superior visual quality to pixel-space obfuscation using the Laplace mechanism.

preprint2021arXiv

SpinQ Gemini: a desktop quantum computer for education and research

SpinQ Gemini is a commercial desktop quantum computer designed and manufactured by SpinQ Technology. It is an integrated hardware-software system. The first generation product with two qubits was launched in January 2020. The hardware is based on NMR spectrometer, with permanent magnets providing $\sim 1$ T magnetic field. SpinQ Gemini operates under room temperature ($0$-$30^{\circ}$C), highlighting its lightweight (55 kg with a volume of $70\times 40 \times 80$ cm$^3$), cost-effective (under $50$k USD), and maintenance-free. SpinQ Gemini aims to provide real-device experience for quantum computing education for K-12 and at the college level. It also features quantum control design capabilities that benefit the researchers studying quantum control and quantum noise. Since its first launch, SpinQ Gemini has been shipped to institutions in Canada, Taiwan and Mainland China. This paper introduces the system of design of SpinQ Gemini, from hardware to software. We also demonstrate examples for performing quantum computing tasks on SpinQ Gemini, including one task for a variational quantum eigensolver of a two-qubit Heisenberg model. The next generations of SpinQ quantum computing devices will adopt models of more qubits, advanced control functions for researchers with comparable cost, as well as simplified models for much lower cost (under $5$k USD) for K-12 education. We believe that low-cost portable quantum computer products will facilitate hands-on experience for teaching quantum computing at all levels, well-prepare younger generations of students and researchers for the future of quantum technologies.

preprint2020arXiv

Accelerating Incremental Gradient Optimization with Curvature Information

This paper studies an acceleration technique for incremental aggregated gradient ({\sf IAG}) method through the use of \emph{curvature} information for solving strongly convex finite sum optimization problems. These optimization problems of interest arise in large-scale learning applications. Our technique utilizes a curvature-aided gradient tracking step to produce accurate gradient estimates incrementally using Hessian information. We propose and analyze two methods utilizing the new technique, the curvature-aided IAG ({\sf CIAG}) method and the accelerated CIAG ({\sf A-CIAG}) method, which are analogous to gradient method and Nesterov's accelerated gradient method, respectively. Setting $κ$ to be the condition number of the objective function, we prove the $R$ linear convergence rates of $1 - \frac{4c_0 κ}{(κ+1)^2}$ for the {\sf CIAG} method, and $1 - \sqrt{\frac{c_1}{2κ}}$ for the {\sf A-CIAG} method, where $c_0,c_1 \leq 1$ are constants inversely proportional to the distance between the initial point and the optimal solution. When the initial iterate is close to the optimal solution, the $R$ linear convergence rates match with the gradient and accelerated gradient method, albeit {\sf CIAG} and {\sf A-CIAG} operate in an incremental setting with strictly lower computation complexity. Numerical experiments confirm our findings. The source codes used for this paper can be found on \url{http://github.com/hoitowai/ciag/}.

preprint2020arXiv

Chip-scale Full-Stokes Spectropolarimeter in Silicon Photonic Circuits

Wavelength-dependent polarization state of light carries crucial information about light-matter interactions. However, its measurement is limited to bulky, energy-consuming devices, which prohibits many modern, portable applications. Here, we propose and demonstrate a chip-scale spectropolarimeter implemented using a CMOS-compatible silicon photonics technology. Four compact Vernier microresonator spectrometers are monolithically integrated with a broadband polarimeter consisting of a 2D nanophotonic antenna and a polarimetric circuit to achieve full-Stokes spectropolarimetric analysis. The proposed device offers a solid-state spectropolarimetry solution with a small footprint of 1*0.6 mm2 and low power consumption of 360 mW}. Full-Stokes spectral detection across a broad spectral range of 50 nm with a resolution of 1~nm is demonstrated in characterizing a material possessing structural chirality. The proposed device may enable a broader application of spectropolarimetry in the fields ranging from biomedical diagnostics and chemical analysis to observational astronomy.

preprint2020arXiv

Differential Privacy Via a Truncated and Normalized Laplace Mechanism

When querying databases containing sensitive information, the privacy of individuals stored in the database has to be guaranteed. Such guarantees are provided by differentially private mechanisms which add controlled noise to the query responses. However, most such mechanisms do not take into consideration the valid range of the query being posed. Thus, noisy responses that fall outside of this range may potentially be produced. To rectify this and therefore improve the utility of the mechanism, the commonly used Laplace distribution can be truncated to the valid range of the query and then normalized. However, such a data-dependent operation of normalization leaks additional information about the true query response thereby violating the differential privacy guarantee. Here, we propose a new method which preserves the differential privacy guarantee through a careful determination of an appropriate scaling parameter for the Laplace distribution. We also generalize the privacy guarantee in the context of the Laplace distribution to account for data-dependent normalization factors and study this guarantee for different classes of range constraint configurations. We provide derivations of the optimal scaling parameter (i.e., the minimal value that preserves differential privacy) for each class or provide an approximation thereof. As a consequence of this work, one can use the Laplace distribution to answer queries in a range-adherent and differentially private manner.

preprint2020arXiv

Joint Embedding in Named Entity Linking on Sentence Level

Named entity linking is to map an ambiguous mention in documents to an entity in a knowledge base. The named entity linking is challenging, given the fact that there are multiple candidate entities for a mention in a document. It is difficult to link a mention when it appears multiple times in a document, since there are conflicts by the contexts around the appearances of the mention. In addition, it is difficult since the given training dataset is small due to the reason that it is done manually to link a mention to its mapping entity. In the literature, there are many reported studies among which the recent embedding methods learn vectors of entities from the training dataset at document level. To address these issues, we focus on how to link entity for mentions at a sentence level, which reduces the noises introduced by different appearances of the same mention in a document at the expense of insufficient information to be used. We propose a new unified embedding method by maximizing the relationships learned from knowledge graphs. We confirm the effectiveness of our method in our experimental studies.

preprint2020arXiv

Push-Pull Gradient Methods for Distributed Optimization in Networks

In this paper, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents' cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider new distributed gradient-based methods where each node maintains two estimates, namely, an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents' objective functions. From the viewpoint of an agent, the information about the gradients is pushed to the neighbors, while the information about the decision variable is pulled from the neighbors hence giving the name "push-pull gradient methods". The methods utilize two different graphs for the information exchange among agents, and as such, unify the algorithms with different types of distributed architecture, including decentralized (peer-to-peer), centralized (master-slave), and semi-centralized (leader-follower) architecture. We show that the proposed algorithms and their many variants converge linearly for strongly convex and smooth objective functions over a network (possibly with unidirectional data links) in both synchronous and asynchronous random-gossip settings. In particular, under the random-gossip setting, "push-pull" is the first class of algorithms for distributed optimization over directed graphs. Moreover, we numerically evaluate our proposed algorithms in both scenarios, and show that they outperform other existing linearly convergent schemes, especially for ill-conditioned problems and networks that are not well balanced.

preprint2020arXiv

Self-Refining Deep Symmetry Enhanced Network for Rain Removal

Rain removal aims to remove the rain streaks on rain images. The state-of-the-art methods are mostly based on Convolutional Neural Network~(CNN). However, as CNN is not equivariant to object rotation, these methods are unsuitable for dealing with the tilted rain streaks. To tackle this problem, we propose Deep Symmetry Enhanced Network~(DSEN) that is able to explicitly extract the rotation equivariant features from rain images. In addition, we design a self-refining mechanism to remove the accumulated rain streaks in a coarse-to-fine manner. This mechanism reuses DSEN with a novel information link which passes the gradient flow to the higher stages. Extensive experiments on both synthetic and real-world rain images show that our self-refining DSEN yields the top performance.

preprint2020arXiv

SNEAP: A Fast and Efficient Toolchain for Mapping Large-Scale Spiking Neural Network onto NoC-based Neuromorphic Platform

Spiking neural network (SNN), as the third generation of artificial neural networks, has been widely adopted in vision and audio tasks. Nowadays, many neuromorphic platforms support SNN simulation and adopt Network-on-Chips (NoC) architecture for multi-cores interconnection. However, interconnection brings huge area overhead to the platform. Moreover, run-time communication on the interconnection has a significant effect on the total power consumption and performance of the platform. In this paper, we propose a toolchain called SNEAP for mapping SNNs to neuromorphic platforms with multi-cores, which aims to reduce the energy and latency brought by spike communication on the interconnection. SNEAP includes two key steps: partitioning the SNN to reduce the spikes communicated between partitions, and mapping the partitions of SNN to the NoC to reduce average hop of spikes under the constraint of hardware resources. SNEAP can reduce more spikes communicated on the interconnection of NoC and spend less time than other toolchains in the partitioning phase. Moreover, the average hop of spikes is reduced more by SNEAP within a time period, which effectively reduces the energy and latency on the NoC-based neuromorphic platform. The experimental results show that SNEAP can achieve 418x reduction in end-to-end execution time, and reduce energy consumption and spike latency, on average, by 23% and 51% respectively, compared with SpiNeMap.

preprint2020arXiv

The Unusual Eruption of the Extragalactic Classical Nova M31N 2017-09a

M31N 2017-09a is a classical nova and was observed for some 160 days following its initial eruption, during which time it underwent a number of bright secondary outbursts. The light-curve is characterized by continual variation with excursions of at least 0.5 magnitudes on a daily time-scale. The lower envelope of the eruption suggests that a single power-law can describe the decline rate. The eruption is relatively long with $t_2 = 111$, and $t_3 = 153$ days.

preprint2019arXiv

A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates

This paper proposes a novel proximal-gradient algorithm for a decentralized optimization problem with a composite objective containing smooth and non-smooth terms. Specifically, the smooth and nonsmooth terms are dealt with by gradient and proximal updates, respectively. The proposed algorithm is closely related to a previous algorithm, PG-EXTRA \cite{shi2015proximal}, but has a few advantages. First of all, agents use uncoordinated step-sizes, and the stable upper bounds on step-sizes are independent of network topologies. The step-sizes depend on local objective functions, and they can be as large as those of the gradient descent. Secondly, for the special case without non-smooth terms, linear convergence can be achieved under the strong convexity assumption. The dependence of the convergence rate on the objective functions and the network are separated, and the convergence rate of the new algorithm is as good as one of the two convergence rates that match the typical rates for the general gradient descent and the consensus averaging. We provide numerical experiments to demonstrate the efficacy of the introduced algorithm and validate our theoretical discoveries.

preprint2019arXiv

Automatic image-domain Moire artifact reduction method in grating-based x-ray interferometry imaging

The aim of this study is to demonstrate the feasibility of removing the image Moire artifacts caused by system inaccuracies in grating-based x-ray interferometry imaging system via convolutional neural network (CNN) technique. Instead of minimizing these inconsistencies between the acquired phase stepping data via certain optimized signal retrieval algorithms, our newly proposed CNN-based method reduces the Moire artifacts in the image-domain via a learned image post-processing procedure. To ease the training data preparations, we propose to synthesize them with numerical natural images and experimentally obtained Moire artifact-only-images. Moreover, a fast signal processing method has also been developed to generate the needed large number of high quality Moire artifact-only images from finite number of acquired experimental phase stepping data. Experimental results show that the CNN method is able to remove Moire artifacts effectively, while maintaining the signal accuracy and image resolution.