Source author record

Zheng Zhao

Zheng Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation and Language gr-qc hep-th Methodology Applications cond-mat.dis-nn cond-mat.quant-gas Emerging Technologies hep-ph math.DS math.OC math.ST physics.app-ph physics.optics quant-ph Statistics Theory

Catalog footprint

What is connected

13works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning

The optical neural network (ONN) is a promising hardware platform for next-generation neurocomputing due to its high parallelism, low latency, and low energy consumption. Previous ONN architectures are mainly designed for general matrix multiplication (GEMM), leading to unnecessarily large area cost and high control complexity. Here, we move beyond classical GEMM-based ONNs and propose an optical subspace neural network (OSNN) architecture, which trades the universality of weight representation for lower optical component usage, area cost, and energy consumption. We devise a butterfly-style photonic-electronic neural chip to implement our OSNN with up to 7x fewer trainable optical components compared to GEMM-based ONNs. Additionally, a hardware-aware training framework is provided to minimize the required device programming precision, lessen the chip area, and boost the noise robustness. We experimentally demonstrate the utility of our neural chip in practical image recognition tasks, showing that a measured accuracy of 94.16% can be achieved in hand-written digit recognition tasks with 3-bit weight programming precision.

preprint2022arXiv

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

The projection filter is a technique for approximating the solutions of optimal filtering problems. In projection filters, the Kushner--Stratonovich stochastic partial differential equation that governs the propagation of the optimal filtering density is projected to a manifold of parametric densities, resulting in a finite-dimensional stochastic differential equation. Despite the fact that projection filters are capable of representing complicated probability densities, their current implementations are limited to Gaussian family or unidimensional filtering applications. This work considers a combination of numerical integration and automatic differentiation to construct projection filter algorithms for more generic problems. Specifically, we provide a detailed exposition of this combination for the manifold of the exponential family, and show how to apply the projection filter to multidimensional cases. We demonstrate numerically that based on comparison to a finite-difference solution to the Kushner--Stratonovich equation and a bootstrap particle filter with systematic resampling, the proposed algorithm retains an accurate approximation of the filtering density while requiring a comparatively low number of quadrature points. Due to the sparse-grid integration and automatic differentiation used to calculate the expected values of the natural statistics and the Fisher metric, the proposed filtering algorithms are highly scalable. They therefore are suitable to many applications in which the number of dimensions exceeds the practical limit of particle filters, but where the Gaussian-approximations are deemed unsatisfactory.

preprint2022arXiv

Revisiting Shallow Discourse Parsing in the PDTB-3: Handling Intra-sentential Implicits

In the PDTB-3, several thousand implicit discourse relations were newly annotated \textit{within} individual sentences, adding to the over 15,000 implicit relations annotated \textit{across} adjacent sentences in the PDTB-2. Given that the position of the arguments to these \textit{intra-sentential implicits} is no longer as well-defined as with \textit{inter-sentential implicits}, a discourse parser must identify both their location and their sense. That is the focus of the current work. The paper provides a comprehensive analysis of our results, showcasing model performance under different scenarios, pointing out limitations and noting future directions.

preprint2022arXiv

To Adapt or to Fine-tune: A Case Study on Abstractive Summarization

Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch. However, such models are sluggish to train and accompanied by a massive overhead. Researchers have proposed a few lightweight alternatives such as smaller adapters to mitigate the drawbacks. Nonetheless, it remains uncertain whether using adapters benefits the task of summarization, in terms of improved efficiency without an unpleasant sacrifice in performance. In this work, we carry out multifaceted investigations on fine-tuning and adapters for summarization tasks with varying complexity: language, domain, and task transfer. In our experiments, fine-tuning a pre-trained language model generally attains a better performance than using adapters; the performance gap positively correlates with the amount of training data used. Notably, adapters exceed fine-tuning under extremely low-resource conditions. We further provide insights on multilinguality, model convergence, and robustness, hoping to shed light on the pragmatic choice of fine-tuning or adapters in abstractive summarization.

preprint2020arXiv

Pentaquark components in low-lying baryon resonances

We study pentaquark states of both light $q^4\bar q$ and hidden heavy $q^3 Q\bar Q$ (q = u,d,s quark in SU(3) flavor symmetry; Q = c, b quark) systems with a general group theory approach in the constituent quark model, and the spectrum of light baryon resonances in the ansatz that the $l=1$ baryon states may consist of the $q^3$ as well as $q^4\bar q$ pentaquark component. The model is fitted to ground state baryons and light baryon resonances which are believed to be normal three-quark states. The work reveals that the $N(1535)1/2^{-}$ and $N(1520)3/2^-$ may consist of a large $q^4\bar q$ component while the $N(1895)1/2^{-}$ and $N(1875)3/2^-$ are respectively their partners, and the $N^+(1685)$ might be a $q^4\bar q$ state. By the way, a new set of color-spin-flavor-spatial wave function for $q^3 Q\bar Q$ systems in the compact pentaquark picture are constructed systematically for studying hidden charm pentaquark states.

preprint2020arXiv

Taylor Moment Expansion for Continuous-Discrete Gaussian Filtering and Smoothing

The paper is concerned with non-linear Gaussian filtering and smoothing in continuous-discrete state-space models, where the dynamic model is formulated as an Itô stochastic differential equation (SDE), and the measurements are obtained at discrete time instants. We propose novel Taylor moment expansion (TME) Gaussian filter and smoother which approximate the moments of the SDE with a temporal Taylor expansion. Differently from classical linearisation or Itô--Taylor approaches, the Taylor expansion is formed for the moment functions directly and in time variable, not by using a Taylor expansion on the non-linear functions in the model. We analyse the theoretical properties, including the positive definiteness of the covariance estimate and stability of the TME Gaussian filter and smoother. By numerical experiments, we demonstrate that the proposed TME Gaussian filter and smoother significantly outperform the state-of-the-art methods in terms of estimation accuracy and numerical stability.

preprint2015arXiv

Phase diagram of the 3D Anderson model for uncorrelated speckle potentials

We investigate the localization properties of atoms moving in a three-dimensional optical lattice in the presence of an uncorrelated disorder potential having the same probability distribution $P(V)$ as laser speckles. We find that the disorder-averaged (single-particle) Green's function, calculated via the coherent potential approximation, is in very good agreement with exact numerics. Using the transfer-matrix method, we compute the phase diagram in the energy-disorder plane and show that its peculiar shape can be understood from the self-consistent theory of localization. In particular, we recover the large asymmetry in the position of the mobility edge for blue and red speckles, which was recently observed numerically for correlated speckle potentials.

preprint2015arXiv

Successive Ray Refinement and Its Application to Coordinate Descent for LASSO

Coordinate descent is one of the most popular approaches for solving Lasso and its extensions due to its simplicity and efficiency. When applying coordinate descent to solving Lasso, we update one coordinate at a time while fixing the remaining coordinates. Such an update, which is usually easy to compute, greedily decreases the objective function value. In this paper, we aim to improve its computational efficiency by reducing the number of coordinate descent iterations. To this end, we propose a novel technique called Successive Ray Refinement (SRR). SRR makes use of the following ray continuation property on the successive iterations: for a particular coordinate, the value obtained in the next iteration almost always lies on a ray that starts at its previous iteration and passes through the current iteration. Motivated by this ray-continuation property, we propose that coordinate descent be performed not directly on the previous iteration but on a refined search point that has the following properties: on one hand, it lies on a ray that starts at a history solution and passes through the previous iteration, and on the other hand, it achieves the minimum objective function value among all the points on the ray. We propose two schemes for defining the search point and show that the refined search point can be efficiently obtained. Empirical results for real and synthetic data sets show that the proposed SRR can significantly reduce the number of coordinate descent iterations, especially for small Lasso regularization parameters.

preprint2014arXiv

Safe Screening With Variational Inequalities and Its Application to LASSO

Sparse learning techniques have been routinely used for feature selection as the resulting model usually has a small number of non-zero entries. Safe screening, which eliminates the features that are guaranteed to have zero coefficients for a certain value of the regularization parameter, is a technique for improving the computational efficiency. Safe screening is gaining increasing attention since 1) solving sparse learning formulations usually has a high computational cost especially when the number of features is large and 2) one needs to try several regularization parameters to select a suitable model. In this paper, we propose an approach called "Sasvi" (Safe screening with variational inequalities). Sasvi makes use of the variational inequality that provides the sufficient and necessary optimality condition for the dual problem. Several existing approaches for Lasso screening can be casted as relaxed versions of the proposed Sasvi, thus Sasvi provides a stronger safe screening rule. We further study the monotone properties of Sasvi for Lasso, based on which a sure removal regularization parameter can be identified for each feature. Experimental results on both synthetic and real data sets are reported to demonstrate the effectiveness of the proposed Sasvi for Lasso screening.

preprint2013arXiv

Safe and Efficient Screening For Sparse Support Vector Machine

Screening is an effective technique for speeding up the training process of a sparse learning model by removing the features that are guaranteed to be inactive the process. In this paper, we present a efficient screening technique for sparse support vector machine based on variational inequality. The technique is both efficient and safe.

preprint2010arXiv

A note on the Hawking radiation calculated by the quasi-classical tunneling method

Since Parikh and Wilczek's tunneling method was proposed, there have been many generalizations, such as its application to massive charged particles' tunneling and other spacetimes. Moreover, a variant tunneling method was also recently proposed by Angheben et al that it was independent of coordinates. However, there are some subtleties in the calculation of Hawking radiation, and particularly is the so-called factor of 2 problem during calculating the Hawking temperature. The most popular opinion on this problem is that it is just a problem of the choice of coordinates. However, following other treatments we show that we can also consider this problem as a problem that we do not consider the contribution from P(absorption). Moreover, we also give some subtleties in the balance method and some comparisons with other treatments. In addition, as Parikh and Wilczek's original works have showed that if one takes the tunneling particles' back-reaction into account, the Hawking radiation would be modified, and this modification is underlying consistent with the unitary theory, we further find that this modification is also underlying correlated with the laws of black hole thermodynamics. Furthermore, we show that this tunneling method may be valid just when the tunneling process is reversible.

preprint2010arXiv

Discussion on Event Horizon and Quantum Ergosphere of Evaporating Black Holes in a Tunnelling Framework

In this paper, with the Parikh-Wilczek tunnelling framework the positions of the event horizon of the Vaidya black hole and the Vaidya-Bonner black hole are calculated respectively. We find that the event horizon and the apparent horizon of these two black holes correspond respectively to the two turning points of the Hawking radiation tunnelling barrier. That is, the quantum ergosphere coincides with the tunnelling barrier. Our calculation also implies that the Hawking radiation comes from the apparent horizon.

preprint2010arXiv

Tortoise coordinate and Hawking effect in the Kinnersley spacetime

Hawking effect from the Kinnersley spacetime is investigated using the improved Damour-Ruffini method with a new coordinate transformation. Hawking temperature of the horizons can be obtained point by point. It is found that Hawking temperatures of different points on the horizons are different. Especially, Hawking temperature of Rindler horizon is investigated. The touch between a Kinnersley black hole and its Rindler horizon is considered, and it shows that the phenomenon is related to the third law of thermodynamics.

Zheng Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

Revisiting Shallow Discourse Parsing in the PDTB-3: Handling Intra-sentential Implicits

To Adapt or to Fine-tune: A Case Study on Abstractive Summarization

Pentaquark components in low-lying baryon resonances

Taylor Moment Expansion for Continuous-Discrete Gaussian Filtering and Smoothing

Phase diagram of the 3D Anderson model for uncorrelated speckle potentials

Successive Ray Refinement and Its Application to Coordinate Descent for LASSO

Safe Screening With Variational Inequalities and Its Application to LASSO

Safe and Efficient Screening For Sparse Support Vector Machine

A note on the Hawking radiation calculated by the quasi-classical tunneling method

Discussion on Event Horizon and Quantum Ergosphere of Evaporating Black Holes in a Tunnelling Framework

Tortoise coordinate and Hawking effect in the Kinnersley spacetime