Source author record

Tianwen Wei

Tianwen Wei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.ST Statistics Theory Computation Computation and Language math.NA math.OC math.SP

Catalog footprint

What is connected

9works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Flexible Multi-Task Model for BERT Serving

In this demonstration, we present an efficient BERT-based multi-task (MT) framework that is particularly suitable for iterative and incremental development of the tasks. The proposed framework is based on the idea of partial fine-tuning, i.e. only fine-tune some top layers of BERT while keep the other layers frozen. For each task, we train independently a single-task (ST) model using partial fine-tuning. Then we compress the task-specific layers in each ST model using knowledge distillation. Those compressed ST models are finally merged into one MT model so that the frozen layers of the former are shared across the tasks. We exemplify our approach on eight GLUE tasks, demonstrating that it is able to achieve both strong performance and efficiency. We have implemented our method in the utterance understanding system of XiaoAI, a commercial AI assistant developed by Xiaomi. We estimate that our model reduces the overall serving cost by 86%.

preprint2016arXiv

On the subdifferential of symmetric convex functions of the spectrum for symmetric and orthogonally decomposable tensors

The subdifferential of convex functions of the singular spectrum of real matrices has been widely studied in matrix analysis, optimization and automatic control theory. Convex optimization over spaces of tensors is now gaining much interest due to its potential applications in signal processing, statistics and engineering. The goal of this paper is to present an extension of the approach by Lewis \cite{lewis1995convex} for the analysis of the subdifferential of certain convex functions of the spectrum of symmetric tensors. We give a complete characterization of the subdifferential of Schatten-type tensor norms for symmetric tensors. Some partial results in this direction are also given for Orthogonally Decomposable tensors.

preprint2015arXiv

A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm

This contribution deals with the generalized symmetric FastICA algorithm in the domain of Independent Component Analysis (ICA). The generalized symmetric version of FastICA has been shown to have the potential to achieve the Cramér-Rao Bound (CRB) by allowing the usage of different nonlinearity functions in its parallel implementations of one-unit FastICA. In spite of this appealing property, a rigorous study of the asymptotic error of the generalized symmetric FastICA algorithm is still missing in the community. In fact, all the existing results exhibit certain limitations, such as ignoring the impact of data standardization on the asymptotic statistics or being based on a heuristic approach. In this work, we aim at filling this blank. The first result of this contribution is the characterization of the limits of the generalized symmetric FastICA. It is shown that the algorithm optimizes a function that is a sum of the contrast functions used by traditional one-unit FastICA with a correction of the sign. Based on this characterization, we derive a closed-form analytic expression of the asymptotic covariance matrix of the generalized symmetric FastICA estimator using the method of estimating equation and M-estimator.

preprint2015arXiv

An Overview of the Asymptotic Performance of the Family of the FastICA Algorithms

This contribution summarizes the results on the asymptotic performance of several variants of the FastICA algorithm. A number of new closed-form expressions are presented.

preprint2015arXiv

Convex recovery of tensors using nuclear norm penalization

The subdifferential of convex functions of the singular spectrum of real matrices has been widely studied in matrix analysis, optimization and automatic control theory. Convex analysis and optimization over spaces of tensors is now gaining much interest due to its potential applications to signal processing, statistics and engineering. The goal of this paper is to present an applications to the problem of low rank tensor recovery based on linear random measurement by extending the results of Tropp to the tensors setting.

preprint2015arXiv

Joint estimation and model order selection for one dimensional ARMA models via convex optimization: a nuclear norm penalization approach

The problem of estimating ARMA models is computationally interesting due to the nonconcavity of the log-likelihood function. Recent results were based on the convex minimization. Joint model selection using penalization by a convex norm, e.g. the nuclear norm of a certain matrix related to the state space formulation was extensively studied from a computational viewpoint. The goal of the present short note is to present a theoretical study of a nuclear norm penalization based variant of the method of \cite{Bauer:Automatica05,Bauer:EconTh05} under the assumption of a Gaussian noise process.

preprint2015arXiv

Sensing tensors with Gaussian filters

Sparse recovery from linear Gaussian measurements has been the subject of much investigation since the breaktrough papers \cite{CRT:IEEEIT06} and \cite{donoho2006compressed} on Compressed Sensing. Application to sparse vectors and sparse matrices via least squares penalized with sparsity promoting norms is now well understood using tools such as Gaussian mean width, statistical dimension and the notion of descent cones \cite{tropp2014convex} \cite{Vershynin:ArXivEstimation14}. Extention of these ideas to low rank tensor recovery is starting to enjoy considerable interest due to its many potential applications to Independent Component Analysis, Hidden Markov Models and Gaussian Mixture Models \cite{AnandkumarEtAl:JMLR14}, hyperspectral image analysis \cite{zhang2008tensor}, to name a few. In this paper, we demonstrate that the recent approach of \cite{Vershynin:ArXivEstimation14} provides very useful error bounds in the tensor setting using the nuclear norm or the Romera-Paredes--Pontil \cite{RomeraParedesPontil:NIPS13} penalization.

preprint2015arXiv

Von Neumann's inequality for tensors

For two matrices in $\mathbb R^{n_1\times n_2}$, the von Neumann inequality says that their scalar product is less than or equal to the scalar product of their singular spectrum. In this short note, we extend this result to real tensors and provide a complete study of the equality case.

preprint2014arXiv

A study of the fixed points and spurious solutions of the FastICA algorithm

The FastICA algorithm is one of the most popular iterative algorithms in the domain of linear independent component analysis. Despite its success, it is observed that FastICA occasionally yields outcomes that do not correspond to any true solutions (known as demixing vectors) of the ICA problem. These outcomes are commonly referred to as spurious solutions. Although FastICA is among the most extensively studied ICA algorithms, the occurrence of spurious solutions are not yet completely understood by the community. In this contribution, we aim at addressing this issue. In the first part of this work, we are interested in the relationship between demixing vectors, local optimizers of the contrast function and (attractive or unattractive) fixed points of FastICA algorithm. Characterizations of these sets are given, and an inclusion relationship is discovered. In the second part, we investigate the possible scenarios where spurious solutions occur. We show that when certain bimodal Gaussian mixtures distributions are involved, there may exist spurious solutions that are attractive fixed points of FastICA. In this case, popular nonlinearities such as "gauss" or "tanh" tend to yield spurious solutions, whereas only "kurtosis" may give reliable results. Some advices are given for the practical choice of nonlinearity function.

Tianwen Wei

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

A Flexible Multi-Task Model for BERT Serving

On the subdifferential of symmetric convex functions of the spectrum for symmetric and orthogonally decomposable tensors

A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm

An Overview of the Asymptotic Performance of the Family of the FastICA Algorithms

Convex recovery of tensors using nuclear norm penalization

Joint estimation and model order selection for one dimensional ARMA models via convex optimization: a nuclear norm penalization approach

Sensing tensors with Gaussian filters

Von Neumann's inequality for tensors

A study of the fixed points and spurious solutions of the FastICA algorithm