Source author record

Dag Tjøstheim

Dag Tjøstheim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Methodology Machine Learning

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Statistical embedding: Beyond principal components

There has been an intense recent activity in embedding of very high dimensional and nonlinear data structures, much of it in the data science and machine learning literature. We survey this activity in four parts. In the first part we cover nonlinear methods such as principal curves, multidimensional scaling, local linear methods, ISOMAP, graph based methods and diffusion mapping, kernel based methods and random projections. The second part is concerned with topological embedding methods, in particular mapping topological properties into persistence diagrams and the Mapper algorithm. Another type of data sets with a tremendous growth is very high-dimensional network data. The task considered in part three is how to embed such data in a vector space of moderate dimension to make the data amenable to traditional techniques such as cluster and classification techniques. Arguably this is the part where the contrast between algorithmic machine learning methods and statistical modeling, the so-called stochastic block modeling, is at its greatest. In the paper, we discuss the pros and cons for the two approaches. The final part of the survey deals with embedding in $\mathbb{R}^ 2$, i.e. visualization. Three methods are presented: $t$-SNE, UMAP and LargeVis based on methods in parts one, two and three, respectively. The methods are illustrated and compared on two simulated data sets; one consisting of a triplet of noisy Ranunculoid curves, and one consisting of networks of increasing complexity generated with stochastic block models and with two types of nodes.

preprint2020arXiv

Nonlinear spectral analysis: A local Gaussian approach

The spectral distribution $f(ω)$ of a stationary time series $\{Y_t\}_{t\in\mathbb{Z}}$ can be used to investigate whether or not periodic structures are present in $\{Y_t\}_{t\in\mathbb{Z}}$, but $f(ω)$ has some limitations due to its dependence on the autocovariances $γ(h)$. For example, $f(ω)$ can not distinguish white i.i.d. noise from GARCH-type models (whose terms are dependent, but uncorrelated), which implies that $f(ω)$ can be an inadequate tool when $\{Y_t\}_{t\in\mathbb{Z}}$ contains asymmetries and nonlinear dependencies. Asymmetries between the upper and lower tails of a time series can be investigated by means of the local Gaussian autocorrelations introduced in Tjøstheim and Hufthammer (2013), and these local measures of dependence can be used to construct the local Gaussian spectral density presented in this paper. A key feature of the new local spectral density is that it coincides with $f(ω)$ for Gaussian time series, which implies that it can be used to detect non-Gaussian traits in the time series under investigation. In particular, if $f(ω)$ is flat, then peaks and troughs of the new local spectral density can indicate nonlinear traits, which potentially might discover local periodic phenomena that remain undetected in an ordinary spectral analysis.

preprint2016arXiv

Estimation for single-index and partially linear single-index integrated models

Estimation mainly for two classes of popular models, single-index and partially linear single-index models, is studied in this paper. Such models feature nonstationarity. Orthogonal series expansion is used to approximate the unknown integrable link functions in the models and a profile approach is used to derive the estimators. The findings include the dual rate of convergence of the estimators for the single-index models and a trio of convergence rates for the partially linear single-index models. A new central limit theorem is established for a plug-in estimator of the unknown link function. Meanwhile, a considerable extension to a class of partially nonlinear single-index models is discussed in Section 4. Monte Carlo simulation verifies these theoretical results. An empirical study furnishes an application of the proposed estimation procedures in practice.

preprint2016arXiv

Estimation in nonlinear regression with Harris recurrent Markov chains

In this paper, we study parametric nonlinear regression under the Harris recurrent Markov chain framework. We first consider the nonlinear least squares estimators of the parameters in the homoskedastic case, and establish asymptotic theory for the proposed estimators. Our results show that the convergence rates for the estimators rely not only on the properties of the nonlinear regression function, but also on the number of regenerations for the Harris recurrent Markov chain. Furthermore, we discuss the estimation of the parameter vector in a conditional volatility function, and apply our results to the nonlinear regression with $I(1)$ processes and derive an asymptotic distribution theory which is comparable to that obtained by Park and Phillips [Econometrica 69 (2001) 117-161]. Some numerical studies including simulation and empirical application are provided to examine the finite sample performance of the proposed approaches and results.

preprint2007arXiv

Exploring spatial nonlinearity using additive approximation

We propose to approximate the conditional expectation of a spatial random variable given its nearest-neighbour observations by an additive function. The setting is meaningful in practice and requires no unilateral ordering. It is capable of catching nonlinear features in spatial data and exploring local dependence structures. Our approach is different from both Markov field methods and disjunctive kriging. The asymptotic properties of the additive estimators have been established for $α$-mixing spatial processes by extending the theory of the backfitting procedure to the spatial case. This facilitates the confidence intervals for the component functions, although the asymptotic biases have to be estimated via (wild) bootstrap. Simulation results are reported. Applications to real data illustrate that the improvement in describing the data over the auto-normal scheme is significant when nonlinearity or non-Gaussianity is pronounced.