Source author record

Jeremy Levesley

Jeremy Levesley appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Computation and Language math.CA Applications cond-mat.stat-mech Numerical Analysis physics.data-an Artificial Intelligence Human-Computer Interaction Information Theory Machine Learning math.IT Quantitative Methods

Catalog footprint

What is connected

16works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Informational Space Based Semantic Analysis for Scientific Texts

One major problem in Natural Language Processing is the automatic analysis and representation of human language. Human language is ambiguous and deeper understanding of semantics and creating human-to-machine interaction have required an effort in creating the schemes for act of communication and building common-sense knowledge bases for the 'meaning' in texts. This paper introduces computational methods for semantic analysis and the quantifying the meaning of short scientific texts. Computational methods extracting semantic feature are used to analyse the relations between texts of messages and 'representations of situations' for a newly created large collection of scientific texts, Leicester Scientific Corpus. The representation of scientific-specific meaning is standardised by replacing the situation representations, rather than psychological properties, with the vectors of some attributes: a list of scientific subject categories that the text belongs to. First, this paper introduces 'Meaning Space' in which the informational representation of the meaning is extracted from the occurrence of the word in texts across the scientific categories, i.e., the meaning of a word is represented by a vector of Relative Information Gain about the subject categories. Then, the meaning space is statistically analysed for Leicester Scientific Dictionary-Core and we investigate 'Principal Components of the Meaning' to describe the adequate dimensions of the meaning. The research in this paper conducts the base for the geometric representation of the meaning of texts.

preprint2022arXiv

Convergence of sparse grid Gaussian convolution approximation for multi-dimensional periodic function

We consider the problem of approximating $[0,1]^{d}$-periodic functions by convolution with a scaled Gaussian kernel. We start by establishing convergence rates to functions from periodic Sobolev spaces and we show that the saturation rate is $O(h^{2}),$ where $h$ is the scale of the Gaussian kernel. Taken from a discrete point of view, this result can be interpreted as the accuracy that can be achieved on the uniform grid with spacing $h.$ In the discrete setting, the curse of dimensionality would place severe restrictions on the computation of the approximation. For instance, a spacing of $2^{-n}$ would provide an approximation converging at a rate of $O(2^{-2n})$ but would require $(2^{n}+1)^{d}$ grid points. To overcome this we introduce a sparse grid version of Gaussian convolution approximation, where substantially fewer grid points are required, and show that the sparse grid version delivers a saturation rate of $O(n^{d-1}2^{-2n}).$ This rate is in line with what one would expect in the sparse grid setting (where the full grid error only deteriorates by a factor of order $n^{d-1}$) however the analysis that leads to the result is novel in that it draws on results from the theory of special functions and key observations regarding the form of certain weighted geometric sums.

preprint2020arXiv

Convergence of Multilevel Stationary Gaussian Convolution

It is well-known that polynomial reproduction is not possible when approximating with Gaussian kernels. Quasi-interpolation schemes have been developed which use a finite number of Gaussians at different scales, which then reproduce polynomials of low degree \cite{beatson}, and thus achieve polynomial orders of convergence. At the same time, interpolation with kernels of fixed width suffers from an explosion in condition number, and information from all data points influences the approximation at any one data point (no localisation). In \cite{HL1} the authors show that, for periodic convolution with the Gaussian kernel, a multilevel scheme can give orders of approximation faster than any polynomial. In this paper we present a new multilevel quasi-interpolation algorithm, the discrete version of the algorithm in \cite{HL1}, which mimics the continuous algorithm well, to single precision accuracy, and gives excellent convergence rates for band limited periodic functions. In this paper we explain how the algorithm works, and why we achieve the numerical results we do. The estimates developed have two parts, one involving the convergence of a low degree polynomial truncation term and one involving the control of the remainder of the truncation as the algorithm proceeds.

preprint2020arXiv

Personality Traits and Drug Consumption. A Story Told by Data

This is a preprint version of the first book from the series: "Stories told by data". In this book a story is told about the psychological traits associated with drug consumption. The book includes: - A review of published works on the psychological profiles of drug users. - Analysis of a new original database with information on 1885 respondents and usage of 18 drugs. (Database is available online.) - An introductory description of the data mining and machine learning methods used for the analysis of this dataset. - The demonstration that the personality traits (five factor model, impulsivity, and sensation seeking), together with simple demographic data, give the possibility of predicting the risk of consumption of individual drugs with sensitivity and specificity above 70% for most drugs. - The analysis of correlations of use of different substances and the description of the groups of drugs with correlated use (correlation pleiades). - Proof of significant differences of personality profiles for users of different drugs. This is explicitly proved for benzodiazepines, ecstasy, and heroin. - Tables of personality profiles for users and non-users of 18 substances. The book is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners. No previous knowledge of machine learning, advanced data mining concepts or modern psychology of personality is assumed. For more detailed introduction into statistical methods we recommend several undergraduate textbooks. Familiarity with basic statistics and some experience in the use of probabilities would be helpful as well as some basic technical understanding of psychology.

preprint2020arXiv

Principal Components of the Meaning

In this paper we argue that (lexical) meaning in science can be represented in a 13 dimension Meaning Space. This space is constructed using principal component analysis (singular decomposition) on the matrix of word category relative information gains, where the categories are those used by the Web of Science, and the words are taken from a reduced word set from texts in the Web of Science. We show that this reduced word set plausibly represents all texts in the corpus, so that the principal component analysis has some objective meaning with respect to the corpus. We argue that 13 dimensions is adequate to describe the meaning of scientific texts, and hypothesise about the qualitative meaning of the principal components.

preprint2019arXiv

Automatic Short Answer Grading and Feedback Using Text Mining Methods

Automatic grading is not a new approach but the need to adapt the latest technology to automatic grading has become very important. As the technology has rapidly became more powerful on scoring exams and essays, especially from the 1990s onwards, partially or wholly automated grading systems using computational methods have evolved and have become a major area of research. In particular, the demand of scoring of natural language responses has created a need for tools that can be applied to automatically grade these responses. In this paper, we focus on the concept of automatic grading of short answer questions such as are typical in the UK GCSE system, and providing useful feedback on their answers to students. We present experimental results on a dataset provided from the introductory computer science class in the University of North Texas. We first apply standard data mining techniques to the corpus of student answers for the purpose of measuring similarity between the student answers and the model answer. This is based on the number of common words. We then evaluate the relation between these similarities and marks awarded by scorers. We then consider an approach that groups student answers into clusters. Each cluster would be awarded the same mark, and the same feedback given to each answer in a cluster. In this manner, we demonstrate that clusters indicate the groups of students who are awarded the same or the similar scores. Words in each cluster are compared to show that clusters are constructed based on how many and which words of the model answer have been used. The main novelty in this paper is that we design a model to predict marks based on the similarities between the student answers and the model answer.

preprint2016arXiv

Quasi-interpolation on a sparse grid with Gaussian

Motivated by the recent multilevel sparse kernel-based interpolation (MuSIK) algorithm proposed in [Georgoulis, Levesley and Subhan, SIAM J. Sci. Comput., 35(2), pp. A815-A831, 2013], we introduce the new quasi-multilevel sparse interpolation with kernels (Q-MuSIK) via the combination technique. The Q-MuSIK scheme achieves better convergence and run time in comparison with classical quasi-interpolation; namely, the Q-MuSIK algorithm is generally superior to the MuSIK methods in terms of run time in particular in high-dimensional interpolation problems, since there is no need to solve large algebraic systems. We subsequently propose a fast, low complexity, high-dimensional quadrature formula based on Q-MuSIK interpolation of the integrand. We present the results of numerical experimentation for both interpolation and quadrature in high dimension.

preprint2015arXiv

Fast multilevel sparse Gaussian kernels for high-dimensional approximation and integration

A fast multilevel algorithm based on directionally scaled tensor-product Gaussian kernels on structured sparse grids is proposed for interpolation of high-dimensional functions and for the numerical integration of high-dimensional integrals. The algorithm is based on the recent Multilevel Sparse Kernel-based Interpolation (MLSKI) method (Georgoulis, Levesley \& Subhan, \emph{SIAM J. Sci. Comput.}, 35(2), pp.~A815--A831, 2013), with particular focus on the fast implementation of Gaussian-based MLSKI for interpolation and integration problems of high-dimen-sional functions $f:[0,1]^d\to\mathbb{R}$, with $5\le d\le 10$. The MLSKI interpolation procedure is shown to be interpolatory and a fast implementation is proposed. More specifically, exploiting the tensor-product nature of anisotropic Gaussian kernels, one-dimensional cardinal basis functions on a sequence of hierarchical equidistant nodes are precomputed to machine precision, rendering the interpolation problem into a fully parallelisable ensemble of linear combinations of function evaluations. A numerical integration algorithm is also proposed, based on interpolating the (high-dimensional) integrand. A series of numerical experiments highlights the applicability of the proposed algorithm for interpolation and integration for up to 10-dimensional problems.

preprint2015arXiv

Magnetic Flux Leakage Method: Large-Scale Approximation

We consider the application of the magnetic flux leakage (MFL) method to the detection of defects in ferromagnetic (steel) tubulars. The problem setup corresponds to the cases where the distance from the casing and the point where the magnetic field is measured is small compared to the curvature radius of the undamaged casing and the scale of inhomogeneity of the magnetic field in the defect-free case. Mathematically this corresponds to the planar ferromagnetic layer in a uniform magnetic field oriented along this layer. Defects in the layer surface result in a strong deformation of the magnetic field, which provides opportunities for the reconstruction of the surface profile from measurements of the magnetic field. We deal with large-scale defects whose depth is small compared to their longitudinal sizes---these being typical of corrosive damage. Within the framework of large-scale approximation, analytical relations between the casing thickness profile and the measured magnetic field can be derived.

preprint2015arXiv

Noise-Produced Patterns in Images Constructed from Magnetic Flux Leakage Data

Magnetic flux leakage measurements help identify the position, size and shape of corrosion-related defects in steel casings used to protect boreholes drilled into oil and gas reservoirs. Images constructed from magnetic flux leakage data contain patterns related to noise inherent in the method. We investigate the patterns and their scaling properties for the case of delta-correlated input noise, and consider the implications for the method's ability to resolve defects. The analytical evaluation of the noise-produced patterns is made possible by model reduction facilitated by large-scale approximation. With appropriate modification, the approach can be employed to analyze noise-produced patterns in other situations where the data of interest are not measured directly, but are related to the measured data by a complex linear transform involving integrations with respect to spatial coordinates.

preprint2013arXiv

Lévy driven models and derivative pricing

We develop a general method for derivative pricing. This approach has its roots in Shannon's Information Theory. The notion of $λ$-analyticity of Lévy models is introduced on the basis of which new representations of the pricing integral are obtained. It is shown that popular in applications Lévy models are $λ$-analytic. We apply these results to derive a general algorithm for pricing of European call options.

preprint2012arXiv

A Multiplier Version of the Bernstein Inequality on the Complex Sphere

We prove a multiplier version of the Bernstein inequality on the complex sphere. Included in this is a new result relating a bivariate sum involving Jacobi polynomials and Gegenbauer polynomials, which relates the sum of reproducing kernels on spaces of polynomials irreducibly invariant under the unitary group, with the reproducing kernel of the sum of these spaces, which is irreducibly invariant under the action of the orthogonal group.

preprint2012arXiv

Approximation on the complex sphere

We develop new elements of harmonic analysis on the complex sphere on the basis of which Bernstein's, Jackson's and Kolmogorov's inequalities are established. We apply these results to get order sharp estimates of $m$-term approximations. The results obtained is a synthesis of new results on classical orthogonal polynomials, harmonic analysis on manifolds and geometric properties of Euclidean spaces.

preprint2012arXiv

Multilevel Sparse Kernel-Based Interpolation

A multilevel kernel-based interpolation method, suitable for moderately high-dimensional function interpolation problems, is proposed. The method, termed multilevel sparse kernel-based interpolation (MLSKI, for short), uses both level-wise and direction-wise multilevel decomposition of structured (or mildly unstructured) interpolation data sites in conjunction with the application of kernel-based interpolants with different scaling in each direction. The multilevel interpolation algorithm is based on a hierarchical decomposition of the data sites, whereby at each level the detail is added to the interpolant by interpolating the resulting residual of the previous level. On each level, anisotropic radial basis functions are used for solving a number of small interpolation problems, which are subsequently linearly combined to produce the interpolant. MLSKI can be viewed as an extension of $d$-boolean interpolation (which is closely related to ideas in sparse grid and hyperbolic crosses literature) to kernel-based functions, within the hierarchical multilevel framework to achieve accelerated convergence. Numerical experiments suggest that the new algorithm is numerically stable and efficient for the reconstruction of large data in $\mathbb{R}^{d}\times \mathbb{R}$, for $d = 2, 3, 4$, with tens or even hundreds of thousands data points. Also, MLSKI appears to be generally superior over classical radial basis function methods in terms of complexity, run time and convergence at least for large data sets.

preprint2012arXiv

On the density of polyharmonic splines

This article treats the question of fundamentality of the translates of a polyharmonic spline kernel (also known as a surface spline) in the space of continuous functions on a compact set $Ω\subset \RR^d$ when the translates are restricted to $Ω$. Fundamentality is not hard to demonstrate when a low degree polynomial may be added or when translates are permitted to lie outside of $Ω$; the challenge of this problem stems from the presence of the boundary, for which all successful approximation schemes require an added polynomial. When $Ω$ is the unit ball, we demonstrate that translates of polyharmonic splines are fundamental by considering two related problems: the fundamentality in the space of functions vanishing at the boundary and fundamentality of the restricted kernel in the space of continuous function on the sphere. This gives rise to a new approximation scheme composed of two parts: one which approximates purely on $\partial Ω$, and a second part involving a shift invariant approximant of a function vanishing outside of a neighborhood $Ω$.

preprint2010arXiv

Time Step Expansions and the Invariant Manifold Approach to Lattice Boltzmann Models

The classical method for deriving the macroscopic dynamics of a lattice Boltzmann system is to use a combination of different approximations and expansions. Usually a Chapman-Enskog analysis is performed, either on the continuous Boltzmann system, or its discrete velocity counterpart. Separately a discrete time approximation is introduced to the discrete velocity Boltzmann system, to achieve a practically useful approximation to the continuous system, for use in computation. Thereafter, with some additional arguments, the dynamics of the Chapman-Enskog expansion are linked to the discrete time system to produce the dynamics of the completely discrete scheme. In this paper we put forward a different route to the macroscopic dynamics. We begin with the system discrete in both velocity space and time. We hypothesize that the alternating steps of advection and relaxation, common to all lattice Boltzmann schemes, give rise to a slow invariant manifold. We perform a time step expansion of the discrete time dynamics using the invariance of the manifold. Finally we calculate the dynamics arising from this system. By choosing the fully discrete scheme as a starting point we avoid mixing approximations and arrive at a general form of the microscopic dynamics up to the second order in the time step. We calculate the macroscopic dynamics of two commonly used lattice schemes up to the first order, and hence find the precise form of the deviation from the Navier-Stokes equations in the dissipative term, arising from the discretization of velocity space. Finally we perform a short wave perturbation on the dynamics of these example systems, to find the necessary conditions for their stability.

Jeremy Levesley

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

An Informational Space Based Semantic Analysis for Scientific Texts

Convergence of sparse grid Gaussian convolution approximation for multi-dimensional periodic function

Convergence of Multilevel Stationary Gaussian Convolution

Personality Traits and Drug Consumption. A Story Told by Data

Principal Components of the Meaning

Automatic Short Answer Grading and Feedback Using Text Mining Methods

Quasi-interpolation on a sparse grid with Gaussian

Fast multilevel sparse Gaussian kernels for high-dimensional approximation and integration

Magnetic Flux Leakage Method: Large-Scale Approximation

Noise-Produced Patterns in Images Constructed from Magnetic Flux Leakage Data

Lévy driven models and derivative pricing

A Multiplier Version of the Bernstein Inequality on the Complex Sphere

Approximation on the complex sphere

Multilevel Sparse Kernel-Based Interpolation

On the density of polyharmonic splines

Time Step Expansions and the Invariant Manifold Approach to Lattice Boltzmann Models