Source author record

Stephen Clark

Stephen Clark appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence math.SP math-ph math.MP Machine Learning Computer Vision Information Retrieval math.CT Applications econ.GN Logic in Computer Science math.CA Multiagent Systems physics.soc-ph q-fin.EC quant-ph

Catalog footprint

What is connected

25works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

The Conceptual VAE

In this report we present a new model of concepts, based on the framework of variational autoencoders, which is designed to have attractive properties such as factored conceptual domains, and at the same time be learnable from data. The model is inspired by, and closely related to, the Beta-VAE model of concepts, but is designed to be more closely connected with language, so that the names of concepts form part of the graphical model. We provide evidence that our model -- which we call the Conceptual VAE -- is able to learn interpretable conceptual representations from simple images of coloured shapes together with the corresponding concept labels. We also show how the model can be used as a concept classifier, and how it can be adapted to learn from fewer labels per instance. Finally, we formally relate our model to Gardenfors' theory of conceptual spaces, showing how the Gaussians we use to represent concepts can be formalised in terms of "fuzzy concepts" in such a space.

preprint2021arXiv

Formalising Concepts as Grounded Abstractions

The notion of concept has been studied for centuries, by philosophers, linguists, cognitive scientists, and researchers in artificial intelligence (Margolis & Laurence, 1999). There is a large literature on formal, mathematical models of concepts, including a whole sub-field of AI -- Formal Concept Analysis -- devoted to this topic (Ganter & Obiedkov, 2016). Recently, researchers in machine learning have begun to investigate how methods from representation learning can be used to induce concepts from raw perceptual data (Higgins, Sonnerat, et al., 2018). The goal of this report is to provide a formal account of concepts which is compatible with this latest work in deep learning. The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces. The mathematics of partial orders and lattices is a standard tool for modelling conceptual spaces (Ch.2, Mitchell (1997), Ganter and Obiedkov (2016)); however, there is no formal work that we are aware of which defines a conceptual lattice on top of a representation that is induced using unsupervised deep learning (Goodfellow et al., 2016). The advantages of partially-ordered lattice structures are that these provide natural mechanisms for use in concept discovery algorithms, through the meets and joins of the lattice.

preprint2021arXiv

Imitating Interactive Intelligence

A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, grounded language comprehension and production, and multi-agent social interaction. To build agents that can robustly interact with humans, we would ideally train them while they interact with humans. However, this is presently impractical. Therefore, we approximate the role of the human with another learned agent, and use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour. Rigorously evaluating our agents poses a great challenge, so we develop a variety of behavioural tests, including evaluation by humans who watch videos of agents or interact directly with them. These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone. Further, we demonstrate that agent capabilities generalise beyond literal experiences in the dataset. Finally, we train evaluation models whose ratings of agents agree well with human judgement, thus permitting the evaluation of new agent models without additional effort. Taken together, our results in this virtual environment provide evidence that large-scale human behavioural imitation is a promising tool to create intelligent, interactive agents, and the challenge of reliably evaluating such agents is possible to surmount.

preprint2020arXiv

Environmental drivers of systematicity and generalization in a situated agent

The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparatively generic agent architecture that exhibits strong performance on these tests. We then identify three aspects of the training regime and environment that make a significant difference to its performance: (a) the number of object/word experiences in the training set; (b) the visual invariances afforded by the agent's perspective, or frame of reference; and (c) the variety of visual input inherent in the perceptual aspect of the agent's perception. Our findings indicate that the degree of generalisation that networks exhibit can depend critically on particulars of the environment in which a given task is instantiated. They further suggest that the propensity for neural networks to generalise in systematic ways may increase if, like human children, those networks have access to many frames of richly varying, multi-modal observations as they learn.

preprint2020arXiv

Learning to Personalize for Web Search Sessions

The task of session search focuses on using interaction data to improve relevance for the user's next query at the session level. In this paper, we formulate session search as a personalization task under the framework of learning to rank. Personalization approaches re-rank results to match a user model. Such user models are usually accumulated over time based on the user's browsing behaviour. We use a pre-computed and transparent set of user models based on concepts from the social science literature. Interaction data are used to map each session to these user models. Novel features are then estimated based on such models as well as sessions' interaction data. Extensive experiments on test collections from the TREC session track show statistically significant improvements over current session search algorithms.

preprint2020arXiv

Learning to Segment Actions from Observation and Narration

We apply a generative segmental model of task structure, guided by narration, to action segmentation in video. We focus on unsupervised and weakly-supervised settings where no action labels are known during training. Despite its simplicity, our model performs competitively with previous work on a dataset of naturalistic instructional videos. Our model allows us to vary the sources of supervision used in training, and we find that both task structure and narrative language provide large benefits in segmentation quality.

preprint2020arXiv

Probing Emergent Semantics in Predictive Agents via Question Answering

Recent work has shown how predictive modeling can endow agents with rich knowledge of their surroundings, improving their ability to act in complex environments. We propose question-answering as a general paradigm to decode and understand the representations that such agents develop, applying our method to two recent approaches to predictive modeling -action-conditional CPC (Guo et al., 2018) and SimCore (Gregor et al., 2019). After training agents with these predictive objectives in a visually-rich, 3D environment with an assortment of objects, colors, shapes, and spatial configurations, we probe their internal state representations with synthetic (English) questions, without backpropagating gradients from the question-answering decoder into the agent. The performance of different agents when probed this way reveals that they learn to encode factual, and seemingly compositional, information about objects, properties and spatial relations from their physical environment. Our approach is intuitive, i.e. humans can easily interpret responses of the model as opposed to inspecting continuous vectors, and model-agnostic, i.e. applicable to any modeling approach. By revealing the implicit knowledge of objects, quantities, properties and relations acquired by agents as they learn, question-conditional agent probing can stimulate the design and development of stronger predictive learning objectives.

preprint2020arXiv

Who voted for a No Deal Brexit? A Composition Model of Great Britains 2019 European Parliamentary Elections

The purpose of this paper is to use the votes cast at the 2019 European elections held in United Kingdom to re-visit the analysis conducted subsequent to its 2016 European Union referendum vote. This exercise provides a staging post on public opinion as the United Kingdom moves to leave the European Union during 2020. A composition data analysis in a seemingly unrelated regression framework is adopted that respects the compositional nature of the vote outcome; each outcome is a share that adds up to 100% and each outcome is related to the alternatives. Contemporary explanatory data for each counting area is sourced from the themes of socio-demographics, employment, life satisfaction and place. The study find that there are still strong and stark divisions in the United Kingdom, defined by age, qualifications, employment and place. The use of a compositional analysis approach produces challenges in regards to the interpretation of these models, but marginal plots are seen to aid the interpretation somewhat.

preprint2018arXiv

Latent Tree Learning with Differentiable Parsers: Shift-Reduce Parsing and Chart Parsing

Latent tree learning models represent sentences by composing their words according to an induced parse tree, all based on a downstream task. These models often outperform baselines which use (externally provided) syntax trees to drive the composition order. This work contributes (a) a new latent tree learning model based on shift-reduce parsing, with competitive downstream performance and non-trivial induced trees, and (b) an analysis of the trees learned by our shift-reduce model and by a chart-based model.

preprint2017arXiv

Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs

We introduce a neural network that represents sentences by composing their words according to induced binary parse trees. We use Tree-LSTM as our composition function, applied along a tree structure found by a fully differentiable natural language chart parser. Our model simultaneously optimises both the composition function and the parser, thus eliminating the need for externally-provided parse trees which are normally required for Tree-LSTM. It can therefore be seen as a tree-based RNN that is unsupervised with respect to the parse trees. As it is fully differentiable, our model is easily trained with an off-the-shelf gradient descent method and backpropagation. We demonstrate that it achieves better performance compared to various supervised Tree-LSTM architectures on a textual entailment task and a reverse dictionary task.

preprint2016arXiv

Characterization of self-adjoint extensions for discrete symplectic systems

All self-adjoint extensions of minimal linear relation associated with the discrete symplectic system are characterized. Especially, for the scalar case on a finite discrete interval some equivalent forms and the uniqueness of the given expression are discussed and the Krein--von Neumann extension is described explicitly. In addition, a limit point criterion for symplectic systems is established. The result partially generalizes even a classical limit point criterion for the second order Sturm--Liouville difference equations.

preprint2016arXiv

On discrete symplectic systems: Associated maximal and minimal linear relations and nonhomogeneous problems

In this paper we characterize the definiteness of the discrete symplectic system, study a nonhomogeneous discrete symplectic system, and introduce the minimal and maximal linear relations associated with these systems. Fundamental properties of the corresponding deficiency indices, including a relationship between the number of square summable solutions and the dimension of the defect subspace, are also derived. Moreover, a sufficient condition for the existence of a densely defined operator associated with the symplectic system is provided.

preprint2016arXiv

Virtual Embodiment: A Scalable Long-Term Strategy for Artificial Intelligence Research

Meaning has been called the "holy grail" of a variety of scientific disciplines, ranging from linguistics to philosophy, psychology and the neurosciences. The field of Artifical Intelligence (AI) is very much a part of that list: the development of sophisticated natural language semantics is a sine qua non for achieving a level of intelligence comparable to humans. Embodiment theories in cognitive science hold that human semantic representation depends on sensori-motor experience; the abundant evidence that human meaning representation is grounded in the perception of physical reality leads to the conclusion that meaning must depend on a fusion of multiple (perceptual) modalities. Despite this, AI research in general, and its subdisciplines such as computational linguistics and computer vision in particular, have focused primarily on tasks that involve a single modality. Here, we propose virtual embodiment as an alternative, long-term strategy for AI research that is multi-modal in nature and that allows for the kind of scalability required to develop the field coherently and incrementally, in an ethically responsible fashion.

preprint2015arXiv

Principal Solutions Revisited

The main objective of this paper is to identify principal solutions associated with Sturm-Liouville operators on arbitrary open intervals $(a,b) \subseteq \mathbb{R}$, as introduced by Leighton and Morse in the scalar context in 1936 and by Hartman in the matrix-valued situation in 1957, with Weyl-Titchmarsh solutions, as long as the underlying Sturm-Liouville differential expression is nonoscillatory (resp., disconjugate or bounded from below near an endpoint) and in the limit point case at the endpoint in question. In addition, we derive an explicit formula for Weyl-Titchmarsh functions in this case (the latter appears to be new in the matrix-valued context).

preprint2014arXiv

Learning Type-Driven Tensor-Based Meaning Representations

This paper investigates the learning of 3rd-order tensors representing the semantics of transitive verbs. The meaning representations are part of a type-driven tensor-based semantic framework, from the newly emerging field of compositional distributional semantics. Standard techniques from the neural networks literature are used to learn the tensors, which are tested on a selectional preference-style task with a simple 2-dimensional sentence space. Promising results are obtained against a competitive corpus-based baseline. We argue that extending this work beyond transitive verbs, and to higher-dimensional sentence spaces, is an interesting and challenging problem for the machine learning community to consider.

preprint2014arXiv

The Frobenius anatomy of word meanings I: subject and object relative pronouns

This paper develops a compositional vector-based semantics of subject and object relative pronouns within a categorical framework. Frobenius algebras are used to formalise the operations required to model the semantics of relative pronouns, including passing information between the relative clause and the modified noun phrase, as well as copying, combining, and discarding parts of the relative clause. We develop two instantiations of the abstract semantics, one based on a truth-theoretic approach and one based on corpus statistics.

preprint2014arXiv

The Frobenius anatomy of word meanings II: possessive relative pronouns

Within the categorical compositional distributional model of meaning, we provide semantic interpretations for the subject and object roles of the possessive relative pronoun `whose'. This is done in terms of Frobenius algebras over compact closed categories. These algebras and their diagrammatic language expose how meanings of words in relative clauses interact with each other. We show how our interpretation is related to Montague-style semantics and provide a truth-theoretic interpretation. We also show how vector spaces provide a concrete interpretation and provide preliminary corpus-based experimental evidence. In a prequel to this paper, we used similar methods and dealt with the case of subject and object relative pronouns.

preprint2014arXiv

Using Sentence Plausibility to Learn the Semantics of Transitive Verbs

The functional approach to compositional distributional semantics considers transitive verbs to be linear maps that transform the distributional vectors representing nouns into a vector representing a sentence. We conduct an initial investigation that uses a matrix consisting of the parameters of a logistic regression classifier trained on a plausibility task as a transitive verb function. We compare our method to a commonly used corpus-based method for constructing a verb matrix and find that the plausibility training may be more effective for disambiguation tasks.

preprint2013arXiv

A quantum teleportation inspired algorithm produces sentence meaning from word meaning and grammatical structure

We discuss an algorithm which produces the meaning of a sentence given meanings of its words, and its resemblance to quantum teleportation. In fact, this protocol was the main source of inspiration for this algorithm which has many applications in the area of Natural Language Processing.

preprint2012arXiv

Boundary Data Maps and Krein's Resolvent Formula for Sturm-Liouville Operators on a Finite Interval

We continue the study of boundary data maps, that is, generalizations of spectral parameter dependent Dirichlet-to-Neumann maps for (three-coefficient) Sturm-Liouville operators on the finite interval $(a,b)$, to more general boundary conditions. While earlier studies of boundary data maps focused on the case of general separated boundary conditions at $a$ and $b$, the present work develops a unified treatment for all possible self-adjoint boundary conditions (i.e., separated as well as non-separated ones). In the course of this paper we describe the connections with Krein's resolvent formula for self-adjoint extensions of the underlying minimal Sturm-Liouville operator (parametrized in terms of boundary conditions), with some emphasis on the Krein extension, develop the basic trace formulas for resolvent differences of self-adjoint extensions, especially, in terms of the associated spectral shift functions, and describe the connections between various parametrizations of all self-adjoint extensions, including the precise relation to von Neumann's basic parametrization in terms of unitary maps between deficiency subspaces.

preprint2010arXiv

Boundary Data Maps for Schrodinger Operators on a Compact Interval

We provide a systematic study of boundary data maps, that is, 2 \times 2 matrix-valued Dirichlet-to-Neumann and more generally, Robin-to-Robin maps, associated with one-dimensional Schrodinger operators on a compact interval [0,R] with separated boundary conditions at 0 and R. Most of our results are formulated in the non-self-adjoint context. Our principal results include explicit representations of these boundary data maps in terms of the resolvent of the underlying Schrodinger operator and the associated boundary trace maps, Krein-type resolvent formulas relating Schrodinger operators corresponding to different (separated) boundary conditions, and a derivation of the Herglotz property of boundary data maps (up to right multiplication by an appropriate diagonal matrix) in the special self-adjoint case.

preprint2010arXiv

Concrete Sentence Spaces for Compositional Distributional Models of Meaning

Coecke, Sadrzadeh, and Clark (arXiv:1003.4394v1 [cs.CL]) developed a compositional model of meaning for distributional semantics, in which each word in a sentence has a meaning vector and the distributional meaning of the sentence is a function of the tensor products of the word vectors. Abstractly speaking, this function is the morphism corresponding to the grammatical structure of the sentence in the category of finite dimensional vector spaces. In this paper, we provide a concrete method for implementing this linear meaning map, by constructing a corpus-based vector space for the type of sentence. Our construction method is based on structured vector spaces whereby meaning vectors of all sentences, regardless of their grammatical structure, live in the same vector space. Our proposed sentence space is the tensor product of two noun spaces, in which the basis vectors are pairs of words each augmented with a grammatical role. This enables us to compare meanings of sentences by simply taking the inner product of their vectors.

preprint2010arXiv

Mathematical Foundations for a Compositional Distributional Model of Meaning

We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.

preprint2010arXiv

Minimal Rank Decoupling of Full-Lattice CMV Operators with Scalar- and Matrix-Valued Verblunsky Coefficients

Relations between half- and full-lattice CMV operators with scalar- and matrix-valued Verblunsky coefficients are investigated. In particular, the decoupling of full-lattice CMV operators into a direct sum of two half-lattice CMV operators by a perturbation of minimal rank is studied. Contrary to the Jacobi case, decoupling a full-lattice CMV matrix by changing one of the Verblunsky coefficients results in a perturbation of twice the minimal rank. The explicit form for the minimal rank perturbation and the resulting two half-lattice CMV matrices are obtained. In addition, formulas relating the Weyl--Titchmarsh $m$-functions (resp., matrices) associated with the involved CMV operators and their Green's functions (resp., matrices) are derived.

preprint2010arXiv

Weyl-Titchmarsh Theory and Borg-Marchenko-type Uniqueness Results for CMV Operators with Matrix-Valued Verblunsky Coefficients

We prove local and global versions of Borg-Marchenko-type uniqueness theorems for half-lattice and full-lattice CMV operators (CMV for Cantero, Moral, and Velazquez) with matrix-valued Verblunsky coefficients. While our half-lattice results are formulated in terms of matrix-valued Weyl-Titchmarsh functions, our full-lattice results involve the diagonal and main off-diagonal Green's matrices. We also develop the basics of Weyl-Titchmarsh theory for CMV operators with matrix-valued Verblunsky coefficients as this is of independent interest and an essential ingredient in proving the corresponding Borg-Marchenko-type uniqueness theorems.

Stephen Clark

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

The Conceptual VAE

Formalising Concepts as Grounded Abstractions

Imitating Interactive Intelligence

Environmental drivers of systematicity and generalization in a situated agent

Learning to Personalize for Web Search Sessions

Learning to Segment Actions from Observation and Narration

Probing Emergent Semantics in Predictive Agents via Question Answering

Who voted for a No Deal Brexit? A Composition Model of Great Britains 2019 European Parliamentary Elections

Latent Tree Learning with Differentiable Parsers: Shift-Reduce Parsing and Chart Parsing

Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs

Characterization of self-adjoint extensions for discrete symplectic systems

On discrete symplectic systems: Associated maximal and minimal linear relations and nonhomogeneous problems

Virtual Embodiment: A Scalable Long-Term Strategy for Artificial Intelligence Research

Principal Solutions Revisited

Learning Type-Driven Tensor-Based Meaning Representations

The Frobenius anatomy of word meanings I: subject and object relative pronouns

The Frobenius anatomy of word meanings II: possessive relative pronouns

Using Sentence Plausibility to Learn the Semantics of Transitive Verbs

A quantum teleportation inspired algorithm produces sentence meaning from word meaning and grammatical structure

Boundary Data Maps and Krein's Resolvent Formula for Sturm-Liouville Operators on a Finite Interval

Boundary Data Maps for Schrodinger Operators on a Compact Interval

Concrete Sentence Spaces for Compositional Distributional Models of Meaning

Mathematical Foundations for a Compositional Distributional Model of Meaning

Minimal Rank Decoupling of Full-Lattice CMV Operators with Scalar- and Matrix-Valued Verblunsky Coefficients

Weyl-Titchmarsh Theory and Borg-Marchenko-type Uniqueness Results for CMV Operators with Matrix-Valued Verblunsky Coefficients