Researcher profile

Steven Schockaert

Steven Schockaert contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Benchmarking Compositional Generalisation for Machine Learning Interatomic Potentials

Machine Learning Interatomic Potentials play a fundamental role in computational chemistry and materials science, enabling applications from molecular dynamics simulations to drug design and materials discovery. While recent approaches can estimate inter-atomic forces with high precision, it remains unclear to what extent they can generalise to previously unseen molecules. Do they learn the compositional structure of chemistry, capturing how molecular fragments and their combinations determine properties, or do they primarily learn to interpolate patterns that are specific to the training examples? To address this question, we propose a benchmark consisting of four tasks that require some form of compositional generalisation. In each task, models are tested on molecules that were unseen during training, but the training data is chosen such that generalisation to the test examples should be feasible for models that learn the underlying physical principles. Our empirical analysis shows that the considered tasks are highly challenging for state-of-the-art models, with errors on out-of-distribution examples often an order of magnitude higher than on in-distribution examples, even when using foundation models that have been pre-trained on millions of molecules.

preprint2022arXiv

BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models Identify Analogies?

Analogies play a central role in human commonsense reasoning. The ability to recognize analogies such as "eye is to seeing what ear is to hearing", sometimes referred to as analogical proportions, shape how we structure knowledge and understand language. Surprisingly, however, the task of identifying such analogies has not yet received much attention in the language model era. In this paper, we analyze the capabilities of transformer-based language models on this unsupervised task, using benchmarks obtained from educational settings, as well as more commonly used datasets. We find that off-the-shelf language models can identify analogies to a certain extent, but struggle with abstract and complex relations, and results are highly sensitive to model architecture and hyperparameters. Overall the best results were obtained with GPT-2 and RoBERTa, while configurations using BERT were not able to outperform word embedding models. Our results raise important questions for future work about how, and to what extent, pre-trained language models capture knowledge about abstract semantic relations.

preprint2021arXiv

Distilling Relation Embeddings from Pre-trained Language Models

Pre-trained language models have been found to capture a surprisingly rich amount of lexical knowledge, ranging from commonsense properties of everyday concepts to detailed factual knowledge about named entities. Among others, this makes it possible to distill high-quality word vectors from pre-trained language models. However, it is currently unclear to what extent it is possible to distill relation embeddings, i.e. vectors that characterize the relationship between two words. Such relation embeddings are appealing because they can, in principle, encode relational knowledge in a more fine-grained way than is possible with knowledge graphs. To obtain relation embeddings from a pre-trained language model, we encode word pairs using a (manually or automatically generated) prompt, and we fine-tune the language model such that relationally similar word pairs yield similar output vectors. We find that the resulting relation embeddings are highly competitive on analogy (unsupervised) and relation classification (supervised) benchmarks, even without any task-specific fine-tuning. Source code to reproduce our experimental results and the model checkpoints are available in the following repository: https://github.com/asahi417/relbert

preprint2021arXiv

Few-shot Image Classification with Multi-Facet Prototypes

The aim of few-shot learning (FSL) is to learn how to recognize image categories from a small number of training examples. A central challenge is that the available training examples are normally insufficient to determine which visual features are most characteristic of the considered categories. To address this challenge, we organize these visual features into facets, which intuitively group features of the same kind (e.g. features that are relevant to shape, color, or texture). This is motivated from the assumption that (i) the importance of each facet differs from category to category and (ii) it is possible to predict facet importance from a pre-trained embedding of the category names. In particular, we propose an adaptive similarity measure, relying on predicted facet importance weights for a given set of categories. This measure can be used in combination with a wide array of existing metric-based methods. Experiments on miniImageNet and CUB show that our approach improves the state-of-the-art in metric-based FSL.

preprint2020arXiv

Learning Cross-lingual Embeddings from Twitter via Distant Supervision

Cross-lingual embeddings represent the meaning of words from different languages in the same vector space. Recent work has shown that it is possible to construct such representations by aligning independently learned monolingual embedding spaces, and that accurate alignments can be obtained even without external bilingual data. In this paper we explore a research direction that has been surprisingly neglected in the literature: leveraging noisy user-generated text to learn cross-lingual embeddings particularly tailored towards social media applications. While the noisiness and informal nature of the social media genre poses additional challenges to cross-lingual embedding methods, we find that it also provides key opportunities due to the abundance of code-switching and the existence of a shared vocabulary of emoji and named entities. Our contribution consists of a very simple post-processing step that exploits these phenomena to significantly improve the performance of state-of-the-art alignment methods.

preprint2020arXiv

On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language. Recent developments which construct these embeddings by aligning monolingual spaces have shown that accurate alignments can be obtained with little or no supervision. However, the focus has been on a particular controlled scenario for evaluation, and there is no strong evidence on how current state-of-the-art systems would fare with noisy text or for language pairs with major linguistic differences. In this paper we present an extensive evaluation over multiple cross-lingual embedding models, analyzing their strengths and limitations with respect to different variables such as target language, training corpora and amount of supervision. Our conclusions put in doubt the view that high-quality cross-lingual embeddings can always be learned without much supervision.

preprint2020arXiv

Plausible Reasoning about EL-Ontologies using Concept Interpolation

Description logics (DLs) are standard knowledge representation languages for modelling ontologies, i.e. knowledge about concepts and the relations between them. Unfortunately, DL ontologies are difficult to learn from data and time-consuming to encode manually. As a result, ontologies for broad domains are almost inevitably incomplete. In recent years, several data-driven approaches have been proposed for automatically extending such ontologies. One family of methods rely on characterizations of concepts that are derived from text descriptions. While such characterizations do not capture ontological knowledge directly, they encode information about the similarity between different concepts, which can be exploited for filling in the gaps in existing ontologies. To this end, several inductive inference mechanisms have already been proposed, but these have been defined and used in a heuristic fashion. In this paper, we instead propose an inductive inference mechanism which is based on a clear model-theoretic semantics, and can thus be tightly integrated with standard deductive reasoning. We particularly focus on interpolation, a powerful commonsense reasoning mechanism which is closely related to cognitive models of category-based induction. Apart from the formalization of the underlying semantics, as our main technical contribution we provide computational complexity bounds for reasoning in EL with this interpolation mechanism.

preprint2015arXiv

Solving stable matching problems using answer set programming

Since the introduction of the stable marriage problem (SMP) by Gale and Shapley (1962), several variants and extensions have been investigated. While this variety is useful to widen the application potential, each variant requires a new algorithm for finding the stable matchings. To address this issue, we propose an encoding of the SMP using answer set programming (ASP), which can straightforwardly be adapted and extended to suit the needs of specific applications. The use of ASP also means that we can take advantage of highly efficient off-the-shelf solvers. To illustrate the flexibility of our approach, we show how our ASP encoding naturally allows us to select optimal stable matchings, i.e. matchings that are optimal according to some user-specified criterion. To the best of our knowledge, our encoding offers the first exact implementation to find sex-equal, minimum regret, egalitarian or maximum cardinality stable matchings for SMP instances in which individuals may designate unacceptable partners and ties between preferences are allowed. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).

preprint2013arXiv

Characterizing and Extending Answer Set Semantics using Possibility Theory

Answer Set Programming (ASP) is a popular framework for modeling combinatorial problems. However, ASP cannot easily be used for reasoning about uncertain information. Possibilistic ASP (PASP) is an extension of ASP that combines possibilistic logic and ASP. In PASP a weight is associated with each rule, where this weight is interpreted as the certainty with which the conclusion can be established when the body is known to hold. As such, it allows us to model and reason about uncertain information in an intuitive way. In this paper we present new semantics for PASP, in which rules are interpreted as constraints on possibility distributions. Special models of these constraints are then identified as possibilistic answer sets. In addition, since ASP is a special case of PASP in which all the rules are entirely certain, we obtain a new characterization of ASP in terms of constraints on possibility distributions. This allows us to uncover a new form of disjunction, called weak disjunction, that has not been previously considered in the literature. In addition to introducing and motivating the semantics of weak disjunction, we also pinpoint its computational complexity. In particular, while the complexity of most reasoning tasks coincides with standard disjunctive ASP, we find that brave reasoning for programs with weak disjunctions is easier.