Source author record

Mark Coates

Mark Coates appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Networking and Internet Architecture Artificial Intelligence Information Retrieval Social and Information Networks physics.soc-ph Computation Cryptography and Security Distributed, Parallel, and Cluster Computing eess.IV math.ST Methodology Statistics Theory

Catalog footprint

What is connected

23works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Abductive Reasoning with Probabilistic Commonsense

Recent efforts to improve the reasoning abilities of Large Language Models (LLMs) have focused on integrating formal logic solvers within neurosymbolic frameworks. A key challenge is that formal solvers lack commonsense world knowledge, preventing them from making reasoning steps that humans find obvious. Prior methods address this by using LLMs to supply missing commonsense assumptions, but these approaches implicitly assume universal agreement on such commonsense facts. In reality, commonsense beliefs vary across individuals. We propose a probabilistic framework for abductive commonsense reasoning that explicitly models this variation, aiming to determine whether most people would judge a statement as true or false. We introduce Probabilistic Abductive CommonSense (PACS), a novel algorithm that uses an LLM and a formal solver to sample proofs as observations of individuals' distinct commonsense beliefs, and aggregates conclusions across these samples. Empirically, PACS outperforms chain-of-thought reasoning, prior neurosymbolic methods, and search-based approaches across multiple benchmarks.

preprint2024arXiv

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Temporal graph neural networks have shown promising results in learning inductive representations by automatically extracting temporal patterns. However, previous works often rely on complex memory modules or inefficient random walk methods to construct temporal representations. To address these limitations, we present an efficient yet effective attention-based encoder that leverages temporal edge encodings and window-based subgraph sampling to generate task-agnostic embeddings. Moreover, we propose a joint-embedding architecture using non-contrastive SSL to learn rich temporal embeddings without labels. Experimental results on 7 benchmark datasets indicate that on average, our model outperforms SoTA baselines on the future link prediction task by 4.23% for the transductive setting and 3.30% for the inductive setting while only requiring 5-10x less training/inference time. Lastly, different aspects of the proposed framework are investigated through experimental analysis and ablation studies. The code is publicly available at https://github.com/huawei-noah/noah-research/tree/master/graph_atlas.

preprint2022arXiv

Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Multiple Instance Learning (MIL) is a weakly supervised learning problem where the aim is to assign labels to sets or bags of instances, as opposed to traditional supervised learning where each instance is assumed to be independent and identically distributed (IID) and is to be labeled individually. Recent work has shown promising results for neural network models in the MIL setting. Instead of focusing on each instance, these models are trained in an end-to-end fashion to learn effective bag-level representations by suitably combining permutation invariant pooling techniques with neural architectures. In this paper, we consider modelling the interactions between bags using a graph and employ Graph Neural Networks (GNNs) to facilitate end-to-end learning. Since a meaningful graph representing dependencies between bags is rarely available, we propose to use a Bayesian GNN framework that can generate a likely graph structure for scenarios where there is uncertainty in the graph or when no graph is available. Empirical results demonstrate the efficacy of the proposed technique for several MIL benchmark tasks and a distribution regression task.

preprint2022arXiv

Multi-relation Message Passing for Multi-label Text Classification

A well-known challenge associated with the multi-label classification problem is modelling dependencies between labels. Most attempts at modelling label dependencies focus on co-occurrences, ignoring the valuable information that can be extracted by detecting label subsets that rarely occur together. For example, consider customer product reviews; a product probably would not simultaneously be tagged by both "recommended" (i.e., reviewer is happy and recommends the product) and "urgent" (i.e., the review suggests immediate action to remedy an unsatisfactory experience). Aside from the consideration of positive and negative dependencies, the direction of a relationship should also be considered. For a multi-label image classification problem, the "ship" and "sea" labels have an obvious dependency, but the presence of the former implies the latter much more strongly than the other way around. These examples motivate the modelling of multiple types of bi-directional relationships between labels. In this paper, we propose a novel method, entitled Multi-relation Message Passing (MrMP), for the multi-label classification problem. Experiments on benchmark multi-label text classification datasets show that the MrMP module yields similar or superior performance compared to state-of-the-art methods. The approach imposes only minor additional computational and memory overheads.

preprint2022arXiv

Node Copying: A Random Graph Model for Effective Graph Sampling

There has been an increased interest in applying machine learning techniques on relational structured-data based on an observed graph. Often, this graph is not fully representative of the true relationship amongst nodes. In these settings, building a generative model conditioned on the observed graph allows to take the graph uncertainty into account. Various existing techniques either rely on restrictive assumptions, fail to preserve topological properties within the samples or are prohibitively expensive for larger graphs. In this work, we introduce the node copying model for constructing a distribution over graphs. Sampling of a random graph is carried out by replacing each node's neighbors by those of a randomly sampled similar node. The sampled graphs preserve key characteristics of the graph structure without explicitly targeting them. Additionally, sampling from this model is extremely simple and scales linearly with the nodes. We show the usefulness of the copying model in three tasks. First, in node classification, a Bayesian formulation based on node copying achieves higher accuracy in sparse data settings. Second, we employ our proposed model to mitigate the effect of adversarial attacks on the graph topology. Last, incorporation of the model in a recommendation system setting improves recall over state-of-the-art methods.

preprint2022arXiv

Spectral image clustering on dual-energy CT scans using functional regression mixtures

Dual-energy computed tomography (DECT) is an advanced CT scanning technique enabling material characterization not possible with conventional CT scans. It allows the reconstruction of energy decay curves at each 3D image voxel, representing varying image attenuation at different effective scanning energy levels. In this paper, we develop novel functional data analysis (FDA) techniques and adapt them to the analysis of DECT decay curves. More specifically, we construct functional mixture models that integrate spatial context in mixture weights, with mixture component densities being constructed upon the energy decay curves as functional observations. We design unsupervised clustering algorithms by developing dedicated expectation maximization (EM) algorithms for the maximum likelihood estimation of the model parameters. To our knowledge, this is the first article to adapt statistical FDA tools and model-based clustering to take advantage of the full spectral information provided by DECT. We evaluate our methods on 91 head and neck cancer DECT scans. We compare our unsupervised clustering results to tumor contours traced manually by radiologists, as well as to several baseline algorithms. Given the inter-rater variability even among experts at delineating head and neck tumors, and given the potential importance of tissue reactions surrounding the tumor itself, our proposed methodology has the potential to add value in downstream machine learning applications for clinical outcome prediction based on DECT data in head and neck cancer.

preprint2021arXiv

Knowledge-Enhanced Top-K Recommendation in Poincaré Ball

Personalized recommender systems are increasingly important as more content and services become available and users struggle to identify what might interest them. Thanks to the ability for providing rich information, knowledge graphs (KGs) are being incorporated to enhance the recommendation performance and interpretability. To effectively make use of the knowledge graph, we propose a recommendation model in the hyperbolic space, which facilitates the learning of the hierarchical structure of knowledge graphs. Furthermore, a hyperbolic attention network is employed to determine the relative importances of neighboring entities of a certain item. In addition, we propose an adaptive and fine-grained regularization mechanism to adaptively regularize items and their neighboring representations. Via a comparison using three real-world datasets with state-of-the-art methods, we show that the proposed model outperforms the best existing models by 2-16% in terms of NDCG@K on Top-K recommendation.

preprint2021arXiv

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this, we develop a distance-based recommendation model with several novel aspects: (i) each user and item are parameterized by Gaussian distributions to capture the learning uncertainties; (ii) an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets; (iii) explicit user-user/item-item similarity modeling is incorporated in the objective function. The Wasserstein distance is employed to determine preferences because it obeys the triangle inequality and can measure the distance between probabilistic distributions. Via a comparison using five real-world datasets with state-of-the-art methods, the proposed model outperforms the best existing models by 4-22% in terms of recall@K on Top-K recommendation.

preprint2020arXiv

Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.

preprint2020arXiv

GraphSAIL: Graph Structure Aware Incremental Learning for Recommender Systems

Given the convenience of collecting information through online services, recommender systems now consume large scale data and play a more important role in improving user experience. With the recent emergence of Graph Neural Networks (GNNs), GNN-based recommender models have shown the advantage of modeling the recommender system as a user-item bipartite graph to learn representations of users and items. However, such models are expensive to train and difficult to perform frequent updates to provide the most up-to-date recommendations. In this work, we propose to update GNN-based recommender models incrementally so that the computation time can be greatly reduced and models can be updated more frequently. We develop a Graph Structure Aware Incremental Learning framework, GraphSAIL, to address the commonly experienced catastrophic forgetting problem that occurs when training a model in an incremental fashion. Our approach preserves a user's long-term preference (or an item's long-term property) during incremental model updating. GraphSAIL implements a graph structure preservation strategy which explicitly preserves each node's local structure, global structure, and self-information, respectively. We argue that our incremental training framework is the first attempt tailored for GNN based recommender systems and demonstrate its improvement compared to other incremental learning techniques on two public datasets. We further verify the effectiveness of our framework on a large-scale industrial dataset.

preprint2020arXiv

Multi-Graph Convolution Collaborative Filtering

Personalized recommendation is ubiquitous, playing an important role in many online services. Substantial research has been dedicated to learning vector representations of users and items with the goal of predicting a user's preference for an item based on the similarity of the representations. Techniques range from classic matrix factorization to more recent deep learning based methods. However, we argue that existing methods do not make full use of the information that is available from user-item interaction data and the similarities between user pairs and item pairs. In this work, we develop a graph convolution-based recommendation framework, named Multi-Graph Convolution Collaborative Filtering (Multi-GCCF), which explicitly incorporates multiple graphs in the embedding learning process. Multi-GCCF not only expressively models the high-order information via a partite user-item interaction graph, but also integrates the proximal information by building and processing user-user and item-item graphs. Furthermore, we consider the intrinsic difference between user nodes and item nodes when performing graph convolution on the bipartite graph. We conduct extensive experiments on four publicly accessible benchmarks, showing significant improvements relative to several state-of-the-art collaborative filtering and graph neural network-based recommendation models. Further experiments quantitatively verify the effectiveness of each component of our proposed model and demonstrate that the learned embeddings capture the important relationship structure.

preprint2020arXiv

Node Copying for Protection Against Graph Neural Network Topology Attacks

Adversarial attacks can affect the performance of existing deep learning models. With the increased interest in graph based machine learning techniques, there have been investigations which suggest that these models are also vulnerable to attacks. In particular, corruptions of the graph topology can degrade the performance of graph based learning algorithms severely. This is due to the fact that the prediction capability of these algorithms relies mostly on the similarity structure imposed by the graph connectivity. Therefore, detecting the location of the corruption and correcting the induced errors becomes crucial. There has been some recent work which tackles the detection problem, however these methods do not address the effect of the attack on the downstream learning task. In this work, we propose an algorithm that uses node copying to mitigate the degradation in classification that is caused by adversarial attacks. The proposed methodology is applied only after the model for the downstream task is trained and the added computation cost scales well for large graphs. Experimental results show the effectiveness of our approach for several real world datasets.

preprint2020arXiv

Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Graphs are ubiquitous in modelling relational structures. Recent endeavours in machine learning for graph-structured data have led to many architectures and learning algorithms. However, the graph used by these algorithms is often constructed based on inaccurate modelling assumptions and/or noisy data. As a result, it fails to represent the true relationships between nodes. A Bayesian framework which targets posterior inference of the graph by considering it as a random quantity can be beneficial. In this paper, we propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. The proposed model is flexible in the sense that it can effectively take into account the output of graph-based learning algorithms that target specific tasks. In addition, model inference scales well to large graphs. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.

preprint2016arXiv

Bayesian Inference of Diffusion Networks with Unknown Infection Times

The analysis of diffusion processes in real-world propagation scenarios often involves estimating variables that are not directly observed. These hidden variables include parental relationships, the strengths of connections between nodes, and the moments of time that infection occurs. In this paper, we propose a framework in which all three sets of parameters are assumed to be hidden and we develop a Bayesian approach to infer them. After justifying the model assumptions, we evaluate the performance efficiency of our proposed approach through numerical simulations on synthetic datasets and real-world diffusion processes.

preprint2016arXiv

Semi-parametric Order-based Generalized Multivariate Regression

In this paper, we consider a generalized multivariate regression problem where the responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove that our algorithm, which maximizes the rank correlation of responses and linear transformations of predictors, is a consistent estimator of the true coefficient matrix. We also identify the rate of convergence and show that the squared estimation error decays with a rate of $o(1/\sqrt{n})$. We then propose a greedy algorithm to maximize the highly non-smooth objective function of our model and examine its performance through extensive simulations. Finally, we compare our algorithm with traditional multivariate regression algorithms over synthetic and real data.

preprint2016arXiv

Sparse Multivariate Factor Regression

We consider the problem of multivariate regression in a setting where the relevant predictors could be shared among different responses. We propose an algorithm which decomposes the coefficient matrix into the product of a long matrix and a wide matrix, with an elastic net penalty on the former and an $\ell_1$ penalty on the latter. The first matrix linearly transforms the predictors to a set of latent factors, and the second one regresses the responses on these factors. Our algorithm simultaneously performs dimension reduction and coefficient estimation and automatically estimates the number of latent factors from the data. Our formulation results in a non-convex optimization problem, which despite its flexibility to impose effective low-dimensional structure, is difficult, or even impossible, to solve exactly in a reasonable time. We specify an optimization algorithm based on alternating minimization with three different sets of updates to solve this non-convex problem and provide theoretical results on its convergence and optimality. Finally, we demonstrate the effectiveness of our algorithm via experiments on simulated and real data.

preprint2015arXiv

Optimal Forwarding in Opportunistic Delay Tolerant Networks with Meeting Rate Estimations

Data transfer in opportunistic Delay Tolerant Networks (DTNs) must rely on unscheduled sporadic meetings between nodes. The main challenge in these networks is to develop a mechanism based on which nodes can learn to make nearly optimal forwarding decision rules despite having no a-priori knowledge of the network topology. The forwarding mechanism should ideally result in a high delivery probability, low average latency and efficient usage of the network resources. In this paper, we propose both centralized and decentralized single-copy message forwarding algorithms that, under relatively strong assumptions about the networks behaviour, minimize the expected latencies from any node in the network to a particular destination. After proving the optimality of our proposed algorithms, we develop a decentralized algorithm that involves a recursive maximum likelihood procedure to estimate the meeting rates. We confirm the improvement that our proposed algorithms make in the system performance through numerical simulations on datasets from synthetic and real-world opportunistic networks.

preprint2013arXiv

Content Distribution Strategies in Opportunistic Networks

This paper describes a mechanism for content distribution through opportunistic contacts between subscribers. A subset of subscribers in the network are seeded with the content. The remaining subscribers obtain the information through opportunistic contact with a user carrying the updated content. We study how the rate of content retrieval by subscribers is affected by the number of initial seeders in the network. We also study the rate of content retrieval by the subscribers under coding strategies (Network Coding, Erasure Coding) and under Flooding, Epidemic Routing.

preprint2011arXiv

GANC: Greedy Agglomerative Normalized Cut

This paper describes a graph clustering algorithm that aims to minimize the normalized cut criterion and has a model order selection procedure. The performance of the proposed algorithm is comparable to spectral approaches in terms of minimizing normalized cut. However, unlike spectral approaches, the proposed algorithm scales to graphs with millions of nodes and edges. The algorithm consists of three components that are processed sequentially: a greedy agglomerative hierarchical clustering procedure, model order selection, and a local refinement. For a graph of n nodes and O(n) edges, the computational complexity of the algorithm is O(n log^2 n), a major improvement over the O(n^3) complexity of spectral methods. Experiments are performed on real and synthetic networks to demonstrate the scalability of the proposed approach, the effectiveness of the model order selection procedure, and the performance of the proposed algorithm in terms of minimizing the normalized cut metric.

preprint2010arXiv

Large scale probabilistic available bandwidth estimation

The common utilization-based definition of available bandwidth and many of the existing tools to estimate it suffer from several important weaknesses: i) most tools report a point estimate of average available bandwidth over a measurement interval and do not provide a confidence interval; ii) the commonly adopted models used to relate the available bandwidth metric to the measured data are invalid in almost all practical scenarios; iii) existing tools do not scale well and are not suited to the task of multi-path estimation in large-scale networks; iv) almost all tools use ad-hoc techniques to address measurement noise; and v) tools do not provide enough flexibility in terms of accuracy, overhead, latency and reliability to adapt to the requirements of various applications. In this paper we propose a new definition for available bandwidth and a novel framework that addresses these issues. We define probabilistic available bandwidth (PAB) as the largest input rate at which we can send a traffic flow along a path while achieving, with specified probability, an output rate that is almost as large as the input rate. PAB is expressed directly in terms of the measurable output rate and includes adjustable parameters that allow the user to adapt to different application requirements. Our probabilistic framework to estimate network-wide probabilistic available bandwidth is based on packet trains, Bayesian inference, factor graphs and active sampling. We deploy our tool on the PlanetLab network and our results show that we can obtain accurate estimates with a much smaller measurement overhead compared to existing approaches.

preprint2010arXiv

Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning

Knowing the largest rate at which data can be sent on an end-to-end path such that the egress rate is equal to the ingress rate with high probability can be very practical when choosing transmission rates in video streaming or selecting peers in peer-to-peer applications. We introduce probabilistic available bandwidth, which is defined in terms of ingress rates and egress rates of traffic on a path, rather than in terms of capacity and utilization of the constituent links of the path like the standard available bandwidth metric. In this paper, we describe a distributed algorithm, based on a probabilistic graphical model and Bayesian active learning, for simultaneously estimating the probabilistic available bandwidth of multiple paths through a network. Our procedure exploits the fact that each packet train provides information not only about the path it traverses, but also about any path that shares a link with the monitored path. Simulations and PlanetLab experiments indicate that this process can dramatically reduce the number of probes required to generate accurate estimates.

preprint2010arXiv

Real-Time Multi-path Tracking of Probabilistic Available Bandwidth

Applications such as traffic engineering and network provisioning can greatly benefit from knowing, in real time, what is the largest input rate at which it is possible to transmit on a given path without causing congestion. We consider a probabilistic formulation for available bandwidth where the user specifies the probability of achieving an output rate almost as large as the input rate. We are interested in estimating and tracking the network-wide probabilistic available bandwidth (PAB) on multiple paths simultaneously with minimal overhead on the network. We propose a novel framework based on chirps, Bayesian inference, belief propagation and active sampling to estimate the PAB. We also consider the time evolution of the PAB by forming a dynamic model and designing a tracking algorithm based on particle filters. We implement our method in a lightweight and practical tool that has been deployed on the PlanetLab network to do online experiments. We show through these experiments and simulations that our approach outperforms block-based algorithms in terms of input rate cost and probability of successful transmission.

preprint2009arXiv

Greedy Gossip with Eavesdropping

This paper presents greedy gossip with eavesdropping (GGE), a novel randomized gossip algorithm for distributed computation of the average consensus problem. In gossip algorithms, nodes in the network randomly communicate with their neighbors and exchange information iteratively. The algorithms are simple and decentralized, making them attractive for wireless network applications. In general, gossip algorithms are robust to unreliable wireless conditions and time varying network topologies. In this paper we introduce GGE and demonstrate that greedy updates lead to rapid convergence. We do not require nodes to have any location information. Instead, greedy updates are made possible by exploiting the broadcast nature of wireless communications. During the operation of GGE, when a node decides to gossip, instead of choosing one of its neighbors at random, it makes a greedy selection, choosing the node which has the value most different from its own. In order to make this selection, nodes need to know their neighbors' values. Therefore, we assume that all transmissions are wireless broadcasts and nodes keep track of their neighbors' values by eavesdropping on their communications. We show that the convergence of GGE is guaranteed for connected network topologies. We also study the rates of convergence and illustrate, through theoretical bounds and numerical simulations, that GGE consistently outperforms randomized gossip and performs comparably to geographic gossip on moderate-sized random geometric graph topologies.

Mark Coates

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

Abductive Reasoning with Probabilistic Commonsense

DyG2Vec: Efficient Representation Learning for Dynamic Graphs

Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Multi-relation Message Passing for Multi-label Text Classification

Node Copying: A Random Graph Model for Effective Graph Sampling

Spectral image clustering on dual-energy CT scans using functional regression mixtures

Knowledge-Enhanced Top-K Recommendation in Poincaré Ball

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

GraphSAIL: Graph Structure Aware Incremental Learning for Recommender Systems

Multi-Graph Convolution Collaborative Filtering

Node Copying for Protection Against Graph Neural Network Topology Attacks

Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Bayesian Inference of Diffusion Networks with Unknown Infection Times

Semi-parametric Order-based Generalized Multivariate Regression

Sparse Multivariate Factor Regression

Optimal Forwarding in Opportunistic Delay Tolerant Networks with Meeting Rate Estimations

Content Distribution Strategies in Opportunistic Networks

GANC: Greedy Agglomerative Normalized Cut

Large scale probabilistic available bandwidth estimation

Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning

Real-Time Multi-path Tracking of Probabilistic Available Bandwidth

Greedy Gossip with Eavesdropping