Source author record

Vincent Gripon

Vincent Gripon appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

36works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach

The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings.

preprint2022arXiv

EASY: Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients

Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems, where only a few labeled samples per class are available. Recent years have seen a fair number of works in the field, introducing methods with numerous ingredients. A frequent problem, though, is the use of suboptimally trained models to extract knowledge, leading to interrogations on whether proposed approaches bring gains compared to using better initial models without the introduced ingredients. In this work, we propose a simple methodology, that reaches or even beats state of the art performance on multiple standardized benchmarks of the field, while adding almost no hyperparameters or parameters to those used for training the initial deep learning models on the generic dataset. This methodology offers a new baseline on which to propose (and fairly compare) new techniques or adapt existing ones.

preprint2022arXiv

Preserving Fine-Grain Feature Information in Classification via Entropic Regularization

Labeling a classification dataset implies to define classes and associated coarse labels, that may approximate a smoother and more complicated ground truth. For example, natural images may contain multiple objects, only one of which is labeled in many vision datasets, or classes may result from the discretization of a regression problem. Using cross-entropy to train classification models on such coarse labels is likely to roughly cut through the feature space, potentially disregarding the most meaningful such features, in particular losing information on the underlying fine-grain task. In this paper we are interested in the problem of solving fine-grain classification or regression, using a model trained on coarse-grain labels only. We show that standard cross-entropy can lead to overfitting to coarse-related features. We introduce an entropy-based regularization to promote more diversity in the feature space of trained models, and empirically demonstrate the efficacy of this methodology to reach better performance on the fine-grain problems. Our results are supported through theoretical developments and empirical validation.

preprint2022arXiv

Preventing Manifold Intrusion with Locality: Local Mixup

Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in adversarial effects. In this paper, we introduce Local Mixup in which distant input samples are weighted down when computing the loss. In constrained settings we demonstrate that Local Mixup can create a trade-off between bias and variance, with the extreme cases reducing to vanilla training and classical Mixup. Using standardized computer vision benchmarks , we also show that Local Mixup can improve test accuracy.

preprint2022arXiv

Pruning Graph Convolutional Networks to select meaningful graph frequencies for fMRI decoding

Graph Signal Processing is a promising framework to manipulate brain signals as it allows to encompass the spatial dependencies between the activity in regions of interest in the brain. In this work, we are interested in better understanding what are the graph frequencies that are the most useful to decode fMRI signals. To this end, we introduce a deep learning architecture and adapt a pruning methodology to automatically identify such frequencies. We experiment with various datasets, architectures and graphs, and show that low graph frequencies are consistently identified as the most important for fMRI decoding, with a stronger contribution for the functional graph over the structural one. We believe that this work provides novel insights on how graph-based methods can be deployed to increase fMRI decoding accuracy and interpretability.

preprint2022arXiv

Rethinking Weight Decay For Efficient Neural Network Pruning

Introduced in the late 1980s for generalization purposes, pruning has now become a staple for compressing deep neural networks. Despite many innovations in recent decades, pruning approaches still face core issues that hinder their performance or scalability. Drawing inspiration from early work in the field, and especially the use of weight decay to achieve sparsity, we introduce Selective Weight Decay (SWD), which carries out efficient, continuous pruning throughout training. Our approach, theoretically grounded on Lagrangian smoothing, is versatile and can be applied to multiple tasks, networks, and pruning structures. We show that SWD compares favorably to state-of-the-art approaches, in terms of performance-to-parameters ratio, on the CIFAR-10, Cora, and ImageNet ILSVRC2012 datasets.

preprint2021arXiv

Graph-based Interpolation of Feature Vectors for Accurate Few-Shot Classification

In few-shot classification, the aim is to learn models able to discriminate classes using only a small number of labeled examples. In this context, works have proposed to introduce Graph Neural Networks (GNNs) aiming at exploiting the information contained in other samples treated concurrently, what is commonly referred to as the transductive setting in the literature. These GNNs are trained all together with a backbone feature extractor. In this paper, we propose a new method that relies on graphs only to interpolate feature vectors instead, resulting in a transductive learning setting with no additional parameters to train. Our proposed method thus exploits two levels of information: a) transfer features obtained on generic datasets, b) transductive information obtained from other samples to be classified. Using standard few-shot vision classification datasets, we demonstrate its ability to bring significant gains compared to other works.

preprint2021arXiv

Improving Classification Accuracy with Graph Filtering

In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean. While our approach applies to all classification problems in general, it is particularly useful in few-shot settings, where intra-class noise can have a huge impact due to the small sample selection. Using standardized benchmarks in the field of vision, we empirically demonstrate the ability of the proposed method to slightly improve state-of-the-art results in both cases of few-shot and standard classification.

preprint2021arXiv

Inferring Graph Signal Translations as Invariant Transformations for Classification Tasks

The field of Graph Signal Processing (GSP) has proposed tools to generalize harmonic analysis to complex domains represented through graphs. Among these tools are translations, which are required to define many others. Most works propose to define translations using solely the graph structure (i.e. edges). Such a problem is ill-posed in general as a graph conveys information about neighborhood but not about directions. In this paper, we propose to infer translations as edge-constrained operations that make a supervised classification problem invariant using a deep learning framework. As such, our methodology uses both the graph structure and labeled signals to infer translations. We perform experiments with regular 2D images and abstract hyperlink networks to show the effectiveness of the proposed methodology in inferring meaningful translations for signals supported on graphs.

preprint2021arXiv

Leveraging the Feature Distribution in Transfer-based Few-Shot Learning

Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples. In the past few years, many methods have been proposed to solve few-shot classification, among which transfer-based methods have proved to achieve the best performance. Following this vein, in this paper we propose a novel transfer-based method that builds on two successive steps: 1) preprocessing the feature vectors so that they become closer to Gaussian-like distributions, and 2) leveraging this preprocessing using an optimal-transport inspired algorithm (in the case of transductive settings). Using standardized vision benchmarks, we prove the ability of the proposed methodology to achieve state-of-the-art accuracy with various datasets, backbone architectures and few-shot settings.

preprint2020arXiv

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Neural networks have demonstrably achieved state-of-the art accuracy using low-bitlength integer quantization, yielding both execution time and energy benefits on existing hardware designs that support short bitlengths. However, the question of finding the minimum bitlength for a desired accuracy remains open. We introduce a training method for minimizing inference bitlength at any granularity while maintaining accuracy. Namely, we propose a regularizer that penalizes large bitlength representations throughout the architecture and show how it can be modified to minimize other quantifiable criteria, such as number of operations or memory footprint. We demonstrate that our method learns thrifty representations while maintaining accuracy. With ImageNet, the method produces an average per layer bitlength of 4.13, 3.76 and 4.36 bits on AlexNet, ResNet18 and MobileNet V2 respectively, remaining within 2.0%, 0.5% and 0.5% of the base TOP-1 accuracy.

preprint2020arXiv

GPU-based Self-Organizing Maps for Post-Labeled Few-Shot Unsupervised Learning

Few-shot classification is a challenge in machine learning where the goal is to train a classifier using a very limited number of labeled examples. This scenario is likely to occur frequently in real life, for example when data acquisition or labeling is expensive. In this work, we consider the problem of post-labeled few-shot unsupervised learning, a classification task where representations are learned in an unsupervised fashion, to be later labeled using very few annotated examples. We argue that this problem is very likely to occur on the edge, when the embedded device directly acquires the data, and the expert needed to perform labeling cannot be prompted often. To address this problem, we consider an algorithm consisting of the concatenation of transfer learning with clustering using Self-Organizing Maps (SOMs). We introduce a TensorFlow-based implementation to speed-up the process in multi-core CPUs and GPUs. Finally, we demonstrate the effectiveness of the method using standard off-the-shelf few-shot classification benchmarks.

preprint2020arXiv

Graph topology inference benchmarks for machine learning

Graphs are nowadays ubiquitous in the fields of signal processing and machine learning. As a tool used to express relationships between objects, graphs can be deployed to various ends: I) clustering of vertices, II) semi-supervised classification of vertices, III) supervised classification of graph signals, and IV) denoising of graph signals. However, in many practical cases graphs are not explicitly available and must therefore be inferred from data. Validation is a challenging endeavor that naturally depends on the downstream task for which the graph is learnt. Accordingly, it has often been difficult to compare the efficacy of different algorithms. In this work, we introduce several ease-to-use and publicly released benchmarks specifically designed to reveal the relative merits and limitations of graph inference methods. We also contrast some of the most prominent techniques in the literature.

preprint2020arXiv

Predicting the Accuracy of a Few-Shot Classifier

In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We then investigate the case of using transfer-based solutions, and consider three settings: i) supervised where we only have access to a few labeled samples, ii) semi-supervised where we have access to both a few labeled samples and a set of unlabeled samples and iii) unsupervised where we only have access to unlabeled samples. For each setting, we propose reasonable measures that we empirically demonstrate to be correlated with the generalization ability of considered classifiers. We also show that these simple measures can be used to predict generalization up to a certain confidence. We conduct our experiments on standard few-shot vision datasets.

preprint2020arXiv

ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget

Typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40K parameters in total, and 74.3% on CIFAR-100 with less than 600K parameters.

preprint2020arXiv

Towards an Intrinsic Definition of Robustness for a Classifier

The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs. Therefore, finding good measures of robustness of a trained classifier is a key issue in the field. In this paper, we point out that averaging the radius of robustness of samples in a validation set is a statistically weak measure. We propose instead to weight the importance of samples depending on their difficulty. We motivate the proposed score by a theoretical case study using logistic regression, where we show that the proposed score is independent of the choice of the samples it is evaluated upon. We also empirically demonstrate the ability of the proposed score to measure robustness of classifiers with little dependence on the choice of samples in more complex settings, including deep convolutional neural networks and real datasets.

preprint2019arXiv

Efficient Hardware Implementation of Incremental Learning and Inference on Chip

In this paper, we tackle the problem of incrementally learning a classifier, one example at a time, directly on chip. To this end, we propose an efficient hardware implementation of a recently introduced incremental learning procedure that achieves state-of-the-art performance by combining transfer learning with majority votes and quantization techniques. The proposed design is able to accommodate for both new examples and new classes directly on the chip. We detail the hardware implementation of the method (implemented on FPGA target) and show it requires limited resources while providing a significant acceleration compared to using a CPU.

preprint2016arXiv

A Comparative Study of Sparse Associative Memories

We study various models of associative memories with sparse information, i.e. a pattern to be stored is a random string of $0$s and $1$s with about $\log N$ $1$s, only. We compare different synaptic weights, architectures and retrieval mechanisms to shed light on the influence of the various parameters on the storage capacity.

preprint2016arXiv

Compression of Deep Neural Networks on the Fly

Thanks to their state-of-the-art performance, deep neural networks are increasingly used for object recognition. To achieve these results, they use millions of parameters to be trained. However, when targeting embedded applications the size of these models becomes problematic. As a consequence, their usage on smartphones or other resource limited devices is prohibited. In this paper we introduce a novel compression method for deep neural networks that is performed during the learning phase. It consists in adding an extra regularization term to the cost function of fully-connected layers. We combine this method with Product Quantization (PQ) of the trained weights for higher savings in storage consumption. We evaluate our method on two data sets (MNIST and CIFAR10), on which we achieve significantly larger compression rates than state-of-the-art methods.

preprint2016arXiv

Graph reconstruction from the observation of diffused signals

Signal processing on graphs has received a lot of attention in the recent years. A lot of techniques have arised, inspired by classical signal processing ones, to allow studying signals on any kind of graph. A common aspect of these technique is that they require a graph correctly modeling the studied support to explain the signals that are observed on it. However, in many cases, such a graph is unavailable or has no real physical existence. An example of this latter case is a set of sensors randomly thrown in a field which obviously observe related information. To study such signals, there is no intuitive choice for a support graph. In this document, we address the problem of inferring a graph structure from the observation of signals, under the assumption that they were issued of the diffusion of initially i.i.d. signals. To validate our approach, we design an experimental protocol, in which we diffuse signals on a known graph. Then, we forget the graph, and show that we are able to retrieve it very precisely from the only knowledge of the diffused signals.

preprint2016arXiv

Neighborhood-Preserving Translations on Graphs

In many domains (e.g. Internet of Things, neuroimaging) signals are naturally supported on graphs. These graphs usually convey information on similarity between the values taken by the signal at the corresponding vertices. An interest of using graphs is that it allows to define ad hoc operators to perform signal processing. Among them, ones of paramount importance in many tasks are translations. In this paper we are interested in defining translations on graphs using a few simple properties. Namely we propose to define translations as functions from vertices to adjacent ones, that preserve neighborhood properties of the graph. We show that our definitions, contrary to other works on the subject, match usual translations on grid graphs.

preprint2016arXiv

Toward An Uncertainty Principle For Weighted Graphs

The uncertainty principle states that a signal cannot be localized both in time and frequency. With the aim of extending this result to signals on graphs, Agaskar&Lu introduce notions of graph and spectral spreads. They show that a graph uncertainty principle holds for some families of unweighted graphs. This principle states that a signal cannot be simultaneously localized both in graph and spectral domains. In this paper, we aim to extend their work to weighted graphs. We show that a naive extension of their definitions leads to inconsistent results such as discontinuity of the graph spread when regarded as a function of the graph structure. To circumvent this problem, we propose another definition of graph spread that relies on an inverse similarity matrix. We also discuss the choice of the distance function that appears in this definition. Finally, we compute and plot uncertainty curves for families of weighted graphs.

preprint2016arXiv

Towards a characterization of the uncertainty curve for graphs

Signal processing on graphs is a recent research domain that aims at generalizing classical tools in signal processing, in order to analyze signals evolving on complex domains. Such domains are represented by graphs, for which one can compute a particular matrix, called the normalized Laplacian. It was shown that the eigenvalues of this Laplacian correspond to the frequencies of the Fourier domain in classical signal processing. Therefore, the frequency domain is not the same for every support graph. A consequence of this is that there is no non-trivial generalization of Heisenberg's uncertainty principle, that states that a signal cannot be fully localized both in the time domain and in the frequency domain. A way to generalize this principle, introduced by Agaskar and Lu, consists in determining a curve that represents a lower bound on the compromise between precision in the graph domain and precision in the spectral domain. The aim of this paper is to propose a characterization of the signals achieving this curve, for a larger class of graphs than the one studied by Agaskar and Lu.

preprint2014arXiv

Combating Corrupt Messages in Sparse Clustered Associative Memories

In this paper we analyze and extend the neural network based associative memory proposed by Gripon and Berrou. This associative memory resembles the celebrated Willshaw model with an added partite cluster structure. In the literature, two retrieving schemes have been proposed for the network dynamics, namely sum-of-sum and sum-of-max. They both offer considerably better performance than Willshaw and Hopfield networks, when comparable retrieval scenarios are considered. Former discussions and experiments concentrate on the erasure scenario, where a partial message is used as a probe to the network, in the hope of retrieving the full message. In this regard, sum-of-max outperforms sum-of-sum in terms of retrieval rate by a large margin. However, we observe that when noise and errors are present and the network is queried by a corrupt probe, sum-of-max faces a severe limitation as its stringent activation rule prevents a neuron from reviving back into play once deactivated. In this manuscript, we categorize and analyze different error scenarios so that both the erasure and the corrupt scenarios can be treated consistently. We make an amendment to the network structure to improve the retrieval rate, at the cost of an extra scalar per neuron. Afterwards, five different approaches are proposed to deal with corrupt probes. As a result, we extend the network capability, and also increase the robustness of the retrieving procedure. We then experimentally compare all these proposals and discuss pros and cons of each approach under different types of errors. Simulation results show that if carefully designed, the network is able to preserve both a high retrieval rate and a low running time simultaneously, even when queried by a corrupt probe.

preprint2014arXiv

Storing sequences in binary tournament-based neural networks

An extension to a recently introduced architecture of clique-based neural networks is presented. This extension makes it possible to store sequences with high efficiency. To obtain this property, network connections are provided with orientation and with flexible redundancy carried by both spatial and temporal redundancy, a mechanism of anticipation being introduced in the model. In addition to the sequence storage with high efficiency, this new scheme also offers biological plausibility. In order to achieve accurate sequence retrieval, a double layered structure combining hetero-association and auto-association is also proposed.

preprint2013arXiv

A Low-Power Content-Addressable-Memory Based on Clustered-Sparse-Networks

A low-power Content-Addressable-Memory (CAM) is introduced employing a new mechanism for associativity between the input tags and the corresponding address of the output data. The proposed architecture is based on a recently developed clustered-sparse-network using binary-weighted connections that on-average will eliminate most of the parallel comparisons performed during a search. Therefore, the dynamic energy consumption of the proposed design is significantly lower compared to that of a conventional low-power CAM design. Given an input tag, the proposed architecture computes a few possibilities for the location of the matched tag and performs the comparisons on them to locate a single valid match. A 0.13 um CMOS technology was used for simulation purposes. The energy consumption and the search delay of the proposed design are 9.5%, and 30.4% of that of the conventional NAND architecture respectively with a 3.4% higher number of transistors.

preprint2013arXiv

A Massively Parallel Associative Memory Based on Sparse Neural Networks

Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression algorithms, and face recognition systems. A classical example of an associative memory is the Hopfield neural network. Recently, Gripon and Berrou have introduced an alternative construction which builds on ideas from the theory of error correcting codes and which greatly outperforms the Hopfield network in capacity, diversity, and efficiency. In this paper we implement a variation of the Gripon-Berrou associative memory on a general purpose graphical processing unit (GPU). The work of Gripon and Berrou proposes two retrieval rules, sum-of-sum and sum-of-max. The sum-of-sum rule uses only matrix-vector multiplication and is easily implemented on the GPU. The sum-of-max rule is much less straightforward to implement because it involves non-linear operations. However, the sum-of-max rule gives significantly better retrieval error rates. We propose a hybrid rule tailored for implementation on a GPU which achieves a 880-fold speedup without sacrificing any accuracy.

preprint2013arXiv

A study of retrieval algorithms of sparse messages in networks of neural cliques

Associative memories are data structures addressed using part of the content rather than an index. They offer good fault reliability and biological plausibility. Among different families of associative memories, sparse ones are known to offer the best efficiency (ratio of the amount of bits stored to that of bits used by the network itself). Their retrieval process performance has been shown to benefit from the use of iterations. However classical algorithms require having prior knowledge about the data to retrieve such as the number of nonzero symbols. We introduce several families of algorithms to enhance the retrieval process performance in recently proposed sparse associative memories based on binary neural networks. We show that these algorithms provide better performance, along with better plausibility than existing techniques. We also analyze the required number of iterations and derive corresponding curves.

preprint2013arXiv

Improving Sparse Associative Memories by Escaping from Bogus Fixed Points

The Gripon-Berrou neural network (GBNN) is a recently invented recurrent neural network embracing a LDPC-like sparse encoding setup which makes it extremely resilient to noise and errors. A natural use of GBNN is as an associative memory. There are two activation rules for the neuron dynamics, namely sum-of-sum and sum-of-max. The latter outperforms the former in terms of retrieval rate by a huge margin. In prior discussions and experiments, it is believed that although sum-of-sum may lead the network to oscillate, sum-of-max always converges to an ensemble of neuron cliques corresponding to previously stored patterns. However, this is not entirely correct. In fact, sum-of-max often converges to bogus fixed points where the ensemble only comprises a small subset of the converged state. By taking advantage of this overlooked fact, we can greatly improve the retrieval rate. We discuss this particular issue and propose a number of heuristics to push sum-of-max beyond these bogus fixed points. To tackle the problem directly and completely, a novel post-processing algorithm is also developed and customized to the structure of GBNN. Experimental results show that the new algorithm achieves a huge performance boost in terms of both retrieval rate and run-time, compared to the standard sum-of-max and all the other heuristics.

preprint2013arXiv

Maximum Likelihood Associative Memories

Associative memories are structures that store data in such a way that it can later be retrieved given only a part of its content -- a sort-of error/erasure-resilience property. They are used in applications ranging from caches and memory management in CPUs to database engines. In this work we study associative memories built on the maximum likelihood principle. We derive minimum residual error rates when the data stored comes from a uniform binary source. Second, we determine the minimum amount of memory required to store the same data. Finally, we bound the computational complexity for message retrieval. We then compare these bounds with two existing associative memory architectures: the celebrated Hopfield neural networks and a neural network architecture introduced more recently by Gripon and Berrou.

preprint2013arXiv

Reconstructing a Graph from Path Traces

This paper considers the problem of inferring the structure of a network from indirect observations. Each observation (a "trace") is the unordered set of nodes which are activated along a path through the network. Since a trace does not convey information about the order of nodes within the path, there are many feasible orders for each trace observed, and thus the problem of inferring the network from traces is, in general, illposed. We propose and analyze an algorithm which inserts edges by ordering each trace into a path according to which pairs of nodes in the path co-occur most frequently in the observations. When all traces involve exactly 3 nodes, we derive necessary and sufficient conditions for the reconstruction algorithm to exactly recover the graph. Finally, for a family of random graphs, we present expressions for reconstruction error probabilities (false discoveries and missed detections).

preprint2013arXiv

Storing non-uniformly distributed messages in networks of neural cliques

Associative memories are data structures that allow retrieval of stored messages from part of their content. They thus behave similarly to human brain that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Nevertheless, it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. We introduce several strategies to allow efficient storage of non-uniform messages in recently introduced sparse associative memories. We analyse and discuss the methods introduced. We also present a practical application example.

preprint2012arXiv

Forwarding Without Repeating: Efficient Rumor Spreading in Bounded-Degree Graphs

We study a gossip protocol called forwarding without repeating (FWR). The objective is to spread multiple rumors over a graph as efficiently as possible. FWR accomplishes this by having nodes record which messages they have forwarded to each neighbor, so that each message is forwarded at most once to each neighbor. We prove that FWR spreads a rumor over a strongly connected digraph, with high probability, in time which is within a constant factor of optimal for digraphs with bounded out-degree. Moreover, on digraphs with bounded out-degree and bounded number of rumors, the number of transmissions required by FWR is arbitrarily better than that of existing approaches. Specifically, FWR requires O(n) messages on bounded-degree graphs with n nodes, whereas classical forwarding and an approach based on network coding both require ω(n) messages. Our results are obtained using combinatorial and probabilistic arguments. Notably, they do not depend on expansion properties of the underlying graph, and consequently the message complexity of FWR is arbitrarily better than classical forwarding even on constant-degree expander graphs, as n \rightarrow \infty. In resource-constrained applications, where each transmission consumes battery power and bandwidth, our results suggest that using a small amount of memory at each node leads to a significant savings.

preprint2012arXiv

Learning sparse messages in networks of neural cliques

An extension to a recently introduced binary neural network is proposed in order to allow the learning of sparse messages, in large numbers and with high memory efficiency. This new network is justified both in biological and informational terms. The learning and retrieval rules are detailed and illustrated by various simulation results.

preprint2011arXiv

Qualitative Concurrent Stochastic Games with Imperfect Information

We study a model of games that combines concurrency, imperfect information and stochastic aspects. Those are finite states games in which, at each round, the two players choose, simultaneously and independently, an action. Then a successor state is chosen accordingly to some fixed probability distribution depending on the previous state and on the pair of actions chosen by the players. Imperfect information is modeled as follows: both players have an equivalence relation over states and, instead of observing the exact state, they only know to which equivalence class it belongs. Therefore, if two partial plays are indistinguishable by some player, he should behave the same in both of them. We consider reachability (does the play eventually visit a final state?) and Büchi objective (does the play visit infinitely often a final state?). Our main contribution is to prove that the following problem is complete for 2-ExpTime: decide whether the first player has a strategy that ensures her to almost-surely win against any possible strategy of her oponent. We also characterise those strategies needed by the first player to almost-surely win.

preprint2011arXiv

Sparse neural networks with large learning diversity

Coded recurrent neural networks with three levels of sparsity are introduced. The first level is related to the size of messages, much smaller than the number of available neurons. The second one is provided by a particular coding rule, acting as a local constraint in the neural activity. The third one is a characteristic of the low final connection density of the network after the learning phase. Though the proposed network is very simple since it is based on binary neurons and binary connections, it is able to learn a large number of messages and recall them, even in presence of strong erasures. The performance of the network is assessed as a classifier and as an associative memory.

Vincent Gripon

What is connected

Connect this record

See the researcher in context

Building this map preview

36 published item(s)

Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach

EASY: Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients

Preserving Fine-Grain Feature Information in Classification via Entropic Regularization

Preventing Manifold Intrusion with Locality: Local Mixup

Pruning Graph Convolutional Networks to select meaningful graph frequencies for fMRI decoding

Rethinking Weight Decay For Efficient Neural Network Pruning

Graph-based Interpolation of Feature Vectors for Accurate Few-Shot Classification

Improving Classification Accuracy with Graph Filtering

Inferring Graph Signal Translations as Invariant Transformations for Classification Tasks

Leveraging the Feature Distribution in Transfer-based Few-Shot Learning

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

GPU-based Self-Organizing Maps for Post-Labeled Few-Shot Unsupervised Learning

Graph topology inference benchmarks for machine learning

Predicting the Accuracy of a Few-Shot Classifier

ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget

Towards an Intrinsic Definition of Robustness for a Classifier

Efficient Hardware Implementation of Incremental Learning and Inference on Chip

A Comparative Study of Sparse Associative Memories

Compression of Deep Neural Networks on the Fly

Graph reconstruction from the observation of diffused signals

Neighborhood-Preserving Translations on Graphs

Toward An Uncertainty Principle For Weighted Graphs

Towards a characterization of the uncertainty curve for graphs

Combating Corrupt Messages in Sparse Clustered Associative Memories

Storing sequences in binary tournament-based neural networks

A Low-Power Content-Addressable-Memory Based on Clustered-Sparse-Networks

A Massively Parallel Associative Memory Based on Sparse Neural Networks

A study of retrieval algorithms of sparse messages in networks of neural cliques

Improving Sparse Associative Memories by Escaping from Bogus Fixed Points

Maximum Likelihood Associative Memories

Reconstructing a Graph from Path Traces

Storing non-uniformly distributed messages in networks of neural cliques

Forwarding Without Repeating: Efficient Rumor Spreading in Bounded-Degree Graphs

Learning sparse messages in networks of neural cliques

Qualitative Concurrent Stochastic Games with Imperfect Information

Sparse neural networks with large learning diversity