Source author record

Christophe Cérin

Christophe Cérin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Artificial Intelligence Machine Learning Performance

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2019arXiv

A Distributed and Approximated Nearest Neighbors Algorithm for an Efficient Large Scale Mean Shift Clustering

In this paper we target the class of modal clustering methods where clusters are defined in terms of the local modes of the probability density function which generates the data. The most well-known modal clustering method is the k-means clustering. Mean Shift clustering is a generalization of the k-means clustering which computes arbitrarily shaped clusters as defined as the basins of attraction to the local modes created by the density gradient ascent paths. Despite its potential, the Mean Shift approach is a computationally expensive method for unsupervised learning. Thus, we introduce two contributions aiming to provide clustering algorithms with a linear time complexity, as opposed to the quadratic time complexity for the exact Mean Shift clustering. Firstly we propose a scalable procedure to approximate the density gradient ascent. Second, our proposed scalable cluster labeling technique is presented. Both propositions are based on Locality Sensitive Hashing (LSH) to approximate nearest neighbors. These two techniques may be used for moderate sized datasets. Furthermore, we show that using our proposed approximations of the density gradient ascent as a pre-processing step in other clustering methods can also improve dedicated classification metrics. For the latter, a distributed implementation, written for the Spark/Scala ecosystem is proposed. For all these considered clustering methods, we present experimental results illustrating their labeling accuracy and their potential to solve concrete problems.

preprint2014arXiv

Backtracking algorithms for service selection

In this paper, we explore the automation of services' compositions. We focus on the service selection problem. In the formulation that we consider, the problem's inputs are constituted by a behavioral composition whose abstract services must be bound to concrete ones. The objective is to find the binding that optimizes the {\it utility} of the composition under some services level agreements. We propose a complete solution. Firstly, we show that the service selection problem can be mapped onto a Constraint Satisfaction Problem (CSP). The benefit of this mapping is that the large know-how in the resolution of the CSP can be used for the service selection problem. Among the existing techniques for solving CSP, we consider the backtracking. Our second contribution is to propose various backtracking-based algorithms for the service selection problem. The proposed variants are inspired by existing heuristics for the CSP. We analyze the runtime gain of our framework over an intuitive resolution based on exhaustive search. Our last contribution is an experimental evaluation in which we demonstrate that there is an effective gain in using backtracking instead of some comparable approaches. The experiments also show that our proposal can be used for finding in real time, optimal solutions on small and medium services' compositions.

preprint2012arXiv

Intégration des intergiciels de grilles de PC dans le nuage SlapOS : le cas de BOINC

In this article we describe the problems and solutions related to the integration of desktop grid middleware in a cloud, in this case the open source SlapOS cloud. We focus on the issues about recipes that describe the integration and the problem of the confinement of execution. They constitute two aspects of service-oriented architecture and Cloud Computing. These two issues solved with SlapOS are not in relation to what is traditionally done in the clouds because we do not rely on virtual machines and, there is no data center (as defined in cloud). Moreover, we show that from the initial deployment model we take into account not only Web applications, B2B applications... but also applications from the field of grids; here desktop grid middleware which is a case study.

preprint2006arXiv

Methods for Partitioning Data to Improve Parallel Execution Time for Sorting on Heterogeneous Clusters

The aim of the paper is to introduce general techniques in order to optimize the parallel execution time of sorting on a distributed architectures with processors of various speeds. Such an application requires a partitioning step. For uniformly related processors (processors speeds are related by a constant factor), we develop a constant time technique for mastering processor load and execution time in an heterogeneous environment and also a technique to deal with unknown cost functions. For non uniformly related processors, we use a technique based on dynamic programming. Most of the time, the solutions are in O(p) (p is the number of processors), independent of the problem size n. Consequently, there is a small overhead regarding the problem we deal with but it is inherently limited by the knowing of time complexity of the portion of code following the partitioning.