Source author record

Sandro Rigo

Sandro Rigo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Artificial Intelligence Computer Vision Cryptography and Security Databases eess.IV Machine Learning Programming Languages

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Perspectives on risk prioritization of data center vulnerabilities using rank aggregation and multi-objective optimization

Nowadays, data has become an invaluable asset to entities and companies, and keeping it secure represents a major challenge. Data centers are responsible for storing data provided by software applications. Nevertheless, the number of vulnerabilities has been increasing every day. Managing such vulnerabilities is essential for building a reliable and secure network environment. Releasing patches to fix security flaws in software is a common practice to handle these vulnerabilities. However, prioritization becomes crucial for organizations with an increasing number of vulnerabilities since time and resources to fix them are usually limited. This review intends to present a survey of vulnerability ranking techniques and promote a discussion on how multi-objective optimization could benefit the management of vulnerabilities risk prioritization. The state-of-the-art approaches for risk prioritization were reviewed, intending to develop an effective model for ranking vulnerabilities in data centers. The main contribution of this work is to point out multi-objective optimization as a not commonly explored but promising strategy to prioritize vulnerabilities, enabling better time management and increasing security.

preprint2022arXiv

The OpenMP Cluster Programming Model

Despite the various research initiatives and proposed programming models, efficient solutions for parallel programming in HPC clusters still rely on a complex combination of different programming models (e.g., OpenMP and MPI), languages (e.g., C++ and CUDA), and specialized runtimes (e.g., Charm++ and Legion). On the other hand, task parallelism has shown to be an efficient and seamless programming model for clusters. This paper introduces OpenMP Cluster (OMPC), a task-parallel model that extends OpenMP for cluster programming. OMPC leverages OpenMP's offloading standard to distribute annotated regions of code across the nodes of a distributed system. To achieve that it hides MPI-based data distribution and load-balancing mechanisms behind OpenMP task dependencies. Given its compliance with OpenMP, OMPC allows applications to use the same programming model to exploit intra- and inter-node parallelism, thus simplifying the development process and maintenance. We evaluated OMPC using Task Bench, a synthetic benchmark focused on task parallelism, comparing its performance against other distributed runtimes. Experimental results show that OMPC can deliver up to 1.53x and 2.43x better performance than Charm++ on CCR and scalability experiments, respectively. Experiments also show that OMPC performance weakly scales for both Task Bench and a real-world seismic imaging application.

preprint2020arXiv

Binary Segmentation of Seismic Facies Using Encoder-Decoder Neural Networks

The interpretation of seismic data is vital for characterizing sediments' shape in areas of geological study. In seismic interpretation, deep learning becomes useful for reducing the dependence on handcrafted facies segmentation geometry and the time required to study geological areas. This work presents a Deep Neural Network for Facies Segmentation (DNFS) to obtain state-of-the-art results for seismic facies segmentation. DNFS is trained using a combination of cross-entropy and Jaccard loss functions. Our results show that DNFS obtains highly detailed predictions for seismic facies segmentation using fewer parameters than StNet and U-Net.

preprint2020arXiv

Employing Simulation to Facilitate the Design of Dynamic Code Generators

Dynamic Translation (DT) is a sophisticated technique that allows the implementation of high-performance emulators and high-level-language virtual machines. In this technique, the guest code is compiled dynamically at runtime. Consequently, achieving good performance depends on several design decisions, including the shape of the regions of code being translated. Researchers and engineers explore these decisions to bring the best performance possible. However, a real DT engine is a very sophisticated piece of software, and modifying one is a hard and demanding task. Hence, we propose using simulation to evaluate the impact of design decisions on dynamic translators and present RAIn, an open-source DT simulator that facilitates the test of DT's design decisions, such as Region Formation Techniques (RFTs). RAIn outputs several statistics that support the analysis of how design decisions may affect the behavior and the performance of a real DT. We validated RAIn running a set of experiments with six well known RFTs (NET, MRET2, LEI, NETPlus, NET-R, and NETPlus-e-r) and showed that it can reproduce well-known results from the literature without the effort of implementing them on a real and complex dynamic translator engine.

preprint2020arXiv

Similarity Driven Approximation for Text Analytics

Text analytics has become an important part of business intelligence as enterprises increasingly seek to extract insights for decision making from text data sets. Processing large text data sets can be computationally expensive, however, especially if it involves sophisticated algorithms. This challenge is exacerbated when it is desirable to run different types of queries against a data set, making it expensive to build multiple indices to speed up query processing. In this paper, we propose and evaluate a framework called EmApprox that uses approximation to speed up the processing of a wide range of queries over large text data sets. The key insight is that different types of queries can be approximated by processing subsets of data that are most similar to the queries. EmApprox builds a general index for a data set by learning a natural language processing model, producing a set of highly compressed vectors representing words and subcollections of documents. Then, at query processing time, EmApprox uses the index to guide sampling of the data set, with the probability of selecting each subcollection of documents being proportional to its {\em similarity} to the query as computed using the vector representations. We have implemented a prototype of EmApprox as an extension of the Apache Spark system, and used it to approximate three types of queries: aggregation, information retrieval, and recommendation. Experimental results show that EmApprox's similarity-guided sampling achieves much better accuracy than random sampling. Further, EmApprox can achieve significant speedups if users can tolerate small amounts of inaccuracies. For example, when sampling at 10\%, EmApprox speeds up a set of queries counting phrase occurrences by almost 10x while achieving estimated relative errors of less than 22\% for 90\% of the queries.

Sandro Rigo

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Perspectives on risk prioritization of data center vulnerabilities using rank aggregation and multi-objective optimization

The OpenMP Cluster Programming Model

Binary Segmentation of Seismic Facies Using Encoder-Decoder Neural Networks

Employing Simulation to Facilitate the Design of Dynamic Code Generators

Similarity Driven Approximation for Text Analytics