Researcher profile

Antonio Fariña

Antonio Fariña contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2020arXiv

Revisiting compact RDF stores based on k2-trees

We present a new compact representation to efficiently store and query large RDF datasets in main memory. Our proposal, called BMatrix, is based on the k2-tree, a data structure devised to represent binary matrices in a compressed way, and aims at improving the results of previous state-of-the-art alternatives, especially in datasets with a relatively large number of predicates. We introduce our technique, together with some improvements on the basic k2-tree that can be applied to our solution in order to boost compression. Experimental results in the flagship RDF dataset DBPedia show that our proposal achieves better compression than existing alternatives, while yielding competitive query times, particularly in the most frequent triple patterns and in queries with unbound predicate, in which we outperform existing solutions.

preprint2020arXiv

Semantrix: A Compressed Semantic Matrix

We present a compact data structure to represent both the duration and length of homogeneous segments of trajectories from moving objects in a way that, as a data warehouse, it allows us to efficiently answer cumulative queries. The division of trajectories into relevant segments has been studied in the literature under the topic of Trajectory Segmentation. In this paper, we design a data structure to compactly represent them and the algorithms to answer the more relevant queries. We experimentally evaluate our proposal in the real context of an enterprise with mobile workers (truck drivers) where we aim at analyzing the time they spend in different activities. To test our proposal under higher stress conditions we generated a huge amount of synthetic realistic trajectories and evaluated our system with those data to have a good idea about its space needs and its efficiency when answering different types of queries.

preprint2016arXiv

Universal Indexes for Highly Repetitive Document Collections

Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others. Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space. We introduce new techniques for compressing inverted indexes that exploit this near-copy regularity. They are based on run-length, Lempel-Ziv, or grammar compression of the differential inverted lists, instead of the usual practice of gap-encoding them. We show that, in this highly repetitive setting, our compression methods significantly reduce the space obtained with classical techniques, at the price of moderate slowdowns. Moreover, our best methods are universal, that is, they do not need to know the versioning structure of the collection, nor that a clear versioning structure even exists. We also introduce compressed self-indexes in the comparison. These are designed for general strings (not only natural language texts) and represent the text collection plus the index structure (not an inverted index) in integrated form. We show that these techniques can compress much further, using a small fraction of the space required by our new inverted indexes. Yet, they are orders of magnitude slower.