Source author record

Haridimos Kondylakis

Haridimos Kondylakis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases

Catalog footprint

What is connected

3works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Coconut Palm: Static and Streaming Data Series Exploration Now in your Palm

Many modern applications produce massive streams of data series and maintain them in indexes to be able to explore them through nearest neighbor search. Existing data series indexes, however, are expensive to operate as they issue many random I/Os to storage. To address this problem, we recently proposed Coconut, a new infrastructure that organizes data series based on a new sortable format. In this way, Coconut is able to leverage state-of-the-art indexing techniques that rely on sorting for the first time to build, maintain and query data series indexes using fast sequential I/Os. In this demonstration, we present Coconut Palm, a new exploration tool that allows to interactively combine different indexing techniques from within the Coconut infrastructure and to thereby seamlessly explore data series from across various scientific domains. We highlight the rich indexing design choices that Coconut opens up, and we present a new recommender tool that allows users to intelligently navigate them for both static and streaming data exploration scenarios.

preprint2020arXiv

Coconut: a scalable bottom-up approach for building data series indexes

Many modern applications produce massive amounts of data series that need to be analyzed, requiring efficient similarity search operations. However, the state-of-the-art data series indexes that are used for this purpose do not scale well for massive datasets in terms of performance, or storage costs. We pinpoint the problem to the fact that existing summarizations of data series used for indexing cannot be sorted while keeping similar data series close to each other in the sorted order. This leads to two design problems. First, traditional bulk-loading algorithms based on sorting cannot be used. Instead, index construction takes place through slow top-down insertions, which create a non-contiguous index that results in many random I/Os. Second, data series cannot be sorted and split across nodes evenly based on their median value; thus, most leaf nodes are in practice nearly empty. This further slows down query speed and amplifies storage costs. To address these problems, we present Coconut. The first innovation in Coconut is an inverted, sortable data series summarization that organizes data series based on a z-order curve, keeping similar series close to each other in the sorted order. As a result, Coconut is able to use bulk-loading techniques that rely on sorting to quickly build a contiguous index using large sequential disk I/Os. We then explore prefix-based and median-based splitting policies for bottom-up bulk-loading, showing that median-based splitting outperforms the state of the art, ensuring that all nodes are densely populated. Overall, we show analytically and empirically that Coconut dominates the state-of-the-art data series indexes in terms of construction speed, query speed, and storage costs.

preprint2020arXiv

Graph Summarization

The continuous and rapid growth of highly interconnected datasets, which are both voluminous and complex, calls for the development of adequate processing and analytical techniques. One method for condensing and simplifying such datasets is graph summarization. It denotes a series of application-specific algorithms designed to transform graphs into more compact representations while preserving structural patterns, query answers, or specific property distributions. As this problem is common to several areas studying graph topologies, different approaches, such as clustering, compression, sampling, or influence detection, have been proposed, primarily based on statistical and optimization methods. The focus of our chapter is to pinpoint the main graph summarization methods, but especially to focus on the most recent approaches and novel research trends on this topic, not yet covered by previous surveys.

Haridimos Kondylakis

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Coconut Palm: Static and Streaming Data Series Exploration Now in your Palm

Coconut: a scalable bottom-up approach for building data series indexes

Graph Summarization