Source author record

Dimitrios Gunopulos

Dimitrios Gunopulos appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Databases hep-ex Computational Geometry Data Structures and Algorithms Distributed, Parallel, and Cluster Computing hep-ph Human-Computer Interaction physics.comp-ph physics.data-an

Catalog footprint

What is connected

8works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

A Novel Framework for Handling Sparse Data in Traffic Forecast

The ever increasing amount of GPS-equipped vehicles provides in real-time valuable traffic information for the roads traversed by the moving vehicles. In this way, a set of sparse and time evolving traffic reports is generated for each road. These time series are a valuable asset in order to forecast the future traffic condition. In this paper we present a deep learning framework that encodes the sparse recent traffic information and forecasts the future traffic condition. Our framework consists of a recurrent part and a decoder. The recurrent part employs an attention mechanism that encodes the traffic reports that are available at a particular time window. The decoder is responsible to forecast the future traffic condition.

preprint2023arXiv

HTTE: A Hybrid Technique For Travel Time Estimation In Sparse Data Environments

Travel time estimation is a critical task, useful to many urban applications at the individual citizen and the stakeholder level. This paper presents a novel hybrid algorithm for travel time estimation that leverages historical and sparse real-time trajectory data. Given a path and a departure time we estimate the travel time taking into account the historical information, the real-time trajectory data and the correlations among different road segments. We detect similar road segments using historical trajectories, and use a latent representation to model the similarities. Our experimental evaluation demonstrates the effectiveness of our approach.

preprint2022arXiv

Particle Cloud Generation with Message Passing Generative Adversarial Networks

In high energy physics (HEP), jets are collections of correlated particles produced ubiquitously in particle collisions such as those at the CERN Large Hadron Collider (LHC). Machine learning (ML)-based generative models, such as generative adversarial networks (GANs), have the potential to significantly accelerate LHC jet simulations. However, despite jets having a natural representation as a set of particles in momentum-space, a.k.a. a particle cloud, there exist no generative models applied to such a dataset. In this work, we introduce a new particle cloud dataset (JetNet), and apply to it existing point cloud GANs. Results are evaluated using (1) 1-Wasserstein distances between high- and low-level feature distributions, (2) a newly developed Fréchet ParticleNet Distance, and (3) the coverage and (4) minimum matching distance metrics. Existing GANs are found to be inadequate for physics applications, hence we develop a new message passing GAN (MPGAN), which outperforms existing point cloud GANs on virtually every metric and shows promise for use in HEP. We propose JetNet as a novel point-cloud-style dataset for the ML community to experiment with, and set MPGAN as a benchmark to improve upon for future generative models. Additionally, to facilitate research and improve accessibility and reproducibility in this area, we release the open-source JetNet Python package with interfaces for particle cloud datasets, implementations for evaluation and loss metrics, and more tools for ML in HEP development.

preprint2021arXiv

Graph Generative Adversarial Networks for Sparse Data Generation in High Energy Physics

We develop a graph generative adversarial network to generate sparse data sets like those produced at the CERN Large Hadron Collider (LHC). We demonstrate this approach by training on and generating sparse representations of MNIST handwritten digit images and jets of particles in proton-proton collisions like those at the LHC. We find the model successfully generates sparse MNIST digits and particle jet data. We quantify agreement between real and generated data with a graph-based Fréchet Inception distance, and the particle and jet feature-level 1-Wasserstein distance for the MNIST and jet datasets respectively.

preprint2021arXiv

Low-Rank Methods in Event Detection and Subsampled Point-to-Subspace Proximity Tests

Monitoring of streamed data to detect abnormal behaviour (variously known as event detection, anomaly detection, change detection, or outlier detection) underlies many applications of the Internet of Things. There, one often collects data from a variety of sources, with asynchronous sampling, and missing data. In this setting, one can predict abnormal behavior using low-rank techniques. In particular, we assume that normal observations come from a low-rank subspace, prior to being corrupted by a uniformly distributed noise. Correspondingly, we aim to recover a representation of the subspace, and perform event detection by running point-to-subspace distance query for incoming data. In particular, we use a variant of low-rank factorisation, which considers interval uncertainty sets around "known entries", on a suitable flattening of the input data to obtain a low-rank model. On-line, we compute the distance of incoming data to the low-rank normal subspace and update the subspace to keep it consistent with the seasonal changes present. For the distance computation, we suggest to consider subsampling. We bound the one-sided error as a function of the number of coordinates employed using techniques from learning theory and computational geometry. In our experimental evaluation, we have tested the ability of the proposed algorithm to identify samples of abnormal behavior in induction-loop data from Dublin, Ireland.

preprint2015arXiv

Anima: Adaptive Personalized Software Keyboard

We present a Software Keyboard for smart touchscreen devices that learns its owner's unique dictionary in order to produce personalized typing predictions. The learning process is accelerated by analysing user's past typed communication. Moreover, personal temporal user behaviour is captured and exploited in the prediction engine. Computational and storage issues are addressed by dynamically forgetting words that the user no longer types. A prototype implementation is available at Google Play Store.

preprint2015arXiv

Elastic Processing of Analytical Query Workloads on IaaS Clouds

Many modern applications require the evaluation of analytical queries on large amounts of data. Such queries entail joins and heavy aggregations that often include user-defined functions (UDFs). The most efficient way to process these specific type of queries is using tree execution plans. In this work, we develop an engine for analytical query processing and a suite of specialized techniques that collectively take advantage of the tree form of such plans. The engine executes these tree plans in an elastic IaaS cloud infrastructure and dynamically adapts by allocating and releasing pertinent resources based on the query workload monitored over a sliding time window. The engine offers its services for a fee according to service-level agreements (SLAs) associated with the incoming queries; its management of cloud resources aims at maximizing the profit after removing the costs of using these resources. We have fully implemented our algorithms in the Exareme dataflow processing system. We present an extensive evaluation that demonstrates that our approach is very efficient (exhibiting fast response times), elastic (successfully adjusting the cloud resources it uses as the engine continually adapts to query workload changes), and profitable (approximating very well the maximum difference between SLA-based income and cloud-based expenses).

preprint2012arXiv

On The Spatiotemporal Burstiness of Terms

Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally exhibited when an unusually high frequency is observed for t. While spatial and temporal burstiness have been studied individually in the past, our work is the first to simultaneously track and measure spatiotemporal term burstiness. In addition, we use the mined burstiness information toward an efficient document-search engine: given a user's query of terms, our engine returns a ranked list of documents discussing influential events with a strong spatiotemporal impact. We demonstrate the efficiency of our methods with an extensive experimental evaluation on real and synthetic datasets.

Dimitrios Gunopulos

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

A Novel Framework for Handling Sparse Data in Traffic Forecast

HTTE: A Hybrid Technique For Travel Time Estimation In Sparse Data Environments

Particle Cloud Generation with Message Passing Generative Adversarial Networks

Graph Generative Adversarial Networks for Sparse Data Generation in High Energy Physics

Low-Rank Methods in Event Detection and Subsampled Point-to-Subspace Proximity Tests

Anima: Adaptive Personalized Software Keyboard

Elastic Processing of Analytical Query Workloads on IaaS Clouds

On The Spatiotemporal Burstiness of Terms