Researcher profile

Christos Anagnostopoulos

Christos Anagnostopoulos contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

Tool-augmented LLM agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended exposure of external inputs (e.g., environment secrets or local files). While existing scanners often focus on static artifacts, analyzing runtime behavior is challenging because directly executing untrusted tools can itself be dangerous. We present MCP-SandboxScan, a lightweight framework motivated by the Model Context Protocol (MCP) that safely executes untrusted tools inside a WebAssembly/WASI sandbox and produces auditable reports of external-to-sink exposures. Our prototype (i) extracts LLM-relevant sinks from runtime outputs (prompt/messages and structured tool-return fields), (ii) instantiates external-input candidates from environment values, mounted file contents, and output-surfaced HTTP fetch intents, and (iii) links sources to sinks via snippet-based substring matching. Case studies on three representative tools show that MCP-SandboxScan can surface provenance evidence when external inputs appear in prompt/messages or tool-return payloads, and can expose filesystem capability violations as runtime evidence. We further compare against a lightweight static string-signature baseline and use a micro-benchmark to characterize false negatives under transformations and false positives from short-token collisions.

preprint2026arXiv

Operational Runtime Behavior Mining for Open-Source Supply Chain Security

Open-source software (OSS) is a critical component of modern software systems, yet supply chain security remains challenging in practice due to unavailable or obfuscated source code. Consequently, security teams often rely on runtime observations collected from sandboxed executions to investigate suspicious third-party components. We present HeteroGAT-Rank, an industry-oriented runtime behavior mining system that supports analyst-in-the-loop supply chain threat investigation. The system models execution-time behaviors of OSS packages as lightweight heterogeneous graphs and applies attention-based graph learning to rank behavioral patterns that are most relevant for security analysis. Rather than aiming for fully automated detection, HeteroGAT-Rank surfaces actionable runtime signals - such as file, network, and command activities - to guide manual investigation and threat hunting. To operate at ecosystem scale, the system decouples offline behavior mining from online analysis and integrates parallel graph construction for efficient processing across multiple ecosystems. An evaluation on a large-scale OSS execution dataset shows that HeteroGAT-Rank effectively highlights meaningful and interpretable behavioral indicators aligned with real-world vulnerability and attack trends, supporting practical security workflows under realistic operational constraints.

preprint2022arXiv

A Civil Protection Early Warning System to Improve the Resilience of Adriatic-Ionian Territories to Natural and Man-made Risk

We are currently witnessing an increased occurrence of extreme weather events, causing a great deal of disruption and distress across the globe. In this setting, the importance and utility of Early Warning Systems is becoming increasingly obvious. In this work, we present the design of an early warning system called TransCPEarlyWarning, aimed at seven countries in the Adriatic-Ionian area in Europe. The overall objective is to increase the level of cooperation among national civil protection institutions in these countries, addressing natural and man-made risks from the early warning stage and improving the intervention capabilities of civil protection mechanisms. The system utilizes an innovative approach with a lever effect, while also aiming to support the whole system of Civil Protection.

preprint2021arXiv

A Soft Method for Outliers Detection at the Edge of the Network

The combination of the Internet of Things and the Edge Computing gives many opportunities to support innovative applications close to end users. Numerous devices present in both infrastructures can collect data upon which various processing activities can be performed. However, the quality of the outcomes may be jeopardized by the presence of outliers. In this paper, we argue on a novel model for outliers detection by elaborating on a `soft' approach. Our mechanism is built upon the concepts of candidate and confirmed outliers. Any data object that deviates from the population is confirmed as an outlier only after the study of its sequence of magnitude values as new data are incorporated into our decision making model. We adopt the combination of a sliding with a landmark window model when a candidate outlier is detected to expand the sequence of data objects taken into consideration. The proposed model is fast and efficient as exposed by our experimental evaluation while a comparative assessment reveals its pros and cons.

preprint2020arXiv

Adaptive Learning of Aggregate Analytics under Dynamic Workloads

Large organizations have seamlessly incorporated data-driven decision making in their operations. However, as data volumes increase, expensive big data infrastructures are called to rescue. In this setting, analytics tasks become very costly in terms of query response time, resource consumption, and money in cloud deployments, especially when base data are stored across geographically distributed data centers. Therefore, we introduce an adaptive Machine Learning mechanism which is light-weight, stored client-side, can estimate the answers of a variety of aggregate queries and can avoid the big data backend. The estimations are performed in milliseconds are inexpensive and accurate as the mechanism learns from past analytical-query patterns. However, as analytic queries are ad-hoc and analysts' interests change over time we develop solutions that can swiftly and accurately detect such changes and adapt to new query patterns. The capabilities of our approach are demonstrated using extensive evaluation with real and synthetic datasets.

preprint2020arXiv

An Intelligent Edge-Centric Queries Allocation Scheme based on Ensemble Models

The combination of Internet of Things (IoT) and Edge Computing (EC) can assist in the delivery of novel applications that will facilitate end users activities. Data collected by numerous devices present in the IoT infrastructure can be hosted into a set of EC nodes becoming the subject of processing tasks for the provision of analytics. Analytics are derived as the result of various queries defined by end users or applications. Such queries can be executed in the available EC nodes to limit the latency in the provision of responses. In this paper, we propose a meta-ensemble learning scheme that supports the decision making for the allocation of queries to the appropriate EC nodes. Our learning model decides over queries' and nodes' characteristics. We provide the description of a matching process between queries and nodes after concluding the contextual information for each envisioned characteristic adopted in our meta-ensemble scheme. We rely on widely known ensemble models, combine them and offer an additional processing layer to increase the performance. The aim is to result a subset of EC nodes that will host each incoming query. Apart from the description of the proposed model, we report on its evaluation and the corresponding results. Through a large set of experiments and a numerical analysis, we aim at revealing the pros and cons of the proposed scheme.

preprint2020arXiv

Data Synopses Management based on a Deep Learning Model

Pervasive computing involves the placement of processing services close to end users to support intelligent applications. With the advent of the Internet of Things (IoT) and the Edge Computing (EC), one can find room for placing services at various points in the interconnection of the aforementioned infrastructures. Of significant importance is the processing of the collected data. Such a processing can be realized upon the EC nodes that exhibit increased computational capabilities compared to IoT devices. An ecosystem of intelligent nodes is created at the EC giving the opportunity to support cooperative models. Nodes become the hosts of geo-distributed datasets formulated by the IoT devices reports. Upon the datasets, a number of queries/tasks can be executed. Queries/tasks can be offloaded for performance reasons. However, an offloading action should be carefully designed being always aligned with the data present to the hosting node. In this paper, we present a model to support the cooperative aspect in the EC infrastructure. We argue on the delivery of data synopses to EC nodes making them capable to take offloading decisions fully aligned with data present at peers. Nodes exchange data synopses to inform their peers. We propose a scheme that detects the appropriate time to distribute synopses trying to avoid the network overloading especially when synopses are frequently extracted due to the high rates at which IoT devices report data to EC nodes. Our approach involves a Deep Learning model for learning the distribution of calculated synopses and estimate future trends. Upon these trends, we are able to find the appropriate time to deliver synopses to peer nodes. We provide the description of the proposed mechanism and evaluate it based on real datasets. An extensive experimentation upon various scenarios reveals the pros and cons of the approach by giving numerical results.

preprint2020arXiv

Explaining Aggregates for Exploratory Analytics

Analysts wishing to explore multivariate data spaces, typically pose queries involving selection operators, i.e., range or radius queries, which define data subspaces of possible interest and then use aggregation functions, the results of which determine their exploratory analytics interests. However, such aggregate query (AQ) results are simple scalars and as such, convey limited information about the queried subspaces for exploratory analysis. We address this shortcoming aiding analysts to explore and understand data subspaces by contributing a novel explanation mechanism coined XAXA: eXplaining Aggregates for eXploratory Analytics. XAXA's novel AQ explanations are represented using functions obtained by a three-fold joint optimization problem. Explanations assume the form of a set of parametric piecewise-linear functions acquired through a statistical learning model. A key feature of the proposed solution is that model training is performed by only monitoring AQs and their answers on-line. In XAXA, explanations for future AQs can be computed without any database (DB) access and can be used to further explore the queried data subspaces, without issuing any more queries to the DB. We evaluate the explanation accuracy and efficiency of XAXA through theoretically grounded metrics over real-world and synthetic datasets and query workloads.

preprint2020arXiv

ML-AQP: Query-Driven Approximate Query Processing based on Machine Learning

As more and more organizations rely on data-driven decision making, large-scale analytics become increasingly important. However, an analyst is often stuck waiting for an exact result. As such, organizations turn to Cloud providers that have infrastructure for efficiently analyzing large quantities of data. But, with increasing costs, organizations have to optimize their usage. Having a cheap alternative that provides speed and efficiency will go a long way. Concretely, we offer a solution that can provide approximate answers to aggregate queries, relying on Machine Learning (ML), which is able to work alongside Cloud systems. Our developed lightweight ML-led system can be stored on an analyst's local machine or deployed as a service to instantly answer analytic queries, having low response times and monetary/computational costs and energy footprint. To accomplish this we leverage the knowledge obtained by previously answered queries and build ML models that can estimate the result of new queries in an efficient and inexpensive manner. The capabilities of our system are demonstrated using extensive evaluation with both real and synthetic datasets/workloads and well known benchmarks.

preprint2020arXiv

On the Use of Interpretable Machine Learning for the Management of Data Quality

Data quality is a significant issue for any application that requests for analytics to support decision making. It becomes very important when we focus on Internet of Things (IoT) where numerous devices can interact to exchange and process data. IoT devices are connected to Edge Computing (EC) nodes to report the collected data, thus, we have to secure data quality not only at the IoT but also at the edge of the network. In this paper, we focus on the specific problem and propose the use of interpretable machine learning to deliver the features that are important to be based for any data processing activity. Our aim is to secure data quality, at least, for those features that are detected as significant in the collected datasets. We have to notice that the selected features depict the highest correlation with the remaining in every dataset, thus, they can be adopted for dimensionality reduction. We focus on multiple methodologies for having interpretability in our learning models and adopt an ensemble scheme for the final decision. Our scheme is capable of timely retrieving the final result and efficiently select the appropriate features. We evaluate our model through extensive simulations and present numerical results. Our aim is to reveal its performance under various experimental scenarios that we create varying a set of parameters adopted in our mechanism.