Source author record

Andreas L. Symeonidis

Andreas L. Symeonidis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering eess.AS Information Retrieval Machine Learning Multimedia Sound

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Towards Online Malware Detection using Process Resource Utilization Metrics

The rapid growth of Cloud Computing and Internet of Things (IoT) has significantly increased the interconnection of computational resources, creating an environment where malicious software (malware) can spread rapidly. To address this challenge, researchers are increasingly utilizing Machine Learning approaches to identify malware through behavioral (i.e. dynamic) cues. However, current approaches are limited by their reliance on large labeled datasets, fixed model training, and the assumption that a trained model remains effective over time-disregarding the ever-evolving sophistication of malware. As a result, they often fail to detect evolving malware attacks that adapt over time. This paper proposes an online learning approach for dynamic malware detection, that overcomes these limitations by incorporating temporal information to continuously update its models using behavioral features, specifically process resource utilization metrics. By doing so, the proposed models can incrementally adapt to emerging threats and detect zero-day malware effectively. Upon evaluating our approach against traditional batch algorithms, we find it effective in detecting zero-day malware. Moreover, we demonstrate its efficacy in scenarios with limited data availability, where traditional batch-based approaches often struggle to perform reliably.

preprint2025arXiv

Towards an Interpretable Analysis for Estimating the Resolution Time of Software Issues

Lately, software development has become a predominantly online process, as more teams host and monitor their projects remotely. Sophisticated approaches employ issue tracking systems like Jira, predicting the time required to resolve issues and effectively assigning and prioritizing project tasks. Several methods have been developed to address this challenge, widely known as bug-fix time prediction, yet they exhibit significant limitations. Most consider only textual issue data and/or use techniques that overlook the semantics and metadata of issues (e.g., priority or assignee expertise). Many also fail to distinguish actual development effort from administrative delays, including assignment and review phases, leading to estimates that do not reflect the true effort needed. In this work, we build an issue monitoring system that extracts the actual effort required to fix issues on a per-project basis. Our approach employs topic modeling to capture issue semantics and leverages metadata (components, labels, priority, issue type, assignees) for interpretable resolution time analysis. Final predictions are generated by an aggregated model, enabling contributors to make informed decisions. Evaluation across multiple projects shows the system can effectively estimate resolution time and provide valuable insights.

preprint2025arXiv

Towards Effective Issue Assignment using Online Machine Learning

Efficient issue assignment in software development relates to faster resolution time, resources optimization, and reduced development effort. To this end, numerous systems have been developed to automate issue assignment, including AI and machine learning approaches. Most of them, however, often solely focus on a posteriori analyses of textual features (e.g. issue titles, descriptions), disregarding the temporal characteristics of software development. Thus, they fail to adapt as projects and teams evolve, such cases of team evolution, or project phase shifts (e.g. from development to maintenance). To incorporate such cases in the issue assignment process, we propose an Online Machine Learning methodology that adapts to the evolving characteristics of software projects. Our system processes issues as a data stream, dynamically learning from new data and adjusting in real time to changes in team composition and project requirements. We incorporate metadata such as issue descriptions, components and labels and leverage adaptive drift detection mechanisms to identify when model re-evaluation is necessary. Upon assessing our methodology on a set of software projects, we conclude that it can be effective on issue assignment, while meeting the evolving needs of software teams.

preprint2021arXiv

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutional Neural Network (CNN) trained on a large scale dataset of audio events, and then we calculate the similarity matrix derived from the pairwise similarity of these descriptors. The similarity matrix is subsequently fed to a CNN network that captures the temporal structures existing within its content. We train our network following a triplet generation process and optimizing the triplet loss function. To evaluate the effectiveness of the proposed approach, we have manually annotated two publicly available video datasets based on the audio duplicity between their videos. The proposed approach achieves very competitive results compared to three state-of-the-art methods. Also, unlike the competing methods, it is very robust to the retrieval of audio duplicates generated with speed transformations.