Researcher profile

Axel Legay

Axel Legay contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Malware Analysis with Symbolic Execution and Graph Kernel

Malware analysis techniques are divided into static and dynamic analysis. Both techniques can be bypassed by circumvention techniques such as obfuscation. In a series of works, the authors have promoted the use of symbolic executions combined with machine learning to avoid such traps. Most of those works rely on natural graph-based representations that can then be plugged into graph-based learning algorithms such as Gspan. There are two main problems with this approach. The first one is in the cost of computing the graph. Indeed, working with graphs requires one to compute and representing the entire state-space of the file under analysis. As such computation is too cumbersome, the techniques often rely on developing strategies to compute a representative subgraph of the behaviors. Unfortunately, efficient graph-building strategies remain weakly explored. The second problem is in the classification itself. Graph-based machine learning algorithms rely on comparing the biggest common structures. This sidelines small but specific parts of the malware signature. In addition, it does not allow us to work with efficient algorithms such as support vector machine. We propose a new efficient open source toolchain for machine learning-based classification. We also explore how graph-kernel techniques can be used in the process. We focus on the 1-dimensional Weisfeiler-Lehman kernel, which can capture local similarities between graphs. Our experimental results show that our approach outperforms existing ones by an impressive factor.

preprint2022arXiv

Symbolic analysis meets federated learning to enhance malware identifier

Over past years, the manually methods to create detection rules were no longer practical in the anti-malware product since the number of malware threats has been growing. Thus, the turn to the machine learning approaches is a promising way to make the malware recognition more efficient. The traditional centralized machine learning requires a large amount of data to train a model with excellent performance. To boost the malware detection, the training data might be on various kind of data sources such as data on host, network and cloud-based anti-malware components, or even, data from different enterprises. To avoid the expenses of data collection as well as the leakage of private data, we present a federated learning system to identify malwares through the behavioural graphs, i.e., system call dependency graphs. It is based on a deep learning model including a graph autoencoder and a multi-classifier module. This model is trained by a secure learning protocol among clients to preserve the private data against the inference attacks. Using the model to identify malwares, we achieve the accuracy of 85\% for the homogeneous graph data and 93\% for the inhomogeneous graph data.

preprint2021arXiv

Quantitative Security Risk Modeling and Analysis with RisQFLan

Domain-specific quantitative modeling and analysis approaches are fundamental in scenarios in which qualitative approaches are inappropriate or unfeasible. In this paper, we present a tool-supported approach to quantitative graph-based security risk modeling and analysis based on attack-defense trees. Our approach is based on QFLan, a successful domain-specific approach to support quantitative modeling and analysis of highly configurable systems, whose domain-specific components have been decoupled to facilitate the instantiation of the QFLan approach in the domain of graph-based security risk modeling and analysis. Our approach incorporates distinctive features from three popular kinds of attack trees, namely enhanced attack trees, capabilities-based attack trees and attack countermeasure trees, into the domain-specific modeling language. The result is a new framework, called RisQFLan, to support quantitative security risk modeling and analysis based on attack-defense diagrams. By offering either exact or statistical verification of probabilistic attack scenarios, RisQFLan constitutes a significant novel contribution to the existing toolsets in that domain. We validate our approach by highlighting the additional features offered by RisQFLan in three illustrative case studies from seminal approaches to graph-based security risk modeling analysis based on attack trees.

preprint2020arXiv

Featured Games

Feature-based SPL analysis and family-based model checking have seen rapid development. Many model checking problems can be reduced to two-player games on finite graphs. A prominent example is mu-calculus model checking, which is generally done by translating to parity games, but also many quantitative model-checking problems can be reduced to (quantitative) games. In their FASE'20 paper, ter Beek et al.\ introduce parity games with variability in order to develop family-based mu-calculus model checking of featured transition systems. We generalize their model to general featured games and show how these may be analysed in a family-based manner. We introduce featured reachability games, featured minimum reachability games, featured discounted games, featured energy games, and featured parity games. We show how to compute winners and values of such games in a family-based manner. We also show that all these featured games admit optimal featured strategies, which project to optimal strategies for any product. Further, we develop family-based algorithms, using late splitting, to compute winners, values, and optimal strategies for all the featured games we have introduced.

preprint2019arXiv

Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Distributed learning across a coalition of organizations allows the members of the coalition to train and share a model without sharing the data used to optimize this model. In this paper, we propose new secure architectures that guarantee preservation of data privacy, trustworthy sequence of iterative learning and equitable sharing of the learned model among each member of the coalition by using adequate encryption and blockchain mechanisms. We exemplify its deployment in the case of the distributed optimization of a deep learning convolutional neural network trained on medical images.