Researcher profile

Paul Bogdan

Paul Bogdan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

EMoE: Eigenbasis-Guided Routing for Mixture-of-Experts

The relentless scaling of deep learning models has led to unsustainable computational demands, positioning Mixture-of-Experts (MoE) architectures as a promising path towards greater efficiency. However, MoE models are plagued by two fundamental challenges: 1) a load imbalance problem known as the``rich get richer" phenomenon, where a few experts are over-utilized, and 2) an expert homogeneity problem, where experts learn redundant representations, negating their purpose. Current solutions typically employ an auxiliary load-balancing loss that, while mitigating imbalance, often exacerbates homogeneity by enforcing uniform routing at the expense of specialization. To resolve this, we introduce the Eigen-Mixture-of-Experts (EMoE), a novel architecture that leverages a routing mechanism based on a learned orthonormal eigenbasis. EMoE projects input tokens onto this shared eigenbasis and routes them based on their alignment with the principal components of the feature space. This principled, geometric partitioning of data intrinsically promotes both balanced expert utilization and the development of diverse, specialized experts, all without the need for a conflicting auxiliary loss function. Our code is publicly available at https://github.com/Belis0811/EMoE.

preprint2022arXiv

End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning

To enable heterogeneous computing systems with autonomous programming and optimization capabilities, we propose a unified, end-to-end, programmable graph representation learning (PGL) framework that is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core in heterogeneous hardware platforms. The proposed framework extracts multi-fractal topological features from code graphs, utilizes graph autoencoders to learn how to partition the graph into computational kernels, and exploits graph neural networks (GNN) to predict the correct assignment to a processor type. In the evaluation, we validate the PGL framework and demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.

preprint2022arXiv

Secure Distributed/Federated Learning: Prediction-Privacy Trade-Off for Multi-Agent System

Decentralized learning is an efficient emerging paradigm for boosting the computing capability of multiple bounded computing agents. In the big data era, performing inference within the distributed and federated learning (DL and FL) frameworks, the central server needs to process a large amount of data while relying on various agents to perform multiple distributed training tasks. Considering the decentralized computing topology, privacy has become a first-class concern. Moreover, assuming limited information processing capability for the agents calls for a sophisticated \textit{privacy-preserving decentralization} that ensures efficient computation. Towards this end, we study the \textit{privacy-aware server to multi-agent assignment} problem subject to information processing constraints associated with each agent, while maintaining the privacy and assuring learning informative messages received by agents about a global terminal through the distributed private federated learning (DPFL) approach. To find a decentralized scheme for a two-agent system, we formulate an optimization problem that balances privacy and accuracy, taking into account the quality of compression constraints associated with each agent. We propose an iterative converging algorithm by alternating over self-consistent equations. We also numerically evaluate the proposed solution to show the privacy-prediction trade-off and demonstrate the efficacy of the novel approach in ensuring privacy in DL and FL.

preprint2021arXiv

VRoC: Variational Autoencoder-aided Multi-task Rumor Classifier Based on Text

Social media became popular and percolated almost all aspects of our daily lives. While online posting proves very convenient for individual users, it also fosters fast-spreading of various rumors. The rapid and wide percolation of rumors can cause persistent adverse or detrimental impacts. Therefore, researchers invest great efforts on reducing the negative impacts of rumors. Towards this end, the rumor classification system aims to detect, track, and verify rumors in social media. Such systems typically include four components: (i) a rumor detector, (ii) a rumor tracker, (iii) a stance classifier, and (iv) a veracity classifier. In order to improve the state-of-the-art in rumor detection, tracking, and verification, we propose VRoC, a tweet-level variational autoencoder-based rumor classification system. VRoC consists of a co-train engine that trains variational autoencoders (VAEs) and rumor classification components. The co-train engine helps the VAEs to tune their latent representations to be classifier-friendly. We also show that VRoC is able to classify unseen rumors with high levels of accuracy. For the PHEME dataset, VRoC consistently outperforms several state-of-the-art techniques, on both observed and unobserved rumors, by up to 26.9%, in terms of macro-F1 scores.

preprint2020arXiv

Efficient Task Mapping for Manycore Systems

System-on-chip (SoC) has migrated from single core to manycore architectures to cope with the increasing complexity of real-life applications. Application task mapping has a significant impact on the efficiency of manycore system (MCS) computation and communication. We present WAANSO, a scalable framework that incorporates a Wavelet Clustering based approach to cluster application tasks. We also introduce Ant Swarm Optimization (ASO) based on iterative execution of Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) for task clustering and mapping to the MCS processing elements. We have shown that WAANSO can significantly increase the MCS energy and performance efficiencies. Based on our experiments on a 64-core system, WAANSO improves energy efficiency by 19%, compared to baseline approaches, namely DPSO, ACO and branch and bound (B&B). Additionally, the performance improves by 65.86% compared to Density-Based Spatial Clustering of Applications with Noise (DBSCAN) baseline.

preprint2020arXiv

H2O-Cloud: A Resource and Quality of Service-Aware Task Scheduling Framework for Warehouse-Scale Data Centers -- A Hierarchical Hybrid DRL (Deep Reinforcement Learning) based Approach

Cloud computing has attracted both end-users and Cloud Service Providers (CSPs) in recent years. Improving resource utilization rate (RUtR), such as CPU and memory usages on servers, while maintaining Quality-of-Service (QoS) is one key challenge faced by CSPs with warehouse-scale data centers. Prior works proposed various algorithms to reduce energy cost or to improve RUtR, which either lack the fine-grained task scheduling capabilities, or fail to take a comprehensive system model into consideration. This article presents H2O-Cloud, a Hierarchical and Hybrid Online task scheduling framework for warehouse-scale CSPs, to improve resource usage effectiveness while maintaining QoS. H2O-Cloud is highly scalable and considers comprehensive information such as various workload scenarios, cloud platform configurations, user request information and dynamic pricing model. The hierarchy and hybridity of the framework, combined with its deep reinforcement learning (DRL) engines, enable H2O-Cloud to efficiently start on-the-go scheduling and learning in an unpredictable environment without pre-training. Our experiments confirm the high efficiency of the proposed H2O-Cloud when compared to baseline approaches, in terms of energy and cost while maintaining QoS. Compared with a state-of-the-art DRL-based algorithm, H2O-Cloud achieves up to 201.17% energy cost efficiency improvement, 47.88% energy efficiency improvement and 551.76% reward rate improvement.

preprint2020arXiv

S4oC: A Self-optimizing, Self-adapting Secure System-on-Chip Design Framework to Tackle Unknown Threats -- A Network Theoretic, Learning Approach

We propose a framework for the design and optimization of a secure self-optimizing, self-adapting system-on-chip (S4oC) architecture. The goal is to minimize the impact of attacks such as hardware Trojan and side-channel, by making real-time adjustments. S4oC learns to reconfigure itself, subject to various security measures and attacks, some of which possibly unknown at design time. Furthermore, the data types and patterns of the target applications, environmental conditions, and sources of variations are incorporated. S4oC is a manycore system, modeled as a four-layer graph, representing the model of computation (MoCp), model of connection (MoCn), model of memory (MoM) and model of storage (MoS), with a large number of elements including heterogeneous reconfigurable processing elements in MoCp, and memory elements in the MoM layer. Security driven community detection, and neural networks are utilized for application task clustering, and distributed reinforcement learning (RL) for task mapping.