Researcher profile

Mohammad Mozaffari

Mohammad Mozaffari contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

LEAP: Learnable End-to-End Adaptive Pruning of Large Language Models

Unstructured sparsity is now natively accelerated by recent GPU kernels and dataflow hardware, shifting the bottleneck from inference execution to the pruning algorithm. State-of-the-art methods for unstructured LLM pruning are layer-wise surrogates derived from the Optimal Brain Surgeon principle, and they sacrifice end-to-end accuracy, especially under aggressive sparsity. End-to-end alternatives such as MaskLLM and PATCH show that learnable masks can close this gap, but their categorical-over-patterns parameterization scales with the number of valid masks per row and does not port to the unstructured setting. We introduce LEAP, which replaces this intractable parameterization with a per-weight Bernoulli-via-Gumbel- sigmoid relaxation that makes end-to-end unstructured mask learning tractable. Across five LLM families from 0.5B to 8B parameters at 50% and 60% sparsity, LEAP improves six-task average zero-shot accuracy by +2.59 points on average over ADMM, the best layer-wise baseline in our sweep.

preprint2025arXiv

OPTIMA: Optimal One-shot Pruning for LLMs via Quadratic Programming Reconstruction

Post-training model pruning is a promising solution, yet it faces a trade-off: simple heuristics that zero weights are fast but degrade accuracy, while principled joint optimization methods recover accuracy but are computationally infeasible at modern scale. One-shot methods such as SparseGPT offer a practical trade-off in optimality by applying efficient, approximate heuristic weight updates. To close this gap, we introduce OPTIMA, a practical one-shot post-training pruning method that balances accuracy and scalability. OPTIMA casts layer-wise weight reconstruction after mask selection as independent, row-wise Quadratic Programs (QPs) that share a common layer Hessian. Solving these QPs yields the per-row globally optimal update with respect to the reconstruction objective given the estimated Hessian. The shared-Hessian structure makes the problem highly amenable to batching on accelerators. We implement an accelerator-friendly QP solver that accumulates one Hessian per layer and solves many small QPs in parallel, enabling one-shot post-training pruning at scale on a single accelerator without fine-tuning. OPTIMA integrates with existing mask selectors and consistently improves zero-shot performance across multiple LLM families and sparsity regimes, yielding up to 3.97% absolute accuracy improvement. On an NVIDIA H100, OPTIMA prunes a 8B-parameter transformer end-to-end in 40 hours with 60GB peak memory. Together, these results set a new state-of-the-art accuracy-efficiency trade-offs for one-shot post-training pruning.

preprint2024arXiv

3GPP Release 18 Wake-up Receiver: Feature Overview and Evaluations

Enhancing the energy efficiency of devices stands as one of the key requirements in the fifth-generation (5G) cellular network and its evolutions toward the next generation wireless technology. Specifically, for battery-limited Internet-of-Things (IoT) devices where downlink monitoring significantly contributes to energy consumption, efficient solutions are required for power saving while addressing performance tradeoffs. In this regard, the use of a low-power wake-up receiver (WUR) and wake-up signal (WUS) is an attractive solution for reducing the energy consumption of devices without compromising the downlink latency. This paper provides an overview of the standardization study on the design of low-power WUR and WUS within Release 18 of the third-generation partnership project (3GPP). We describe design principles, receiver architectures, waveform characteristics, and device procedures upon detection of WUS. In addition, we provide representative results to show the performance of the WUR in terms of power saving, coverage, and network overhead along with highlighting design tradeoffs.

preprint2022arXiv

Toward Smaller and Lower-Cost 5G Devices with Longer Battery Life: An Overview of 3GPP Release 17 RedCap

The fifth generation (5G) wireless technology is primarily developed to support three classes of use cases, namely, enhanced mobile broadband (eMBB), ultra-reliable and low-latency communication (URLLC), and massive machine-type communication (mMTC), with significantly different requirements in terms of data rate, latency, connection density and power consumption. Meanwhile, there are several key use cases, such as industrial wireless sensor networks, video surveillance, and wearables, whose requirements fall in-between those of eMBB, URLLC, and mMTC. In this regard, 5G can be further optimized to efficiently support such mid-range use cases. Therefore, in Release 17, the 3rd generation partnership project (3GPP) developed the essential features to support a new device type enabling reduced capability (RedCap) NR devices aiming at lower cost/complexity, smaller physical size, and longer battery life compared to regular 5G NR devices. In this paper, we provide a comprehensive overview of 3GPP Release 17 RedCap while describing newly introduced features, cost reduction and power saving gains, and performance and coexistence impacts. Moreover, we present key design guidelines, fundamental tradeoffs, and future outlook for RedCap evolution.

preprint2021arXiv

Coverage Evaluation for 5G Reduced Capability New Radio (NR-RedCap)

The fifth generation (5G) wireless technology is primarily designed to address a wide range of use cases categorized into the enhanced mobile broadband (eMBB), ultra-reliable and low latency communication (URLLC), and massive machine-type communication (mMTC). Nevertheless, there are a few other use cases which are in-between these main use cases such as industrial wireless sensor networks, video surveillance, or wearables. In order to efficiently serve such use cases, in Release 17, the 3rd generation partnership project (3GPP) introduced the reduced capability NR devices (NR-RedCap) with lower cost and complexity, smaller form factor and longer battery life compared to regular NR devices. However, one key potential consequence of device cost and complexity reduction is the coverage loss. In this paper, we provide a comprehensive evaluation of NR RedCap coverage for different physical channels and initial access messages to identify the channels/messages that are potentially coverage limiting for RedCap UEs. We perform the coverage evaluations for RedCap UEs operating in three different scenarios, namely Rural, Urban and Indoor with carrier frequencies 700 MHz, 2.6 GHz and 28 GHz, respectively. Our results confirm that for all the considered scenarios, the amounts of required coverage recovery for RedCap channels are either less than 1 dB or can be compensated by considering smaller data rate targets for RedCap use cases.

preprint2020arXiv

A Deep Reinforcement Learning Approach to Efficient Drone Mobility Support

The growing deployment of drones in a myriad of applications relies on seamless and reliable wireless connectivity for safe control and operation of drones. Cellular technology is a key enabler for providing essential wireless services to flying drones in the sky. Existing cellular networks targeting terrestrial usage can support the initial deployment of low-altitude drone users, but there are challenges such as mobility support. In this paper, we propose a novel handover framework for providing efficient mobility support and reliable wireless connectivity to drones served by a terrestrial cellular network. Using tools from deep reinforcement learning, we develop a deep Q-learning algorithm to dynamically optimize handover decisions to ensure robust connectivity for drone users. Simulation results show that the proposed framework significantly reduces the number of handovers at the expense of a small loss in signal strength relative to the baseline case where a drone always connect to a base station that provides the strongest received signal strength.

preprint2020arXiv

Federated Learning in the Sky: Joint Power Allocation and Scheduling with UAV Swarms

Unmanned aerial vehicle (UAV) swarms must exploit machine learning (ML) in order to execute various tasks ranging from coordinated trajectory planning to cooperative target recognition. However, due to the lack of continuous connections between the UAV swarm and ground base stations (BSs), using centralized ML will be challenging, particularly when dealing with a large volume of data. In this paper, a novel framework is proposed to implement distributed federated learning (FL) algorithms within a UAV swarm that consists of a leading UAV and several following UAVs. Each following UAV trains a local FL model based on its collected data and then sends this trained local model to the leading UAV who will aggregate the received models, generate a global FL model, and transmit it to followers over the intra-swarm network. To identify how wireless factors, like fading, transmission delay, and UAV antenna angle deviations resulting from wind and mechanical vibrations, impact the performance of FL, a rigorous convergence analysis for FL is performed. Then, a joint power allocation and scheduling design is proposed to optimize the convergence rate of FL while taking into account the energy consumption during convergence and the delay requirement imposed by the swarm's control system. Simulation results validate the effectiveness of the FL convergence analysis and show that the joint design strategy can reduce the number of communication rounds needed for convergence by as much as 35% compared with the baseline design.

preprint2012arXiv

Performance Analysis of Sequential Method for Handover in Cognitive Radio Systems

Powerful spectrum handover schemes enable cognitive radios (CRs) to use transmission opportunities in primary users' channels appropriately. In this paper, we consider the cognitive access of primary channels by a secondary user. We evaluate the average detection time and the maximum achievable average throughput of the secondary user when the sequential method for hand-over (SMHO) is used. We assume that a prior knowledge of the primary users' presence and absence probabilities are available. When investigating the maximum achievable throughput of the secondary user, we end into an optimization problem, in which the optimum value of sensing time must be selected. In our optimization problem, we take into account the spectrum hand over due to false detection of the primary user. We also propose a weighted based hand-over (WBHO) scheme in which the impacts of channels conditions and primary users' presence probability are considered. This Spectrum handover scheme provides higher average throughput for the SU than the SMHO method. The tradeoff between the maximum achievable throughput and consumed energy is discussed, and finally an energy efficient optimization formulation for finding a proper sensing time is provided.