Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

GEM: GPU-Variability-Aware Expert to GPU Mapping for MoE Systems

Mixture-of-Expert (MoE) models enable efficient inference by employing smaller experts and activating only a subset of them per token. MoE serving engines distribute experts across multiple GPUs and route tokens to appropriate GPUs at inference time based on experts activated. They process tokens in lock-step fashion, where tokens within a batch must finish processing before proceeding to the next layer. This synchronization barrier acts as a critical bottleneck because the performance of MoE models is limited by the straggler GPU that finishes last. Stragglers emerge when too many heavily used experts are placed on the same GPU or the slowest GPU. While prior works place experts that balance token loads across GPUs, they all overlook GPU variability and often place highly used experts on the slowest GPUs. We propose GEM, GPU-variability-aware Expert Mapping, a framework for GPU variability-aware expert to GPU mapping for MoE models. GEM exploits two insights. First, we must place experts such that each GPU receives non-uniform token loads based on their variability and they all finish processing a layer at about the same time. Our studies show that there are two types of experts: consistent that are used most of the time and temporal that are often used together for the remaining time. Our second insight is that we must place simultaneously used consistent and temporal experts on different GPUs and avoid placing them on slower GPUs to reduce slowdown. GEM gathers the variability profile of GPUs for each model and task and uses the token load distributions per task to map experts to GPUs. Our experiments show that GEM improves end-to-end latency by 7.9% on average and by up to 16.5% compared to the baseline.

preprint2026arXiv

Test-Time Speculation

Speculative decoding accelerates LLM inference by using a fast draft model to generate tokens and a more accurate target model to verify them. Its performance depends on the $\textit{acceptance length}$, or number of draft tokens accepted by the target. Our studies show that the acceptance length of even state-of-the-art speculators, like DFlash, EAGLE-3 and PARD degrade with generation length, reaching values close to 1 (i.e. no speedup) within just a few thousand output tokens, making speculators ineffective for long-response tasks. Acceptance lengths decline because most speculators are trained offline on short sequences, but are forced to match the target model on much longer outputs at inference, well beyond their training distribution. To address this issue, we propose $\textit{Test-Time Speculation (TTS)}$, an online distillation approach that continuously adapts the speculator at test-time. TTS leverages the key insight that the token verification step already invokes the target model for each draft token, providing the training signal needed to adapt the draft at no additional cost. Treating the draft as the student and the target as a teacher, TTS adjusts the draft over several speculation rounds, with each update improving the draft's accuracy as generation proceeds. Our results across multiple models from the Qwen-3, Qwen-3.5, and Llama3.1 families show that TTS improves acceptance lengths over state-of-the-art speculators by up to $72\%$ and $41\%$ on average, with the benefits scaling with increased generation lengths.

preprint2022arXiv

Assessing Effectiveness of Pulsed Input on Mixing Characteristics of Non-Newtonian fluids in T-shaped Channels

Mixing of reagents in microfluidics is necessary for various applications however, due to the laminar nature of flows, efficient mixing in a small span of length and time becomes difficult. The analysis of mixing of non-Newtonian fluids is critical as they are commonly encountered in practical applications. Towards this, we investigated an effective way for mixing of non-Newtonian fluids using pulsatile velocity inlet conditions. In the present study, the non-Newtonian fluid is modelled using the power law model with varying fluid rheology from shear thickening to shear thinning. For enhancing the mixing, pulsed velocity inlet condition is applied with varying phase angle and frequency and compared with constant velocity inlet condition. We demonstrated enhanced mixing using pulsing velocity inlet condition and achieved a maximum mixing of 97.6% using pulsed input velocity with a phase difference of 180° and considering a frequency of 5 Hz for the case of shear-thinning fluid (n=0.6). For the same condition, the mixing index is 89.1% and 85.2% for Newtonian and shear thickening fluid (n=1.4), respectively. The present study will be helpful in designing micromixers for mixing non-Newtonian fluid effectively in a small span of length and time.

preprint2022arXiv

Combined effect of Fluid Rheology and Surface Modification on Eletrokinetic Energy Generation through Finite Length Microchannel

Electrokinetic energy conversion provides a scheme for energy harvesting and storage for on-chip applications. However, the major drawback of electrokinetic energy conversion is its low conversion efficiency. Researchers are in a quest to find ways to improve this efficiency. With the same motive, we investigated the generation of streaming potential by applying surface modification and employing a non-Newtonian fluid to flow through the microchannel under constant pressure difference across its ends. Shear-thickening liquids tend to lessen electrokinetic effects, whereas shear-thinning liquids favour them. Also, having superhydrophobic surfaces improve the magnitude of generated streaming potential. We examine the combined effect of fluid rheology and surface modification on electrokinetic energy generation. We have learned intriguing insights about using non-Newtonian fluid in hydrophobic microchannels as an outcome of our combined research. Hydrophobic surfaces do not enhance the efficiency for a fluid with below a power law index of 0.7. The findings of this research can be used towards the selection of fluid-substrate combination that will optimize electrokinetic power generation efficiency.

preprint2022arXiv

Giant enhancement of third-harmonic generation in graphene-metal heterostructures

Nonlinear nanophotonics leverages engineered nanostructures to funnel light into small volumes and intensify nonlinear optical processes with spectral and spatial control. Due to its intrinsically large and electrically tunable nonlinear optical response, graphene is an especially promising nanomaterial for nonlinear optoelectronic applications. Here we report on exceptionally strong optical nonlinearities in graphene-insulator-metal heterostructures, demonstrating an enhancement by three orders of magnitude in the third-harmonic signal compared to bare graphene. Furthermore, by increasing the graphene Fermi energy through an external gate voltage, we find that graphene plasmons mediate the optical nonlinearity and modify the third-harmonic signal. Our findings show that graphene-insulator-metal is a promising heterostructure for optically-controlled and electrically-tunable nano-optoelectronic components.

preprint2022arXiv

Load Balancing and Resource Allocation in Fog-Assisted 5G Networks: An Incentive-based Game Theoretic Approach

Fog-assisted 5G Networks allow the users within the networks to execute their tasks and processes through fog nodes and cooperation among the fog nodes. As a result, the delay in task execution reduces as compared to that in case of independent task execution, where the Base Station (BS) or server is directly involved. In the practical scenario, the ability to cooperate clearly depends on the willingness of fog nodes to cooperate. Hence, in this paper, we propose an incentive-based bargaining approach which encourages the fog nodes to cooperate among themselves by receiving incentives from the end users benefitting from the cooperation. Considering the heterogenous nature of users and fog nodes based on their storage capacity, energy efficiency etc., we aim to emphasise a fair incentive mechanism which fairly and uniformly distributes the incentives from user to the participating fog nodes. The proposed incentive-based cooperative approach reduces the cost of end users as well as balances the energy consumption of fog nodes. The proposed system model addresses and models the above approaches and mathematically formulate cost models for both fog nodes and the end users in a fog-assisted 5G network.

preprint2022arXiv

Reshape: Adaptive Result-aware Skew Handling for Exploratory Analysis on Big Data

The process of data analysis, especially in GUI-based analytics systems, is highly exploratory. The user iteratively refines a workflow multiple times before arriving at the final workflow. In such an exploratory setting, it is valuable to the user if the initial results of the workflow are representative of the final answers so that the user can refine the workflow without waiting for the completion of its execution. Partitioning skew may lead to the production of misleading initial results during the execution. In this paper, we explore skew and its mitigation strategies from the perspective of the results shown to the user. We present a novel framework called Reshape that can adaptively handle partitioning skew in pipelined execution. Reshape employs a two-phase approach that transfers load in a fine-tuned manner to mitigate skew iteratively during execution, thus enabling it to handle changes in input-data distribution. Reshape has the ability to adaptively adjust skew-handling parameters, which reduces the technical burden on the users. Reshape supports a variety of operators such as HashJoin, Group-by, and Sort. We implemented Reshape on top of two big data engines, namely Amber and Flink, to demonstrate its generality and efficiency, and report an experimental evaluation using real and synthetic datasets.

preprint2022arXiv

Suboptimal Consensus Protocol Design for a Class of Multiagent Systems

This article presents a new technique for suboptimal consensus protocol design for a class of multiagent systems. The technique is based upon the extension of newly developed sufficient conditions for suboptimal linear-quadratic optimal control design, which are derived in this paper by an explication of a noniterative solution technique of the infinite-horizon linear quadratic regulation problem in the Krotov framework. For suboptimal consensus protocol design, the structural requirements on the overall feedback gain matrix, which are inherently imposed by agents dynamics and their interaction topology, are recast on a specific matrix introduced in a suitably formulated convex optimization problem. As a result, preassigning the identical feedback gain matrices to a network of homogeneous agents, which acts on the relative state variables with respect to their neighbors is not required. The suboptimality of the computed control laws is quantified by implicitly deriving an upper bound on the cost in terms of the solution of a convex optimization problem and initial conditions instead of specifying it a priori. Numerical examples are provided to demonstrate the implementation of proposed approaches and their comparison with existing methods in the literature.

preprint2021arXiv

Anomalous heating in a colloidal system

We report anomalous heating in a colloidal system, the first observation of the inverse Mpemba effect, where an initially cold system heats up faster than an identical warm system coupled to the same thermal bath. For an overdamped, Brownian colloidal particle moving in a tilted double-well potential, we find a non-monotonic dependence of the heating times on the initial temperature of the system, as predicted by an eigenfunction expansion of the associated Fokker-Planck equation. By carefully tuning parameters, we also observe a "strong" version of anomalous heating, where a cold system heats up exponentially faster than systems prepared under slightly different conditions

preprint2020arXiv

Exponentially faster cooling in a colloidal system

As the temperature of a cooling object decreases as it relaxes to thermal equilibrium, it is intuitively assumed that a hot object should take longer to cool than a warm one. Yet, some 2,300 years ago, Aristotle observed that "to cool hot water quickly, begin by putting it in the sun". In the 1960s, this counterintuitive phenomenon was rediscovered as the statement that "hot water can freeze faster than cold water" and has become known as the Mpemba effect; it has since been the subject of much experimental investigation and some controversy. Although many specific mechanisms have been proposed, no general consensus exists as to the underlying cause. Here we demonstrate the Mpemba effect in a controlled setting - the thermal quench of a colloidal system immersed in water, which serves as a heat bath. Our results are reproducible and agree quantitatively with calculations based on a recently proposed theoretical framework. By carefully choosing parameters, we observe cooling that is exponentially faster than that observed using typical parameters, in accord with the recently predicted strong Mpemba effect. Our experiments outline the generic conditions needed to accelerate heat removal and relaxation to thermal equilibrium and support the idea that the Mpemba effect is not simply a scientific curiosity concerning how water freezes into ice - one of the many anomalous features of water - but rather the prototype for a wide range of anomalous relaxation phenomena of broad technological importance.

preprint2020arXiv

Far-field Excitation of Single Graphene Plasmon Cavities with Ultra-compressed Mode-volumes

Acoustic-graphene-plasmons (AGPs) are highly confined electromagnetic modes, carrying large momentum and low loss in the mid-infrared/Terahertz spectra. Owing to their ability to confine light to extremely small dimensions, they bear great potential for ultra-strong light-matter interactions in this long wavelength regime, where molecular fingerprints reside. However, until now AGPs have been restricted to micron-scale areas, reducing their confinement potential by several orders-of-magnitude. Here, by utilizing a new type of graphene-based magnetic-resonance, we realize single, nanometric-scale AGP cavities, reaching record-breaking mode-volume confinement factors of $\thicksim5\cdot10^{-10}$. This AGP cavity acts as a mid-infrared nanoantenna, which is efficiently excited from the far-field, and electrically tuneble over an ultra-broadband spectrum. Our approach provides a new platform for studying ultra-strong-coupling phenomena, such as chemical manipulation via vibrational-strong-coupling, and a path to efficient detectors and sensors, in this challenging spectral range.

preprint2020arXiv

Liquid-phase reinforced Metal matrix (LMM) composite with non-intuitive properties

Over the ages, efforts have been made to use composite design to reinforce metals and alloys in order to increase their strength and modulus. On the other hand, nature herself improves the strength, ductility, stiffness and toughness of materials by strengthening them with liquids having zero strength/modulus. Here, emulating nature, efforts have been made to develop a new class of tin based alloy/composite with liquid metal reinforcement (LMM). Based on thermodynamic calculations, a composition has been designed such that on melting and casting it forms a solid metal (tin solid solution) and the eutectic mixture remains in liquid form at room temperature. The composite structure named as LMM shows multifold improvement in hardness, strength, ductility, toughness and wear resistance as compared to conventional solder alloys. A Finite Element Method (FEM) based simulation shows strain distribution in the composite which results in the unique behavior. The LMM also shows a negative coefficient of thermal expansion which is further verified using in-situ microscopy and thermodynamic calculations.

preprint2020arXiv

Low field transport calculations in 2-dimensional electron gas in $\mathrm{β\mbox{-}(Al_{x}Ga_{1-x})_{2}O_{3}/Ga_{2}O_{3}}$ heterostructures

$\mathrmβ$-Gallium oxide ($\mathrm{β\mbox{-}Ga_{2}O_{3}}$) is an emerging widebandgap semiconductor for potential application in power and RF electronics applications. Initial theoretical calculation on a 2-dimensional electron gas (2DEG) in $\mathrm{β\mbox{-}(Al_{x}Ga_{1-x})_{2}O_{3}/Ga_{2}O_{3}}$ heterostructures show the promise for high speed transistors. However, the experimental results do not get close to the predicted mobility values. In this work, We perform more comprehensive calculations to study the low field 2DEG transport properties in the $\mathrm{β\mbox{-}(Al_{x}Ga_{1-x})_{2}O_{3}/Ga_{2}O_{3}}$ heterostructure. A self-consistent Poisson-Schrodinger simulation of heterostructure is used to obtain the subband energies and wavefunctions. The electronic structure, assuming confinement in a particular direction, and the phonon dispersion is calculated based on first principle methods under DFT and DFPT framework. Phonon confinement is not considered for the sake of simplicity. The different scattering mechanisms that are included in the calculation are phonon (polar and non-polar), remote impurity, alloy and interface-roughness. We include the full dynamic screening polar optical phonon screening. We report the temperature dependent low-field electron mobility.

preprint2020arXiv

Supersonic rotation of a superfluid: a long-lived dynamical ring

We present the experimental realization of a long-lived superfluid flow of a quantum gas rotating in an anharmonic potential, sustained by its own angular momentum. The gas is set into motion by rotating an elliptical deformation of the trap. An evaporation selective in angular momentum yields an acceleration of rotation until the density vanishes at the trap center, resulting in a dynamical ring with 350 hbar angular momentum per particle. The density profile of the ring corresponds to the one of a quasi two-dimensional superfluid, with a linear velocity reaching Mach 18 and a rotation lasting more than a minute.