Researcher profile

Sameer G. Kulkarni

Sameer G. Kulkarni contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2025arXiv

CPePC: Cooperative and Predictive Popularity based Caching for Named Data Networks

Caching content is an inherent feature of Named Data Networks. Limited cache capacity of routers warrants that the choice of content being cached is judiciously done. Existing techniques resort to caching popular content to maximize utilization. However, these methods experience significant overhead for coordinating and estimating the popularity of content. To address this issue, in this paper, we present CPePC, which is a cooperative caching technique designed to improve performance. It accomplishes this through a combination of two factors. First, CPePC enhances efficiency by minimizing the overhead of popularity estimation. Second, it forecasts a parameter that governs caching decisions. Efficiency in popularity estimation is achieved by dividing the network into several non-overlapping communities using a community estimation algorithm and selecting a leader node to coordinate this on behalf of all the nodes in the community. CPePC bases its caching decisions by predicting a parameter whose value is estimated using current cache occupancy and the popularity of the content into account. We present algorithms for community detection, leader selection, content popularity estimation, and caching decisions made by the CPePC method. We evaluate and compare it with six other state-of-the-art caching techniques, with simulations performed using a discrete event simulator to show that it outperforms others.

preprint2020arXiv

Spatial Sharing of GPU for Autotuning DNN models

GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources can still provide low inference latency, just as much as dedicating all of the GPU for their inference task. An approach to improve DNN inference is tuning of the DNN model. Autotuning frameworks find the optimal low-level implementation for a certain target device based on the trained machine learning model, thus reducing the DNN's inference latency and increasing inference throughput. We observe an interdependency between the tuned model and its inference latency. A DNN model tuned with specific GPU resources provides the best inference latency when inferred with close to the same amount of GPU resources. While a model tuned with the maximum amount of the GPU's resources has poorer inference latency once the GPU resources are limited for inference. On the other hand, a model tuned with an appropriate amount of GPU resources still achieves good inference latency across a wide range of GPU resource availability. We explore the causes that impact the tuning of a model at different amounts of GPU resources. We present many techniques to maximize resource utilization and improve tuning performance. We enable controlled spatial sharing of GPU to multiplex several tuning applications on the GPU. We scale the tuning server instances and shard the tuning model across multiple client instances for concurrent tuning of different operators of a model, achieving better GPU multiplexing. With our improvements, we decrease DNN autotuning time by up to 75 percent and increase throughput by a factor of 5.