Source author record

Puneet Sharma

Puneet Sharma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.DS Computer Vision Distributed, Parallel, and Cluster Computing Machine Learning Computation and Language cond-mat.mes-hall cond-mat.stat-mech Cryptography and Security cs.CY eess.SY hep-th Networking and Internet Architecture Neural and Evolutionary Computing Performance physics.flu-dyn quant-ph Systems and Control

Catalog footprint

What is connected

12works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

The fast pace of artificial intelligence~(AI) innovation demands an agile methodology for observation, reproduction and optimization of distributed machine learning~(ML) workload behavior in production AI systems and enables efficient software-hardware~(SW-HW) co-design for future systems. We present Chakra, an open and portable ecosystem for performance benchmarking and co-design. The core component of Chakra is an open and interoperable graph-based representation of distributed AI/ML workloads, called Chakra execution trace~(ET). These ETs represent key operations, such as compute, memory, and communication, data and control dependencies, timing, and resource constraints. Additionally, Chakra includes a complementary set of tools and capabilities to enable the collection, analysis, generation, and adoption of Chakra ETs by a broad range of simulators, emulators, and replay tools. We present analysis of Chakra ETs collected on production AI clusters and demonstrate value via real-world case studies. Chakra has been adopted by MLCommons and has active contributions and engagement across the industry, including but not limited to NVIDIA, AMD, Meta, Keysight, HPE, and Scala, to name a few.

preprint2022arXiv

Insights from a pseudospectral study of a potentially singular solution of the three-dimensional axisymmetric incompressible Euler equation

We develop a Fourier-Chebyshev pseudospectral direct numerical simulation (DNS) to examine a potentially singular solution of the radially bounded, three-dimensional (3D), axisymmetric Euler equations [G. Luo and T.Y. Hou, Proc. Natl. Acad. Sci. USA, 111.36 (2014)]. We demonstrate that: (a) the time of singularity is preceded, in any spectrally truncated DNS, by the formation of oscillatory structures called tygers, first investigated in the one-dimensional (1D) Burgers and two-dimensional (2D) Euler equations; (b) the analyticity-strip method can be generalized to obtain an estimate for the (potential) singularity time.

preprint2020arXiv

Assertion Detection in Multi-Label Clinical Text using Scope Localization

Multi-label sentences (text) in the clinical domain result from the rich description of scenarios during patient care. The state-of-theart methods for assertion detection mostly address this task in the setting of a single assertion label per sentence (text). In addition, few rules based and deep learning methods perform negation/assertion scope detection on single-label text. It is a significant challenge extending these methods to address multi-label sentences without diminishing performance. Therefore, we developed a convolutional neural network (CNN) architecture to localize multiple labels and their scopes in a single stage end-to-end fashion, and demonstrate that our model performs atleast 12% better than the state-of-the-art on multi-label clinical text.

preprint2020arXiv

Multi-boundary entanglement in Chern-Simons theory with finite gauge groups

We study the multi-boundary entanglement structure of the states prepared in (1+1) and (2+1) dimensional Chern-Simons theory with finite discrete gauge group $G$. The states in (1+1)-$d$ are associated with Riemann surfaces of genus $g$ with multiple $S^1$ boundaries and we use replica trick to compute the entanglement entropy for such states. In (2+1)-$d$, we focus on the states associated with torus link complements which live in the tensor product of Hilbert spaces associated with multiple $T^2$. We present a quantitative analysis of the entanglement structure for both abelian and non-abelian groups. For all the states considered in this work, we find that the entanglement entropy for direct product of groups is the sum of entropy for individual groups, i.e. $\text{EE}(G_1 \times G_2) = \text{EE}(G_1)+\text{EE}(G_2)$. Moreover, the reduced density matrix obtained by tracing out a subset of the total Hilbert space has a positive semidefinite partial transpose on any bi-partition of the remaining Hilbert space.

preprint2020arXiv

PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing

The global health threat from COVID-19 has been controlled in a number of instances by large-scale testing and contact tracing efforts. We created this document to suggest three functionalities on how we might best harness computing technologies to supporting the goals of public health organizations in minimizing morbidity and mortality associated with the spread of COVID-19, while protecting the civil liberties of individuals. In particular, this work advocates for a third-party free approach to assisted mobile contact tracing, because such an approach mitigates the security and privacy risks of requiring a trusted third party. We also explicitly consider the inferential risks involved in any contract tracing system, where any alert to a user could itself give rise to de-anonymizing information. More generally, we hope to participate in bringing together colleagues in industry, academia, and civil society to discuss and converge on ideas around a critical issue rising with attempts to mitigate the COVID-19 pandemic.

preprint2020arXiv

Spatial Sharing of GPU for Autotuning DNN models

GPUs are used for training, inference, and tuning the machine learning models. However, Deep Neural Network (DNN) vary widely in their ability to exploit the full power of high-performance GPUs. Spatial sharing of GPU enables multiplexing several DNNs on the GPU and can improve GPU utilization, thus improving throughput and lowering latency. DNN models given just the right amount of GPU resources can still provide low inference latency, just as much as dedicating all of the GPU for their inference task. An approach to improve DNN inference is tuning of the DNN model. Autotuning frameworks find the optimal low-level implementation for a certain target device based on the trained machine learning model, thus reducing the DNN's inference latency and increasing inference throughput. We observe an interdependency between the tuned model and its inference latency. A DNN model tuned with specific GPU resources provides the best inference latency when inferred with close to the same amount of GPU resources. While a model tuned with the maximum amount of the GPU's resources has poorer inference latency once the GPU resources are limited for inference. On the other hand, a model tuned with an appropriate amount of GPU resources still achieves good inference latency across a wide range of GPU resource availability. We explore the causes that impact the tuning of a model at different amounts of GPU resources. We present many techniques to maximize resource utilization and improve tuning performance. We enable controlled spatial sharing of GPU to multiplex several tuning applications on the GPU. We scale the tuning server instances and shard the tuning model across multiple client instances for concurrent tuning of different operators of a model, achieving better GPU multiplexing. With our improvements, we decrease DNN autotuning time by up to 75 percent and increase throughput by a factor of 5.

preprint2016arXiv

Comparative Advantage Driven Resource Allocation for Virtual Network Functions

As Communication Service Providers (CSPs) adopt the Network Function Virtualization (NFV) paradigm, they need to transition their network function capacity to a virtualized infrastructure with different Network Functions running on a set of heterogeneous servers. This abstract describes a novel technique for allocating server resources (compute, storage and network) for a given set of Virtual Network Function (VNF) requirements. Our approach helps the telco providers decide the most effective way to run several VNFs on servers with different performance characteristics. Our analysis of prior VNF performance characterization on heterogeneous/different server resource allocations shows that the ability to arbitrarily create many VNFs among different servers' resource allocations leads to a comparative advantage among servers. We propose a VNF resource allocation method called COMPARE that maximizes the total throughput of the system by formulating this resource allocation problem as a comparative advantage problem among heterogeneous servers. There several applications for using the VNF resource allocation from COMPARE including transitioning current Telco deployments to NFV based solutions and providing initial VNF placement for Service Function Chain (SFC) provisioning.

preprint2016arXiv

Dynamics of Nonautonomous Discrete Dynamical Systems

In this paper we study the dynamics of a general non-autonomous dynamical system generated by a family of continuous self maps on a compact space $X$. We derive necessary and sufficient conditions for the system to exhibit complex dynamical behavior. In the process we discuss properties like transitivity, weakly mixing, topologically mixing, minimality, sensitivity, topological entropy and Li-Yorke chaoticity for the non-autonomous system. We also give examples to prove that the dynamical behavior of the non-autonomous system in general cannot be characterized in terms of the dynamical behavior of its generating functions.

preprint2016arXiv

Induced Dynamics on the Hyperspaces

In this paper, we study the dynamics induced by finite commutative relation. We prove that the dynamics generated by such a non-trivial collection cannot be transitive/super-transitive and hence cannot exhibit higher degrees of mixing. As a consequence we establish that the dynamics induced by such a collection on the hyperspace endowed with any admissible hit and miss topology cannot be transitive and hence cannot exhibit any form of mixing. We also prove that if the system is generated by such a commutative collection, under suitable conditions the induced system cannot have dense set of periodic points. In the end we give example to show that the induced dynamics in this case may or may not be sensitive.

preprint2016arXiv

Matrix Characterization of Multidimensional Subshifts of Finite Type

Let $X\subset A^{Z^d}$ be a $2$-dimensional subshift of finite type. We prove that any $2$-dimensional multidimensional subshift of finite type can be characterized by a square matrix of infinite dimension. We extend our result to a general $d$-dimensional case. We prove that the multidimensional shift space is non-empty if and only if the matrix obtained is of positive dimension. In the process, we give an alternative view of the necessary and sufficient conditions obtained for the non-emptiness of the multidimensional shift space. We also give sufficient conditions for the shift space $X$ to exhibit periodic points.

preprint2013arXiv

Efficient Image Retargeting for High Dynamic Range Scenes

Most of the real world scenes have a very high dynamic range (HDR). The mobile phone cameras and the digital cameras available in markets are limited in their capability in both the range and spatial resolution. Same argument can be posed about the limited dynamic range display devices which also differ in the spatial resolution and aspect ratios. In this paper, we address the problem of displaying the high contrast low dynamic range (LDR) image of a HDR scene in a display device which has different spatial resolution compared to that of the capturing digital camera. The optimal solution proposed in this work can be employed with any camera which has the ability to shoot multiple differently exposed images of a scene. Further, the proposed solutions provide the flexibility in the depiction of entire contrast of the HDR scene as a LDR image with an user specified spatial resolution. This task is achieved through an optimized content aware retargeting framework which preserves salient features along with the algorithm to combine multi-exposure images. We show the proposed approach performs exceedingly well in the generation of high contrast LDR image of varying spatial resolution compared to an alternate approach.

preprint2012arXiv

Affine Almost Automorphic Actions on Compact Nilmanifolds

We discuss conditions under which an affine automorphism of a compact nilmanifold is almost automorphic, and the structure of such automorphisms from a dynamical point of view.

Puneet Sharma

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Insights from a pseudospectral study of a potentially singular solution of the three-dimensional axisymmetric incompressible Euler equation

Assertion Detection in Multi-Label Clinical Text using Scope Localization

Multi-boundary entanglement in Chern-Simons theory with finite gauge groups

PACT: Privacy Sensitive Protocols and Mechanisms for Mobile Contact Tracing

Spatial Sharing of GPU for Autotuning DNN models

Comparative Advantage Driven Resource Allocation for Virtual Network Functions

Dynamics of Nonautonomous Discrete Dynamical Systems

Induced Dynamics on the Hyperspaces

Matrix Characterization of Multidimensional Subshifts of Finite Type

Efficient Image Retargeting for High Dynamic Range Scenes

Affine Almost Automorphic Actions on Compact Nilmanifolds