Source author record

Karthik S. Gurumoorthy

Karthik S. Gurumoorthy appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision math.NA Computation Computational Complexity Information Theory math.IT math.OC math.PR math.ST Neural and Evolutionary Computing Neurons and Cognition Numerical Analysis Statistics Theory

Catalog footprint

What is connected

11works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A decision-tree framework to select optimal box-sizes for product shipments

In package-handling facilities, boxes of varying sizes are used to ship products. Improperly sized boxes with box dimensions much larger than the product dimensions create wastage and unduly increase the shipping costs. Since it is infeasible to make unique, tailor-made boxes for each of the $N$ products, the fundamental question that confronts e-commerce companies is: How many $K << N$ cuboidal boxes need to manufactured and what should be their dimensions? In this paper, we propose a solution for the single-count shipment containing one product per box in two steps: (i) reduce it to a clustering problem in the $3$ dimensional space of length, width and height where each cluster corresponds to the group of products that will be shipped in a particular size variant, and (ii) present an efficient forward-backward decision tree based clustering method with low computational complexity on $N$ and $K$ to obtain these $K$ clusters and corresponding box dimensions. Our algorithm has multiple constituent parts, each specifically designed to achieve a high-quality clustering solution. As our method generates clusters in an incremental fashion without discarding the present solution, adding or deleting a size variant is as simple as stopping the backward pass early or executing it for one more iteration. We tested the efficacy of our approach by simulating actual single-count shipments that were transported during a month by Amazon using the proposed box dimensions. Even by just modifying the existing box dimensions and not adding a new size variant, we achieved a reduction of $4.4\%$ in the shipment volume, contributing to the decrease in non-utilized, air volume space by $2.2\%$. The reduction in shipment volume and air volume improved significantly to $10.3\%$ and $6.1\%$ when we introduced $4$ additional boxes.

preprint2022arXiv

Individual Treatment Effect Estimation Through Controlled Neural Network Training in Two Stages

We develop a Causal-Deep Neural Network (CDNN) model trained in two stages to infer causal impact estimates at an individual unit level. Using only the pre-treatment features in stage 1 in the absence of any treatment information, we learn an encoding for the covariates that best represents the outcome. In the $2^{nd}$ stage we further seek to predict the unexplained outcome from stage 1, by introducing the treatment indicator variables alongside the encoded covariates. We prove that even without explicitly computing the treatment residual, our method still satisfies the desirable local Neyman orthogonality, making it robust to small perturbations in the nuisance parameters. Furthermore, by establishing connections with the representation learning approaches, we create a framework from which multiple variants of our algorithm can be derived. We perform initial experiments on the publicly available data sets to compare these variants and get guidance in selecting the best variant of our CDNN method. On evaluating CDNN against the state-of-the-art approaches on three benchmarking datasets, we observe that CDNN is highly competitive and often yields the most accurate individual treatment effect estimates. We highlight the strong merits of CDNN in terms of its extensibility to multiple use cases.

preprint2022arXiv

Joint Probability Estimation Using Tensor Decomposition and Dictionaries

In this work, we study non-parametric estimation of joint probabilities of a given set of discrete and continuous random variables from their (empirically estimated) 2D marginals, under the assumption that the joint probability could be decomposed and approximated by a mixture of product densities/mass functions. The problem of estimating the joint probability density function (PDF) using semi-parametric techniques such as Gaussian Mixture Models (GMMs) is widely studied. However such techniques yield poor results when the underlying densities are mixtures of various other families of distributions such as Laplacian or generalized Gaussian, uniform, Cauchy, etc. Further, GMMs are not the best choice to estimate joint distributions which are hybrid in nature, i.e., some random variables are discrete while others are continuous. We present a novel approach for estimating the PDF using ideas from dictionary representations in signal processing coupled with low rank tensor decompositions. To the best our knowledge, this is the first work on estimating joint PDFs employing dictionaries alongside tensor decompositions. We create a dictionary of various families of distributions by inspecting the data, and use it to approximate each decomposed factor of the product in the mixture. Our approach can naturally handle hybrid $N$-dimensional distributions. We test our approach on a variety of synthetic and real datasets to demonstrate its effectiveness in terms of better classification rates and lower error rates, when compared to state of the art estimators.

preprint2020arXiv

Think out of the package: Recommending package types for e-commerce shipments

Multiple product attributes like dimensions, weight, fragility, liquid content etc. determine the package type used by e-commerce companies to ship products. Sub-optimal package types lead to damaged shipments, incurring huge damage related costs and adversely impacting the company's reputation for safe delivery. Items can be shipped in more protective packages to reduce damage costs, however this increases the shipment costs due to expensive packaging and higher transportation costs. In this work, we propose a multi-stage approach that trades-off between shipment and damage costs for each product, and accurately assigns the optimal package type using a scalable, computationally efficient linear time algorithm. A simple binary search algorithm is presented to find the hyper-parameter that balances between the shipment and damage costs. Our approach when applied to choosing package type for Amazon shipments, leads to significant cost savings of tens of millions of dollars in emerging marketplaces, by decreasing both the overall shipment cost and the number of in-transit damages. Our algorithm is live and deployed in the production system where, package types for more than 130,000 products have been modified based on the model's recommendation, realizing a reduction in damage rate of 24%.

preprint2016arXiv

Error bounds for gradient density estimation computed from a finite sample set using the method of stationary phase

For a twice continuously differentiable function $S$, we define the density function of its gradient (derivative in one dimension) $s = S^{\prime}$ as a random variable transformation of a uniformly distributed random variable using $s$ as the transformation function. Given $N$ values of $S$ sampled at equally spaced locations, we demonstrate using the method of stationary phase that the approximation error between the integral of the scaled, discrete power spectrum of the wave function $ϕ^{D}_τ=\frac{1}{\sqrt{L}}\exp\left(\frac{iS}τ\right)$ and the integral of the true density function of $s$ over an arbitrarily small interval is bounded above by $O(1/N)$ as $N \rightarrow \infty$ ($τ\rightarrow 0$). In addition to its easy implementation and fast computability in $O(N \log N)$ that only requires computing the discrete Fourier transform, our framework for obtaining the derivative density does not involve any parameter selection like the number of histogram bins, width of the histogram bins, width of the kernel parameter, number of mixture components etc. as required by other widely applied methods like histograms and Parzen windows.

preprint2015arXiv

A fast eikonal equation solver using the Schrodinger wave equation

We use a Schrödinger wave equation formalism to solve the eikonal equation. In our framework, a solution to the eikonal equation is obtained in the limit as Planck's constant $\hbar$ (treated as a free parameter) tends to zero of the solution to the corresponding linear Schrödinger equation. The Schrödinger equation corresponding to the eikonal turns out to be a \emph{generalized, screened Poisson equation}. Despite being linear, it does not have a closed-form solution for arbitrary forcing functions. We present two different techniques to solve the screened Poisson equation. In the first approach we use a standard perturbation analysis approach to derive a new algorithm which is guaranteed to converge provided the forcing function is bounded and positive. The perturbation technique requires a sequence of discrete convolutions which can be performed in $O(N\log N)$ using the Fast Fourier Transform (FFT) where $N$ is the number of grid points. In the second method we discretize the linear Laplacian operator by the finite difference method leading to a sparse linear system of equations which can be solved using the plethora of sparse solvers. The eikonal solution is recovered from the exponent of the resultant scalar field. Our approach eliminates the need to explicitly construct viscosity solutions as customary with direct solutions to the eikonal. Since the linear equation is computed for a small but non-zero $\hbar$, the obtained solution is an approximation. Though our solution framework is applicable to the general class of eikonal problems, we detail specifics for the popular vision applications of shape-from-shading, vessel segmentation, and path planning.

preprint2015arXiv

A new variational principle for the Euclidean distance function: Linear approach to the non-linear eikonal problem

We present a fast convolution-based technique for computing an approximate, signed Euclidean distance function $S$ on a set of 2D and 3D grid locations. Instead of solving the non-linear, static Hamilton-Jacobi equation ($\|\nabla S\|=1$), our solution stems from first solving for a scalar field $ϕ$ in a linear differential equation and then deriving the solution for $S$ by taking the negative logarithm. In other words, when $S$ and $ϕ$ are related by $ϕ= \exp \left(-\frac{S}τ \right)$ and $ϕ$ satisfies a specific linear differential equation corresponding to the extremum of a variational problem, we obtain the approximate Euclidean distance function $S = -τ\log(ϕ)$ which converges to the true solution in the limit as $τ\rightarrow 0$. This is in sharp contrast to techniques like the fast marching and fast sweeping methods which directly solve the Hamilton-Jacobi equation by the Godunov upwind discretization scheme. Our linear formulation results in a closed-form solution to the approximate Euclidean distance function expressible as a discrete convolution, and hence efficiently computable using the fast Fourier transform (FFT). Our solution also circumvents the need for spatial discretization of the derivative operator. As $τ\rightarrow0$ we show the convergence of our results to the true solution and also bound the error for a given value of $τ$. The differentiability of our solution allows us to compute---using a set of convolutions---the first and second derivatives of the approximate distance function. In order to determine the sign of the distance function (defined to be positive inside a closed region and negative outside), we compute the winding number in 2D and the topological degree in 3D, whose computations can also be performed via fast convolutions. We demonstrate the efficacy of our method through a set of experimental results.

preprint2015arXiv

Sensitivity Analysis for additive STDP rule

Spike Timing Dependent Plasticity (STDP) is a Hebbian like synaptic learning rule. The basis of STDP has strong experimental evidences and it depends on precise input and output spike timings. In this paper we show that under biologically plausible spiking regime, slight variability in the spike timing leads to drastically different evolution of synaptic weights when its dynamics are governed by the additive STDP rule.

preprint2013arXiv

An application of the stationary phase method for estimating probability densities of function derivatives

We prove a novel result wherein the density function of the gradients---corresponding to density function of the derivatives in one dimension---of a thrice differentiable function S (obtained via a random variable transformation of a uniformly distributed random variable) defined on a closed, bounded interval Ω\subset R is accurately approximated by the normalized power spectrum of ϕ=exp(iS/τ) as the free parameter τ-->0. The result is shown using the well known stationary phase approximation and standard integration techniques and requires proper ordering of limits. Experimental results provide anecdotal visual evidence corroborating the result.

preprint2013arXiv

Distance Transform Gradient Density Estimation using the Stationary Phase Approximation

The complex wave representation (CWR) converts unsigned 2D distance transforms into their corresponding wave functions. Here, the distance transform S(X) appears as the phase of the wave function ϕ(X)---specifically, ϕ(X)=exp(iS(X)/τwhere τis a free parameter. In this work, we prove a novel result using the higher-order stationary phase approximation: we show convergence of the normalized power spectrum (squared magnitude of the Fourier transform) of the wave function to the density function of the distance transform gradients as the free parameter τ-->0. In colloquial terms, spatial frequencies are gradient histogram bins. Since the distance transform gradients have only orientation information (as their magnitudes are identically equal to one almost everywhere), as τ-->0, the 2D Fourier transform values mainly lie on the unit circle in the spatial frequency domain. The proof of the result involves standard integration techniques and requires proper ordering of limits. Our mathematical relation indicates that the CWR of distance transforms is an intriguing, new representation.

preprint2013arXiv

On the dynamic compressibility of sets

We define a new notion of compressibility of a set of numbers through the dynamics of a polynomial function. We provide approaches to solve the problem by reducing it to the multi-criteria traveling salesman problem through a series of transformations. We then establish computational complexity results by giving some NP-completeness proofs. We also discuss about a notion of $ε$ K-compressibility of a set, with regard to lossy compression and deduce the necessary condition for the given set to be $ε$ K-compressible. Finally, we conclude by providing a list of open problems solutions to which could extend the applicability the our technique.

Karthik S. Gurumoorthy

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A decision-tree framework to select optimal box-sizes for product shipments

Individual Treatment Effect Estimation Through Controlled Neural Network Training in Two Stages

Joint Probability Estimation Using Tensor Decomposition and Dictionaries

Think out of the package: Recommending package types for e-commerce shipments

Error bounds for gradient density estimation computed from a finite sample set using the method of stationary phase

A fast eikonal equation solver using the Schrodinger wave equation

A new variational principle for the Euclidean distance function: Linear approach to the non-linear eikonal problem

Sensitivity Analysis for additive STDP rule

An application of the stationary phase method for estimating probability densities of function derivatives

Distance Transform Gradient Density Estimation using the Stationary Phase Approximation

On the dynamic compressibility of sets