Researcher profile

Amit Goyal

Amit Goyal contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

SAGE: Scalable Automatic Gating Ensemble for Confident Negative Harvesting in Fraud Detection

Music streaming fraud, where bad actors artificially inflate stream counts to manipulate chart rankings and royalty payments, poses a significant threat to streaming services and legitimate content creators. Traditional fraud detection approaches struggle with a critical challenge: many legitimate edge cases, including super-fans and sleep-music sessions, exhibit activity patterns that closely mimic those of coordinated fraud. We present SAGE, a novel counterfactual-aware negative harvesting approach that combines SimHash-based stratified sampling with a modular gating ensemble for confident negative identification from unlabeled data. Our ensemble architecture employs pluggable statistical gates (currently instantiated with Mahalanobis distance and k-NN density) with configurable voting thresholds enabling adaptive precision-recall trade-offs. This addresses the representation bias problem in Positive-Unlabeled learning by ensuring comprehensive coverage of rare behavioral cohorts through floor-constrained sampling. Evaluation demonstrates strong precision and recall on held-out data. The approach generalizes across fraud detection domains, achieving strong performance on both customer-level and artist-level fraud without modification to the core methodology.

preprint2021arXiv

Phase modulated domain walls and dark solitons for surface gravity waves

We report theoretical prediction of exact localized solutions for dynamics of surface gravity waves, at the critical point kh=1.363, modelled by higher-order nonlinear Schrodinger equation. The model possess domain walls (kink solitons) and dark solitons modulated through different phase profiles. The parametric domains are delineated for the existence of soliton solutions. The effect of wave parameters have been discussed on the amplitude of surface gravity waves. Our work is motivated by Tsitoura et al. [1], on experimental and analytical observation of phase domain walls for deep water surface gravity waves modelled by nonlinear Schrodinger equation.

preprint2020arXiv

Chirped Lambert W-kink solitons of the complex cubic-quintic Ginzburg-Landau equation with intrapulse Raman scattering

In this paper, an exact explicit solution for the complex cubic-quintic Ginzburg-Landau equation is obtained, by using Lambert W function or omega function. More pertinently, we term them as Lambert W-kink-type solitons, begotten under the influence of intrapulse Raman scattering. Parameter domains are delineated in which these optical solitons exit in the ensuing model. We report the effect of model coefficients on the amplitude of Lambert W-kink solitons, which enables us to control efficiently the pulse intensity and hence their subsequent evolution. Also, moving fronts or optical shock-type solitons are obtained as a byproduct of this model. We explicate the mechanism to control the intensity of these fronts, by fine tuning the spectral filtering or gain parameter. It is exhibited that the frequency chirp associated with these optical solitons depends on the intensity of the wave and saturates to a constant value as the retarded time approaches its asymptotic value.

preprint2020arXiv

Controlled self-similar matter waves in PT-symmetric waveguide

We study the dynamics of Bose-Einstein condensate coupled to a waveguide with parity-time symmetric potential in the presence of quadratic-cubic nonlinearity modelled by Gross-Pitaevskii equation with external source. We employ the self-similar technique to obtain matter wave solutions, such as bright, kinktype, rational dark and Lorentzian-type self-similar waves for this model. The dynamical behavior of self-similar matter waves can be controlled through variation of trapping potential, external source and nature of nonlinearities present in the system.

preprint2016arXiv

Convex Factorization Machine for Regression

We propose the convex factorization machine (CFM), which is a convex variant of the widely used Factorization Machines (FMs). Specifically, we employ a linear+quadratic model and regularize the linear term with the $\ell_2$-regularizer and the quadratic term with the trace norm regularizer. Then, we formulate the CFM optimization as a semidefinite programming problem and propose an efficient optimization procedure with Hazan's algorithm. A key advantage of CFM over existing FMs is that it can find a globally optimal solution, while FMs may get a poor locally optimal solution since the objective function of FMs is non-convex. In addition, the proposed algorithm is simple yet effective and can be implemented easily. Finally, CFM is a general factorization method and can also be used for other factorization problems including including multi-view matrix factorization and tensor completion problems. Through synthetic and movielens datasets, we first show that the proposed CFM achieves results competitive to FMs. Furthermore, in a toxicogenomics prediction task, we show that CFM outperforms a state-of-the-art tensor factorization method.

preprint2015arXiv

Viral Marketing Meets Social Advertising: Ad Allocation with Minimum Regret

In this paper, we study the problem of allocating ads to users through the viral-marketing lens. Advertisers approach the host with a budget in return for the marketing campaign service provided by the host. We show that allocation that takes into account the propensity of ads for viral propagation can achieve significantly better performance. However, uncontrolled virality could be undesirable for the host as it creates room for exploitation by the advertisers: hoping to tap uncontrolled virality, an advertiser might declare a lower budget for its marketing campaign, aiming at the same large outcome with a smaller cost. This creates a challenging trade-off: on the one hand, the host aims at leveraging virality and the network effect to improve advertising efficacy, while on the other hand the host wants to avoid giving away free service due to uncontrolled virality. We formalize this as the problem of ad allocation with minimum regret, which we show is NP-hard and inapproximable w.r.t. any factor. However, we devise an algorithm that provides approximation guarantees w.r.t. the total budget of all advertisers. We develop a scalable version of our approximation algorithm, which we extensively test on four real-world data sets, confirming that our algorithm delivers high quality solutions, is scalable, and significantly outperforms several natural baselines.

preprint2014arXiv

Few-cycle optical solitary waves in cascaded-quadratic-cubic-quintic nonlinear media

We study the propagation of few-cycle optical solitary waves in a nonlinear media under the combined action of quadratic, cubic and quintic nonlinearities in a large phase-mismatched second harmonic (SHG) process. Exact bright and dark soliton solutions to the nonlinear evolution equation for cascaded quadratic media beyond the slowly varying envelope approximations is reported. The analytical solutions obtained are verified through numerical simulations.

preprint2013arXiv

Validating Network Value of Influencers by means of Explanations

Recently, there has been significant interest in social influence analysis. One of the central problems in this area is the problem of identifying influencers, such that by convincing these users to perform a certain action (like buying a new product), a large number of other users get influenced to follow the action. The client of such an application is a marketer who would target these influencers for marketing a given new product, say by providing free samples or discounts. It is natural that before committing resources for targeting an influencer the marketer would be interested in validating the influence (or network value) of influencers returned. This requires digging deeper into such analytical questions as: who are their followers, on what actions (or products) they are influential, etc. However, the current approaches to identifying influencers largely work as a black box in this respect. The goal of this paper is to open up the black box, address these questions and provide informative and crisp explanations for validating the network value of influencers. We formulate the problem of providing explanations (called PROXI) as a discrete optimization problem of feature selection. We show that PROXI is not only NP-hard to solve exactly, it is NP-hard to approximate within any reasonable factor. Nevertheless, we show interesting properties of the objective function and develop an intuitive greedy heuristic. We perform detailed experimental analysis on two real world datasets - Twitter and Flixster, and show that our approach is useful in generating concise and insightful explanations of the influence distribution of users and that our greedy algorithm is effective and efficient with respect to several baselines.

preprint2011arXiv

A Data-Based Approach to Social Influence Maximization

Influence maximization is the problem of finding a set of users in a social network, such that by targeting this set, one maximizes the expected spread of influence in the network. Most of the literature on this topic has focused exclusively on the social graph, overlooking historical data, i.e., traces of past action propagations. In this paper, we study influence maximization from a novel data-based perspective. In particular, we introduce a new model, which we call credit distribution, that directly leverages available propagation traces to learn how influence flows in the network and uses this to estimate expected influence spread. Our approach also learns the different levels of influenceability of users, and it is time-aware in the sense that it takes the temporal nature of influence into account. We show that influence maximization under the credit distribution model is NP-hard and that the function that defines expected spread under our model is submodular. Based on these, we develop an approximation algorithm for solving the influence maximization problem that at once enjoys high accuracy compared to the standard approach, while being several orders of magnitude faster and more scalable.

preprint2011arXiv

Approximation Analysis of Influence Spread in Social Networks

In the context of influence propagation in a social graph, we can identify three orthogonal dimensions - the number of seed nodes activated at the beginning (known as budget), the expected number of activated nodes at the end of the propagation (known as expected spread or coverage), and the time taken for the propagation. We can constrain one or two of these and try to optimize the third. In their seminal paper, Kempe et al. constrained the budget, left time unconstrained, and maximized the coverage: this problem is known as Influence Maximization. In this paper, we study alternative optimization problems which are naturally motivated by resource and time constraints on viral marketing campaigns. In the first problem, termed Minimum Target Set Selection (or MINTSS for short), a coverage threshold n is given and the task is to find the minimum size seed set such that by activating it, at least n nodes are eventually activated in the expected sense. In the second problem, termed MINTIME, a coverage threshold n and a budget threshold k are given, and the task is to find a seed set of size at most k such that by activating it, at least n nodes are activated, in the minimum possible time. Both these problems are NP-hard, which motivates our interest in their approximation. For MINTSS, we develop a simple greedy algorithm and show that it provides a bicriteria approximation. We also establish a generic hardness result suggesting that improving it is likely to be hard. For MINTIME, we show that even bicriteria and tricriteria approximations are hard under several conditions. However, if we allow the budget to be boosted by a logarithmic factor and allow the coverage to fall short, then the problem can be solved exactly in PTIME. Finally, we show the value of the approximation algorithms, by comparing them against various heuristics.