Source author record

Vwani Roychowdhury

Vwani Roychowdhury appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Computer Vision Machine Learning quant-ph Social and Information Networks Artificial Intelligence Computational Complexity cond-mat.dis-nn cond-mat.stat-mech Multimedia nlin.AO physics.data-an

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An Exterior Method for Nonnegative Matrix Factorization

Nonnegative matrix factorization (NMF) seeks a low-rank approximation $X \approx UV^T$ with nonnegative factors and is commonly solved using interior methods that enforce feasibility throughout optimization. We show that such constraint-driven approaches can impede progress in the nonconvex landscape, leading to slow convergence or convergence to suboptimal stationary points. We propose an exterior framework for NMF (eNMF) that separates low-rank approximation from nonnegativity enforcement. Our method initializes from the optimal unconstrained factorization and introduces a rotation procedure that maps unconstrained factors to an exterior point closest to the nonnegative orthant. This viewpoint yields an algorithmic framework in which simple iterative updates converge to KKT-satisfying stationary points on the boundary of the positive orthant. The exterior formulation also enables a geometric interpretation of NMF solutions, clarifying equivalence classes of factorizations under permutation and orthogonal transformations. An intriguing numerical result, involving 400 NMF experiments across both real and synthetic datasets, show that in 99% of the cases, different algorithms tend to converge towards equivalent factor matrices. We benchmark eNMF against 9 state-of-the-art NMF algorithms with 9 initialization schemes across 3 real-world and 2 synthetic datasets. eNMF consistently outperforms all 81 competitors, achieving up to 30% lower reconstruction error under equal-time settings and up to 150% speedup under equal-error settings. The downstream experiments further demonstrate substantial performance gains in audio processing and recommendation tasks, corroborating the practical benefits of the proposed exterior optimization framework. Code is available at https://github.com/roychowdhuryresearch/eNMF

preprint2026arXiv

Can Multimodal Large Language Models Understand Pathologic Movements? A Pilot Study on Seizure Semiology

Multimodal Large Language Models (MLLMs) have demonstrated robust capabilities in recognizing everyday human activities, yet their potential for analyzing clinically significant involuntary movements in neurological disorders remains largely unexplored. This pilot study evaluates the capability of MLLMs for automated recognition of pathological movements in seizure videos. We assessed the zero-shot performance of state-of-the-art MLLMs on 20 ILAE-defined semiological features across 90 clinical seizure recordings. MLLMs outperformed fine-tuned Convolutional Neural Network (CNN) and Vision Transformer (ViT) baseline models on 13 of 18 features without task-specific training, demonstrating particular strength in recognizing salient postural and contextual features while struggling with subtle, high-frequency movements. Feature-targeted signal enhancement (facial cropping, pose estimation, audio denoising) improved performance on 10 of 20 features. Expert evaluation showed that 94.3 percent of MLLM-generated explanations for correctly predicted cases achieved at least 60 percent faithfulness scores, aligning with epileptologist reasoning. These findings demonstrate the potential of adapting general-purpose MLLMs for specialized clinical video analysis through targeted preprocessing strategies, offering a path toward interpretable, efficient diagnostic assistance. Our code is publicly available at https://github.com/LinaZhangUCLA/PathMotionMLLM.

preprint2026arXiv

Vicsek Model Meets DBSCAN: Cluster Phases in the Vicsek Model

The Vicsek model, which was originally proposed to explain the dynamics of bird flocking, exhibits a phase transition with respect to the absolute value of the mean velocity. Although clusters of agents can be easily observed via numerical simulations of the Vicsek model, qualitative studies are lacking. We study the clustering structure of the Vicsek model by applying DBSCAN, a recently-introduced clustering algorithm, and report that the Vicsek model shows a phase transition with respect to the number of clusters: from O(N) to O(1), with N being the number of agents, when increasing the magnitude of noise for a fixed radius that specifies the interaction of the Vicsek model. We also report that the combination of the order parameter proposed by Vicsek et al. and the number of clusters defines at least four phases of the Vicsek model.

preprint2022arXiv

Action-conditioned On-demand Motion Generation

We propose a novel framework, On-Demand MOtion Generation (ODMO), for generating realistic and diverse long-term 3D human motion sequences conditioned only on action types with an additional capability of customization. ODMO shows improvements over SOTA approaches on all traditional motion evaluation metrics when evaluated on three public datasets (HumanAct12, UESTC, and MoCap). Furthermore, we provide both qualitative evaluations and quantitative metrics demonstrating several first-known customization capabilities afforded by our framework, including mode discovery, interpolation, and trajectory customization. These capabilities significantly widen the spectrum of potential applications of such motion generation models. The novel on-demand generative capabilities are enabled by innovations in both the encoder and decoder architectures: (i) Encoder: Utilizing contrastive learning in low-dimensional latent space to create a hierarchical embedding of motion sequences, where not only the codes of different action types form different groups, but within an action type, codes of similar inherent patterns (motion styles) cluster together, making them readily discoverable; (ii) Decoder: Using a hierarchical decoding strategy where the motion trajectory is reconstructed first and then used to reconstruct the whole motion sequence. Such an architecture enables effective trajectory control. Our code is released on the Github page: https://github.com/roychowdhuryresearch/ODMO

preprint2022arXiv

Diverse Imitation Learning via Self-Organizing Generative Models

Imitation learning is the task of replicating expert policy from demonstrations, without access to a reward function. This task becomes particularly challenging when the expert exhibits a mixture of behaviors. Prior work has introduced latent variables to model variations of the expert policy. However, our experiments show that the existing works do not exhibit appropriate imitation of individual modes. To tackle this problem, we adopt an encoder-free generative model for behavior cloning (BC) to accurately distinguish and imitate different modes. Then, we integrate it with GAIL to make the learning robust towards compounding errors at unseen states. We show that our method significantly outperforms the state of the art across multiple experiments.

preprint2022arXiv

Which side are you on? Insider-Outsider classification in conspiracy-theoretic social media

Social media is a breeding ground for threat narratives and related conspiracy theories. In these, an outside group threatens the integrity of an inside group, leading to the emergence of sharply defined group identities: Insiders -- agents with whom the authors identify and Outsiders -- agents who threaten the insiders. Inferring the members of these groups constitutes a challenging new NLP task: (i) Information is distributed over many poorly-constructed posts; (ii) Threats and threat agents are highly contextual, with the same post potentially having multiple agents assigned to membership in either group; (iii) An agent's identity is often implicit and transitive; and (iv) Phrases used to imply Outsider status often do not follow common negative sentiment patterns. To address these challenges, we define a novel Insider-Outsider classification task. Because we are not aware of any appropriate existing datasets or attendant models, we introduce a labeled dataset (CT5K) and design a model (NP2IO) to address this task. NP2IO leverages pretrained language modeling to classify Insiders and Outsiders. NP2IO is shown to be robust, generalizing to noun phrases not seen during training, and exceeding the performance of non-trivial baseline models by $20\%$.

preprint2020arXiv

An Automated Pipeline for Character and Relationship Extraction from Readers' Literary Book Reviews on Goodreads.com

Reader reviews of literary fiction on social media, especially those in persistent, dedicated forums, create and are in turn driven by underlying narrative frameworks. In their comments about a novel, readers generally include only a subset of characters and their relationships, thus offering a limited perspective on that work. Yet in aggregate, these reviews capture an underlying narrative framework comprised of different actants (people, places, things), their roles, and interactions that we label the "consensus narrative framework". We represent this framework in the form of an actant-relationship story graph. Extracting this graph is a challenging computational problem, which we pose as a latent graphical model estimation problem. Posts and reviews are viewed as samples of sub graphs/networks of the hidden narrative framework. Inspired by the qualitative narrative theory of Greimas, we formulate a graphical generative Machine Learning (ML) model where nodes represent actants, and multi-edges and self-loops among nodes capture context-specific relationships. We develop a pipeline of interlocking automated methods to extract key actants and their relationships, and apply it to thousands of reviews and comments posted on Goodreads.com. We manually derive the ground truth narrative framework from SparkNotes, and then use word embedding tools to compare relationships in ground truth networks with our extracted networks. We find that our automated methodology generates highly accurate consensus narrative frameworks: for our four target novels, with approximately 2900 reviews per novel, we report average coverage/recall of important relationships of > 80% and an average edge detection rate of >89\%. These extracted narrative frameworks can generate insight into how people (or classes of people) read and how they recount what they have read to others.

preprint2020arXiv

An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web

Although a great deal of attention has been paid to how conspiracy theories circulate on social media and their factual counterpart conspiracies, there has been little computational work done on describing their narrative structures. We present an automated pipeline for the discovery and description of the generative narrative frameworks of conspiracy theories on social media, and actual conspiracies reported in the news media. We base this work on two separate repositories of posts and news articles describing the well-known conspiracy theory Pizzagate from 2016, and the New Jersey conspiracy Bridgegate from 2013. We formulate a graphical generative machine learning model where nodes represent actors/actants, and multi-edges and self-loops among nodes capture context-specific relationships. Posts and news items are viewed as samples of subgraphs of the hidden narrative network. The problem of reconstructing the underlying structure is posed as a latent model estimation problem. We automatically extract and aggregate the actants and their relationships from the posts and articles. We capture context specific actants and interactant relationships by developing a system of supernodes and subnodes. We use these to construct a network, which constitutes the underlying narrative framework. We show how the Pizzagate framework relies on the conspiracy theorists' interpretation of "hidden knowledge" to link otherwise unlinked domains of human interaction, and hypothesize that this multi-domain focus is an important feature of conspiracy theories. While Pizzagate relies on the alignment of multiple domains, Bridgegate remains firmly rooted in the single domain of New Jersey politics. We hypothesize that the narrative framework of a conspiracy theory might stabilize quickly in contrast to the narrative framework of an actual one, which may develop more slowly as revelations come to light.

preprint2020arXiv

Conspiracy in the Time of Corona: Automatic detection of Covid-19 Conspiracy Theories in Social Media and the News

Rumors and conspiracy theories thrive in environments of low confidence and low trust. Consequently, it is not surprising that ones related to the Covid-19 pandemic are proliferating given the lack of any authoritative scientific consensus on the virus, its spread and containment, or on the long term social and economic ramifications of the pandemic. Among the stories currently circulating are ones suggesting that the 5G network activates the virus, that the pandemic is a hoax perpetrated by a global cabal, that the virus is a bio-weapon released deliberately by the Chinese, or that Bill Gates is using it as cover to launch a global surveillance regime. While some may be quick to dismiss these stories as having little impact on real-world behavior, recent events including the destruction of property, racially fueled attacks against Asian Americans, and demonstrations espousing resistance to public health orders countermand such conclusions. Inspired by narrative theory, we crawl social media sites and news reports and, through the application of automated machine-learning methods, discover the underlying narrative frameworks supporting the generation of these stories. We show how the various narrative frameworks fueling rumors and conspiracy theories rely on the alignment of otherwise disparate domains of knowledge, and consider how they attach to the broader reporting on the pandemic. These alignments and attachments, which can be monitored in near real-time, may be useful for identifying areas in the news that are particularly vulnerable to reinterpretation by conspiracy theorists. Understanding the dynamics of storytelling on social media and the narrative frameworks that provide the generative basis for these stories may also be helpful for devising methods to disrupt their spread.

preprint2011arXiv

How much of quantum mechanics is really needed to defy Extended Church-Turing Thesis?

This paper has been withdrawn by the author as one of the coauthors needs institutional permission.

preprint2005arXiv

Eigenvalue Estimation of Differential Operators

We demonstrate how linear differential operators could be emulated by a quantum processor, should one ever be built, using the Abrams-Lloyd algorithm. Given a linear differential operator of order 2S, acting on functions psi(x_1,x_2,...,x_D) with D arguments, the computational cost required to estimate a low order eigenvalue to accuracy Theta(1/N^2) is Theta((2(S+1)(1+1/nu)+D)log N) qubits and O(N^{2(S+1)(1+1/nu)} (D log N)^c) gate operations, where N is the number of points to which each argument is discretized, nu and c are implementation dependent constants of O(1). Optimal classical methods require Theta(N^D) bits and Omega(N^D) gate operations to perform the same eigenvalue estimation. The Abrams-Lloyd algorithm thereby leads to exponential reduction in memory and polynomial reduction in gate operations, provided the domain has sufficiently large dimension D > 2(S+1)(1+1/nu). In the case of Schrodinger's equation, ground state energy estimation of two or more particles can in principle be performed with fewer quantum mechanical gates than classical gates.

preprint2005arXiv

Multiple Scale-Free Structures in Complex Ad-Hoc Networks

This paper develops a framework for analyzing and designing dynamic networks comprising different classes of nodes that coexist and interact in one shared environment. We consider {\em ad hoc} (i.e., nodes can leave the network unannounced, and no node has any global knowledge about the class identities of other nodes) {\em preferentially grown networks}, where different classes of nodes are characterized by different sets of local parameters used in the stochastic dynamics that all nodes in the network execute. We show that multiple scale-free structures, one within each class of nodes, and with tunable power-law exponents (as determined by the sets of parameters characterizing each class) emerge naturally in our model. Moreover, the coexistence of the scale-free structures of the different classes of nodes can be captured by succinct phase diagrams, which show a rich set of structures, including stable regions where different classes coexist in heavy-tailed and light-tailed states, and sharp phase transitions. Finally, we show how the dynamics formulated in this paper will serve as an essential part of {\em ad-hoc networking protocols}, which can lead to the formation of robust and efficiently searchable networks (including, the well-known Peer-To-Peer (P2P) networks) even under very dynamic conditions.

Vwani Roychowdhury

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

An Exterior Method for Nonnegative Matrix Factorization

Can Multimodal Large Language Models Understand Pathologic Movements? A Pilot Study on Seizure Semiology

Vicsek Model Meets DBSCAN: Cluster Phases in the Vicsek Model

Action-conditioned On-demand Motion Generation

Diverse Imitation Learning via Self-Organizing Generative Models

Which side are you on? Insider-Outsider classification in conspiracy-theoretic social media

An Automated Pipeline for Character and Relationship Extraction from Readers' Literary Book Reviews on Goodreads.com

An automated pipeline for the discovery of conspiracy and conspiracy theory narrative frameworks: Bridgegate, Pizzagate and storytelling on the web

Conspiracy in the Time of Corona: Automatic detection of Covid-19 Conspiracy Theories in Social Media and the News

How much of quantum mechanics is really needed to defy Extended Church-Turing Thesis?

Eigenvalue Estimation of Differential Operators

Multiple Scale-Free Structures in Complex Ad-Hoc Networks