Source author record

Sanguthevar Rajasekaran

Sanguthevar Rajasekaran appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Computational Engineering, Finance, and Science Computation and Language Cryptography and Security Distributed, Parallel, and Cluster Computing Machine Learning Quantitative Methods Computational Complexity Emerging Technologies Genomics math.OC

Catalog footprint

What is connected

17works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit. However, under the trending pretrain-and-finetune paradigm, we postulate a counter-traditional hypothesis, that is: pruning increases the risk of overfitting when performed at the fine-tuning phase. In this paper, we aim to address the overfitting problem and improve pruning performance via progressive knowledge distillation with error-bound properties. We show for the first time that reducing the risk of overfitting can help the effectiveness of pruning under the pretrain-and-finetune paradigm. Ablation studies and experiments on the GLUE benchmark show that our method outperforms the leading competitors across different tasks.

preprint2022arXiv

A Secure and Efficient Federated Learning Framework for NLP

In this work, we consider the problem of designing secure and efficient federated learning (FL) frameworks. Existing solutions either involve a trusted aggregator or require heavyweight cryptographic primitives, which degrades performance significantly. Moreover, many existing secure FL designs work only under the restrictive assumption that none of the clients can be dropped out from the training protocol. To tackle these problems, we propose SEFL, a secure and efficient FL framework that (1) eliminates the need for the trusted entities; (2) achieves similar and even better model accuracy compared with existing FL designs; (3) is resilient to client dropouts. Through extensive experimental studies on natural language processing (NLP) tasks, we demonstrate that the SEFL achieves comparable accuracy compared to existing FL solutions, and the proposed pruning technique can improve runtime performance up to 13.7x.

preprint2020arXiv

MSPP: A Highly Efficient and Scalable Algorithm for Mining Similar Pairs of Points

The closest pair of points problem or closest pair problem (CPP) is an important problem in computational geometry where we have to find a pair of points from a set of points in metric space with the smallest distance between them. This problem arises in a number of applications, such as but not limited to clustering, graph partitioning, image processing, patterns identification, and intrusion detection. For example, in air-traffic control, we must monitor aircrafts that come too close together, since this may potentially indicate a possible collision. Numerous algorithms have been presented for solving the CPP. The algorithms that are employed in practice have a worst case quadratic run time complexity. In this article we present an elegant approximation algorithm for the CPP called MSPP: Mining Similar Pairs of Points. It is faster than currently best known algorithms while maintaining a very good accuracy. The proposed algorithm also detects a set of closely similar pairs of points in Euclidean and Pearson metric spaces and can be adapted in numerous real world applications, such as clustering, dimension reduction, constructing and analyzing gene/transcript co-expression network, among others.

preprint2020arXiv

SAPAG: A Self-Adaptive Privacy Attack From Gradients

Distributed learning such as federated learning or collaborative learning enables model training on decentralized data from users and only collects local gradients, where data is processed close to its sources for data privacy. The nature of not centralizing the training data addresses the privacy issue of privacy-sensitive data. Recent studies show that a third party can reconstruct the true training data in the distributed machine learning system through the publicly-shared gradients. However, existing reconstruction attack frameworks lack generalizability on different Deep Neural Network (DNN) architectures and different weight distribution initialization, and can only succeed in the early training phase. To address these limitations, in this paper, we propose a more general privacy attack from gradient, SAPAG, which uses a Gaussian kernel based of gradient difference as a distance measure. Our experiments demonstrate that SAPAG can construct the training data on different DNNs with different weight initializations and on DNNs in any training phases.

preprint2016arXiv

Hybrid-DCA: A Double Asynchronous Approach for Stochastic Dual Coordinate Ascent

In prior works, stochastic dual coordinate ascent (SDCA) has been parallelized in a multi-core environment where the cores communicate through shared memory, or in a multi-processor distributed memory environment where the processors communicate through message passing. In this paper, we propose a hybrid SDCA framework for multi-core clusters, the most common high performance computing environment that consists of multiple nodes each having multiple cores and its own shared memory. We distribute data across nodes where each node solves a local problem in an asynchronous parallel fashion on its cores, and then the local updates are aggregated via an asynchronous across-node update scheme. The proposed double asynchronous method converges to a global solution for $L$-Lipschitz continuous loss functions, and at a linear convergence rate if a smooth convex loss function is used. Extensive empirical comparison has shown that our algorithm scales better than the best known shared-memory methods and runs faster than previous distributed-memory methods. Big datasets, such as one of 280 GB from the LIBSVM repository, cannot be accommodated on a single node and hence cannot be solved by a parallel algorithm. For such a dataset, our hybrid algorithm takes 30 seconds to achieve a duality gap of $10^{-6}$ on 16 nodes each using 8 cores, which is significantly faster than the best known distributed algorithms, such as CoCoA+, that take more than 300 seconds on 16 nodes.

preprint2016arXiv

On pattern matching with k mismatches and few don't cares

We consider the problem of pattern matching with $k$ mismatches, where there can be don't care or wild card characters in the pattern. Specifically, given a pattern $P$ of length $m$ and a text $T$ of length $n$, we want to find all occurrences of $P$ in $T$ that have no more than $k$ mismatches. The pattern can have don't care characters, which match any character. Without don't cares, the best known algorithm for pattern matching with $k$ mismatches has a runtime of $O(n\sqrt{k \log k})$. With don't cares in the pattern, the best deterministic algorithm has a runtime of $O(nk polylog m)$. Therefore, there is an important gap between the versions with and without don't cares. In this paper we give an algorithm whose runtime increases with the number of don't cares. We define an {\em island} to be a maximal length substring of $P$ that does not contain don't cares. Let $q$ be the number of islands in $P$. We present an algorithm that runs in $O(n\sqrt{k\log m}+n\min\{\sqrt[3]{qk\log^2 m},\sqrt{q\log m}\})$ time. If the number of islands $q$ is $O(k)$ this runtime becomes $O(n\sqrt{k\log m})$, which essentially matches the best known runtime for pattern matching with $k$ mismatches without don't cares. If the number of islands $q$ is $O(k^2)$, this algorithm is asymptotically faster than the previous best algorithm for pattern matching with $k$ mismatches with don't cares in the pattern.

preprint2014arXiv

An error correcting parser for context free grammars that takes less than cubic time

The problem of parsing has been studied extensively for various formal grammars. Given an input string and a grammar, the parsing problem is to check if the input string belongs to the language generated by the grammar. A closely related problem of great importance is one where the input are a string ${\cal I}$ and a grammar $G$ and the task is to produce a string ${\cal I}'$ that belongs to the language generated by $G$ and the `distance' between ${\cal I}$ and ${\cal I}'$ is the smallest (from among all the strings in the language). Specifically, if ${\cal I}$ is in the language generated by $G$, then the output should be ${\cal I}$. Any parser that solves this version of the problem is called an {\em error correcting parser}. In 1972 Aho and Peterson presented a cubic time error correcting parser for context free grammars. Since then this asymptotic time bound has not been improved under the (standard) assumption that the grammar size is a constant. In this paper we present an error correcting parser for context free grammars that runs in $O(T(n))$ time, where $n$ is the length of the input string and $T(n)$ is the time needed to compute the tropical product of two $n\times n$ matrices. In this paper we also present an $\frac{n}{M}$-approximation algorithm for the {\em language edit distance problem} that has a run time of $O(Mn^ω)$, where $O(n^ω)$ is the time taken to multiply two $n\times n$ matrices. To the best of our knowledge, no approximation algorithms have been proposed for error correcting parsing for general context free grammars.

preprint2014arXiv

Efficient Algorithms for the Closest Pair Problem and Applications

The closest pair problem (CPP) is one of the well studied and fundamental problems in computing. Given a set of points in a metric space, the problem is to identify the pair of closest points. Another closely related problem is the fixed radius nearest neighbors problem (FRNNP). Given a set of points and a radius $R$, the problem is, for every input point $p$, to identify all the other input points that are within a distance of $R$ from $p$. A naive deterministic algorithm can solve these problems in quadratic time. CPP as well as FRNNP play a vital role in computational biology, computational finance, share market analysis, weather prediction, entomology, electro cardiograph, N-body simulations, molecular simulations, etc. As a result, any improvements made in solving CPP and FRNNP will have immediate implications for the solution of numerous problems in these domains. We live in an era of big data and processing these data take large amounts of time. Speeding up data processing algorithms is thus much more essential now than ever before. In this paper we present algorithms for CPP and FRNNP that improve (in theory and/or practice) the best-known algorithms reported in the literature for CPP and FRNNP. These algorithms also improve the best-known algorithms for related applications including time series motif mining and the two locus problem in Genome Wide Association Studies (GWAS).

preprint2013arXiv

An Elegant Algorithm for the Construction of Suffix Arrays

The suffix array is a data structure that finds numerous applications in string processing problems for both linguistic texts and biological data. It has been introduced as a memory efficient alternative for suffix trees. The suffix array consists of the sorted suffixes of a string. There are several linear time suffix array construction algorithms (SACAs) known in the literature. However, one of the fastest algorithms in practice has a worst case run time of $O(n^2)$. The problem of designing practically and theoretically efficient techniques remains open. In this paper we present an elegant algorithm for suffix array construction which takes linear time with high probability; the probability is on the space of all possible inputs. Our algorithm is one of the simplest of the known SACAs and it opens up a new dimension of suffix array construction that has not been explored until now. Our algorithm is easily parallelizable. We offer parallel implementations on various parallel models of computing. We prove a lemma on the $\ell$-mers of a random string which might find independent applications. We also present another algorithm that utilizes the above algorithm. This algorithm is called RadixSA and has a worst case run time of $O(n\log{n})$. RadixSA introduces an idea that may find independent applications as a speedup technique for other SACAs. An empirical comparison of RadixSA with other algorithms on various datasets reveals that our algorithm is one of the fastest algorithms to date. The C++ source code is freely available at http://www.engr.uconn.edu/~man09004/radixSA.zip

preprint2013arXiv

Efficient Sequential and Parallel Algorithms for Planted Motif Search

Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. This paper presents an exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). This paper also introduces necessary and sufficient conditions for 3 l-mers to have a common d-neighbor.

preprint2013arXiv

On string matching with k mismatches

In this paper we consider several variants of the pattern matching problem. In particular, we investigate the following problems: 1) Pattern matching with k mismatches; 2) Approximate counting of mismatches; and 3) Pattern matching with mismatches. The distance metric used is the Hamming distance. We present some novel algorithms and techniques for solving these problems. Both deterministic and randomized algorithms are offered. Variants of these problems where there could be wild cards in either the text or the pattern or both are considered. An experimental evaluation of these algorithms is also presented. The source code is available at http://www.engr.uconn.edu/~man09004/kmis.zip.

preprint2011arXiv

An Experimental Comparison of PMSPrune and Other Algorithms for Motif Search

Extracting meaningful patterns from voluminous amount of biological data is a very big challenge. Motifs are biological patterns of great interest to biologists. Many different versions of the motif finding problem have been identified by researchers. Examples include the Planted $(l, d)$ Motif version, those based on position-specific score matrices, etc. A comparative study of the various motif search algorithms is very important for several reasons. For example, we could identify the strengths and weaknesses of each. As a result, we might be able to devise hybrids that will perform better than the individual components. In this paper we (either directly or indirectly) compare the performance of PMSprune (an algorithm based on the $(l, d)$ motif model) and several other algorithms in terms of seven measures and using well established benchmarks In this paper, we (directly or indirectly) compare the quality of motifs predicted by PMSprune and 14 other algorithms. We have employed several benchmark datasets including the one used by Tompa, et.al. These comparisons show that the performance of PMSprune is competitive when compared to the other 14 algorithms tested. We have compared (directly or indirectly) the performance of PMSprune and 14 other algorithms using the Benchmark dataset provided by Tompa, et.al. It is observed that both PMSprune and DME (an algorithm based on position-specific score matrices) in general perform better than the 13 algorithms reported in Tompa et. al.. Subsequently we have compared PMSprune and DME on other benchmark data sets including ChIP-Chip, ChIP-seq, and ABS. Between PMSprune and DME, PMSprune performs better than DME on six measures. DME performs better than PMSprune on one measure (namely, specificity).

preprint2011arXiv

Parallel Algorithms for DNA Probe Placement on Small Oligonucleotide Arrays

Oligonucleotide arrays are used in a wide range of genomic analyses, such as gene expression profiling, comparative genomic hybridization, chromatin immunoprecipitation, SNP detection, etc. During fabrication, the sites of an oligonucleotide array are selectively exposed to light in order to activate oligonucleotides for further synthesis. Optical effects can cause unwanted illumination at masked sites that are adjacent to the sites intentionally exposed to light. This results in synthesis of unforeseen sequences in masked sites and compromises interpretation of experimental data. To reduce such uncertainty, one can exploit freedom in how probes are assigned to array sites. The border length minimization problem (BLMP) seeks a placement of probes that minimizes the sum of border lengths in all masks. In this paper, we propose two parallel algorithms for the BLMP. The proposed parallel algorithms have the local-search paradigm at their core, and are especially developed for the BLMP. The results reported show that, for small microarrays with at most 1156 probes, the proposed parallel algorithms perform better than the best previous algorithms.

preprint2010arXiv

A memory-efficient data structure representing exact-match overlap graphs with application for next generation DNA assembly

An exact-match overlap graph of $n$ given strings of length $\ell$ is an edge-weighted graph in which each vertex is associated with a string and there is an edge $(x,y)$ of weight $ω= \ell - |ov_{max}(x,y)|$ if and only if $ω\leq λ$, where $|ov_{max}(x,y)|$ is the length of $ov_{max}(x,y)$ and $λ$ is a given threshold. In this paper, we show that the exact-match overlap graphs can be represented by a compact data structure that can be stored using at most $(2λ-1 )(2\lceil\log n\rceil + \lceil\logλ\rceil)n$ bits with a guarantee that the basic operation of accessing an edge takes $O(\log λ)$ time. Exact-match overlap graphs have been broadly used in the context of DNA assembly and the \emph{shortest super string problem} where the number of strings $n$ ranges from a couple of thousands to a couple of billions, the length $\ell$ of the strings is from 25 to 1000, depending on DNA sequencing technologies. However, many DNA assemblers using overlap graphs are facing a major problem of constructing and storing them. Especially, it is impossible for these DNA assemblers to handle the huge amount of data produced by the next generation sequencing technologies where the number of strings $n$ is usually very large ranging from hundred million to a couple of billions. In fact, to our best knowledge there is no DNA assemblers that can handle such a large number of strings. Fortunately, with our compact data structure, the major problem of constructing and storing overlap graphs is practically solved since it only requires linear time and and linear memory. As a result, it opens the door of possibilities to build a DNA assembler that can handle large-scale datasets efficiently.

preprint2010arXiv

An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs

Sequence assembly from short reads is an important problem in biology. It is known that solving the sequence assembly problem exactly on a bi-directed de Bruijn graph or a string graph is intractable. However finding a Shortest Double stranded DNA string (SDDNA) containing all the k-long words in the reads seems to be a good heuristic to get close to the original genome. This problem is equivalent to finding a cyclic Chinese Postman (CP) walk on the underlying un-weighted bi-directed de Bruijn graph built from the reads. The Chinese Postman walk Problem (CPP) is solved by reducing it to a general bi-directed flow on this graph which runs in O(|E|2 log2(|V |)) time. In this paper we show that the cyclic CPP on bi-directed graphs can be solved without reducing it to bi-directed flow. We present a ?(p(|V | + |E|) log(|V |) + (dmaxp)3) time algorithm to solve the cyclic CPP on a weighted bi-directed de Bruijn graph, where p = max{|{v|din(v) - dout(v) > 0}|, |{v|din(v) - dout(v) < 0}|} and dmax = max{|din(v) - dout(v)}. Our algorithm performs asymptotically better than the bidirected flow algorithm when the number of imbalanced nodes p is much less than the nodes in the bi-directed graph. From our experimental results on various datasets, we have noticed that the value of p/|V | lies between 0.08% and 0.13% with 95% probability.

preprint2010arXiv

Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs

Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories -- based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In Jackson et. al. ICPP-2008, an $O(n/p)$ time parallel algorithm has been given for this problem. Here $n$ is the size of the input and $p$ is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating $Θ(nΣ)$ messages. In this paper we present a $Θ(n/p)$ time parallel algorithm with a communication complexity equal to that of parallel sorting and is not sensitive to $Σ$. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of $Θ(\frac{n\log(n/B)}{B\log(M/B)})$. We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with that of Jackson et. al. ICPP-2008 reveals that our algorithm is faster. We also provide efficient algorithms for the bi-directed chain compaction problem.

preprint2010arXiv

On the Border Length Minimization Problem (BLMP) on a Square Array

Protein/Peptide microarrays are rapidly gaining momentum in the diagnosis of cancer. High-density and highthroughput peptide arrays are being extensively used to detect tumor biomarkers, examine kinase activity, identify antibodies having low serum titers and locate antibody signatures. Improving the yield of microarray fabrication involves solving a hard combinatorial optimization problem called the Border Length Minimization Problem (BLMP). An important question that remained open for the past seven years is if the BLMP is tractable or not. We settle this open problem by proving that the BLMP is NP-hard. We also present a hierarchical refinement algorithm which can refine any heuristic solution for the BLMP problem. We also prove that the TSP+1-threading heuristic is an O(N)- approximation. The hierarchical refinement solver is available as an opensource code at http://launchpad.net/blm-solve.

Sanguthevar Rajasekaran

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

A Secure and Efficient Federated Learning Framework for NLP

MSPP: A Highly Efficient and Scalable Algorithm for Mining Similar Pairs of Points

SAPAG: A Self-Adaptive Privacy Attack From Gradients

Hybrid-DCA: A Double Asynchronous Approach for Stochastic Dual Coordinate Ascent

On pattern matching with k mismatches and few don't cares

An error correcting parser for context free grammars that takes less than cubic time

Efficient Algorithms for the Closest Pair Problem and Applications

An Elegant Algorithm for the Construction of Suffix Arrays

Efficient Sequential and Parallel Algorithms for Planted Motif Search

On string matching with k mismatches

An Experimental Comparison of PMSPrune and Other Algorithms for Motif Search

Parallel Algorithms for DNA Probe Placement on Small Oligonucleotide Arrays

A memory-efficient data structure representing exact-match overlap graphs with application for next generation DNA assembly

An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs

Efficient Parallel and Out of Core Algorithms for Constructing Large Bi-directed de Bruijn Graphs

On the Border Length Minimization Problem (BLMP) on a Square Array