Source author record

R. G. Ragel

R. G. Ragel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Engineering, Finance, and Science Distributed, Parallel, and Cluster Computing Cryptography and Security Hardware Architecture Computation and Language Human-Computer Interaction Information Retrieval Artificial Intelligence Computer Vision cs.CY Digital Libraries Performance Programming Languages Software Engineering

Catalog footprint

What is connected

22works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

A Feasibility Study on Programmer Specific Instruction Set Processors (PSISPs)

ASIPs are designed in order to execute instructions of a particular domain of applications. The designing of ASIPs addresses the major challenges faced by a system on chip such as size, cost, performance and energy consumption. The higher the number of similar instructions within the domain to be mapped the lesser the energy consumption, the smaller the size and the higher the performance of the ASIP. Thus, designing processors for domains with more similar programs would overcome these issues. This paper describes the investigation of whether the domains of programmer specific programs have any significance like application specific program domains and thus, whether the approach of designing processors known as Programmer Specific Instruction Set Processors is worthwhile. We performed the evaluation at the instruction level by using four different measures to obtain the similarity of programs: (1) by the existence of each instruction, (2) by the frequency of each instruction, (3) by two consecutive instruction patterns and (4) by three consecutive instruction patterns of application specific and programmer specific programs. We found that although programmer specific instructions show some impact on the similarity measures, they are much smaller and therefore insignificant compared to the impact from application specific programs.

preprint2014arXiv

A Fuzzy Based Model to Identify Printed Sinhala Characters (ICIAfS14)

Character recognition techniques for printed documents are widely used for English language. However, the systems that are implemented to recognize Asian languages struggle to increase the accuracy of recognition. Among other Asian languages (such as Arabic, Tamil, Chinese), Sinhala characters are unique, mainly because they are round in shape. This unique feature makes it a challenge to extend the prevailing techniques to improve recognition of Sinhala characters. Therefore, a little attention has been given to improve the accuracy of Sinhala character recognition. A novel method, which makes use of this unique feature, could be advantageous over other methods. This paper describes the use of a fuzzy inference system to recognize Sinhala characters. Feature extraction is mainly focused on distance and intersection measurements in different directions from the center of the letter making use of the round shape of characters. The results showed an overall accuracy of 90.7% for 140 instances of letters tested, much better than similar systems.

preprint2014arXiv

A Structured Hardware Software Architecture for Peptide Based Diagnosis - Sub-string Matching Problem with Limited Tolerance (ICIAfS14)

The problem of inferring proteins from complex peptide samples in shotgun proteomic workflow sets extreme demands on computational resources. This is exacerbated by the fact that, in general, a given protein cannot be defined by a fixed sequence of amino acids due to the existence of splice variants and isoforms of that protein. Therefore, the problem of protein inference could be considered as one of identifying sequences of amino acids with some limited tolerance. Two problems arise from this: a) due to these variations, the applicability of exact string matching methodologies could be questioned and b) the difficulty of defining a reference sequence for a particular set of proteins that are functionally indistinguishable, but with some variation in features. This paper presents a model-based inference approach that is developed and validated to solve the inference problem. Our approach starts from an examination of the known set of splice variants and isoforms of a target protein to identify the Greatest Common Stable Substring (GCSS) of amino acids and the Substrings Subjects to Limited Variation (SSLV) and their respective locations on the GCSS. Then we define and solve the Sub-string Matching Problem with Limited Tolerance (SMPLT). This approach is validated on identified peptides in a labelled and clustered data set from UNIPROT. Identification of Baylisascaris Procyonis infection was used as an application instance that achieved up to 70 times speedup compared to a software only system. This workflow can be generalised to any inexact multiple pattern matching application by replacing the patterns in a clustered and distributed environment which permits a distance between member strings to account for permitted deviations such as substitutions, insertions and deletions.

preprint2014arXiv

A Structured Hardware Software Architecture for Peptide Based Diagnosis of Baylisascaris Procyonis Infection (ICIAfS14)

The problem of inferring proteins from complex peptide cocktails (digestion products of biological samples) in shotgun proteomic workflow sets extreme demands on computational resources in respect of the required very high processing throughputs, rapid processing rates and reliability of results. This is exacerbated by the fact that, in general, a given protein cannot be defined by a fixed sequence of amino acids due to the existence of splice variants and isoforms of that protein. Therefore, the problem of protein inference could be considered as one of identifying sequences of amino acids with some limited tolerance. In the current paper a model-based hardware acceleration of a structured and practical inference approach is developed and validated on a mass spectrometry experiment of realistic size. We have achieved 10 times maximum speed-up in the co-designed workflow compared to a similar software-only workflow run on the processor used for co-design.

preprint2014arXiv

Accelerating motif finding in DNA sequences with multicore CPUs

Motif discovery in DNA sequences is a challenging task in molecular biology. In computational motif discovery, Planted (l, d) motif finding is a widely studied problem and numerous algorithms are available to solve it. Both hardware and software accelerators have been introduced to accelerate the motif finding algorithms. However, the use of hardware accelerators such as FPGAs needs hardware specialists to design such systems. Software based acceleration methods on the other hand are easier to implement than hardware acceleration techniques. Grid computing is one such software based acceleration technique which has been used in acceleration of motif finding. However, drawbacks such as network communication delays and the need of fast interconnection between nodes in the grid can limit its usage and scalability. As using multicore CPUs to accelerate CPU intensive tasks are becoming increasingly popular and common nowadays, we can employ it to accelerate motif finding and it can be a faster method than grid based acceleration. In this paper, we have explored the use of multicore CPUs to accelerate motif finding. We have accelerated the Skip-Brute Force algorithm on multicore CPUs parallelizing it using the POSIX thread library. Our method yielded an average speed up of 34x on a 32-core processor compared to a speed up of 21x on a grid based implementation of 32 nodes.

preprint2014arXiv

Accelerating string matching for bio-computing applications on multi-core CPUs

Huge amount of data in the form of strings are being handled in bio-computing applications and searching algorithms are quite frequently used in them. Many methods utilizing on both software and hardware are being proposed to accelerate processing of such data. The typical hardware-based acceleration techniques either require special hardware such as general purpose graphics processing units (GPGPUs) or need building a new hardware such as an FPGA based design. On the other hard, software-based acceleration techniques are easier since they only require some changes in the software code or the software architecture. Typical software-based techniques make use of computers connected over a network, also known as a network grid to accelerate the processing. In this paper, we test the hypothesis that multi-core architectures should provide better performance in this kind of computation, but still it would depend on the algorithm selected as well as the programming model being utilized. We present the acceleration of a string-searching algorithm on a multi-core CPU via a POSIX thread based implementation. Our implementation on an 8-core processor (that supports 16-threads) resulted in 9x throughput improvement compared to a single thread implementation.

preprint2014arXiv

AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments

Plagiarism is one of the growing issues in academia and is always a concern in Universities and other academic institutions. The situation is becoming even worse with the availability of ample resources on the web. This paper focuses on creating an effective and fast tool for plagiarism detection for text based electronic assignments. Our plagiarism detection tool named AntiPlag is developed using the tri-gram sequence matching technique. Three sets of text based assignments were tested by AntiPlag and the results were compared against an existing commercial plagiarism detection tool. AntiPlag showed better results in terms of false positives compared to the commercial tool due to the pre-processing steps performed in AntiPlag. In addition, to improve the detection latency, AntiPlag applies a data clustering technique making it four times faster than the commercial tool considered. AntiPlag could be used to isolate plagiarized text based assignments from non-plagiarised assignments easily. Therefore, we present AntiPlag, a fast and effective tool for plagiarism detection on text based electronic assignments.

preprint2014arXiv

Authorship detection of SMS messages using unigrams

SMS messaging is a popular media of communication. Because of its popularity and privacy, it could be used for many illegal purposes. Additionally, since they are part of the day to day life, SMSes can be used as evidence for many legal disputes. Since a cellular phone might be accessible to people close to the owner, it is important to establish the fact that the sender of the message is indeed the owner of the phone. For this purpose, the straight forward solutions seem to be the use of popular stylometric methods. However, in comparison with the data used for stylometry in the literature, SMSes have unusual characteristics making it hard or impossible to apply these methods in a conventional way. Our target is to come up with a method of authorship detection of SMS messages that could still give a usable accuracy. We argue that, considering the methods of author attribution, the best method that could be applied to SMS messages is an n-gram method. To prove our point, we checked two different methods of distribution comparison with varying number of training and testing data. We specifically try to compare how well our algorithms work under less amount of testing data and large number of candidate authors (which we believe to be the real world scenario) against controlled tests with less number of authors and selected SMSes with large number of words. To counter the lack of information in an SMS message, we propose the method of stacking together few SMSes.

preprint2014arXiv

Axis2UNO: Web Services Enabled Openoffice.org

Openoffice.org is a popular, free and open source office product. This product is used by millions of people and developed, maintained and extended by thousands of developers worldwide. Playing a dominant role in the web, web services technology is serving millions of people every day. Axis2 is one of the most popular, free and open source web service engines. The framework presented in this paper, Axis2UNO, a combination of such two technologies is capable of making a new era in office environment. Two other attempts to enhance web services functionality in office products are Excel Web Services and UNO Web Service Proxy. Excel Web Services is combined with Microsoft SharePoint technology and exposes information sharing in a different perspective within the proprietary Microsoft office products. UNO Web Service Proxy is implemented with Java Web Services Developer Pack and enables basic web services related functionality in Openoffice.org. However, the work presented here is the first one to combine Openoffice.org and Axis2 and we expect it to outperform the other efforts with the community involvement and feature richness in those products.

preprint2014arXiv

Constant time encryption as a countermeasure against remote cache timing attacks

Rijndael was standardized in 2001 by National Institute of Standard and Technology as the Advanced Encryption Standard (AES). AES is still being used to encrypt financial, military and even government confidential data. In 2005, Bernstein illustrated a remote cache timing attack on AES using the client-server architecture and therefore proved a side channel in its software implementation. Over the years, a number of countermeasures have been proposed against cache timing attacks both using hardware and software. Although the software based countermeasures are flexible and easy to deploy, most of such countermeasures are vulnerable to statistical analysis. In this paper, we propose a novel software based countermeasure against cache timing attacks, known as constant time encryption, which we believe is secure against statistical analysis. The countermeasure we proposed performs rescheduling of instructions such that the encryption rounds will consume constant time independent of the cache hits and misses. Through experiments, we prove that our countermeasure is secure against Bernstein's cache timing attack.

preprint2014arXiv

Hardware accelerated protein inference framework

Protein inference plays a vital role in the proteomics study. Two major approaches could be used to handle the problem of protein inference; top-down and bottom-up. This paper presents a framework for protein inference, which uses hardware accelerated protein inference framework for handling the most important step in a bottom-up approach, viz. peptide identification during the assembling process. In our framework, identified peptides and their probabilities are used to predict the most suitable reference protein cluster for a given input amino acid sequence with the probability of identified peptides. The framework is developed on an FPGA where hardware software co-design techniques are used to accelerate the computationally intensive parts of the protein inference process. In the paper we have measured, compared and reported the time taken for the protein inference process in our framework against a pure software implementation.

preprint2014arXiv

Hardware software co-design of the Aho-Corasick algorithm: Scalable for protein identification?

Pattern matching is commonly required in many application areas and bioinformatics is a major area of interest that requires both exact and approximate pattern matching. Much work has been done in this area, yet there is still a significant space for improvement in efficiency, flexibility, and throughput. This paper presents a hardware software co-design of Aho-Corasick algorithm in Nios II soft-processor and a study on its scalability for a pattern matching application. A software only approach is used to compare the throughput and the scalability of the hardware software co-design approach. According to the results we obtained, we conclude that the hardware software co-design implementation shows a maximum of 10 times speed up for pattern size of 1200 peptides compared to the software only implementation. The results also show that the hardware software co-design approach scales well for increasing data size compared to the software only approach.

preprint2014arXiv

Heterogeneous processor pipeline for a product cipher application

Processing data received as a stream is a task commonly performed by modern embedded devices, in a wide range of applications such as multimedia (encoding/decoding/ playing media), networking (switching and routing), digital security, scientific data processing, etc. Such processing normally tends to be calculation intensive and therefore requiring significant processing power. Therefore, hardware acceleration methods to increase the performance of such applications constitute an important area of study. In this paper, we present an evaluation of one such method to process streaming data, namely multi-processor pipeline architecture. The hardware is based on a Multiple-Processor System on Chip (MPSoC), using a data encryption algorithm as a case study. The algorithm is partitioned on a coarse grained level and mapped on to an MPSoC with five processor cores in a pipeline, using specifically configured Xtensa LX3 cores. The system is then selectively optimized by strengthening and pruning the resources of each processor core. The optimized system is evaluated and compared against an optimal single-processor System on Chip (SoC) for the same application. The multiple-processor pipeline system for data encryption algorithms used was observed to provide significant speed ups, up to 4.45 times that of the single-processor system, which is close to the ideal speed up from a five-stage pipeline.

preprint2014arXiv

Improving the throughput of the AES algorithm with multicore processors

AES, Advanced Encryption Standard, can be considered the most widely used modern symmetric key encryption standard. To encrypt/decrypt a file using the AES algorithm, the file must undergo a set of complex computational steps. Therefore a software implementation of AES algorithm would be slow and consume large amount of time to complete. The immense increase of both stored and transferred data in the recent years had made this problem even more daunting when the need to encrypt/decrypt such data arises. As a solution to this problem, in this paper, we present an extensive study of enhancing the throughput of AES encryption algorithm by utilizing the state of the art multicore architectures. We take a sequential program that implements the AES algorithm and convert the same to run on multicore architectures with minimum effort. We implement two different parallel programmes, one with the fork system call in Linux and the other with the pthreads, the POSIX standard for threads. Later, we ran both the versions of the parallel programs on different multicore architectures and compared and analysed the throughputs between the implementations and among different architectures. The pthreads implementation outperformed in all the experiments we conducted and the best throughput obtained is around 7Gbps on a 32-core processor (the largest number of cores we had) with the pthreads implementation.

preprint2014arXiv

Instruction-set Selection for Multi-application based ASIP Design: An Instruction-level Study

Efficiency in embedded systems is paramount to achieve high performance while consuming less area and power. Processors in embedded systems have to be designed carefully to achieve such design constraints. Application Specific Instruction set Processors (ASIPs) exploit the nature of applications to design an optimal instruction set. Despite being not general to execute any application, ASIPs are highly preferred in the embedded systems industry where the devices are produced to satisfy a certain type of application domain/s (either intra-domain or inter-domain). Typically, ASIPs are designed from a base-processor and functionalities are added for applications. This paper studies the multi-application ASIPs and their instruction sets, extensively analysing the instructions for inter-domain and intra-domain designs. Metrics analysed are the reusable instructions and the extra cost to add a certain application. A wide range of applications from various application benchmarks (MiBench, MediaBench and SPEC2006) and domains are analysed for two different architectures (ARM-Thumb and PISA). Our study shows that the intra-domain applications contain larger number of common instructions, whereas the inter-domain applications have very less common instructions, regardless of the architecture (and therefore the ISA).

preprint2014arXiv

LineCAPTCHA Mobile: A User Friendly Replacement for Unfriendly Reverse Turing Tests for Mobile Devices (ICIAfS14)

As smart phones and tablets are becoming ubiquitous and taking over as the primary choice for accessing the Internet worldwide, ensuring a secure gateway to the servers serving such devices become essential. CAPTCHAs play an important role in identifying human users in internet to prevent unauthorized bot attacks. Even though there are numerous CAPTCHA alternatives available today, there are certain drawbacks attached with each alternative, making them harder to find a general solution for the necessity of a CAPTCHA mechanism. With the advancing technology and expertise in areas such as AI, cryptography and image processing, it has come to a stage where the chase between making and breaking CAPTCHAs are even now. This has led the humans with a hard time deciphering the CAPTCHA mechanisms. In this paper, we adapt a novel CAPTCHA mechanism named as LineCAPTCHA to mobile devices. LineCAPTCHA is a new reverse Turing test based on drawing on top of Bezier curves within noisy backgrounds. The major objective of this paper is to report the implementation and evaluation of LineCAPTCHA on a mobile platform. At the same time we impose certain security standards and security aspects for establishing LineCAPTCHAs which are obtained through extensive measures. Independency from factors such as the fluency in English language, age and easily understandable nature of it inclines the usability of LineCAPTCHA. We believe that such independency will favour the main target of LineCAPTCHA, user friendliness and usability.

preprint2014arXiv

Register Spilling for Specific Application Domains in Application Specific Instruction-set Processors

An Application Specific Instruction set Processor (ASIP) is an important component in designing embedded systems. One of the problems in designing an instruction set for such processors is determining the number of registers is needed in the processor that will optimize the computational time and the cost. The performance of a processor may fall short due to register spilling, which is caused by the lack of available registers in a processor. In the design perspective, it will result in processors with great performance and low power consumption if we can avoid register spilling by deciding a value for the number of registers needed in an ASIP. However, as of now, it has not clearly been recognized how the number of registers changes with different application domains. In this paper, we evaluated whether different application domains have any significant effect on register spilling and therefore the performance of a processor so that we could use different number of registers when building ASIPs for different application domains rather than using a constant set of registers. Such utilization of registers will result in processors with high performance, low cost and low power consumption.

preprint2014arXiv

Software implementation level countermeasures against the cache timing attack on advanced encryption standard

Advanced Encryption Standard (AES) is a symmetric key encryption algorithm which is extensively used in secure electronic data transmission. When introduced, although it was tested and declared as secure, in 2005, a researcher named Bernstein claimed that it is vulnerable to side channel attacks. The cache-based timing attack is the type of side channel attack demonstrated by Bernstein, which uses the timing variation in cache hits and misses. This kind of attacks can be prevented by masking the actual timing information from the attacker. Such masking can be performed by altering the original AES software implementation while preserving its semantics. This paper presents possible software implementation level countermeasures against Bernstein's cache timing attack. Two simple software based countermeasures based on the concept of "constant-encryption-time" were demonstrated against the remote cache timing attack with positive outcomes, in which we establish a secured environment for the AES encryption.

preprint2014arXiv

String Matching with Multicore CPUs: Performing Better with the Aho-Corasick Algorithm

Multiple string matching is known as locating all the occurrences of a given number of patterns in an arbitrary string. It is used in bio-computing applications where the algorithms are commonly used for retrieval of information such as sequence analysis and gene/protein identification. Extremely large amount of data in the form of strings has to be processed in such bio-computing applications. Therefore, improving the performance of multiple string matching algorithms is always desirable. Multicore architectures are capable of providing better performance by parallelizing the multiple string matching algorithms. The Aho-Corasick algorithm is the one that is commonly used in exact multiple string matching algorithms. The focus of this paper is the acceleration of Aho-Corasick algorithm through a multicore CPU based software implementation. Through our implementation and evaluation of results, we prove that our method performs better compared to the state of the art.

preprint2014arXiv

Students Behavioural Analysis in an Online Learning Environment Using Data Mining (ICIAfS)

The focus of this research was to use Educational Data Mining (EDM) techniques to conduct a quantitative analysis of students interaction with an e-learning system through instructor-led non-graded and graded courses. This exercise is useful for establishing a guideline for a series of online short courses for them. A group of 412 students' access behaviour in an e-learning system were analysed and they were grouped into clusters using K-Means clustering method according to their course access log records. The results explained that more than 40% from the student group are passive online learners in both graded and non-graded learning environments. The result showed that the difference in the learning environments could change the online access behaviour of a student group. Clustering divided the student population into five access groups based on their course access behaviour. Among these groups, the least access group (NG-41% and G-42%) and the highest access group (NG-9% and G-5%) could be identified very clearly due to their access variation from the rest of the groups.

preprint2014arXiv

Tile optimization for area in FPGA based hardware acceleration of peptide identification

Advances in life sciences over the last few decades have lead to the generation of a huge amount of biological data. Computing research has become a vital part in driving biological discovery where analysis and categorization of biological data are involved. String matching algorithms can be applied for protein/gene sequence matching and with the phenomenal increase in the size of string databases to be analyzed, software implementations of these algorithms seems to have hit a hard limit and hardware acceleration is increasingly being sought. Several hardware platforms such as Field Programmable Gate Arrays (FPGA), Graphics Processing Units (GPU) and Chip Multi Processors (CMP) are being explored as hardware platforms. In this paper, we give a comprehensive overview of the literature on hardware acceleration of string matching algorithms, we take an FPGA hardware exploration and expedite the design time by a design automation technique. Further, our design automation is also optimized for better hardware utilization through optimizing the number of peptides that can be represented in an FPGA tile. The results indicate significant improvements in design time and hardware utilization which are reported in this paper.

preprint2014arXiv

User Friendly Line CAPTCHAs

CAPTCHAs or reverse Turing tests are real-time assessments used by programs (or computers) to tell humans and machines apart. This is achieved by assigning and assessing hard AI problems that could only be solved easily by human but not by machines. Applications of such assessments range from stopping spammers from automatically filling online forms to preventing hackers from performing dictionary attack. Today, the race between makers and breakers of CAPTCHAs is at a juncture, where the CAPTCHAs proposed are not even answerable by humans. We consider such CAPTCHAs as non user friendly. In this paper, we propose a novel technique for reverse Turing test - we call it the Line CAPTCHAs - that mainly focuses on user friendliness while not compromising the security aspect that is expected to be provided by such a system.

R. G. Ragel

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

A Feasibility Study on Programmer Specific Instruction Set Processors (PSISPs)

A Fuzzy Based Model to Identify Printed Sinhala Characters (ICIAfS14)

A Structured Hardware Software Architecture for Peptide Based Diagnosis - Sub-string Matching Problem with Limited Tolerance (ICIAfS14)

A Structured Hardware Software Architecture for Peptide Based Diagnosis of Baylisascaris Procyonis Infection (ICIAfS14)

Accelerating motif finding in DNA sequences with multicore CPUs

Accelerating string matching for bio-computing applications on multi-core CPUs

AntiPlag: Plagiarism Detection on Electronic Submissions of Text Based Assignments

Authorship detection of SMS messages using unigrams

Axis2UNO: Web Services Enabled Openoffice.org

Constant time encryption as a countermeasure against remote cache timing attacks

Hardware accelerated protein inference framework

Hardware software co-design of the Aho-Corasick algorithm: Scalable for protein identification?

Heterogeneous processor pipeline for a product cipher application

Improving the throughput of the AES algorithm with multicore processors

Instruction-set Selection for Multi-application based ASIP Design: An Instruction-level Study

LineCAPTCHA Mobile: A User Friendly Replacement for Unfriendly Reverse Turing Tests for Mobile Devices (ICIAfS14)

Register Spilling for Specific Application Domains in Application Specific Instruction-set Processors

Software implementation level countermeasures against the cache timing attack on advanced encryption standard

String Matching with Multicore CPUs: Performing Better with the Aho-Corasick Algorithm

Students Behavioural Analysis in an Online Learning Environment Using Data Mining (ICIAfS)

Tile optimization for area in FPGA based hardware acceleration of peptide identification

User Friendly Line CAPTCHAs