Researcher profile

S. M. Kamruzzaman

S. M. Kamruzzaman contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
30works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

30 published item(s)

preprint2010arXiv

A Constructive Algorithm for Feedforward Neural Networks for Medical Diagnostic Reasoning

This research is to search for alternatives to the resolution of complex medical diagnosis where human knowledge should be apprehended in a general fashion. Successful application examples show that human diagnostic capabilities are significantly worse than the neural diagnostic system. Our research describes a constructive neural network algorithm with backpropagation; offer an approach for the incremental construction of nearminimal neural network architectures for pattern classification. The algorithm starts with minimal number of hidden units in the single hidden layer; additional units are added to the hidden layer one at a time to improve the accuracy of the network and to get an optimal size of a neural network. Our algorithm was tested on several benchmarking classification problems including Cancer1, Heart, and Diabetes with good generalization ability.

preprint2010arXiv

A hybrid learning algorithm for text classification

Text classification is the process of classifying documents into predefined categories based on their content. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification that requires fewer documents for training. Instead of using words, word relation i.e association rules from these words is used to derive feature set from preclassified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. Experimental results show that the classifier build this way is more accurate than the existing text classification systems.

preprint2010arXiv

A System for Smart Home Control of Appliances based on Timer and Speech Interaction

The main objective of this work is to design and construct a microcomputer based system: to control electric appliances such as light, fan, heater, washing machine, motor, TV, etc. The paper discusses two major approaches to control home appliances. The first involves controlling home appliances using timer option. The second approach is to control home appliances using voice command. Moreover, it is also possible to control appliances using Graphical User Interface. The parallel port is used to transfer data from computer to the particular device to be controlled. An interface box is designed to connect the high power loads to the parallel port. This system will play an important role for the elderly and physically disable people to control their home appliances in intuitive and flexible way. We have developed a system, which is able to control eight electric appliances properly in these three modes.

preprint2010arXiv

A Unique 10 Segment Display for Bengali Numerals

Segmented display is widely used for efficient display of alphanumeric characters. English numerals are displayed by 7 segment and 16 segment display. The segment size is uniform in this two display architecture. Display architecture using 8, 10, 11, 18 segments have been proposed for Bengali numerals 0...9 yet no display architecture is designed using segments of uniform size and uniform power consumption. In this paper we have proposed a uniform 10 segment architecture for Bengali numerals. This segment architecture uses segments of uniform size and no bent segment is used.

preprint2010arXiv

An Algorithm to Extract Rules from Artificial Neural Networks for Medical Diagnosis Problems

Artificial neural networks (ANNs) have been successfully applied to solve a variety of classification and function approximation problems. Although ANNs can generally predict better than decision trees for pattern classification problems, ANNs are often regarded as black boxes since their predictions cannot be explained clearly like those of decision trees. This paper presents a new algorithm, called rule extraction from ANNs (REANN), to extract rules from trained ANNs for medical diagnosis problems. A standard three-layer feedforward ANN with four-phase training is the basis of the proposed algorithm. In the first phase, the number of hidden nodes in ANNs is determined automatically by a constructive algorithm. In the second phase, irrelevant connections and input nodes are removed from trained ANNs without sacrificing the predictive accuracy of ANNs. The continuous activation values of the hidden nodes are discretized by using an efficient heuristic clustering algorithm in the third phase. Finally, rules are extracted from compact ANNs by examining the discretized activation values of the hidden nodes. Extensive experimental studies on three benchmark classification problems, i.e. breast cancer, diabetes and lenses, demonstrate that REANN can generate high quality rules from ANNs, which are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy.

preprint2010arXiv

An Efficient Technique for Text Compression

For storing a word or the whole text segment, we need a huge storage space. Typically a character requires 1 Byte for storing it in memory. Compression of the memory is very important for data management. In case of memory requirement compression for text data, lossless memory compression is needed. We are suggesting a lossless memory requirement compression method for text data compression. The proposed compression method will compress the text segment or the text file based on two level approaches firstly reduction and secondly compression. Reduction will be done using a word lookup table not using traditional indexing system, then compression will be done using currently available compression methods. The word lookup table will be a part of the operating system and the reduction will be done by the operating system. According to this method each word will be replaced by an address value. This method can quite effectively reduce the size of persistent memory required for text data. At the end of the first level compression with the use of word lookup table, a binary file containing the addresses will be generated. Since the proposed method does not use any compression algorithm in the first level so this file can be compressed using the popular compression algorithms and finally will provide a great deal of data compression on purely English text data.

preprint2010arXiv

An Energy Efficient Multichannel MAC Protocol for Cognitive Radio Ad Hoc Networks

This paper presents a TDMA based energy efficient cognitive radio multichannel medium access control (MAC) protocol called ECR-MAC for wireless Ad Hoc Networks. ECR-MAC requires only a single half-duplex radio transceiver on each node that integrates the spectrum sensing at physical (PHY) layer and the packet scheduling at MAC layer. In addition to explicit frequency negotiation which is adopted by conventional multichannel MAC protocols, ECR-MAC introduces lightweight explicit time negotiation. This two-dimensional negotiation enables ECR-MAC to exploit the advantage of both multiple channels and TDMA, and achieve aggressive power savings by allowing nodes that are not involved in communication to go into doze mode. The IEEE 802.11 standard allows for the use of multiple channels available at the PHY layer, but its MAC protocol is designed only for a single channel. A single channel MAC protocol does not work well in a multichannel environment, because of the multichannel hidden terminal problem. The proposed energy efficient ECR-MAC protocol allows SUs to identify and use the unused frequency spectrum in a way that constrains the level of interference to the primary users (PUs). Extensive simulation results show that our proposed ECR-MAC protocol successfully exploits multiple channels and significantly improves network performance by using the licensed spectrum band opportunistically and protects QoS provisioning over cognitive radio ad hoc networks.

preprint2010arXiv

Completely Enhanced Cell Phone Keypad

The enhanced frequency based keypad is designed to speed up the typing process. This paper will show that the proposed layout will increase the typing speed and be flexible for thumb. Traditional cell phone keypad is not a scientific keypad from the frequency point of view. Approaches have been explored to speed up the typing process. We found that no manufacturer has considered the frequency of the alphabet. The current architecture does not provide flexibility although the users are accustomed to the currently available multi-tapping keypad. Since the currently available keypad layouts are not best suited for users, this paper will suggest a keypad for cell phone and other cellular device based on the frequency of the alphabet in English language and also with the view of structure of human finger movements to provide a flexible and fast cell phone keypad. It also takes into consideration the key jamming problem that was available in typewriter. At first we identified those keys of cell phone, which are easily reachable and create less pressure on the thumb. Thus the key frequency order is calculated from anatomical point of view. In our proposed layout we arranged the alphabet in the frequent keys based on the frequency of the alphabet.

preprint2010arXiv

CR-MAC: A multichannel MAC protocol for cognitive radio ad hoc networks

This paper proposes a cross-layer based cognitive radio multichannel medium access control (MAC) protocol with TDMA, which integrate the spectrum sensing at physical (PHY) layer and the packet scheduling at MAC layer, for the ad hoc wireless networks. The IEEE 802.11 standard allows for the use of multiple channels available at the PHY layer, but its MAC protocol is designed only for a single channel. A single channel MAC protocol does not work well in a multichannel environment, because of the multichannel hidden terminal problem. Our proposed protocol enables secondary users (SUs) to utilize multiple channels by switching channels dynamically, thus increasing network throughput. In our proposed protocol, each SU is equipped with only one spectrum agile transceiver, but solves the multichannel hidden terminal problem using temporal synchronization. The proposed cognitive radio MAC (CR-MAC) protocol allows SUs to identify and use the unused frequency spectrum in a way that constrains the level of interference to the primary users (PUs). Our scheme improves network throughput significantly, especially when the network is highly congested. The simulation results show that our proposed CR-MAC protocol successfully exploits multiple channels and significantly improves network performance by using the licensed spectrum band opportunistically and protects PUs from interference, even in hidden terminal situations.

preprint2010arXiv

Extracting Symbolic Rules for Medical Diagnosis Problem

Neural networks (NNs) have been successfully applied to solve a variety of application problems involving classification and function approximation. Although backpropagation NNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained NNs for the users to gain a better understanding of how the networks solve the problems. An algorithm is proposed and implemented to extract symbolic rules for medical diagnosis problem. Empirical study on three benchmarks classification problems, such as breast cancer, diabetes, and lenses demonstrates that the proposed algorithm generates high quality rules from NNs comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy.

preprint2010arXiv

Extraction of Symbolic Rules from Artificial Neural Networks

Although backpropagation ANNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained ANNs for the users to gain a better understanding of how the networks solve the problems. A new rule extraction algorithm, called rule extraction from artificial neural networks (REANN) is proposed and implemented to extract symbolic rules from ANNs. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Explicitness of the extracted rules is supported by comparing them to the symbolic rules generated by other methods. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, such as breast cancer, iris, diabetes, and season classification problems, demonstrate the effectiveness of the proposed approach with good generalization ability.

preprint2010arXiv

Medical diagnosis using neural network

This research is to search for alternatives to the resolution of complex medical diagnosis where human knowledge should be apprehended in a general fashion. Successful application examples show that human diagnostic capabilities are significantly worse than the neural diagnostic system. This paper describes a modified feedforward neural network constructive algorithm (MFNNCA), a new algorithm for medical diagnosis. The new constructive algorithm with backpropagation; offer an approach for the incremental construction of near-minimal neural network architectures for pattern classification. The algorithm starts with minimal number of hidden units in the single hidden layer; additional units are added to the hidden layer one at a time to improve the accuracy of the network and to get an optimal size of a neural network. The MFNNCA was tested on several benchmarking classification problems including the cancer, heart disease and diabetes. Experimental results show that the MFNNCA can produce optimal neural network architecture with good generalization ability.

preprint2010arXiv

Optimal Bangla Keyboard Layout using Association Rule of Data Mining

In this paper we present an optimal Bangla Keyboard Layout, which distributes the load equally on both hands so that maximizing the ease and minimizing the effort. Bangla alphabet has a large number of letters, for this it is difficult to type faster using Bangla keyboard. Our proposed keyboard will maximize the speed of operator as they can type with both hands parallel. Here we use the association rule of data mining to distribute the Bangla characters in the keyboard. First, we analyze the frequencies of data consisting of monograph, digraph and trigraph, which are derived from data wire-house, and then used association rule of data mining to distribute the Bangla characters in the layout. Finally, we propose a Bangla Keyboard Layout. Experimental results on several keyboard layout shows the effectiveness of the proposed approach with better performance.

preprint2010arXiv

Optimal Bangla Keyboard Layout using Data Mining Technique

This paper presents an optimal Bangla Keyboard Layout, which distributes the load equally on both hands so that maximizing the ease and minimizing the effort. Bangla alphabet has a large number of letters, for this it is difficult to type faster using Bangla keyboard. Our proposed keyboard will maximize the speed of operator as they can type with both hands parallel. Here we use the association rule of data mining to distribute the Bangla characters in the keyboard. First, we analyze the frequencies of data consisting of monograph, digraph and trigraph, which are derived from data wire-house, and then used association rule of data mining to distribute the Bangla characters in the layout. Experimental results on several data show the effectiveness of the proposed approach with better performance.

preprint2010arXiv

Pattern Classification using Simplified Neural Networks

In recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. This paper presents an approach for classifying patterns from simplified NNs. Although the predictive accuracy of ANNs is often higher than that of other methods or human experts, it is often said that ANNs are practically "black boxes", due to the complexity of the networks. In this paper, we have an attempted to open up these black boxes by reducing the complexity of the network. The factor makes this possible is the pruning algorithm. By eliminating redundant weights, redundant input and hidden units are identified and removed from the network. Using the pruning algorithm, we have been able to prune networks such that only a few input units, hidden units and connections left yield a simplified network. Experimental results on several benchmarks problems in neural networks show the effectiveness of the proposed approach with good generalization ability.

preprint2010arXiv

Performance Analysis of Pulse Shaping Technique for OFDM PAPR Reduction

Orthogonal Frequency Division Multiplexing (OFDM) is an attractive modulation and multiple access techniques for channels with a nonflat frequency response, as it saves the need for complex equalizers. It can offer high quality performance in terms of bandwidth efficiency, robustness against multipath fading and cost-effective implementation. However, its main disadvantage is the high peak-to-average power ratio (PAPR) of the output signal. As a result, a linear behavior of the system over a large dynamic range is needed and therefore the efficiency of the output amplifier is reduced. In this paper, we investigate the effect of some of these sets of time waveforms on the OFDM system performance in terms of Bit Error Rate (BER). We evaluate the system performance in AWGN channels. The obtained results indicate that the reduction in PAPR of the investigated methods is associated with considerable improvement in BER performance of the system, in multipath channels, as compared to conventional OFDM. These promising results indicate that pulse shaping with reduced PAPR is an attractive solution for an OFDM system.

preprint2010arXiv

REx: An Efficient Rule Generator

This paper describes an efficient algorithm REx for generating symbolic rules from artificial neural network (ANN). Classification rules are sought in many areas from automatic knowledge acquisition to data mining and ANN rule extraction. This is because classification rules possess some attractive features. They are explicit, understandable and verifiable by domain experts, and can be modified, extended and passed on as modular knowledge. REx exploits the first order information in the data and finds shortest sufficient conditions for a rule of a class that can differentiate it from patterns of other classes. It can generate concise and perfect rules in the sense that the error rate of the rules is not worse than the inconsistency rate found in the original data. An important feature of rule extraction algorithm, REx, is its recursive nature. They are concise, comprehensible, order insensitive and do not involve any weight values. Extensive experimental studies on several benchmark classification problems, such as breast cancer, iris, season, and golf-playing, demonstrate the effectiveness of the proposed approach with good generalization ability.

preprint2010arXiv

RGANN: An Efficient Algorithm to Extract Rules from ANNs

This paper describes an efficient rule generation algorithm, called rule generation from artificial neural networks (RGANN) to generate symbolic rules from ANNs. Classification rules are sought in many areas from automatic knowledge acquisition to data mining and ANN rule extraction. This is because classification rules possess some attractive features. They are explicit, understandable and verifiable by domain experts, and can be modified, extended and passed on as modular knowledge. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Comparing them to the symbolic rules generated by other methods supports explicitness of the generated rules. Generated rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, including breast cancer, wine, season, golf-playing, and lenses classification demonstrate the effectiveness of the proposed approach with good generalization ability.

preprint2010arXiv

Rotation Invariant Face Detection Using Wavelet, PCA and Radial Basis Function Networks

This paper introduces a novel method for human face detection with its orientation by using wavelet, principle component analysis (PCA) and redial basis networks. The input image is analyzed by two-dimensional wavelet and a two-dimensional stationary wavelet. The common goals concern are the image clearance and simplification, which are parts of de-noising or compression. We applied an effective procedure to reduce the dimension of the input vectors using PCA. Radial Basis Function (RBF) neural network is then used as a function approximation network to detect where either the input image is contained a face or not and if there is a face exists then tell about its orientation. We will show how RBF can perform well then back-propagation algorithm and give some solution for better regularization of the RBF (GRNN) network. Compared with traditional RBF networks, the proposed network demonstrates better capability of approximation to underlying functions, faster learning speed, better size of network, and high robustness to outliers.

preprint2010arXiv

Rule Extraction using Artificial Neural Networks

Artificial neural networks have been successfully applied to a variety of business application problems involving classification and regression. Although backpropagation neural networks generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions are not as interpretable as those of decision trees. In many applications, it is desirable to extract knowledge from trained neural networks so that the users can gain a better understanding of the solution. This paper presents an efficient algorithm to extract rules from artificial neural networks. We use two-phase training algorithm for backpropagation learning. In the first phase, the number of hidden nodes of the network is determined automatically in a constructive fashion by adding nodes one after another based on the performance of the network on training data. In the second phase, the number of relevant input units of the network is determined using pruning algorithm. The pruning process attempts to eliminate as many connections as possible from the network. Relevant and irrelevant attributes of the data are distinguished during the training process. Those that are relevant will be kept and others will be automatically discarded. From the simplified networks having small number of connections and nodes we may easily able to extract symbolic rules using the proposed algorithm. Extensive experimental results on several benchmarks problems in neural networks demonstrate the effectiveness of the proposed approach with good generalization ability.

preprint2010arXiv

Smart Bengali Cell Phone Keypad Layout

Nowadays cell phone is the most common communicating used by mass people. SMS based communication is a cheap and popular communication method. It is human tendency to have the opportunity to write SMS in their mother language. Text input in mother language is more flexible when the alphabets of that language are printed on the keypad. Bangla mobile keypad based on phonetics has been proposed earlier. But the keypad is not scientific from frequency and flexibility point of view. Since it is not a feasible solution in this paper we have proposed an efficient Bengali keypad for cell phone and other cellular device. The proposed keypad is based on the frequency of the alphabets in Bengali language and also with the view of structure of human finger movements. We took the two points in count to provide a flexible and fast cell phone keypad.

preprint2010arXiv

Speaker Identification using MFCC-Domain Support Vector Machine

Speech recognition and speaker identification are important for authentication and verification in security purpose, but they are difficult to achieve. Speaker identification methods can be divided into text-independent and text-dependent. This paper presents a technique of text-dependent speaker identification using MFCC-domain support vector machine (SVM). In this work, melfrequency cepstrum coefficients (MFCCs) and their statistical distribution properties are used as features, which will be inputs to the neural network. This work firstly used sequential minimum optimization (SMO) learning technique for SVM that improve performance over traditional techniques Chunking, Osuna. The cepstrum coefficients representing the speaker characteristics of a speech segment are computed by nonlinear filter bank analysis and discrete cosine transform. The speaker identification ability and convergence speed of the SVMs are investigated for different combinations of features. Extensive experimental results on several samples show the effectiveness of the proposed approach.

preprint2010arXiv

Text Classification using Artificial Intelligence

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using artificial intelligence technique that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of naïve Bayes classifier is then used on derived features and finally only a single concept of genetic algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental results show that the proposed system works as a successful text classifier.

preprint2010arXiv

Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm

Text classification is the automated assignment of natural language texts to predefined categories based on their content. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Now a day the demand of text classification is increasing tremendously. Keeping this demand into consideration, new and updated techniques are being developed for the purpose of automated text classification. This paper presents a new algorithm for text classification. Instead of using words, word relation i.e. association rules is used to derive feature set from pre-classified text documents. The concept of Naive Bayes Classifier is then used on derived features and finally a concept of Genetic Algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental results show that the proposed system works as a successful text classifier.

preprint2010arXiv

Text Classification using Data Mining

Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the proposed algorithm has been implemented and tested. The experimental results show that the proposed system works as a successful text classifier.

preprint2010arXiv

Text Classification using the Concept of Association Rule of Data Mining

As the amount of online text increases, the demand for text classification to aid the analysis and management of text is increasing. Text is cheap, but information, in the form of knowing what classes a text belongs to, is expensive. Automatic classification of text can provide this information at low cost, but the classifiers themselves must be built with expensive human effort, or trained from texts which have themselves been manually classified. In this paper we will discuss a procedure of classifying text using the concept of association rule of data mining. Association rule mining technique has been used to derive feature set from pre-classified text documents. Naive Bayes classifier is then used on derived features for final classification.

preprint2010arXiv

The Most Advantageous Bangla Keyboard Layout Using Data Mining Technique

Bangla alphabet has a large number of letters, for this it is complicated to type faster using Bangla keyboard. The proposed keyboard will maximize the speed of operator as they can type with both hands parallel. Association rule of data mining to distribute the Bangla characters in the keyboard is used here. The frequencies of data consisting of monograph, digraph and trigraph are analyzed, which are derived from data wire-house, and then used association rule of data mining to distribute the Bangla characters in the layout. Experimental results on several data show the effectiveness of the proposed approach with better performance. This paper presents an optimal Bangla Keyboard Layout, which distributes the load equally on both hands so that maximizing the ease and minimizing the effort.

preprint2010arXiv

Universal Numeric Segmented Display

Segmentation display plays a vital role to display numerals. But in today's world matrix display is also used in displaying numerals. Because numerals has lots of curve edges which is better supported by matrix display. But as matrix display is costly and complex to implement and also needs more memory, segment display is generally used to display numerals. But as there is yet no proposed compact display architecture to display multiple language numerals at a time, this paper proposes uniform display architecture to display multiple language digits and general mathematical expressions with higher accuracy and simplicity by using a 18-segment display, which is an improvement over the 16 segment display.

preprint2010arXiv

Web Page Categorization Using Artificial Neural Networks

Web page categorization is one of the challenging tasks in the world of ever increasing web technologies. There are many ways of categorization of web pages based on different approach and features. This paper proposes a new dimension in the way of categorization of web pages using artificial neural network (ANN) through extracting the features automatically. Here eight major categories of web pages have been selected for categorization; these are business & economy, education, government, entertainment, sports, news & media, job search, and science. The whole process of the proposed system is done in three successive stages. In the first stage, the features are automatically extracted through analyzing the source of the web pages. The second stage includes fixing the input values of the neural network; all the values remain between 0 and 1. The variations in those values affect the output. Finally the third stage determines the class of a certain web page out of eight predefined classes. This stage is done using back propagation algorithm of artificial neural network. The proposed concept will facilitate web mining, retrievals of information from the web and also the search engines.