Biological-based Semi-supervised Clustering Algorithm to Improve Gene Function Prediction [ Full-Text ]
Shahreen Kasim, Safaai Deris, Razib M. Othman and Rathiah Hashim
Analysis of simultaneous clustering of gene expression with biological knowledge has now become an important technique and standard practice to present a proper interpretation of the data and its underlying biology. However, common clustering algorithms do not provide a comprehensive approach that look into the three categories of annotations; biological process, molecular function, and cellular component, and were not tested with different functional annotation database formats. Furthermore, the traditional clustering algorithms use random initialization which causes inconsistent cluster generation and are unable to determine the number of clusters involved. In this paper, we present a novel computational framework called CluFA (Clustering Functional Annotation) for semi-supervised clustering of gene expression data. The framework consists of three stages: (i) preparation of Gene Ontology (GO) datasets, functional annotation databases, and testing datasets, (ii) a fuzzy c-means clustering to find the optimal clusters; and (iii) analysis of computational evaluation and biological validation from the results obtained. With combination of the three GO term categories (biological process, molecular function, and cellular component) and functional annotation databases (Saccharomyces Genome Database (SGD), the Yeast Database at Munich Information Centre for Protein Sequences (MIPS), and Entrez), the CluFA is able to determine the number of clusters and reduce random initialization. In addition, CluFA is more comprehensive in its capability to predict the functions of unknown genes. We tested our new computational framework for semi-supervised clustering of yeast gene expression data based on multiple functional annotation databases. Experimental results show that 76 clusters have been identified via GO slim dataset. By applying SGD, Entrez, and MIPS functional annotation database to reduce random initialization, performance on both computational evaluation and biological validation were improved. By the usage of comprehensive GO term categories, the lowest compactness and separation values were achieved. Therefore, from this experiment, we can conclude that CluFA had improved the gene function prediction through the utilization of GO and gene expression values using the fuzzy c-means clustering algorithm by cross referencing it with the latest SGD annotation.
An Effective Framework to Preserve the Data Privacy By Innovative Rotation Technique [ Full-Text ]
P. Kamakshi and A. Vinaya Babu
Due to rapid development in the hardware, software and networking technology, there has been a tremendous growth in the amount of data collected, stored and shared between different organizations. The data is collected from heterogeneous sources like medical, financial, library, telephone, and shopping records can be stored in central repository called data warehouse. The primary challenge is to how to utilize such data for competitive business advantage. Data mining process analyzes such data from different perspectives and summarizes it into useful information that can be used to increase revenue, reduce cost and recommend better resolution for the growth of an organization. Data mining tools finds correlations or patterns among large relational databases and analyze the data from many different dimensions or angles. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage in various domains like marketing, weather forecasting, fraud detection, scientific research etc. A very significant feature to be considered during data mining process is that the data collected from heterogeneous sources also consists of sensitive information. The extracted pattern obtained by data mining operation may reveal the sensitive information. While data mining is a technology that has a large number of advantages, the main threat to be addressed is privacy. The main anxiety of people is that their confidential information may be disclosed without their knowledge and will be misused behind the scenes.. Hence data mining activities are forced to take actions to protect the privacy of the individuals. In this paper we propose and architecture which utilizes the significant features of perturbation and rotation techniques. In this paper we analyzed the problem due to perturbation technique and proposed a method to present better protection of sensitive information.
Application of MCDM model for assessing suitability of JIT Manufacturing [ Full-Text ]
A. V. R. Ramanbharath, B. S. P. Manjunaathan and C. Vijaya Ramnath
In the present scenario all the companies concentrate on producing quality products with low cost. The success of an industry depends on its product’s quality, cost and delivery time. In order to achieve above said factors, the maintenance policy adopted in the industry should be perfect and easy to implement. So, now a days all manufacturers are trying to implement new manufacturing methods for their production process. In this paper, an attempt has been made to find the suitability of just in time (JIT) in a leading steering manufacturing company in India. Even though lot of Multi Criteria Decision Making (MCDM) models like AHP, ANP and PVA are available, a Fuzzy Based (FB) model is necessary to assure the suitability by considering important factors and simulate the factors with data given by the experts in those fields. This paper mainly focused on the modeling of a ‘Fuzzy Based Simulation’ for finding the suitability of the JIT by considering the following important factors: Quality, cost, delivery time.
Arabic Document Classification: A Comparative Study [ Full-Text ]
M. E. Abd El-Monsef, M. Amin , El-Sayed Atlam and O. El-Barbary
Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, traditional classification methods are based on the word representation, which only accounts for term frequency in the documents, and ignores important relationships between key terms and fields. This paper considering a specific word that related to the field called Field Association (FA) words by considering their ranks (levels). Moreover, this paper built a Java software system to make classification on Arabic text using keyword, FA words and compound FA words. Furthermore, a comparative study of keywords, FA words and compound FA words on Arabic text were done using experimental results generated by our software. The presented methods estimated by simulation results of 1819 files and 16 super fields.
On Data Sanitization in Constructing Database from Class Diagram [ Full-Text ]
Mohd Zainuri Saringat, Rosziati Ibrahim, Noraini Ibrahim and Tutut Herawan
Data is very expensive and must be clear from error because it is used to support the decision making. Ambiguity data will decerese the reliability and increse the computer storage. Furthermore, data is used to produce executive information for decision support. Even though the data is valuable, not all the data must be stored in the database. Bad designing database will lead to ambiguity and incorrect information. However, it is very hard to ensure only the useful data is stored in the database. In this paper, we proposed a data sanitization process as an alternative method to consider the suitable data keeping in the database system. The technique will reduce the data redundancy and only compromised data will be stored in the database. It is based on object oriented design and user interfaces are used as the guide line for constructing the structure of the database. In this research, we are concern to remove duplicate storages in database and the data that is not retrieved by the users will be removed too. It will increase the efficiency of usage database system and reduce the route in accessing information in database. It is believed to have another advantage in accelerating the time for accessing speeds.
Software Testing Approach for Detection and Correction of Design Defects in Object Oriented Software
Dinesh Kumar Saini, Lingaraj A. Hadimani and Nirmal Gupta
The presence of design defects in object oriented software can have a severe impact on the quality of software. The detection and correction of design defects is an important issue for cost effective maintenance. In this work we propose an automatic detection technique which uses the design patterns as reference for good design to detect the design defects in existing software design. We also propose the correction technique which can refactor the code to meet the design specifications using the concept of class slicing. We can use this technique for any code in which classes are excessively coupled together thereby not meeting with the good design specifications for an object oriented software.
Transferring Voice using SMS over GSM Network [ Full-Text ]
M. Fahad Khan and Saira Beg
The paper presents a methodology of transmitting voice in SMS (Short Message Service) over GSM network. Usually SMS contents are text based and limited to 140 bytes. It supports national and international roaming, but also supported by other telecommunication such as TDMA (Time Division Multiple Access), CDMA (Code Division Multiple Access) as well. It can sent/ receive simultaneously with other services. Such features make it favorable for this methodology. For this an application is developed using J2ME platform which is supported by all mobile phones in the world. This algorithm’s test is conducted on N95 having Symbian Operating System (OS).
A Case Study of the Development of Document Management System in Oil and Gas Company [ Full-Text ]
Mohd Hilmi Hasan and Daliainie Mat Saaid
Information resources require proper management that can always ensure they are accessible in any condition. This can be made possible through libraries that archive all the information documents. The objective of this paper is to present a case study of the development of a document management system in an oil and gas company. The system was developed in three-tier web-based architecture, and is able to provide several functionalities namely viewing and borrowing documents and drawings of gas pipelines. The system is also able to rank documents based on fuzzy algorithm and provide e-mail notification to users in registration and borrowing features. Two types of tests were performed to evaluate the system namely functional and user acceptance tests. The latter was performed by a group of expert users from the company that was being studied in this research. The system was successfully developed and believed can improve the company’s documents and drawings management. The study implies potential time saving as users may now do the document’s details viewing and borrowing online. In addition, the study also implies effectiveness in managing the documents and drawings. For future works, it is proposed that more security measures be implemented to ensure reliability of the system. Moreover, a notification feature to inform administrators of any requests made by user is also proposed to enhance the system.
Techniques developed in Artificial Intelligence from the standpoint of their application in software Engineering [ Full-Text ]
A. Sharmila Dhana Joy and R. Dhanapal
Software development process is very complex process is primarily a human activity. Programming, in software development, requires the use of different types of knowledge: about the problem domain and the programming domain. It also requires many different steps in combining these types of knowledge into one final solution. This paper intends to study the techniques developed in artificial intelligence (AI) from the standpoint of their application in software engineering. In particular, it focuses on techniques developed (or that are being developed) in artificial intelligence that can be deployed in solving problems associated with software engineering processes. This paper highlights a comparative study between the software development and expert system development. This paper also highlights absence of risk management strategies or risk management phases in AI based systems.
Secure Heterogeneous Cluster-based Newscast Protocol [ Full-Text ]
I. Kazmi, S. Aslam and M. Y. Javed
Heterogeneous Cluster based Newscast Protocol is a gossip-based p2p overlay network protocol. It has inherent capability of handling heterogeneous resources. It exploits the autonomy of nodes by sharing each node’s cache with other nodes. This makes it prone to different types of cache attacks manipulating protocol-related information. The current research is a first step towards implementation of public key encryption technique with gossip-based protocols particularly HCNP, to make them secure from attacks by malicious nodes. It, not only, introduces the concept of p2p overlay architecture- specific security but also defines a new version of RSA with dynamic key assignment capability which gives efficient results than original RSA for HCNP. Moreover, it also provides a platform that can be used with other gossip-based protocols. This research has proven that RSA can be modified to give better performance results with gossip-based protocols although it is computationally quite complex.
Diffusion and Adoption of Technology and Innovation in Developing Countries – New Critical Impact Evidence [ Full-Text ]
This paper discusses the diffusion and adopting of technology (particularly the Internet) as well as innovation in developing countries. It analyzes the exploding effect of the Internet and social networking in developing countries and provides evidence of critical impact of this effect. The study focuses on the strategic use of technology and how it be used to cause major social and political changes in developing countries. The study also proposed a new theoretical framework for diffusion and adoption of technology and innovation in developing countries.
Proposed Quality Evaluation Framework to Incorporate Quality Aspects in Web Warehouse Creation
Umm-e-Mariya Shah, Maqbool Uddin Shaikh, Azra Shamim and Yasir Mehmood
Web Warehouse is a read only repository maintained on the web to effectively handle the relevant data. Web warehouse is a system comprised of various subsystems and process. It supports the organizations in decision making. Quality of data store in web warehouse can affect the quality of decision made. For a valuable decision making it is required to consider the quality aspects in designing and modeling of a web warehouse. Thus data quality is one of the most important issues of the web warehousing system. Quality must be incorporated at different stages of the web warehousing system development. It is necessary to enhance existing data warehousing system to increase the data quality. It results in the storage of high quality data in the repository and efficient decision making. In this paper a Quality Evaluation Framework is proposed keeping in view the quality dimensions associated with different phases of a web warehouse. Further more, the proposed framework is validated empirically with the help of quantitative analysis.
An Extended Method for Order Reduction of Large Scale Systems [ Full-Text ]
In this paper, an effective procedure to determine the reduced order model of higher order linear time invariant dynamic systems is discussed. Numerator and denominator polynomials of reduced order model are obtained by redefining the time moments of the original high order system and the method is extended to systems having repeated poles. The proposed method has been verified using typical numerical examples.
Fingerprint Recognition Using Extended Fuzzy Hypersphere Neural Network [ Full-Text ]
M. H. Kondekar, U. V. Kulkarni and S. S. Chowhan
Presently personal identification using fingerprint is one of the most reliable and popular biometric recognition method. An accurate and consistent classification algorithm can significantly reduce fingerprint matching time. This paper describes Extended Fuzzy Hypersphere Neural Network (EFHSNN) with its learning algorithm, which is an extension of Fuzzy Hypersphere Neural Network (FHSNN). The EFHSNN uses Manhattan distance instead of Euclidian distance. The experimental results for PolyU HRF fingerprint database show that EFHSNN is superior and yields 100% recognition rate along with less training and recall time.
Coating RUP-Project Management over SOA-Project Management [ Full-Text ]
Sheikh Muhammad Saqib, Muhammad Ahmad Jan and Shakeel Ahmad
SOA plays a vital role in development of service oriented computing. In every type of computing, project management is very necessary and strong practice. Due to huge scope of SOA, its project management is some time becomes feeble. SOA can follow traditional approaches for project management but risk handling can be loosed. Risk exploring and handling is very influential in agile methodologies such as in RUP. Here author investigates that by using RUP project management (RUP-PM) in SOA projects, these projects can be done with highly percentage of RUP-PM work.
Enhanced Encryption Methods [ Full-Text ]
Khalaf F. Khatatneh, Mohammad Hjouj Btoush and Qutyba A. Al-Tallaq
Modern society has a significant interest in keeping information secure. Fields such as Commerce, military, and simple personal communication all have a need to keep their data unreadable by unauthorized people. This paper explores the concept of data encryption. The history of encryption is presented from then up through the modern age.Finally, an in-depth analysis and description of the mathematics that make encryption work is presented then view the application created to implement the encryption methods.
A new Testing approach using Cuckoo Search to achieve Multi-Objective Genetic Algorithm [ Full-Text ]
Kavita Choudhary and G. N. Purohit
Software Testing is the process of executing a program with the intent of finding errors. Software testing can also be stated as the process of validating and verifying that a software program meets the requirements that guided its design and development and works as expected. The verification and validation of software through dynamic testing is an area of software engineering where progress towards automation has been slow. Software systems should be reliable and accurate. To achieve this objective, a complete testing is required. An automated software testing can significantly reduce the cost of developing software. This paper presents Cuckoo Search for generation of test cases. It is an optimization algorithm. Basically, Cuckoo Search is one of the evolutionary algorithms used to achieve multi-objective genetic algorithm and was introduced in 2009.
Securing DNS Using Elliptical Curve Cryptography: An Overview [ Full-Text ]
M. Junaid Arshad and M. Abrar
Vital importance of domain name server demands it to be more robust against attacks. Due to the absence of any good security model for DNS data, Domain Name Space has always been a good target for attacker. So there was a real need of some security protocol that could provide utmost security to DNS. This work is a step to achieve this goal. Elliptical Curve Cryptography is one of the most proficient methods of encryption. This model uses elliptic curve cryptography to provide confidentiality, integrity and availability of DNS data and DNS resources. DNS queries and responses will be encrypted to make it secure against DNS cache poisoning, DNS data spoofing, DoS and some other attacks.
Towards the development of Biological Viruses Community Ontology (BVCO) [ Full-Text ]
Sheikh Kashif Raffat, Muhammad Shahab Siddiqui, Zubair A. Shaikh and Abdul Rahman Memon
This century is said to be the century of biological and information sciences due to the number of researches conducted in this field. The viruses (biological and non-biological) received a lot of attention in all over the world due to its massive effect on different living communities. Viruses like bird flu, dengue and late blight of potato had its drastic impact on human, animal and plant communities. All these problems generated a need to classify these virus communities in some formal form. Ontology plays a vital role in classification of different biological information in a controlled and appropriate manner. Therefore, Biological Viruses Community Ontology (BVCO) is proposed which will cover virus communities exist in different species. Viruses are part of different viral communities according to its host like vertebrate, invertebrate, plant and unicellular organism. The proposed ontology is developed by using the principles of Open Biomedical Ontology (OBO) and will be available in the format of OBO. It can be explore by using OBO-Edit or any other OBO supported browser. To develop BVCO we used the taxonomy of viruses developed by the International Committee on Taxonomy of Viruses (ICTV).
Strategies of Domain Decomposition to Partition Mesh-Based Applications onto Computational Grids
Beatriz Otero and Marisa Gil
In this paper, we evaluate strategies of domain decomposition in Grid environment to solve mesh-based applications. We compare the balanced distribution strategy with unbalanced distribution strategies. While the former is a common strategy in homogenous computing environment (e.g. parallel computers), it presents some problems due to communication latency in Grid environments. Unbalanced decomposition strategies consist of assigning less workload to processors responsible for sending updates outside the host. The results obtained in Grid environments show that unbalanced distributions strategies improve the expected execution time of mesh-based applications by up to 53%. However, this is not true when the number of processors devoted to communication exceeds the number of processors devoted to calculation in the host. To solve this problem we propose a new unbalanced distribution strategy that improves the expected execution time up to 43%. We analyze the influence of the communication patterns on execution times using the Dimemas simulator.
Framework for Case Based Object Oriented Expert Warehouse to Enhance Knowledge management process for executive Decision Making [ Full-Text ]
Rizwana Irfan, Azra Shamim and Madiha Kazmi
Knowledge management is the key to success of any organization. Knowledge management process enables the decision makers to fully utilize the existing knowledge for future decision making. It improves organization efficiency by minimizing loss and risk. Moreover it provides well informed decisions and pre planed streamline operations that can lead to greater productivity, increase profit and competitive advantage. In this paper authors proposed a new architecture of knowledge management with expert decision making capability which is based on efficient extraction, transformation and storing of data/knowledge coming from heterogeneous and distributed environment. The proposed framework provides qualitative approach for enhancing knowledge management process with the help of object oriented approach and inference techniques for making accurate and reliable business decisions.
A Dynamic-Management of QoS in Distributed Multimedia Systems [ Full-Text ]
Bechir Alaya, Claude Duvallet, Bruno Sadeg and Faiez Gargouri
One of the current challenges in multimedia systems is to ensure efficient data transmission between the server and clients. These systems must guarantee to the users a certain quality of service (QoS), by ensuring data accessibility whatever the material and networks conditions are. They also must guarantee information consistency, particularly the respect of temporal constraints in order to obtain a smooth presentation of scenes. In this paper, we propose an architecture obtained by exploiting similarities existing between real-time database systems (RTDBSs) and multimedia systems. Then, we define a method, that we name (m,k)-frame, which allows to control at any moment the number of frames sent to users, by discarding selectively some frames when needed. We, finally, carry out simulations whose results show the best performances of our approach which consists of adapting the QoS to the real conditions, compared both to the application of an already proposed method by other authors (R-(m,k)-firm method) and to the original (m,k)-firm method. This adaptation is done according to the system load, which may become heavy due to network congestion, i.e, dynamic arrival of clients.
LMS and RLS Channel Estimation Algorithms for LTE-Advanced [ Full-Text ]
Saqib Saleem and Qamar-ul-Islam
For increased data rate and reduced latency for 4G radio communication standards, ITU made proposals for LTE-Advanced in 2009. To achieve Release-10 targets, made by 3rd Generation Partnership Project (3GPP), channel state information at the transmitter is a pre-requisite. In this paper, analysis of Least Mean Square (LMS) and Recursive Least Square (RLS) channel estimation techniques, using a priori channel statistics, is drawn for different Channel Impulse Response samples and channel taps for LTE-Advanced system. The effect of different parameters involved in these adaptive filters is also optimized. MATLAB simulations are used to compare their performance, in terms of Mean Square Error and Symbol Error Rate, and the complexity in terms of computational time.
Transform-Based Channel Estimation Techniques for LTE-Advanced [ Full-Text ]
Saqib Saleem and Qamar-ul-Islam
For wireless broadband services, 3rd Generation Partnership Project (3GPP) has proposed LTE-Advanced as next generation mobile standard in Release 10 and beyond. For high data rate and spectral efficiency, channel state information is required at the transmitter side. In this paper the performance and complexity of three time-domain transform-based channel estimators, Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT) and Windowed-DFT, are compared employing the channel length as CIR samples and number of multi-paths. The performance is evaluated in terms of Mean Square Error (MSE) and Symbol Error Rate (SER) while complexity is determined in terms of computational time. MATLAB Monte-Carlo simulations are used to optimize these algorithms.
Modified PCA based Image Fusion and its Quality Measure [ Full-Text ]
Amit Kumar Sen, Subhadip Mukherjee and Amlan Chakrabarti
Image Fusion is an emerging area of research in image processing and computer vision. This paper proposes an algorithm which is based on the revised version of the traditional principal component analysis (PCA) technique and it overcomes the shortcomings of the traditional PCA based algorithm. This algorithm is applied for fusing benchmark images and then the results are compared with the results of traditional PCA based fusion in terms of image quality. The results shows that the quality of the fused image by the proposed algorithm produces better result than the traditional PCA based technique.
A New Digital Watermarking Algorithm Using Combination of Least Significant Bit (LSB) and Inverse Bit
Abdullah Bamatraf, Rosziati Ibrahim and Mohd. Najib Mohd. Salleh
In this paper, we introduce a new digital watermarking algorithm using least significant bit (LSB). LSB is used because of its little effect on the image. This new algorithm is using LSB by inversing the binary values of the watermark text and shifting the watermark according to the odd or even number of pixel coordinates of image before embedding the watermark. The proposed algorithm is flexible depending on the length of the watermark text. If the length of the watermark text is more than ((MxN)/8)-2 the proposed algorithm will also embed the extra of the watermark text in the second LSB. We compare our proposed algorithm with the 1-LSB algorithm and Lee’s algorithm using Peak signal-to-noise ratio (PSNR). This new algorithm improved its quality of the watermarked image. We also attack the watermarked image by using cropping and adding noise and we got good results as well.
A Hybrid Approach to the Diagnosis of Tuberculosis by Cascading Clustering and Classification
Asha T., S. Natarajan and K. N. B. Murthy
In this paper, a methodology for the automated detection and classification of Tuberculosis (TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads through the air and attacks low immune bodies easily. Our methodology is based on clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis (PTB) and retroviral PTB (RPTB) that is those with Human Immunodeficiency Virus (HIV) infection. Initially K-means clustering is used to group the TB data into two clusters and assigns classes to clusters. Subsequently multiple different classification algorithms are trained on the result set to build the final classifier model based on K-fold cross validation method.This methodology is evaluated using 700 raw TB data obtained from a city hospital. The best obtained accuracy was 98.7% from support vector machine (SVM) compared to other classifiers. The proposed approach helps doctors in their diagnosis decisions and also in their treatment planning procedures for different categories.
Improving the Performance of Quicksort for Average Case Through a Modified Diminishing Increment Sorting [ Full-Text ]
Oyelami M. Olufemi and Akinyemi I. Olawole
Quicksort is an algorithm that is most suitable for average case scenarios. It has been refined to the extent that it is a sorting algorithm of choice in a wide variety of practical sorting applications. The most efficient refinement of Quicksort for average case scenario is the Median-of-Three Sort. There are, however, some average case scenarios in which Median-of-Three Sort algorithm is not so efficient. This paper presents an Improved Median-of-Three Sort which uses a modified diminishing increment sorting for a better performance in these situations. The results of the implementation and experimentation of this approach compared with Quicksort and Median-of-Three Sort shows that the Improved Median-of-Three Sort is more efficient in these scenarios.
A Novel Genetic Algorithm for Dynamic Economic Dispatch of Power Generation [ Full-Text ]
This paper presents a new and efficient method for solving the dynamic economic dispatch (DED) problem. The main goal of this problem consists in finding the optimal combination of power outputs over a certain period of time while satisfying all system equality and inequality constraints. The proposed framework is based on a new Genetic Algorithm with meiosis-specific features that provides efficient global and local search characteristics. The feasibility and the validity of the proposed approach are evaluated through numerical simulation considering a five-generator system and the results are compared with the solutions obtained from the literature. The simulation results reveal the superiority of the proposed technique in solving the DED problem.
Watermarking Ancient Documents Based on the Selection of the Best Base of Wavelet Packets and a Convolutional ECC [ Full-Text ]
Mohamed Neji Maatouk, Anis Kricha and Najoua Essoukri Ben Amara
A digital library of ancient documents makes the information accessible for everyone via the web and permits conserving, preserving and enhancing the value of the cultural and scientific heritage. Nevertheless, with a digital form, these types of documents are threatened to be hacked, modified or even diffused illegally. As a consequence, we risk losing the intellectual property of these documents. To curb these frauds, watermarking represents a promising method to protect these images. In this context, our work makes part of protecting essentially ancient documents. In this paper, we have proposed a method of watermarking ancient documents, which is based on the wavelet packet transform and on a convolutional error correcting code. The insertion is performed in the coefficients of maximum amplitude being in the best basis decomposition, according to an entropy criterion. This method proves noticeable signature invisibility and robustness against attacks of type to signal processing attacks (noise, filter and compression) as a first contribution to watermarking ancient documents.