UTHM Institutional Repository

An improved algorithm for iris classification by using support vector machine and binary random machine learning

Kamarulzalis, Ahmad Haadzal (2018) An improved algorithm for iris classification by using support vector machine and binary random machine learning. Masters thesis, Universiti Tun Hussein Onn Malaysia.

[img]
Preview
Text
24p ahmad haadzal kamarulzalis.pdf

Download (933kB) | Preview

Abstract

In machine learning, there are three type of learning branch that can used in classification procedures for data mining. Those branch are consist of supervised learning, unsupervised learning and reinforcement learning. This study focuses on supervised learning that seek to classify all the Iris dataset respect to three species (setosa, versicolor and virginica) in order them to mimic the actual dataset by using Support Vector Machine with four different kernel function (Linear, Radial Basis, Sigmoid and Polynomial), Random Forest (RF), k-Nearest Neighbors(k-NN) and Random Nearest Neighbors (RNN) as a method. The first objective of this study is to improve a new algorithm technique for classification. The new algorithm come from a combination of an ideas of k-NN algorithm and ensemble concept. The second objective is to conduct a supervised and binary ensemble machine learning technique for classification. This is done by using method of RF and RNN that share the same ensemble concept. The last objective is to identify the best model for classification procedures. Performance Measurement Tools such as overall accuracy, kappa, average sensitivity, average specificity, average precious, average detection rate, average prevalence and misclassification error rate (MER) were used by refers confusion matrix values output during data analysis for average and individual performance of each classifier. Besides that, Performance Visualization such as Stacked Bar Plot, Fourfold Plot, Receiver Operating Characteristic (ROC) Curve and Lollipop Chart are used to simplify each output for more clear understanding. Random Nearest Neighbors (RNN) has highest accuracy value that is 98.67% and just 1.33% misclassification error rate (MER) compare to other classifier. Therefore, Random Nearest Neighbors (RNN) is preferable for supervised learning classification procedures.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > QA297 Numerical analysis. Analysis
Divisions: Faculty of Applied Science and Technology > Department of Mathematics and Statistic
Depositing User: Sabarina Che Mat
Date Deposited: 02 Feb 2020 04:02
Last Modified: 02 Feb 2020 04:02
URI: http://eprints.uthm.edu.my/id/eprint/12094
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year