An improved algorithm for iris classification by using support vector machine and binary random machine learning

Kamarulzalis, Ahmad Haadzal (2018) An improved algorithm for iris classification by using support vector machine and binary random machine learning. Masters thesis, Universiti Tun Hussein Onn Malaysia.

24p ahmad haadzal kamarulzalis.pdf

Download (827kB) | Preview
[img] Text (Copyright Declaration)
Restricted to Repository staff only

Download (618kB) | Request a copy
[img] Text (Full Text)
Restricted to Registered users only

Download (1MB) | Request a copy


In machine learning, there are three type of learning branch that can used in classification procedures for data mining. Those branch are consist of supervised learning, unsupervised learning and reinforcement learning. This study focuses on supervised learning that seek to classify all the Iris dataset respect to three species (setosa, versicolor and virginica) in order them to mimic the actual dataset by using Support Vector Machine with four different kernel function (Linear, Radial Basis, Sigmoid and Polynomial), Random Forest (RF), k-Nearest Neighbors(k-NN) and Random Nearest Neighbors (RNN) as a method. The first objective of this study is to improve a new algorithm technique for classification. The new algorithm come from a combination of an ideas of k-NN algorithm and ensemble concept. The second objective is to conduct a supervised and binary ensemble machine learning technique for classification. This is done by using method of RF and RNN that share the same ensemble concept. The last objective is to identify the best model for classification procedures. Performance Measurement Tools such as overall accuracy, kappa, average sensitivity, average specificity, average precious, average detection rate, average prevalence and misclassification error rate (MER) were used by refers confusion matrix values output during data analysis for average and individual performance of each classifier. Besides that, Performance Visualization such as Stacked Bar Plot, Fourfold Plot, Receiver Operating Characteristic (ROC) Curve and Lollipop Chart are used to simplify each output for more clear understanding. Random Nearest Neighbors (RNN) has highest accuracy value that is 98.67% and just 1.33% misclassification error rate (MER) compare to other classifier. Therefore, Random Nearest Neighbors (RNN) is preferable for supervised learning classification procedures.

Item Type: Thesis (Masters)
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics
Divisions: Faculty of Applied Science and Technology > Department of Mathematics and Statistics
Depositing User: Miss Afiqah Faiqah Mohd Hafiz
Date Deposited: 21 Jul 2021 03:09
Last Modified: 21 Jul 2021 03:09

Actions (login required)

View Item View Item