Kamarulzalis, Ahmad Haadzal (2018) An improved algorithm for iris classification by using support vector machine and binary random machine learning. Masters thesis, Universiti Tun Hussein Onn Malaysia.
|
Text
24p ahmad haadzal kamarulzalis.pdf Download (827kB) | Preview |
|
Text (Copyright Declaration)
AHMAD HAADZAL KAMARULZALIS COPYRIGHT DECLARATION.pdf Restricted to Repository staff only Download (618kB) | Request a copy |
||
Text (Full Text)
AHMAD HAADZAL KAMARULZALIS WATERMARK.pdf Restricted to Registered users only Download (1MB) | Request a copy |
Abstract
In machine learning, there are three type of learning branch that can used in classification procedures for data mining. Those branch are consist of supervised learning, unsupervised learning and reinforcement learning. This study focuses on supervised learning that seek to classify all the Iris dataset respect to three species (setosa, versicolor and virginica) in order them to mimic the actual dataset by using Support Vector Machine with four different kernel function (Linear, Radial Basis, Sigmoid and Polynomial), Random Forest (RF), k-Nearest Neighbors(k-NN) and Random Nearest Neighbors (RNN) as a method. The first objective of this study is to improve a new algorithm technique for classification. The new algorithm come from a combination of an ideas of k-NN algorithm and ensemble concept. The second objective is to conduct a supervised and binary ensemble machine learning technique for classification. This is done by using method of RF and RNN that share the same ensemble concept. The last objective is to identify the best model for classification procedures. Performance Measurement Tools such as overall accuracy, kappa, average sensitivity, average specificity, average precious, average detection rate, average prevalence and misclassification error rate (MER) were used by refers confusion matrix values output during data analysis for average and individual performance of each classifier. Besides that, Performance Visualization such as Stacked Bar Plot, Fourfold Plot, Receiver Operating Characteristic (ROC) Curve and Lollipop Chart are used to simplify each output for more clear understanding. Random Nearest Neighbors (RNN) has highest accuracy value that is 98.67% and just 1.33% misclassification error rate (MER) compare to other classifier. Therefore, Random Nearest Neighbors (RNN) is preferable for supervised learning classification procedures.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics |
Divisions: | Faculty of Applied Science and Technology > Department of Mathematics and Statistics |
Depositing User: | Miss Afiqah Faiqah Mohd Hafiz |
Date Deposited: | 21 Jul 2021 03:09 |
Last Modified: | 21 Jul 2021 03:09 |
URI: | http://eprints.uthm.edu.my/id/eprint/295 |
Actions (login required)
View Item |