UTHM Institutional Repository

Maximum total attribute relative of soft-set theory for efficient categorical data clustering

Mamat, Rabiei (2014) Maximum total attribute relative of soft-set theory for efficient categorical data clustering. PhD thesis, Universiti Tun Hussein Onn Malaysia.


Download (3MB)


Clustering a set of categorical data into a homogenous class is a fundamental operation in data mining. A number of clustering algorithms have been proposed and have made an important contribution to the issues of clustering especially related to the categorical data. Unfortunately, most of the clustering techniques are not designed to address the issues of uncertainties inherent in the categorical data. However, handling the data uncertainty is not an easy task. One method of handling the data uncertainty in categorical data clustering is by identifying the partition attribute in the information system. But, with this approach, the computational cost is still a major issue and the resulting clusters is still dubious. Thus, in this thesis, the concept of attribute relative which is based on the theory of soft-set is discussed and consequently introduces an alternative technique to the partition attribute selection approach for the used in the categorical data clustering. A technique which called Maximum Total Attribute Relative (MTAR) is able to determine the partition attribute of the categorical information system at the category level without compromising the computational cost and at the same time enhance the legitimacy of the resulting clusters. Experiments on sixteen (16) UCI-MLR benchmark datasets demonstrate the potentials of MTAR to achieved lower computational time with the improvements up to 90% as compared to TR, MMR, MDA and NSS. Experiments also show the objects in the clusters produced by MTAR technique has obvious similarities and the generated clusters also have better objects coverage simultaneously increased the cluster validity up to 23% in term of entropy as compared to MDA.

Item Type: Thesis (PhD)
Subjects: Q Science > QA Mathematics > QA273 Probabilities. Mathematical statistics
Depositing User: Normajihan Abd. Rahman
Date Deposited: 30 May 2016 08:04
Last Modified: 30 May 2016 08:04
URI: http://eprints.uthm.edu.my/id/eprint/8056
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item


Downloads per month over past year