Rohidin, Dede (2023) An improved associative classification model using fuzzy parameterized soft set-based decision for text classification. Doctoral thesis, Universiti Tun Hussein Onn Malaysia.
Text
24p DEDE ROHIDIN.pdf Download (1MB) |
|
Text (Copyright Declaration)
DEDE ROHIDIN COPYRIGHT DECLARATION.pdf Restricted to Repository staff only Download (820kB) | Request a copy |
|
Text (Full Text)
DEDE ROHIDIN WATERMARK.pdf Restricted to Registered users only Download (5MB) | Request a copy |
Abstract
Text classification is applicable in various problem domains, including marketing, security, and biomedical. One of the potential text classifiers is the well-known associative classification approach. However, the existing associative classification approach is still prone to some limitations especially when dealing with the problem with too many rules in text classification problem. Some of the rules generated from the textual data may be irrelevant and redundant, result in low performance in imbalanced and class overlapping data. Therefore, this research has proposed an improved associative classification approach to enhance the performance and efficiency of the text classification by removing the irrelevant rules, reducing redundant rules, and handling the imbalanced and class overlapping issues in the textual data. The proposed associative classification approach consists of three stages: pre-processing, fuzzification and classification. In the classification stage primarily, this study proposed to integrating principles of fuzzy soft set theory into associative rules, therefore referred to as Class-Based Fuzzy Soft Associative (CBFSA) method. The experiments used 20 Newsgroup (balanced data) datasets and Reuter-25178 (imbalanced) to evaluate the proposed model. It shows that CBFSA is successful in removing irrelevant and reducing redundant rules. The CBFSA classifier applies smaller number of rules than Class Based Associative (CBA) and Class Based of Predictive Association Rule (CPAR). The CBFSA is also successful in dealing with imbalanced and class overlap data. The CBFSA performance is higher and faster than CBA and CPAR. Meanwhile, comparative analysis with some other non-associative based classifiers may achieve improved f1-measure between 6% to 32%. The processing time of CBFSA is faster than RNN and CNN but slightly slower than Decision Tree, k-NN, Naïve Bayes, Roccio, Bagging and Boosting
Item Type: | Thesis (Doctoral) |
---|---|
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Computer Science and Information Technology > Department of Information Security |
Depositing User: | Mrs. Sabarina Che Mat |
Date Deposited: | 13 May 2024 07:05 |
Last Modified: | 13 May 2024 07:05 |
URI: | http://eprints.uthm.edu.my/id/eprint/10825 |
Actions (login required)
View Item |