Text categorization based on fuzzy soft set theory

Bana Handaga, Bana Handaga and Mat Deris, Mustafa (2012) Text categorization based on fuzzy soft set theory. In: ICCSA'12: Proceedings of the 12th international conference on Computational Science and Its Applications, 18-21 June 2012, Bahia, Brazil.

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-3-642-31128-4_25

Abstract

In this paper, we proposed a new method for Text Categorization based on fuzzy soft set theory so called fuzzy soft set classifier (FSSC). We use fuzzy soft set representation that derived from the bag-of-words representation and define each term as a distinct word in the set of words of the document collection. The FSSC categorize each document by using fuzzy c-means formula for classification, and use fuzzy soft set similarity to measure distance between two documents. We perform the experiments with the standard Reuters-21578 dataset, and using three kind of weigthing such as boolean, term frequency, and term frequency-invert document frequency to compare the performance of FSSC with others four classifier such as kNN, Bayesian, Rocchio, and SVM. We are using precision, recall, F-measure, retun-size, and the running time as a performance evaluation. Result shown that there is no absolute winner. The FSSC has precision, recall, and F-measure lower than SVM, and kNN but FSSC can work faster than both. When compared with the Bayesian and Rocchio, the FSSC works more slowly but has a higher precision and F-measure.

Item Type:Conference or Workshop Item (Paper)
Uncontrolled Keywords:bag-of-words; fuzzy soft set theory; text classification
Subjects:Q Science > QA Mathematics > QA76 Computer software
Divisions:Faculty of Science Computer and Information Technology > Department of Software Engineering
ID Code:3585
Deposited By:Normajihan Abd. Rahman
Deposited On:15 Apr 2013 13:31
Last Modified:15 Apr 2013 13:31

Repository Staff Only: item control page