Performance Analysis of NBMU & Samo Classification Techniques used for Textual Information

By:

Contributor(s):

Department of Computer System Engineering

Material type:

TextPublication details: Nawabshah: QUEST, 2016.Description: 60pOnline resources:

Click here to access online

Average rating: 0.0 (0 votes)

Holdings
Cover image	Item type	Current library	Home library	Collection	Shelving location	Call number	Materials specified	Vol info	URL	Copy number	Status	Notes	Date due	Barcode	Item holds	Item hold queue priority	Course reserves
	Thesis and Dissertation	Research Section									Available			MP/26-276

Total holds: 0

ABSTRACT

here is a huge amount of data in various formats that is present over internet. It is difficult to classify millions of text documents manually because it requires more time and resources. Therefore, text classification is widely used for organizing text automatically. In this research, two classification techniques Naive Bayes Multinomial Updateable (NBMU) and Sequential Minimal Optimization (SMO) were applied on dataset. According to results, it was observed that, the combination of Rainbow Stopword (R) and Snowball Stemmer (SS) in NMBU classifier yielded the maximum accuracy (83%) while taking nominal time (0.07 sec) compared to other combinations of stemmers and stopwords removal. Whereas the SMO classifier yielded high accuracy (80%) by three different combinations of stemmers and stopwords removers, (Wordsfromfile) Stopword and Lovins Stemmer (WFF_LS), Regexpfromfile Stopword and Lovins Stemmer (REFF_LS),Regexpfromfile Stopword and Snowball Stemmer (REFF_SS)). However, the time taken for building the model was significantly high (500 - 1000 times higher). Based on the results of this research, it is suggested that the R & SS combination of stopwords remover and stemmer, respectively, in NMBU classifier perform best across the other selected combinations in terms of accuracy. By analysing the results, it is observed that the overall performance of SMO classifier in terms of accuracy is quite high on average compared to NBMU classifier. It was noted that the time that was taken by NBMU classifier was significantly low, compared to SMO classifier. Despite that, SMO is suggested to be utilised for the text classification due to the fact that the overall performance of this classifier is significantly higher in term of accuracy and that the text classification is a difficult task to perform, therefore, the difference in the time taken by NBMU and SMO become negligible and compensated by better accuracy.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer