A stylometry approach for blind linguistic steganalysis model against translation-based steganography

Mohd Lokman, Syiham (2023) A stylometry approach for blind linguistic steganalysis model against translation-based steganography. Masters thesis, Universiti Tun Hussein Onn Malaysia.

[img] Text
24p SYIHAM MOHD LOKMAN.pdf

Download (943kB)
[img] Text (Copyright Declaration)
SYIHAM MOHD LOKMAN COPYRIGHT DECLARATION.pdf
Restricted to Repository staff only

Download (609kB) | Request a copy
[img] Text (Full Text)
SYIHAM MOHD LOKMAN WATERMARK.pdf
Restricted to Registered users only

Download (18MB) | Request a copy

Abstract

Steganography is the art of hiding information in ways that prevent the detection of a secret message. In Translation-based Steganography (TBS), the secret messages are encoded in the “noise” made via translation of natural language text programmed. The adversarial technique to extract the secret message is called steganalysis, which can be categorized into two types; targeted vs. blind. While targeted steganalysis is designed to attack a specific embedding algorithm, blind steganalysis use features extracted or selection from the medium to detect any anomalies that indicate a possibility that a secret data has been embedded within the medium. However, accuracy of blind steganalysis algorithms highly depend on the features selected from the input data especially when attacking embedding techniques in TBS. This thesis explore the potential of using stylometry or linguistic style to improve the representation of characteristics among the word distribution in distinguishing the stego text from the cover text for TBS. This is because all translated in TBS text have an intrinsic structural styles that can be used to improve the performance of a blind steganalysis model. The proposed stylometry-based blind steganalysis model consists of two stages, which are stylometric feature selection and classification. The proposed stylometric features selected from a set of cover text are categorized into two group features; lexical and syntactic features before implemented into the model Support Vector Machine (SVM) as the classifier. The performance of the stylometry-based blind steganalysis model is then evaluated based on all false rate, missing rate and accuracy rate and compared against three other standard classifiers in steganalysis; Naive Bayes (NB), k-Nearest Neighbor (k-NN), and Decision Tree (J48). The results showed that the stylometric features are impactful to a blind steganalysis model by giving higher detection performance. Meanwhile, SVM is the best classifier for stego text detection with significantly low processing time performance

Item Type: Thesis (Masters)
Subjects: T Technology > T Technology (General)
Depositing User: Mrs. Sabarina Che Mat
Date Deposited: 20 May 2024 01:36
Last Modified: 20 May 2024 01:36
URI: http://eprints.uthm.edu.my/id/eprint/10995

Actions (login required)

View Item View Item