UTHM Institutional Repository

A hybrid semantic similarity feature-based to support multiple ontologies

Omar, Nurul Aswa (2017) A hybrid semantic similarity feature-based to support multiple ontologies. PhD thesis, Universiti Tun Hussein Onn Malaysia.


Download (718kB)


Semantic similarity between concepts, words, and terms is of great importance in many applications dealing with textual data, such as Natural Language Processing (NLP). Semantic similarity is defined as the closeness of two concepts, based on the likeliness of their meaning. It is also more ontology-based, due to their efficiency, scalability, lack of constraints and the availability of large ontologies. However, ontology-based semantic similarity is hampered by the fact that it depends on the overall scope and detail of the background ontology. Coupled with the fact that only one ontology is exploited, this leads to insufficient knowledge, missing terms and inaccuracy. This limitation can be overcome by exploiting multiple ontologies. Semantic similarity with multiple ontologies potentially leads to better accuracy because it is able to calculate the similarity of these missing terms from the combination of multiple knowledge sources. This research was conducted for developing the taxonomy of semantic similarity that contributes to understanding the current approaches, issues and data involved. This research aims to propose and evaluate ontological features for semantic similarity with multiple ontologies. Additionally, this research aims to develop and evaluate a feature-based mechanism (Hyb-TvX) to measure semantic similarity with multiple ontologies which can improve the accuracy of the similarity. This research used two benchmark datasets of biomedical concepts from Perdesen and Hliaoutakis. Similarity value, correlation and p-value were also used in the evaluation of the relationship between the concept pair of multiple ontologies. The findings indicate that the use of a semantic relationship of concepts (hypernym, hyponym, sister term and meronym) can improve the baseline method up to 75%. Besides that, the Hyb-TvX mechanism produces the highest correlation value compared to the other two methods, that is 0.759 and the result correlation is significant. Finally, the ability to discover similarity concepts with multiple ontologies could be also exploited in other domains besides biomedicine as future research.

Item Type: Thesis (PhD)
Subjects: Q Science > QA Mathematics > QA76 Computer software
Depositing User: Mr. Mohammad Shaifulrip Ithnin
Date Deposited: 24 Jun 2018 01:34
Last Modified: 24 Jun 2018 01:34
URI: http://eprints.uthm.edu.my/id/eprint/10184
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item


Downloads per month over past year