Zaman, Gohar and Mahdin, Hairulnizam and Hussain, Khalid and Rahman, Atta-ur- (2020) Information extraction from semi and unstructured data sources: a systematic literature review. ICIC Express Letters, 14 (6). pp. 593-603. ISSN 1881-803X
![]() |
Text
AJ 2020 (348).pdf Restricted to Registered users only Download (166kB) | Request a copy |
Abstract
Millions of structured, semi structured and unstructured documents have been produced around the globe on a daily basis. Sources of such documents are individuals as well as several research societies like IEEE, Elsevier, Springer and Wiley that we use to publish the scientific documents enormously. These documents are a huge resource of scientific knowledge for research communities and interested users around the world. However, due to their massive volume and varying document formats, search engines are facing problems in indexing such documents, thus making retrieval of information inefficient, tedious and time consuming. Information extraction from such documents is among the hottest areas of research in data/text mining. As the number of such documents is increasing tremendously, more sophisticated information extraction techniques are necessary. This research focuses on reviewing and summarizing existing state-of-theart techniques in information extraction to highlight their limitations. Consequently, the research gap is formulated for the researchers in information extraction domain.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Information extraction; Semi structured; Unstructured documents; Digital libraries; Retrieval |
Subjects: | T Technology > T Technology (General) |
Divisions: | Faculty of Computer Science and Information Technology > Department of Information Security |
Depositing User: | Mr. Shahrul Ahmad Bakri |
Date Deposited: | 01 Mar 2022 01:14 |
Last Modified: | 01 Mar 2022 01:14 |
URI: | http://eprints.uthm.edu.my/id/eprint/6551 |
Actions (login required)
![]() |
View Item |