ROMJIST Volume 23, No. 1, 2020, pp. 55-68
Rajesh MAHULE, Ranjana VYAS, Om Prakash VYAS Towards Knowledge Discovery in interlinked heterogeneous datasets of LOD cloud
ABSTRACT: In the last years, a huge volume of data was published on the web as Linked Open Data (LOD). Consuming and using this interlinked collection of heterogeneous data in classical data mining methods is a substantial challenge as it requires input in propositional Feature Vector Table (FVT) form. To overcome this consumption hurdle, this paper proposes a framework inspired by Link Traversal Based Query Execution (LTBQE) paradigm. The framework is designed to dynamically extract relevant features and build an FVT from a set of interlinked RDF datasets in a local environment. This article introduces a Content-Based similarity measure to evaluate generated FVT. Also, two representative data mining tasks are performed to evaluate the framework empirically which shows that the generated FVT assists in learning from heterogeneous LOD datasets. The evaluation work revealed some interesting patterns and also suggests an appropriate distance measure to handle dimensionality of the set-valued data attribute.KEYWORDS: Linked Open Data, Semantic Web, Data mining, Knowledge discovery, Feature vector generationRead full text (pdf)