Related Communities:

An Extensible Approach for Materialized Big Data Integration in Distributed Computation Environments

An Extensible Approach for Materialized Big Data Integration in Distributed Computation Environments

Author(s): Vladimir Sazontev, Sergey Stupnikov
Published:Proc. 2019 Ivannikov Memorial Workshop (IVMEM), P. 33 38. IEEE, 2019
Abstract:
Modern IT world requires data integration systems to deal with the large number of heterogeneous data sources. Such systems should perform not only data extraction, but also schema alignment, entity resolution and data fusion. In the world of big data with large number of heterogenous data sources, there are number of methods that address various aspects of integration, to make the system automatic and less user-dependent. This work proposes an extensible approach for development of data integration system to perform materialized integration of heterogenous sources in a distributed computation environment. A prototype of the system with implementation of advanced methods for big data integration has been developed. The system is applied in e-commerce domain.
Download: [ https://ieeexplore.ieee.org/document/8880743 ]

Supported by Synthesis Group