Why integrate data?
Data integration in all its practical purpose is blending data from different sources; making it more useful and valuable than it was before. Multi-source data blending will help identify data discrepancies and gaps. Data integration is a combination of technical and business processes.
Integration does not necessarily mean to move data from one repository to the other or multiple repositories to one. The purpose is to make data comprehensive, error-free, and more usable.
There are methods of bringing data together into an integrated view and there are techniques for bringing data together physically, for an integration version. You can argue that both are a type of data integration, the difference being whether the data was physically moved and/or manipulated. Below are the common data integration approaches:
- Data consolidation
- Data propagation
- Data virtualization
- Data federation
- Data warehousing (Mastered Data)
Reliable Inc. will access your end-to-end data infrastructure, virtually integrate multi-source data sets, clean and synchronize. Our techniques will address both structured and unstructured data assets. We will apply one or a combination of approaches depending on the need.
#1 Data Consolidation
Data consolidation physically brings data together from several separate systems, creating a version of the consolidated data in one data store. Often the goal of data consolidation is to reduce the number of data storage locations. Extract, transform, and load (ETL) technology supports data consolidation.
ETL pulls data from sources, transforms it into an understandable format, and then transfers it to another database or data warehouse. The ETL process cleans, filters, and transforms data, and then applies business rules before data populates the new source.
#2 Data Propagation
Data propagation is the use of applications to copy data from one location to another. It is event-driven and can be done synchronously or asynchronously. Most synchronous data propagation supports two-way data exchange between the source and the target. Enterprise application integration (EAI) and enterprise data replication (EDR) technologies support data propagation. EAI integrates application systems for the exchange of messages and transactions. It is often used for real-time business transaction processing. Integration platform as a service (iPaaS) is a modern approach to EAI integration. EDR typically transfers large amounts of data between databases, instead of applications. base triggers and logs are used to capture and disseminate data changes between the source and remote databases.
#3 Data Virtualization
Virtualization uses an interface to provide a near real-time, unified view of data from disparate sources with different data models. Data can be viewed in one location but is not stored in that single location. Data virtualization retrieves and interprets data, but does not require uniform formatting or a single point of access.
#4 Data Federation
Federation is technically a form of data virtualization. It uses a virtual database and creates a common data model for heterogeneous data from different systems. Data is brought together and viewable from a single point of access. Enterprise information integration (EII) is a technology that supports data federation. It uses data abstraction to provide a unified view of data from different sources. That data can then be presented or analyzed in new ways through applications. virtualization and federation are good workarounds for situations where data consolidation is cost prohibitive or would cause too many security and compliance issues.
#5 Data Warehousing
Warehousing is included in this list because it is a commonly used term. However, its meaning is more generic than the other methods previously mentioned. Data warehouses are storage repositories for data. However, when the term “data warehousing,” is used, it implies the cleansing, reformatting, and storage of data, which is basically data integration.