Data reconciliation for big data environments

Fecha de publicación: 10/08/2023
Fuente: Wipo "BigData"
Apparatus and methods for reconciling data in a big data environment is provided. Methods may receive a first data set and a second data set for reconciliation. Methods may identify a first set of metadata associated with the first data set. Methods may identify a second set of metadata associated with the second data set. Methods may include a data reconciliation algorithm. The algorithm may compare the first set of metadata to the second set of metadata to obtain a subset of data found within the first data set and a subset of data found within the second data set that are joinable. Methods may dynamically construct one or more SQL queries to identify any discrepancies between the first data set and the second data set. Methods may execute the one or more SQL queries that identify any discrepancies between the first data set and the second data set.