|
Efficient Techniques for Online Record LinkageKeywords: decision tree , data heterogeneity Abstract: Matching records that refer to the same entity across databases is becoming an increasingly important part of many data mining projects, as often data from multiple sources needs to be matched in order to enrich data or improve its quality. Record linkage is the computation of the associations among records of multiple databases. It arises in contexts like the integration of such databases, online interactions and negotiations, and many others. Matching data from heterogeneous data source has been a real problem. A great organization must resolve a number of types of heterogeneity problems especially non uniformity problem. Statistical record linkage techniques could be used for resolving this problem but it causes communication bottleneck in a distributed environment. A matching tree is used to overcome communication overhead and give matching decision as obtained using the conventional linkage technique.
|