Clustering is a sub-area of data mining, which congregates similar data records in a group.So we put forward applying this technology into detecting approximately duplicate data records.

英美