In this post I will show you how to add a duplicate related list to the Lead object. The code can be modified to find duplicates on most objects.
These will be the tabs you create and for illustrative purposes, I have put the tabs next to the Lead tab.

Find Dups is a Visual Force tab for the Visual Force page Find Duplicates.

This Visual Force page has two buttons. The Delete All Records button only deletes records created in the Unique and Dup tables. It will not delete your leads. The Find Duplicates button, will look for exact matches in the Lead object using Company Name, Street, City, State and ZIP. You can modify the APEX code to customize which fields are compared.
When the batch APEX has processed, click on the Unique tab to see a list of unique leads together with the number of duplicates found for each unique record.

If you click on the hyperlinked Unique Record ID, it will pull up the detail page and show the related Dups list.


Now click on one of the Duplicate Record IDs to pull up the detail page of the Dup record:

To see the actual Lead record this dup is related to, click the Lead Look Up field:
Under the Dups related list on the Lead detail page are the matching dups for the lead you just saw earlier…


Here is the Schema Builder of the objects:

To create this app:
First create two custom objects called Unique and Dup by following the instructions below. Dup is a child of Unqiue so you will need to create a Master-Detail relationship between these two objects. Dup also has a lookup relationship to Lead. This creates the related list Dup on the Lead detail page.
Create an object called Unique with API Name Unique__c.
When creating the object, add related lists, notes and attachments and create a tab.
The object name should be Unique Record ID and formatted as Auto Number with the display format as U-{00000000} and beginning at 1.
Custom Fields
Lead Name, Lead_Name__c, Text (255)
Number of Duplicates, Number_of_Duplicates__c, Roll-Up Summary (COUNT Dup)
Unique Lead ID, Unique_Lead_ID__c, Text (18)
Unique Record, Unique_Record__c, Text (255) (Unique Case Insensitive)
Create an object called Dup with API Name Dup__c
When creating the object, add related lists, notes and attachments and create a tab.
The object name should be Duplicate Record ID and formatted as Auto Number with the display format as D-{00000000} and beginning at 1.
Custom Fields
Duplicate Lead ID, Duplicate_Lead_ID__c, Text (18)
Duplicate Match, Duplicate_Match__c, Text (255)
Lead Look Up, LeadName__c, Lookup (Lead)
Unique Lead ID, Unique_Lead_ID__c, Text (18)
Unique Record ID, Unique_Record_ID__c, Master-Detail (Unique)
For the Unique object List View,
Select the following fields in the order they are displayed: Record ID, Unique Record ID, Number of Duplicates, Unique Lead ID, Lead Name, Unique Record
For the Dup object List View:
Select the following fields in the order they are displayed: Record ID, Unique Record ID, Duplicate Record ID, Unique Lead ID, Duplicate Lead ID, Lead Look Up, Duplicate Match
Edit the Lead page layout and modify the Dup related list to show the fields in the following order: Record ID, Unique Record ID, Unique Lead ID, Duplicate Record ID, Duplicate Lead ID and Duplicate Match.
Create the following APEX classes:
public class FindDup {
//this class has web service call outs from the Visual Force page Find Dups
public FindDup(ApexPages.StandardController controller) {
}
public void FindDup(){
BatchFindDups2 obj = new BatchFindDups2();
Database.executebatch(obj,200); //use 1,000 as batchable chunk avoids 'Apex CPU time limit exceeded' error
system.debug(obj);
}
public void DeleteAllRecords(){
//this code also delete the child records in the Dup object
BatchMassDeleteUnique objU = new BatchMassDeleteUnique();
Database.executebatch(objU,2000);
system.debug(objU);
}
}
global class BatchMassDeleteUnique implements Database.Batchable<sObject>,Database.stateful{
//this is a batch class that deletes the Unique and Dup records
global final string query;
global BatchMassDeleteUnique(){
query = 'SELECT ID FROM Unique__c';
}
global Database.QueryLocator start(Database.BatchableContext BC){
return Database.getQueryLocator(query);
}
global void execute(Database.BatchableContext BC,List<sObject> scope){
Database.delete(scope,false);
}
global void finish(Database.BatchableContext BC){
}
}
global class BatchFindDups2 implements Database.Batchable<sObject>,Database.Stateful{
//this is a batch class that looks for unique records and duplicate records
//and puts them in objects
global final string query;
global Map<String, Id> uniqueRecord;
global BatchFindDups2(){
uniqueRecord = new Map<String, ID>();
query = 'SELECT ID, Name, Company, Street, City, State, PostalCode FROM Lead ORDER BY Company';
}
global Database.QueryLocator start(Database.BatchableContext BC){
return Database.getQueryLocator(query);
}
global void execute(Database.BatchableContext BC,List<Lead> scope){
String fields;
ID uniqueID;
Map<Id, Id> duplicateRecord = new Map<Id, Id>();
list<Unique__c> uniqueInsert = new list<Unique__c>();
list<Dup__c> duplicateInsert = new list<Dup__c>();
try{
for(Lead l:scope){
fields = l.Company + '|' + l.Street + '|' + l.City + '|' + l.State + '|' + l.PostalCode; //concatenate fields
if(uniqueRecord.containsKey(fields)){ //if field already exists in uniqueRecord map then add to duplicateRecord map
duplicateRecord.put(l.ID,uniqueRecord.get(fields));
}
else{
uniqueRecord.put(l.Company + '|' + l.Street + '|' + l.City + '|' + l.State + '|' + l.PostalCode, l.ID); //add to uniqueRecord map
Unique__c uni = New Unique__c(); //add to uniqueRecord object
uni.Unique_Lead_ID__c = l.ID;
uni.Unique_Record__c = l.Company + '|' + l.Street + '|' + l.City + '|' + l.State + '|' + l.PostalCode; //Combine fields with pipe | separator
uni.Lead_Name__c = l.name;
uniqueInsert.add(uni);
}
}
List<Database.SaveResult> updateResults = Database.insert(uniqueInsert, false); //false allows for partial inserts
//Get a list of newly created IDs
//Database.SaveResult only gives record Ids for new records so you must query them to pull back the complete object record
List<Id> listOfIds = new List<Id>();
for (Database.SaveResult sr : updateResults) {
if (sr.isSuccess()) {
listOfIds.add(sr.getId());
}
}
//Lead_Name__c must be included in newUniqueRecords query - it makes the correct lookup from Dup to Lead possible. I have no idea how.
List<Unique__c> newUniqueRecords = [Select Id, Lead_Name__c, Unique_Lead_ID__c, Unique_Record__c from Unique__c where Id in :listOfIds];
for (ID dupID : duplicateRecord.keySet())
{
uniqueID = duplicateRecord.get(dupID);
for(Unique__c u: newUniqueRecords){
If (uniqueID == u.Unique_Lead_ID__c){
Dup__c dup = New Dup__c();
dup.Unique_Record_ID__c = u.Id;
dup.LeadName__c = uniqueID; //somehow the Lookup Lead ID is resolved to the Lead name
dup.Unique_Lead_ID__c = uniqueID;
dup.Duplicate_Lead_ID__c = dupID;
dup.Duplicate_Match__c = u.Unique_Record__c;
duplicateInsert.add(dup);
break; //exit out of loop if match found
}
}
}
insert duplicateInsert;
}
catch (Exception e) {
}
}
global void finish(Database.BatchableContext BC){
}
}
Create a Visual Force page called FindDup and add a tab for it:
<apex:page standardController="Lead" extensions="FindDup" sidebar="false" showHeader="true" >
<apex:form >
<p>Find Duplicates in Leads Object</p>
<br></br>
<p>Combine Company Name and Address</p>
<br></br>
<apex:commandButton action="{!DeleteAllRecords}" value="Delete All Records" id="deleteAll"/>
<apex:commandButton action="{!FindDup}" value="Find Duplicates" id="findDup"/>
</apex:form>
</apex:page>
Create a custom report type called Dup based off the custom object Dup and save it under the Lead category.
So how does it work? The Visual Force page Find Dups has web service call outs which are activated by the two buttons, Delete All Records and Find Duplicates. The APEX class for this call out is called FindDup. When deleting records from the Unique and Dup tables, the maximum batch size of 2,000 records is used.
To find duplicates, the Find Dups class calls the BatchFindDups2 global batch class. The statement Database.Stateful is added as we need this batch to remember certain variables for each batch processed.
To minimize heap, stack and APEX time out errors, batch size is limited to 200 records. The global batch class queries the Lead object with the following SOQL:
query = ‘SELECT ID, Name, Company, Street, City, State, PostalCode FROM Lead ORDER BY Company’;
This query gets 200 records at a time from the Lead object and orders them by Company Name. If you wanted to change the fields to match on you would need to change this SOQL.
The global Database.QueryLocator start(Database.BatchableContext BC) allows Salesforce to query up to 50 million records.
The execute method of this batch concatenates the lead fields, separated by pipes, i.e., Tower Records|210 First Ave|New York City|NY 10021, and puts them into the uniqueRecord map as the unique key and the map value as the unique Lead ID. The map keeps its collection of records for each batch because we set this class to Database.Stateful and the map gets progressively larger as the system works its way through the queried record set.
Records in the uniqueRecord map are inserted into the Unique table and matching lead records are inserted into the duplicateRecord map with the matching Lead ID as the map’s unique key and the map value as the as the unique Lead ID. This is how the app matches leads with their duplicates. The duplicateRecord map is inserted into the Dup table.
To illustrate:
uniqueRecord map: Unique Key: Tower Records|210 First Ave|New York City|NY 10021; Value: 00167ujK891245ghUv (Unique Lead ID)
duplicateRecord map: Unique Key: 00189ujK8916Q5ghUv (Duplicate Lead ID); Value: 00167ujK891245ghUv (Unqiue Lead ID)
To connect the Dup records to the Lead object as a related list, we need to know the created record IDs of the Unique records that were previously inserted, as they contain the unique Lead ID. A query retrieves these records and the unique Lead ID is compared to the Lead ID in the duplicate record. If they match, then the unique Lead ID is saved in the Lead Lookup field of the Dup table.
The end result is the Unique table only contains unique leads and the duplicate table contains duplicate leads. This is a parent child relationship. The Dup table also has a lookup relationship to the Lead object. Neither of these objects contains the entire lead record except for the concatenated matching fields and Lead IDs.
The Unique table has a roll up summary field so it can count the number of child duplicate “leads.”
Remember the custom report type Dup you created earlier? You can use this to create a cross object report of all Leads with Dups. You will need to add a filter “Dup ID not equal to “null”. This is shown below:

I have not load tested this app. If you try it on large record sets and it causes APEX time out errors, try reducing the batch size to 150 or even 100.
This is not a fast method to find duplicates compared to some apps out there but it is simple enough that anyone can create it and modify it for their own needs. Whether you merge or delete the dups is entirely up to you – this app just shows you how to find them and add them as a related list.