SharePoint 2013/2016: How to Find Duplicate Records in SharePoint List

During one of my assignments I have come across a situation where we need to fix data issues in SharePoint Lists.

One of the issues that we found was presence of duplicate data. In order to fix that problem in hand I had developed a Powershell Script to find out duplicate data based on a specific or a group of columns.

For the sake of demo, I have added a SharePoint List with some duplicate records in it as shown below:

1

Now let’s look into the code to understand implementation details-

In Step 1 we are getting references of the Site and Web where the SharePoint List resides

In Step 2 we are splitting the list of columns based on which we want to find out the duplicate data

We can see there are two input variables “ColumnToValidate” and “ColumnToDisplay”. “ColumnToValidate” provides columns based on which duplicity needs to be checked while “ColumnToDisplay” contains the list of columns that needs to be the part of data export.

In Step 3 we are creating the export folder that will hold the CSV files exported with duplicate records

In Step 4 we are creating the list object that will give the handle on the list which needs to be validated

2

In Step 5 we are getting list of Items from SharePoint List and grouping them based on the validation columns

In Step 6 we are creating the directory for export files

In Step 7 we are exporting all the groups which is having item count greater than 1 (this logic identifies the duplicate items)

3

That is all for the code.

Now we will see the variation in outputs depending on the columns specified for duplicacy check

In Step 8 we specify the validation and display columns, for the first execution we will check duplicate values in “Title” column

In Step 9 we are calling the “Get-DuplicateListItems” function to find the duplicate values

4

After the function executed successfully we can see the following output.

In Step 10 we can see the output of this excution and can see 6 items found which duplicate in Title Column

5

In Step 11 we can see the CSV file that is exported by the execution considering “Title” Column to be validated.

6

In Step 12 we can see the output file and can notice duplicate values in “Title” Column

7

In Step 13 we have changed the list of columns to be validated. In this second execution I have added another column “Role”.

Now the list will be validated for duplicity based on the combination of “Title & Role” Columns

8

In Step 14 we can see the output of this excution and can see 4 items found which duplicate in “Title & Role” Columns

9

In Step 15 we can see the CSV file that is exported by the execution considering “Title & Role” Column to be validated

10

In Step 16 we can see the output file and can notice duplicate values in “Title & Role” Column

11

In Step 17 we have changed the list of columns to be validated. In this second execution I have added another column “Location”.

Now the list will be validated for duplicity based on the combination of “Title & Role & Location” Columns

12

In Step 18 we can see the output of this excution and can see 2 items found which duplicate in “Title & Role & Location” Columns

13

In Step 19 we can see the CSV file that is exported by the execution considering “Title & Role & Location” Column to be validated

14

In Step 20 we can see the output file and can notice duplicate values in “Title & Role & Location” Column

15

This is a very simple technique that can be used to fix one of the issues with SharePoint List data.

Hope you find it helpful.

Advertisements

2 thoughts on “SharePoint 2013/2016: How to Find Duplicate Records in SharePoint List

  1. Why go through all those contortions? Just export the list to Excel and manipulate the data there, where it’s designed to be manipulated.

    _____________________________________________________________________
    Scot Witt
    Business Analyst-Lead Consultant II
    Workforce Technology Solutions- User Productivity Solutions Engineering
    Knowledge Management

    • Scot, this process has been designed to remove all manual steps.

      This solution was the part of a big picture with lot of moving parts running based on inputs of one process serving output for another.

      The framework that I wrote for that business problem was responsible to find and fix various types of data issues with multiple SharePoint Lists without any manual intervention at all.

      I believe that’s we consider as Process Automation. 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s