Deleting duplicate records from a table

Note:  The examples in this post are written and tested in SQL Server 2008.

The below example expalins how to delete duplicate  records from a table in SQL Server database.

First create a table and insert some duplicate records into it.


--Creating a table for duplicate records

CREATE TABLE StudentDetails(Id int Primary key,RollNum int,Name varchar(20))

--Inserting Records

Insert into StudentDetails values(1,10,'Krishna')

Insert into StudentDetails values(2,11,'Raju')

Insert into StudentDetails values(3,10,'Krishna')

Insert into StudentDetails values(4,12,'Jagadish')

Insert into StudentDetails values(5,10,'Krishna')

In the above table three records are duplicated with RollNum=10 and Name=”Krishna”. We can remove these duplicate values by this query.

--Deleting duplicate records

delete DuplicateRecordTable from

(

      select row_number() over

      (partition by RollNum,Name order by RollNum)as DuplicateCount ,

      * from StudentDetails

) as DuplicateRecordTable

where DuplicateCount>1

Now the resulting table only have 3 records.

Note:  In the above query after ‘partition by’ keyword we should have to specify column names which have the duplicate values.

Another example:

Another way of deleting duplicate records from a table which doesn’t have a primary key is explained below.

First create a table without any primary key and insert some duplicate records into it.


--Creating a table for duplicate records

CREATE TABLE StudentDtls(RollNum int,Name varchar(20))

--Inserting Records

Insert into StudentDtls values(10,'Krishna')

Insert into StudentDtls values(11,'Raju')

Insert into StudentDtls values(10,'Krishna')

Insert into StudentDtls values(12,'Jagadish')

Insert into StudentDtls values(10,'Krishna')

observe no primary key created in this example. Now, to delete duplicate records execute the following query.

--Deleting duplicate records

SELECT DISTINCT *

      INTO DuplicateRecordTable

      FROM StudentDtls

      GROUP BY RollNum,Name

      HAVING COUNT(RollNum) > 1

DELETE StudentDtls

  WHERE RollNum

  IN (SELECT RollNum

         FROM DuplicateRecordTable)

INSERT StudentDtls

  SELECT *

     FROM DuplicateRecordTable

DROP TABLE DuplicateRecordTable

Now the resulting table will have only three unique records.

Advertisements