Doctor Bob
Linux Topic Search
TOURBUS
HOME PAGE
SAVVY
CONSUMER
FREE TECH
SUPPORT

LINUX DATA MANIPULATION

How Can I Eliminate Duplicates in a Linux File?

The uniq command reads the input file and compares adjacent lines. Any line that is the same as the one before it will be discarded. In other words, duplicates are discarded, leaving only the unique lines in the file.
Let's say you're a publisher with an inventory of all your books in the my.books file shown here:

Atopic Dermatitis for Dummies
Atopic Dermatitis for Dummies
Chronic Rhinitis Unleashed
Chronic Rhinitis Unleashed
Chronic Rhinitis Unleashed
Learn Nasal Endoscopy in 21 Days

To remove all the duplicates from the list of books, use this command:

uniq my.books
Atopic Dermatitis for Dummies
Chronic Rhinitis Unleashed
Learn Nasal Endoscopy in 21 Days

If you want to print only the book titles that are not duplicated (to find out which books you have one copy of), add the -u flag, like this:

uniq -u my.books
Learn Nasal Endoscopy in 21 Days

Conversely, you might want to exclude the titles that appear only once. If so, add the -d flag, like this:

uniq -d my.books
Atopic Dermatitis for Dummies
Chronic Rhinitis Unleashed

Now let's take inventory. To summarize the list of books and add a count of the number of times each one appears in the list, add the -c flag, like this:

uniq -c my.books
2 Atopic Dermatitis for Dummies
3 Chronic Rhinitis Unleashed
1 Learn Nasal Endoscopy in 21 Days

Note that the uniq command does not sort the input file, so you may want to use the sort command to prepare the data for uniq in advance. (See the end of this section for an example.)

Here's a recap of the flags you can use with the uniq command:

-u Print only lines that appear once in the input file.

-d Print only the lines that appear more than once in the input file.
-c
Precede each output line with a count of the number of times it was found.

Previous Lesson: Sorting Data
Next Lesson: Selecting Columns

[ RETURN TO INDEX ]

Comments

No comments yet

*Name:
Email:
Notify me about new comments on this page
Hide my email
*Text:
 

Ask Bob Rankin - Free Tech Support
<Send This Link to a Friend>         <Bookmark This Page>


Copyright © by Bob Rankin
All rights reserved - Redistribution is allowed only with permission.