LINUX CLASSES - DATA MANIPULATION

Linux Sort Command

How Can I Sort Linux Files?

The sort command sorts a file according to fields--the individual pieces of data on each line. By default, sort assumes that the fields are just words separated by blanks, but you can specify an alternative field delimiter if you want (such as commas or colons). Output from sort is printed to the screen, unless you redirect it to a file. If you had a file like the one shown here containing information on people who contributed to your presidential reelection campaign, for example, you might want to sort it by last name, donation amount, or location. (Using a text editor, enter those three lines into a file and save it with donor.data as the file name.)

Bay Ching 500000 China
Jack Arta 250000 Indonesia
Cruella Lumper 725000 Malaysia

Let's take this sample donors file and sort it. The following shows the command to sort the file on the second field (last name) and the output from the command:

sort +1 -2 donors.data
Jack Arta 250000 Indonesia
Bay Ching 500000 China
Cruella Lumper 725000 Malaysia

The syntax of the sort command is pretty strange, but if you study the following examples, you should be able to adapt one of them for your own use. The general form of the sort command is

sort <flags> <sort fields> <file name>

The most common flags are as follows:

-f Make all lines uppercase before sorting (so "Bill" and "bill" are treated the same).
-r Sort in reverse order (so "Z" starts the list instead of "A").
-n Sort a column in numerical order
-tx Use x as the field delimiter (replace x with a comma or other character).
-u Suppress all but one line in each set of lines with equal sort fields (so if you sort on a field containing last names, only one "Smith" will appear even if there are several).

Specify the sort keys like this:

+m Start at the first character of the m+1th field.
-n End at the last character of the nth field (if -N omitted, assume the end of the line).

Looks weird, huh? Let's look at a few more examples with the sample company.data file shown here, and you'll get the hang of it. (Each line of the file contains four fields: first name, last name, serial number, and department name.)

Jan Itorre 406378 Sales
Jim Nasium 031762 Marketing
Mel Ancholie 636496 Research
Ed Jucacion 396082 Sales

To sort the file on the third field (serial number) in reverse order and save the results in sorted.data, use this command:

sort -r +2 -3 company.data > sorted.data
Mel Ancholie 636496 Research
Jan Itorre 406378 Sales
Ed Jucacion 396082 Sales
Jim Nasium 031762 Marketing

Now let's look at a situation where the fields are separated by colons instead of spaces. In this case, we will use the -t: flag to tell the sort command how to find the fields on each line. Let's start with this file:

Itorre, Jan:406378:Sales
Nasium, Jim:031762:Marketing
Ancholie, Mel:636496:Research
Jucacion, Ed:396082:Sales

To sort the file on the second field (serial number), use this command:

sort -t: +1 -2 company.data
Nasium, Jim:031762:Marketing
Jucacion, Ed:396082:Sales
Itorre, Jan:406378:Sales
Ancholie, Mel:636496:Research

To sort the file on the third field (department name) and suppress the duplicates, use this command:

sort -t: -u +2 -3 company.data
Nasium, Jim:031762:Marketing
Ancholie, Mel:636496:Research
Itorre, Jan:406378:Sales

Note that the line for Ed Jucacion did not print, because he's in Sales, and we asked the command (with the -u flag) to suppress lines that were the same in the sort field.

There are lots of fancy (and a few obscure) things you can do with the sort command. If you need to do any sorting that's not quite as straightforward as these examples, try the man sort command for more information.

For more information on the sort command, see the sort manual.

Previous Lesson: Heads or Tails?
Next Lesson: Eliminating Duplicates

[ RETURN TO INDEX ]


   

Comments - most recent first
(Please feel free to answer questions posted by others!)

ashwin     (14 Sep 2014, 16:35)
Itorre, Jan:406378:Sales
Nasium, Jim:031762:Marketing
Ancholie, Mel:636496:Research
Jucacion, Ed:396082:Sales

To sort the file on the third field (department name) and suppress the duplicates, use this command:

sort -t: -u +2 -3 company.data
Nasium, Jim:031762:Marketing
Ancholie, Mel:636496:Research
Itorre, Jan:406378:Sales

Note that the line for Ed Jucacion did not print, because he's in Sales, and we asked the command (with the -u flag) to suppress lines that were the same in the sort field.

Can you please exaplain reason clearly for not printing Ed Juaction details.


Sunil     (22 Nov 2012, 10:23)
Thanks for the tutorial, helped me alot
Peter Lordan     (11 Oct 2012, 15:43)
I haven't found any examples on the web in which
the columns are aligned and may have no delimiter. I handled this by telling sort that the
^ character (which never shows up in my output) was the delimiter and used the . to include a
byte count with the -k option:

cat my_log|sort -n -t^ --key 1.12,1.17

So, this uses bytes 12-17 of each line as the
key, regardless of what else is in the line. There
may be a better way to do it, but I though I'd
start the discussion with this.
sonia     (25 Sep 2012, 05:23)
Hi,
I have this file with a number of sequence of format
>string1
data
>string100
data
>string10
.....
>string5
...
>string67
......

the dots represent data.
I wanted to get the sequences arranged in ascending order like
>string1
data
>string5
data
>string10
.....
>string67

I used sort -n filename command but it ddint work.
Could some one help me!!
Thanks
Monika     (16 Aug 2012, 06:05)
i want to sort xml file through shell script. first i want to sort top attribute then left attribute.reply me
Neeta     (27 Feb 2012, 23:11)
Hi,
I have a file which i need to sort based on 1st field and then assign same number at the beginning of each line with same sort key i.e same first field.
Thanks
David     (24 Feb 2012, 13:47)
These examples all use +<field_number> -<field_number> key notation, but newer sort requires -k notation (where all field nubers rise by 1, i.e., not cardinal offset but ordinal ).
David     (24 Feb 2012, 13:44)
Many commands like join and comm need the the old original sort's binary or native order, so export LC_ALL=C before sort for join, comm or just for speed.
David     (24 Feb 2012, 13:41)
To get the old, fastest, binary order comparisons, you need to export LC_ALL=C first.
gautam baldev     (11 Feb 2012, 09:21)
Hi Guys,

you can all use the simply this syntax also

sort -k <field1,field2> file.txt also .

example:-


[redhat@localhost Desktop]$ cat sort.dat
Jan Itorre 406378 Sales
Jim Nasium 031762 Marketing
Mel Ancholie 636496 Research
Ed Jucacion 396082 Sales

o/p
----------------------------

[redhat@localhost Desktop]$ sort -k 2 sort.dat

Mel Ancholie 636496 Research
Jan Itorre 406378 Sales
Ed Jucacion 396082 Sales
Jim Nasium 031762 Marketing
[redhat@localhost Desktop]$ sort -r -k 2 sort.dat
Jim Nasium 031762 Marketing
Ed Jucacion 396082 Sales
Jan Itorre 406378 Sales
Mel Ancholie 636496 Research

[redhat@localhost Desktop]$ sort -r -k 2,3 sort.dat
Jim Nasium 031762 Marketing
Ed Jucacion 396082 Sales
Jan Itorre 406378 Sales
Mel Ancholie 636496 Research
------------------------------------

i think it would be beneficial for all of you.
Urigiough     (16 Dec 2011, 04:58)
Haha that's rediculous. No way
semna     (03 Dec 2011, 05:27)
Hi every one,
I have a file like this:
000 1558 1221 9 110 1 chr2scalefinal.txt
000 1558 215 2 130 0 chr28asfinal.txt
000 1558 329 3 136 2 chr22asfinal.txt
000 1558 329 3 228 3 chr22scalefinal.txt
000 1558 329 3 279 3 chr22tmidfinal.txt
000 1558 329 4 104 0 chr15scalefinal.txt
000 1558 357 1 126 0 chr13asfinal.txt
000 1558 782 4 159 0 chr3asfinal.txt
000 2682 1221 22 110 3 chr2scalefinal.txt
000 2682 215 4 130 3 chr28asfinal.txt
000 2682 329 5 104 1 chr15scalefinal.txt
000 2682 329 8 136 5 chr22asfinal.txt
000 2682 329 8 228 6 chr22scalefinal.txt
000 2682 329 8 279 6 chr22tmidfinal.txt
000 2682 357 4 126 2 chr13asfinal.txt
000 2682 782 8 159 1 chr3asfinal.txt

and I want to sort base on column 7 that should be the same order base on the second column (in this case 1558 and 2682). Any suggestion will be appreciated.
Majed     (16 Nov 2011, 09:27)
i wouldn't have understood it if not for the comments in the comment section about +1 -2 and +2 -3. At first, i thought they are ascending and must follow each other so i tried +3 -4 :)
Majed     (16 Nov 2011, 09:25)
i wouldn't have understood if not for the explanation in the comments about +1 -2 and +2 -3
i was thinking that the 2 numbers are ascending and right after each other and to make sure i tried +3 -4 :)
Fabho     (11 Sep 2011, 21:06)
Thank a lot. Easy and simple guide. Thank u very much
Bob Rankin     (25 Jul 2011, 10:58)
@Sagar - Quite right, thanks! Fixed now...
Sagar Patel     (25 Jul 2011, 02:47)
There's a little correction here:
sort -t: -u +2 company.data

Above should be like:
sort -t: -u +2 -3 company.data

No need to publish this comment :)
mohit sahjwala     (05 May 2011, 03:25)
source code for creating two files and sorting them
Jon     (18 Feb 2011, 14:02)
FYI, Sort (as of AT&T SVID R5) did *not* ignore the leading "_" in the sort... I'd habitually used the _ to prefix directories before "ls" from the days of USG3, Version 6 and 7... Back in those days sort was simply based on ASCII values.

And yes, it is annoying when over time the default behavior changes... ;-)
Bob Rankin     (10 Feb 2011, 14:01)
@Larry - I think it's pretty clear... I said "The following shows the command to sort the file on the second field (last name)." Later I go on to explain in detail (with examples) the sort syntax.

If I may say so, that's NOT very typical of other sites that leave you on your own to figure it out. :-)
Larry Stewart     (10 Feb 2011, 12:52)
sort +1 -2 donors.data:
I believe that this command basically says, "Sort by the 2nd field and then by the 1st field in the donors.data file. I realize that individuals reading this forum shouldn't be totally helpless but I don't understand why you couldn't have included a full explanation of the command. The explanation as it stands is incomplete. This is typical of the info provided in websites of this type.
Abhijit Das     (28 Nov 2010, 15:06)
tell me the code of merge sort.
Bob Rankin     (15 Sep 2010, 07:21)
@PARTHA SEN - You can count records with "wc -l"
PARTHA SEN     (15 Sep 2010, 06:59)
Is there any command to see the the number of Records of sorted file at the time of sorting?
ashwin     (13 Sep 2010, 10:05)
Role of +2:
+2 as Indicated by author is (m)+1 = (2)+1 = 3rd field. In the first example, author is sorting on the 2nd field, hence he used (1)+1 =2nd field.
Partha Sen     (13 Sep 2010, 02:17)
I want to sort a file under linux 1 field is
ascending and 1 filed is decending.Please advice me the sort command.
Bob Rankin     (17 Aug 2010, 09:56)
Yogesh, Once again, they are the sort keys, which are explained above.
yogesh     (17 Aug 2010, 09:19)
sort -r +2 -3 company.data > sorted.data

I wonder what is role of +2 here and

+1 in : sort +1 -2 donors.data



i've this same querryyyyyyyyy
moomin     (15 Jul 2010, 14:01)
sort -t'f' -k2 -rn test.txt
blabla.ref110.f0
blabla.ref102.f0
blabla.ref11.f0

moomin     (15 Jul 2010, 13:55)
I find the -k option easier to use

if you wanna sort guid of your existing grps
i.e
sort -t: -k3 -n /etc/group
or -rn for reverse order
John Hopkins     (09 Jul 2010, 02:05)
I just can't "sort" out the following problem.
I got a list of filenames looking like this:

blabla.ref102.f0
blabla.ref11.f0
blabla.ref110.f0

and I need to sort it by the ref-numbers.
There is no way to get proper fields. Anything I can do about that?
umar ayaz     (16 Jun 2010, 05:29)
Good for understanding
McSort     (14 Jun 2010, 15:47)
to sort multiple files, you can merge data using '-m' flag.

Example:

> sort -m file1.data file2.data file3.data
Bob Rankin     (10 Jun 2010, 08:58)
@checkerbum - Believe it or not, that's the expected result! I didn't believe it either, until I tried it and looked up the specs for the sort command. The trick is to set the LC_ALL environment variable before the sort command.

export LC_ALL=C

Then run your sort command.
checkerbum     (09 Jun 2010, 11:52)
sort ignores some characters while sorting.
The following list is considered by sort to be in order. the '_' is ignored.
I have tried all the switches, -d, -g, -i, -n. none of them gives the desired results.
Is there another sort utility that works as one would expect?

en_aud_sw_digo
enb_m0_digo
enb_mchrg_digo_chrg
en_cmp_vkp_digo
gilberto dos santos alves     (31 Mar 2010, 15:07)
please bob could you make one more explain to sort a file like this file have 3 fields: 1=id of title, 2=id of editor, 3=title of book/article. se that theses files are separated by spaces and the problem is the 3rd field have spaces in. I think if you show these sample all we will understand +m -n. thank you.
=====file start======
66 365 ACCENT, DIALECT AND THE SCHOOL
1454 5436 A COURSE IN MODERN LINGUISTICS
1 30 A COURSE OF PHONETICS
67 370 ACQUIRING LANGUAGE IN CONVERS.
68 375 ACTANTS ET ACTIONS DANS L'EXPRESSION D'UNE REGLE DE JEU
69 377 ACTES DU PREMIER CONGRES INTERNATIONAL DE LINGUISTES
1020 5002 ACTES DU PREMIER CONGRES INTERNATIONAL DE LINGUISTES
70 378 ACTION GESTURE AND SYMBOLI THE EMERGENCE OF LANGUAGE
=====file end======
gilberto dos santos alves     (31 Mar 2010, 14:57)
default sort is by default entire line of file.
hari     (27 Mar 2010, 12:11)
what is the default sort key
Bob Rankin     (17 Mar 2010, 06:17)
I suppose you could launch both sort commands in the background and let them run at the same time...
Gouled     (17 Mar 2010, 04:21)
could i sort 2 files (i.e file1 & file2)simultaniously or I'd have to do each seperately
thank you
interesting article
Bob Rankin     (01 Mar 2010, 06:45)
They are the sort keys. Read the article again, they are explained above.
mauludi     (28 Feb 2010, 23:46)
sort -r +2 -3 company.data > sorted.data

I wonder what is role of +2 here and

+1 in : sort +1 -2 donors.data

thank you in advance

I welcome your comments. However... I am puzzled by many people who say "Please send me the Linux tutorial." This website *is* your Linux Tutorial! Read everything here, learn all you can, ask questions if you like. But don't ask me to send what you already have. :-)

NO SPAM! If you post garbage, it will be deleted, and you will be banned.
*Name:
Email:
Notify me about new comments on this page
Hide my email
*Text:
 
 


Ask Bob Rankin - Free Tech Support


Copyright by - Privacy Policy
All rights reserved - Redistribution is allowed only with permission.

Popular Pages