I have a CSV file like this:

1 2 3 4 5 6 | fish,4 cat,1 elephant,1 tree,2 dog,8 car,10 |

`awk -F',' '{print length($1),$0}' file.csv | sort -k1nr | cut -d' ' -f 2-`

will sort the file by word length, for all words appearing in the first column:

1 2 3 4 5 6 | elephant,1 fish,4 tree,2 cat,1 dog,8 car,10 |

`sort -t, -k+2 -n -r file.csv`

will sort the file from greatest to least according to the number appearing in the second column:

1 2 3 4 5 6 | car,10 dog,8 fish,4 tree,2 elephant,1 cat,1 |

How can I use these two commands together such that the CSV file is first sorted by word length, according to the words appearing in the first column, then any rows containing words of equal length within the first column are sorted according to the number appearing in the second column from greatest to least. The resulting output would look like this:

1 2 3 4 5 6 | elephant,1 fish,4 tree,2 car,10 dog,8 cat,1 |

How can these two sorting methods be used together?

If you are using gawk then you can use the asort function to perform sort, so no other utility has to be called. You can try something like this:

1 2 3 4 5 6 | awk -F, 'function cmp(i1,v1,i2,v2) {split(v1,a1); split(v2,a2) l1=length(a1[1]); l2=length(a2[1]) return l1 > l2 ? -1 : l1 < l2 ? 1 : a1[2] > a2[2] ? -1 : a1[2] < a2[2] } {a[n++]=$0} END{asort(a,a,"cmp"); for(i in a) print a[i]}' infile |

Output:

1 2 3 4 5 6 | elephant,1 fish,4 tree,2 car,10 dog,8 cat,1 |

This script reads all the lines first then it sorts the array called `a`

with the function `cmp`

. The only trick I used that `a > b`

returns the usual 1 or 0 for `true`

or `false`

.

A little bit shorter version in perl:

1 2 3 | perl -F, -ane 'push @a,[@F]; END{for $i(sort {length $b->[0]<=>length $a->[0] or $b->[1]<=>$a->[1]} @a) {printf "%s,%d\n", @$i} }' infile |

This is not 100% correct as `$F[1]`

contains the `\n`

, yet `printf`

handles it properly.