Linux CSV Command Line Tool XSV

In this post, I will introduce an excellent command line tool XSV for the CSV file parsing and data manipulation. You can install the tool from github https://github.com/BurntSushi/xsv


Let us begin the tutorial.


If you are on Centos, following installation commands will work.


yum install cargo

cargo install xsv


By default cargo installs it in /root/.cargo/bin/xsv


If you are not root, you might have to set either alias or add the tool in your bash PATH.


alias xsv=/root/.cargo/bin/xsv

As an example, I would look at the stocks data which I downloaded from Yahoo Finance for the Apple stock.


Ok let us first look at our data.



Read Csv file using Xsv


Ok, now let us try using the Xsv command "xsv table". Xsv table would show the data in nice tabular form.

Instead of Linux 'head -2' command, we can also use Xsv slice command. Xsv slice command takes index number to display a particular row. Let us say, we want to print the 2nd row. Note index starts from 0 which means data at index 1.


We can also flatten the data and display the data in different form as shown below with "xsv flatten" command.

We can also only print the headers of the csv file using "xsv headers" command.

Count number of rows using "xsv count"

cat AAPL.csv | xsv count

252


Concatenate csv files using "xsv cat"

We can also concatenate csv files either by rows or columns. Below command will concatenate or append the rows in two given files.


xsv cat rows AAPL.csv AAPL.csv| xsv count

504


Print Csv stats using "xsv stats" command.

Another very useful command is "xsv stats". This will give you excellent stats about the contents of the data present in the csv file.


How to search in csv file using "xsv search"

Xsv search is very useful especially if you want to search in very large csv files. Let us say we want to display the data for the month of August, 2020 only. Following command will do that for you.



You can also search with regular expressions. To search the data for the months of July and August, do following...


xsv search -s Date '2020-0[7-8]' AAPL.csv | xsv count

27


Sometimes, we just want to look at sample of data at random locations. Use "xsv sample" for that as shown below.



Select columns of csv file using "xsv select"


Another very useful feature of xsv utility is "select" command. Let us say we want to just print data for columns "Date" and "Close".



How to sort Csv file using "xsv sort" command


We can also sort the csv file by column name. Let us sort by column "High".



Notice by default "xsv sort" sorts by ascending order. Let us reverse the order by using --reverse flag.

How to handle large csv files in Linux using "xsv index"


Last but not least, handling large csv files can be very slow without creating the index. With xsv we can create index and that can speed up the csv handling on the command line.


xsv index AAPL.csv

Above command would create an index file which xsv uses to speed up the operations on the csv file.


ls AAPL.csv*

AAPL.csv AAPL.csv.idx

As we can see above, "xsv index" has created an index file "AAPL.csv.idx" which it would use on any subsequent commands we run.


Wrap Up: 


Xsv is a great tool for performing lot of basic operations on the csv file at Linux command line. You can perform joins using xsv too. Check out the github link above for more information.

Comments