Cut Command (Linux)

cut [-b] [-c] [-f list] [-n] [-d delim] [-s] [file]

Tips:

[1]  –d and -f

Option -d specified a single character delimiter (in the example above it is a colon) which serves as field separator. Option -f which specifies range of fields included in the output (here fields range from five till the end). Option -d presupposes usage of option -f.

[2]

Extraction of line segments can typically be done by bytes (-b), characters (-c), or fields (-f) separated by a delimiter (-d — the tab character by default). A range must be provided in each case which consists of one of N, N-M, N- (N to the end of the line), or -M (beginning of the line to M), where N and M are counted from 1 (there is no zeroth value). Since version 6, an error is thrown if you include a zeroth value. Prior to this the value was ignored and assumed to be 1

[3]

In case of CSV files, it might be tricky to use the comma (,) as delimiter.  CSV files often contain columns where there is comma embedded within quotes.

[4]  For TAB delimited files, there are 2 options to spedify the tab with the -d flag.

(a) Press Ctrl-v + Tab

cut -f2 -d'   ' infile

(b) or write it like this:

cut -f2 -d$'\t' infile

 

Examples:

[1]

$ head -n 20 ../../datasets/MLdatasets/titanic_train.csv | cut -d “,” -f 2,3

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s