banner

For a full list of BASHing data blog posts see the index page.     RSS


How to stack columns


A simple case. Here's "fileA", a tab-separated table (TSV) with some missing values:

aaagggmmmsssyyy
bbbhhhnnntttzzz
ccciiiooouuu
dddjjjvvv
kkkqqqwww
ffflllrrrxxx

The easiest way I know to stack up the columns (column 1 on top of column 2 on top of column 3, etc) is to first get the number of columns from the first line (I use AWK for that job), then print the results of progressively cutting out each column:

cols=$(awk -F"\t" 'NR==1 {print NF}' fileA)
 
for ((i=1;i<="$cols";i++)); do cut -f"$i" fileA; done

stack1

Not so simple. Stacking gets a little trickier when adjoining columns need to be stacked together. For example, here's "fileB", again tab-separated, with each group of letters having a corresponding number:

aaa001ggg007mmm013sss019yyy025
bbb002hhh008nnn014ttt020zzz026
ccc003iii009ooo015uuu021
ddd004jjj010vvv022
kkk011qqq017www023
fff006lll012rrr018xxx024

A simple approach is to loop through the odd-numbered columns, cutting and stacking the odd-numbered columns together with the next even-numbered ones:

cols=$(awk -F"\t" 'NR==1 {print NF}' fileB)
 
for ((i=1;i<="$cols";i+=2)); \
do cut -f"$i","$(($i+1))" fileB; done

stack2

Last update: 2020-11-25
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License