awk - Unexpected outcome from awk when there is multiple delimiter

For example, 

$ echo "tom+|dick+|and+|harry" | awk -F '+|' '{print $2}'

|dick

or

$ echo "tom | dick | and | harry" | awk -F ' | ' '{print $2}'
|

In the above example, one may expect the output as dick. However, both awk output seems unexpected.

The reason is - awk -F (as least more recent version of Linux) perform regex. The above 

+ mean matches character before + one or more times
| mean OR operator

as a result, above output is given due to regex matching.

If you want to literal matching of the delimiter, you need to escape the regex operator with \\

$ echo "tom+|dick+|and+|harry" | awk -F '\\+\\|' '{print $2}'
dick

or

$ echo "tom | dick | and | harry" | awk -F ' \\| ' '{print $2}'
dick


Reference

https://www.computerhope.com/jargon/r/regex.htm

Comments

Popular Posts