Mastering awk Fields and Separators: A Practical Introduction

Introduction

awk is a powerful command specialized in text processing, frequently used in log analysis and data formatting.

Understanding the concepts of fields and separators is the first hurdle to mastering awk.

There are many points where beginners tend to get stuck, so it is important to learn while imagining the actual behavior.

This article explains awk from the basics to advanced usage in a step-by-step manner.

Reference: GNU awk

Behavior and Notes on Default Delimiters (Spaces and Tabs)

Creating the File

cat << 'EOF' > input.txt
apple orange banana
cat	dog	mouse
one  two   three
EOF

※ To input a tab, press Ctrl+v and then press the Tab key.

Command

awk '{print $1, $2, $3}' input.txt

Output

apple orange banana
cat dog mouse
one two three

Command

awk '{print NF}' input.txt

Output

3
3
3

Command

awk '{print "["$2"]"}' input.txt

Output

[orange]
[dog]
[two]

How It Works

Item	Description
Default delimiter	Spaces and tabs
Handling of consecutive delimiters	Consecutive spaces are treated as a single delimiter
Field variables	$1, $2, $3 ...
Number of fields	Obtainable with NF
Specifying delimiter	Can be changed with the -F option

Explanation

The default delimiter in awk is spaces and tabs, and it is important to note that consecutive spaces are treated as one. Therefore, fields are correctly split regardless of the number of visible spaces.

Specifying Field Delimiters Using the -F Option

Creating the File

cat << 'EOF' > input.txt
name,age,city
Alice,25,Tokyo
Bob,30,Osaka
Charlie,35,Nagoya
EOF

Command

awk -F ',' '{print $1, $2}' input.txt

Output

name age
Alice 25
Bob 30
Charlie 35

Command

awk -F ',' 'NR > 1 {print $3}' input.txt

Output

Tokyo
Osaka
Nagoya

How It Works

Element	Description
awk	A command that processes text by rows and columns
-F ','	Specifies comma as the field delimiter
$1, $2, $3	Represent the 1st, 2nd, and 3rd columns respectively
NR > 1	Condition to exclude the first row (header)
{print ...}	Process to output the specified fields

Explanation

Using the -F option allows you to flexibly change the delimiter. This is an essential basic feature especially for processing CSV-like data.

Defining Delimiters Within a Script Using the Built-in Variable FS

Creating the File

cat << 'EOF' > input.txt
apple,banana,orange
dog,cat,bird
EOF

Command

awk 'BEGIN { FS="," } { print $1, $2 }' input.txt

Output

apple banana
dog cat

Command

awk -F',' '{ print $1, $3 }' input.txt

Output

apple orange
dog bird

How It Works

Element	Description
FS	Built-in variable that defines the field delimiter
BEGIN	A block executed only once before input processing
$1, $2	Data of the 1st and 2nd columns after splitting
-F	Option to specify the delimiter from the command line

Explanation

By setting FS, awk splits each line using the specified delimiter. The delimiter can be flexibly changed either within BEGIN or with the -F option.

Using Multiple Different Characters (Comma, Tab, Semicolon, etc.) as Delimiters Simultaneously

Creating the File

cat << 'EOF' > input.txt
apple,orange;grape	banana
dog;cat,bird	fish
EOF

※ To input a tab, press Ctrl+v and then press the Tab key.

Command

awk -F '[,;\t]' '{print $1, $2, $3, $4}' input.txt

Output

apple orange grape banana
dog cat bird fish

How It Works

Element	Description
-F	Specifies the field delimiter (Field Separator)
[,;\t]	A regex character class (specifying comma, semicolon, and tab simultaneously)
$1, $2...	References each field after splitting
awk	A command that processes text on a per-field basis

Explanation

In awk, using a regular expression with -F allows you to handle multiple delimiters simultaneously. Using a character class [] is the simplest and most practical approach.

Flexible Field Splitting Techniques Using Regular Expressions

Creating the File

cat << 'EOF' > input.txt
name:John, age=25; city Tokyo
name:Alice, age=30; city Osaka
name:Bob, age=22; city Nagoya
EOF

Command

awk -F '[:,=; ]+' '{print $2, $4, $6}' input.txt

Output

John 25 Tokyo
Alice 30 Osaka
Bob 22 Nagoya

Command

awk -F '[,;]' '{print $1 "|" $2 "|" $3}' input.txt

Output

name:John| age=25| city Tokyo
name:Alice| age=30| city Osaka
name:Bob| age=22| city Nagoya

How It Works

Element	Description
-F	Specifies the field delimiter
[:,=; ]+	A regex that treats colon, comma, equals, semicolon, and space together as a delimiter
$1, $2 ...	References fields after splitting
[ ,; ]	Splits by comma and semicolon
+	Treats consecutive delimiters as one

Explanation

By using regular expressions as delimiters in awk, you can flexibly split data in multiple formats. This is extremely effective for parsing complex logs and mixed-format data.

Using the Built-in Variable OFS to Control Output Delimiters

Creating the File

cat << 'EOF' > input.txt
apple orange banana
dog cat mouse
EOF

Command

awk '{print $1, $2, $3}' input.txt

Output

apple orange banana
dog cat mouse

Command

awk 'BEGIN {OFS=","} {print $1, $2, $3}' input.txt

Output

apple,orange,banana
dog,cat,mouse

Command

awk 'BEGIN {OFS="\t"} {print $1, $2, $3}' input.txt

Output

apple	orange	banana
dog	cat	mouse

How It Works

Element	Description
FS	Input field delimiter (default is space)
OFS	Output field delimiter (used when printing)
$1, $2...	Field (column) references
BEGIN	A block executed only once before input processing
print	Outputs fields joined by OFS

Explanation

The output delimiter is controlled by OFS and is applied when multiple fields are specified with print.
By combining it with FS, you can flexibly manipulate both input and output formats.

Splitting and Processing One Character at a Time Using an Empty String as a Delimiter

Creating the File

cat << 'EOF' > input.txt
Hello
EOF

Command

awk -v FS="" '{ for(i=1;i<=NF;i++) print $i }' input.txt

Output

H
e
l
l
o

How It Works

Element	Description
FS=""	Sets the delimiter to an empty string (splits on each character)
NF	Number of fields (= number of characters)
$i	The i-th character
for loop	Iterates through one character at a time

Explanation

Specifying FS="" in awk causes each character to be treated as a field. This makes it easy to process data one character at a time.

Settings for Handling Fixed-width Data

Creating the File

cat << 'EOF' > input.txt
John      25Engineer 
Alice     30Designer 
Bob       22Student  
EOF

Command

awk '{name=substr($0,1,10); age=substr($0,11,2); job=substr($0,13,9); print name, age, job}' input.txt

Output

John       25 Engineer
Alice      30 Designer
Bob        22 Student

Command

awk '{print $1, $2}' input.txt

Output

John 25Engineer
Alice 30Designer
Bob 22Student

How It Works

Item	Description
substr($0,1,10)	Characters 1 through 10 (name)
substr($0,11,2)	Characters 11 through 12 (age)
substr($0,13,9)	Characters 13 through 21 (job)
$0	The entire line as a string
Standard awk	Processes using spaces as the field separator

Explanation

substr allows you to extract fixed positions. The basic approach for fixed-width data is to handle it by character position.

Extracting and Formatting Specific Columns from System Logs

Creating the File

cat << 'EOF' > input.txt
2026-05-01 INFO user1 login
2026-05-01 ERROR user2 failed
2026-05-02 INFO user3 logout
EOF

Command

awk '{print $1, $3}' input.txt

Output

2026-05-01 user1
2026-05-01 user2
2026-05-02 user3

Command

awk -F ' ' '{print $2, $4}' input.txt

Output

INFO login
ERROR failed
INFO logout

How It Works

Element	Description
awk	A text processing tool
-F	Specifies the field delimiter (field separator)
$1, $2...	Represents each column (field)
print	Outputs the specified columns

Explanation

awk splits columns using a delimiter and can extract only the needed fields.
This is extremely effective for extracting specific columns in log analysis.

Summary: Key Points for Using Fields and Separators in awk

Understanding fields and separators is essential to mastering awk.

By appropriately using -F, FS, and OFS, you can flexibly handle everything from data splitting to formatting.

Furthermore, combining regular expressions makes it possible to handle complex, real-world data.

For beginners, it is important to first understand the default behavior and then gradually progress to more advanced usage.

Articles on how to use awk other than with the “field separator”

The following link is an article about the awk command.

Please make use of it if you want to learn comprehensively.

Mastering the awk Command

Introduction

Behavior and Notes on Default Delimiters (Spaces and Tabs)

Creating the File

Command

Output

Command

Output

Command

Output

How It Works

Explanation

Specifying Field Delimiters Using the -F Option

Creating the File

Command

Output

Command

Output

How It Works

Explanation

Defining Delimiters Within a Script Using the Built-in Variable FS

Creating the File

Command

Output

Command

Output

How It Works

Explanation

Using Multiple Different Characters (Comma, Tab, Semicolon, etc.) as Delimiters Simultaneously

Creating the File

Command

Output

How It Works

Explanation

Flexible Field Splitting Techniques Using Regular Expressions

Creating the File

Command

Output

Command

Output

How It Works

Explanation

Using the Built-in Variable OFS to Control Output Delimiters

Creating the File

Command

Output

Command

Output

Command

Output

How It Works

Explanation

Splitting and Processing One Character at a Time Using an Empty String as a Delimiter

Creating the File

Command

Output

How It Works

Explanation

Settings for Handling Fixed-width Data

Creating the File

Command

Output

Command

Output

How It Works

Explanation

Extracting and Formatting Specific Columns from System Logs

Creating the File

Command

Output

Command

Output

How It Works

Explanation

Summary: Key Points for Using Fields and Separators in awk

Articles on how to use awk other than with the “field separator”

Related Posts:

Leave a Reply Cancel reply