copied to clipboard!
string awk

Mastering awk Fields and Separators: A Practical Introduction

updated: 2026/05/05 created: 2026/05/04

Introduction

awk is a powerful command specialized in text processing, frequently used in log analysis and data formatting.

Understanding the concepts of fields and separators is the first hurdle to mastering awk.

There are many points where beginners tend to get stuck, so it is important to learn while imagining the actual behavior.

This article explains awk from the basics to advanced usage in a step-by-step manner.

Reference: GNU awk

Behavior and Notes on Default Delimiters (Spaces and Tabs)

Creating the File

cat << 'EOF' > input.txt apple orange banana cat dog mouse one two three EOF

※ To input a tab, press Ctrl+v and then press the Tab key.

Command

awk '{print $1, $2, $3}' input.txt

Output

apple orange banana
cat dog mouse
one two three

Command

awk '{print NF}' input.txt

Output

3
3
3

Command

awk '{print "["$2"]"}' input.txt

Output

[orange]
[dog]
[two]

How It Works

ItemDescription
Default delimiterSpaces and tabs
Handling of consecutive delimitersConsecutive spaces are treated as a single delimiter
Field variables$1, $2, $3 ...
Number of fieldsObtainable with NF
Specifying delimiterCan be changed with the -F option

Explanation

The default delimiter in awk is spaces and tabs, and it is important to note that consecutive spaces are treated as one. Therefore, fields are correctly split regardless of the number of visible spaces.

Specifying Field Delimiters Using the -F Option

Creating the File

cat << 'EOF' > input.txt name,age,city Alice,25,Tokyo Bob,30,Osaka Charlie,35,Nagoya EOF

Command

awk -F ',' '{print $1, $2}' input.txt

Output

name age
Alice 25
Bob 30
Charlie 35

Command

awk -F ',' 'NR > 1 {print $3}' input.txt

Output

Tokyo
Osaka
Nagoya

How It Works

ElementDescription
awkA command that processes text by rows and columns
-F ','Specifies comma as the field delimiter
$1, $2, $3Represent the 1st, 2nd, and 3rd columns respectively
NR > 1Condition to exclude the first row (header)
{print ...}Process to output the specified fields

Explanation

Using the -F option allows you to flexibly change the delimiter. This is an essential basic feature especially for processing CSV-like data.

Defining Delimiters Within a Script Using the Built-in Variable FS

Creating the File

cat << 'EOF' > input.txt apple,banana,orange dog,cat,bird EOF

Command

awk 'BEGIN { FS="," } { print $1, $2 }' input.txt

Output

apple banana
dog cat

Command

awk -F',' '{ print $1, $3 }' input.txt

Output

apple orange
dog bird

How It Works

ElementDescription
FSBuilt-in variable that defines the field delimiter
BEGINA block executed only once before input processing
$1, $2Data of the 1st and 2nd columns after splitting
-FOption to specify the delimiter from the command line

Explanation

By setting FS, awk splits each line using the specified delimiter. The delimiter can be flexibly changed either within BEGIN or with the -F option.

Using Multiple Different Characters (Comma, Tab, Semicolon, etc.) as Delimiters Simultaneously

Creating the File

cat << 'EOF' > input.txt apple,orange;grape banana dog;cat,bird fish EOF

※ To input a tab, press Ctrl+v and then press the Tab key.

Command

awk -F '[,;\t]' '{print $1, $2, $3, $4}' input.txt

Output

apple orange grape banana
dog cat bird fish

How It Works

ElementDescription
-FSpecifies the field delimiter (Field Separator)
[,;\t]A regex character class (specifying comma, semicolon, and tab simultaneously)
$1, $2...References each field after splitting
awkA command that processes text on a per-field basis

Explanation

In awk, using a regular expression with -F allows you to handle multiple delimiters simultaneously. Using a character class [] is the simplest and most practical approach.

Flexible Field Splitting Techniques Using Regular Expressions

Creating the File

cat << 'EOF' > input.txt name:John, age=25; city Tokyo name:Alice, age=30; city Osaka name:Bob, age=22; city Nagoya EOF

Command

awk -F '[:,=; ]+' '{print $2, $4, $6}' input.txt

Output

John 25 Tokyo
Alice 30 Osaka
Bob 22 Nagoya

Command

awk -F '[,;]' '{print $1 "|" $2 "|" $3}' input.txt

Output

name:John| age=25| city Tokyo
name:Alice| age=30| city Osaka
name:Bob| age=22| city Nagoya

How It Works

ElementDescription
-FSpecifies the field delimiter
[:,=; ]+A regex that treats colon, comma, equals, semicolon, and space together as a delimiter
$1, $2 ...References fields after splitting
[ ,; ]Splits by comma and semicolon
+Treats consecutive delimiters as one

Explanation

By using regular expressions as delimiters in awk, you can flexibly split data in multiple formats. This is extremely effective for parsing complex logs and mixed-format data.

Using the Built-in Variable OFS to Control Output Delimiters

Creating the File

cat << 'EOF' > input.txt apple orange banana dog cat mouse EOF

Command

awk '{print $1, $2, $3}' input.txt

Output

apple orange banana
dog cat mouse

Command

awk 'BEGIN {OFS=","} {print $1, $2, $3}' input.txt

Output

apple,orange,banana
dog,cat,mouse

Command

awk 'BEGIN {OFS="\t"} {print $1, $2, $3}' input.txt

Output

apple	orange	banana
dog	cat	mouse

How It Works

ElementDescription
FSInput field delimiter (default is space)
OFSOutput field delimiter (used when printing)
$1, $2...Field (column) references
BEGINA block executed only once before input processing
printOutputs fields joined by OFS

Explanation

The output delimiter is controlled by OFS and is applied when multiple fields are specified with print.
By combining it with FS, you can flexibly manipulate both input and output formats.

Splitting and Processing One Character at a Time Using an Empty String as a Delimiter

Creating the File

cat << 'EOF' > input.txt Hello EOF

Command

awk -v FS="" '{ for(i=1;i<=NF;i++) print $i }' input.txt

Output

H
e
l
l
o

How It Works

ElementDescription
FS=""Sets the delimiter to an empty string (splits on each character)
NFNumber of fields (= number of characters)
$iThe i-th character
for loopIterates through one character at a time

Explanation

Specifying FS="" in awk causes each character to be treated as a field. This makes it easy to process data one character at a time.

Settings for Handling Fixed-width Data

Creating the File

cat << 'EOF' > input.txt John 25Engineer Alice 30Designer Bob 22Student EOF

Command

awk '{name=substr($0,1,10); age=substr($0,11,2); job=substr($0,13,9); print name, age, job}' input.txt

Output

John       25 Engineer
Alice      30 Designer
Bob        22 Student

Command

awk '{print $1, $2}' input.txt

Output

John 25Engineer
Alice 30Designer
Bob 22Student

How It Works

ItemDescription
substr($0,1,10)Characters 1 through 10 (name)
substr($0,11,2)Characters 11 through 12 (age)
substr($0,13,9)Characters 13 through 21 (job)
$0The entire line as a string
Standard awkProcesses using spaces as the field separator

Explanation

substr allows you to extract fixed positions. The basic approach for fixed-width data is to handle it by character position.

Extracting and Formatting Specific Columns from System Logs

Creating the File

cat << 'EOF' > input.txt 2026-05-01 INFO user1 login 2026-05-01 ERROR user2 failed 2026-05-02 INFO user3 logout EOF

Command

awk '{print $1, $3}' input.txt

Output

2026-05-01 user1
2026-05-01 user2
2026-05-02 user3

Command

awk -F ' ' '{print $2, $4}' input.txt

Output

INFO login
ERROR failed
INFO logout

How It Works

ElementDescription
awkA text processing tool
-FSpecifies the field delimiter (field separator)
$1, $2...Represents each column (field)
printOutputs the specified columns

Explanation

awk splits columns using a delimiter and can extract only the needed fields.
This is extremely effective for extracting specific columns in log analysis.

Summary: Key Points for Using Fields and Separators in awk

Understanding fields and separators is essential to mastering awk.

By appropriately using -F, FS, and OFS, you can flexibly handle everything from data splitting to formatting.

Furthermore, combining regular expressions makes it possible to handle complex, real-world data.

For beginners, it is important to first understand the default behavior and then gradually progress to more advanced usage.

Leave a Reply

Your email address will not be published. Required fields are marked *

©︎ 2025-2026 running terminal commands