Introduction
The awk split function is a convenient feature that splits a string by a specified delimiter and stores the results in an array.
It can be used in a wide range of situations, including log analysis, CSV processing, and configuration file inspection.
This article explains the awk split function from the basics to practical usage, including common pitfalls for beginners.
Reference: GNU awk
Basic Syntax for Splitting Strings with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,orange,banana
EOF
Command
awk '{
split($0, fruits, ",")
print fruits[1]
print fruits[2]
print fruits[3]
}' input.txt
Output
apple
orange
banana
How It Works
| Item | Description |
|---|---|
| Role of split | Splits a string by a delimiter |
| 1st argument | The string to split |
| 2nd argument | The array to store the split values |
| 3rd argument | The delimiter |
| Array access | Retrieve values like fruits[1] |
Explanation
Using awk's split function, you can easily split CSV-style strings.
Because the results are stored in an array, it also works well in combination with loop processing.
How to Store Split Results in an Array with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,banana,orange
EOF
Command
awk '{
n = split($0, fruits, ",")
for (i = 1; i <= n; i++) {
print "fruits[" i "] = " fruits[i]
}
}' input.txt
Output
fruits[1] = apple
fruits[2] = banana
fruits[3] = orange
How It Works
| Item | Description |
|---|---|
| split function | Splits a string by a specified delimiter |
| 1st argument | The string to split ($0) |
| 2nd argument | The array to store the results (fruits) |
| 3rd argument | The delimiter (",") |
| Return value | The number of elements after splitting (n) |
Explanation
Using awk's split function, you can easily store strings into an array.
It is commonly used for processing delimited data such as CSV formats and log analysis.
How to Use the Return Value of awk’s split Function
Create File
cat << 'EOF' > input.txt
apple orange banana grape
EOF
Command
awk '{
count = split($0, fruits, " ")
print "Count:", count
for (i = 1; i <= count; i++) {
print i ":" fruits[i]
}
}' input.txt
Output
Count: 4
1:apple
2:orange
3:banana
4:grape
How It Works
| Item | Description |
|---|---|
| split function | Splits a string by a delimiter |
| Return value | Returns the number of elements after splitting |
| fruits array | Array storing the split values |
| " " | Split by space |
| count variable | Holds the return value of split |
Explanation
Using the return value of split, you can accurately obtain the number of array elements.
Combining it with loop processing allows you to handle split data flexibly.
Behavior When the 3rd Argument of awk’s split Function Is Omitted
Create File
cat << 'EOF' > input.txt
apple orange grape
dog cat mouse
red blue green
EOF
Command
awk '{
n = split($0, arr)
printf "count=%d ", n
for (i = 1; i <= n; i++) {
printf "[%s] ", arr[i]
}
print ""
}' input.txt
Output
count=3 [apple] [orange] [grape]
count=3 [dog] [cat] [mouse]
count=3 [red] [blue] [green]
How It Works
| Item | Description |
|---|---|
| Function | split(string, array) |
| When 3rd argument is omitted | Uses the value of FS as the delimiter |
| FS in this case | Default whitespace character |
| Return value | Number of elements after splitting |
| Storage | Stored sequentially from arr[1] |
Explanation
When the 3rd argument of split is omitted, awk's field separator FS is automatically used.
By default, consecutive spaces and tabs are used as the delimiter.
How to Split a Space-Delimited String with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple orange grape
EOF
Command
awk '{
split($0, arr, " ")
print arr[1]
print arr[2]
print arr[3]
}' input.txt
Output
apple
orange
grape
How It Works
| Item | Description |
|---|---|
| split function | split($0, arr, " ") |
| $0 | The entire input line |
| arr | Array to store the split values |
| " " | Split by space |
Explanation
Using awk's split function, you can easily split a space-delimited string into an array.
After splitting, each element can be accessed by its array index, such as arr[1].
How to Split Comma-Delimited CSV with awk’s split Function
Create File
cat << 'EOF' > input.txt
name,age,city
Alice,25,Tokyo
Bob,30,Osaka
Charlie,22,Nagoya
EOF
Command
awk -F',' '{
split($0, data, ",")
print "Name=" data[1] ", Age=" data[2] ", City=" data[3]
}' input.txt
Output
Name=name, Age=age, City=city
Name=Alice, Age=25, City=Tokyo
Name=Bob, Age=30, City=Osaka
Name=Charlie, Age=22, City=Nagoya
How It Works
| Item | Description |
|---|---|
| Delimiter specification | -F',' sets the CSV delimiter to comma |
| split function | split($0, data, ",") |
| $0 | Represents the entire line |
| data[1] | Value of the 1st column |
| data[2] | Value of the 2nd column |
| data[3] | Value of the 3rd column |
Explanation
Using awk split, each CSV column can be handled as an array element.
It is characterized by its concise syntax for processing and extracting comma-delimited data.
How to Split Tab-Delimited Data with awk’s split Function
Create File
cat << 'EOF' > input.txt
id name score
1 Alice 80
2 Bob 92
3 Carol 75
EOF
Command
awk '{
split($0, data, "\t")
print "ID=" data[1] ", NAME=" data[2] ", SCORE=" data[3]
}' input.txt
Output
ID=id, NAME=name, SCORE=score
ID=1, NAME=Alice, SCORE=80
ID=2, NAME=Bob, SCORE=92
ID=3, NAME=Carol, SCORE=75
How It Works
| Item | Description |
|---|---|
| Command | split($0, data, "\t") |
| $0 | Represents the entire line |
| data | Array to store the split values |
| "\t" | Specifies tab as delimiter |
| data[1] | Value of the 1st column |
| data[2] | Value of the 2nd column |
| data[3] | Value of the 3rd column |
Explanation
Using awk's split function, tab-delimited data can be easily split into an array.
awk split is frequently used for TSV file and log data analysis.
How to Split a String Using Multiple Delimiters with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,orange;grape
EOF
Command
awk '
{
split($0, arr, /[,;]/)
for (i = 1; i <= length(arr); i++) {
print arr[i]
}
}
' input.txt
Output
apple
orange
grape
How It Works
| Item | Description |
|---|---|
| split function | Splits a string into an array |
| 1st argument | The string to split |
| 2nd argument | Array to store the results |
| 3rd argument | Regular expression for the delimiter |
| [,;] | Split on either , or ; |
| arr[i] | Retrieve each split element |
Explanation
By specifying a regular expression as the 3rd argument of split, multiple delimiters can be handled together.
Using awk split allows you to efficiently process CSV and log data.
How to Split a String Using a Regular Expression with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,,orange;;;grape:melon
EOF
Command
awk '{
n = split($0, arr, /[,;:]+/)
for (i = 1; i <= n; i++) {
print arr[i]
}
}' input.txt
Output
apple
orange
grape
melon
How It Works
| Item | Description |
|---|---|
| Role of split | Splits a string into an array using split() |
| Regular expression | /[,;:]+/ specifies , ; : together as delimiters |
| Meaning of + | Treats consecutive delimiters as one |
| Array storage | Split values are stored from arr[1] onward |
| Return value | split() returns the number of elements |
Explanation
With awk's split, specifying a regular expression as the 3rd argument enables flexible string splitting.
Being able to process multiple types of delimiters together makes it useful for log analysis and CSV processing.
How to Loop Through an Array with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,banana,orange
EOF
Command
awk '{
n = split($0, fruits, ",")
for (i = 1; i <= n; i++) {
print "fruits[" i "] = " fruits[i]
}
}' input.txt
Output
fruits[1] = apple
fruits[2] = banana
fruits[3] = orange
How It Works
| Item | Description |
|---|---|
| split function | Splits a string by delimiter and stores in an array |
| 1st argument | The string to split |
| 2nd argument | Destination array |
| 3rd argument | The delimiter |
| Return value | Number of elements after splitting |
| Loop processing | for loop iterates from 1 to element count |
Explanation
Using awk's split function, strings can be easily converted into arrays.
Using the element count as the return value allows safe loop processing of the array.
How to Combine awk’s split Function with a for Loop
Create File
cat << 'EOF' > input.txt
apple,banana,orange
grape,melon,peach
EOF
Command
awk '{
n = split($0, fruits, ",")
for (i = 1; i <= n; i++) {
print "Element" i ":" fruits[i]
}
}' input.txt
Output
Element1:apple
Element2:banana
Element3:orange
Element1:grape
Element2:melon
Element3:peach
How It Works
| Item | Description |
|---|---|
| split function | Splits a string and stores it in an array |
| Return value n | Returns the number of split elements |
| $0 | Represents the entire input line |
| fruits | Array generated by split |
| for (i = 1; i <= n; i++) | Iterates sequentially for the number of elements |
| fruits[i] | Retrieves the value in the array |
Explanation
Storing the return value of split into variable n allows precise management of the array element count.
Using n in a for loop ensures that loop processing runs the exact number of times needed.
How to Extract a Specific Element with awk’s split Function
Create File
cat << 'EOF' > input.txt
apple,banana,orange,grape
EOF
Command
awk -F',' '{
split($0, fruits, ",")
print fruits[2]
}' input.txt
Output
banana
How It Works
| Item | Description |
|---|---|
| split($0, fruits, ",") | Splits entire line by , and stores in array fruits |
| fruits[2] | Retrieves the 2nd element |
| -F',' | Specifies , as the input field separator |
| Outputs the specified element |
Explanation
Using split allows strings to be treated as arrays, making it easy to retrieve specific elements. It is a convenient method frequently used for CSV data processing and log analysis.
How to Split Keys and Values with awk’s split Function
Create File
cat << 'EOF' > input.txt
name:Tanaka age:30 city:Tokyo
EOF
Command
awk '{
for (i = 1; i <= NF; i++) {
split($i, data, ":")
print "Key=" data[1] ", Value=" data[2]
}
}' input.txt
Output
Key=name, Value=Tanaka
Key=age, Value=30
Key=city, Value=Tokyo
How It Works
| How It Works | Command |
|---|---|
| Split ":" delimited string with split($i, data, ":") | split($i, data, ":") |
| Store key in data[1] and value in data[2] | print data[1], data[2] |
| Process each field sequentially with a for loop | for (i = 1; i <= NF; i++) |
Explanation
Using awk's split function, strings can be easily split by a delimiter.
It is commonly used for key-value data analysis and log processing.
How to Re-Split a Specific CSV Column with awk’s split Function
Create File
cat << 'EOF' > input.txt
id,name,data
1,Alice,red|blue|green
2,Bob,yellow|white
3,Carol,black|pink|orange
EOF
Command
awk -F',' '{
split($3, arr, "|")
print $1, $2, arr[1], arr[2], arr[3]
}' input.txt
Output
id name data
1 Alice red blue green
2 Bob yellow white
3 Carol black pink orange
How It Works
| Item | Description |
|---|---|
| Delimiter | -F',' reads the CSV with comma as separator |
| split function | split($3, arr, "|") |
| Target column | $3 specifies the 3rd column |
| Array storage | Refer to split results via arr[1], arr[2], etc. |
Explanation
Using awk's split function, a specific CSV column can be further split by another delimiter.
Because results are stored in an array, needed elements can be flexibly retrieved.
How to Combine awk’s split Function with gsub for String Processing
Create File
cat << 'EOF' > input.txt
apple,orange,banana
grape,,melon
lemon,lime
EOF
Command
awk '{
n = split($0, arr, ",")
for (i = 1; i <= n; i++) {
gsub(/^ +| +$/, "", arr[i])
if (arr[i] == "")
arr[i] = "EMPTY"
printf "[%s]", arr[i]
}
print ""
}' input.txt
Output
[apple][orange][banana]
[grape][EMPTY][melon]
[lemon][lime]
How It Works
| Process | Description |
|---|---|
| split($0, arr, ",") | Splits the string by comma into an array |
| gsub(/^ +| +$/, "", arr[i]) | Removes leading and trailing whitespace |
| arr[i] == "" | Checks for an empty string |
| printf "[%s]", arr[i] | Outputs the processed string formatted |
Explanation
Processing split elements with gsub enables flexible string handling.
It is a technique commonly used for formatting CSV-style data and log processing.
How to Combine awk’s split Function with substr
Create File
cat << 'EOF' > input.txt
apple,banana,grape
orange,melon,lemon
EOF
Command
awk -F',' '{
split($0, arr, ",")
print substr(arr[2], 1, 3)
}' input.txt
Output
ban
mel
How It Works
| Item | Description |
|---|---|
| split | Splits the string by delimiter , into an array |
| arr[2] | Retrieves the 2nd element after splitting |
| substr | Extracts a portion of a string |
| substr(arr[2],1,3) | Retrieves the first 3 characters of the 2nd element |
Explanation
Processing a split string further with substr enables flexible extraction of specific portions.
It is a commonly used combination for CSV text processing and log analysis.
How to Analyze Log Files with awk’s split Function
Create File
cat << 'EOF' > input.txt
2026-05-14,INFO,user=alice,ip=192.168.1.10
2026-05-14,ERROR,user=bob,ip=192.168.1.20
2026-05-14,WARN,user=charlie,ip=192.168.1.30
EOF
Command
awk -F',' '{
split($3, user, "=")
split($4, ip, "=")
printf "USER:%s IP:%s STATUS:%s\n", user[2], ip[2], $2
}' input.txt
Output
USER:alice IP:192.168.1.10 STATUS:INFO
USER:bob IP:192.168.1.20 STATUS:ERROR
USER:charlie IP:192.168.1.30 STATUS:WARN
How It Works
| How It Works | Command |
|---|---|
| Specify comma delimiter with -F',' | awk -F',' '{print $1}' input.txt |
| Store = delimited values into array with split() | awk -F',' '{split($3,a,"="); print a[2]}' input.txt |
| Format and display with printf | awk -F',' '{printf "%s %s\n",$2,$3}' input.txt |
Explanation
Using awk's split, values in CSV-formatted logs can be flexibly decomposed.
Combining multiple delimiters enables concise implementation of practical log analysis.
How to Analyze Configuration Files with awk’s split Function
Create File
cat << 'EOF' > input.txt
server=app01
port=8080
timeout=30
log_level=debug
EOF
Command
awk -F= '{
split($0, config, "=")
printf "key=%s value=%s\n", config[1], config[2]
}' input.txt
Output
key=server value=app01
key=port value=8080
key=timeout value=30
key=log_level value=debug
How It Works
| Item | Description |
|---|---|
| -F= | Sets = as the delimiter |
| split($0, config, "=") | Splits the entire line by = and stores in an array |
| config[1] | Retrieves the key name |
| config[2] | Retrieves the value |
| printf | Formats and outputs the result |
Explanation
Using awk's split function, configuration file format data can be easily parsed.
Because keys and values can be handled separately, it is also applicable to log analysis and environment variable inspection.
How to Batch Process Multiple Files with awk’s split Function
Create File
cat << 'EOF' > input.txt
sales_2024.csv:Tokyo,120,150,180
sales_2025.csv:Osaka,200,210,250
report_01.csv:Nagoya,90,100,110
EOF
Create File
cat << 'EOF' > run.sh
awk -F ':' '
{
n = split($2, data, ",")
printf "File: %s\n", $1
printf "Count: %d\n", n
for (i = 1; i <= n; i++) {
printf " data[%d] = %s\n", i, data[i]
}
print ""
}
' input.txt
EOF
Command
chmod +x run.sh
./run.sh
Output
File: sales_2024.csv
Count: 4
data[1] = Tokyo
data[2] = 120
data[3] = 150
data[4] = 180
File: sales_2025.csv
Count: 4
data[1] = Osaka
data[2] = 200
data[3] = 210
data[4] = 250
File: report_01.csv
Count: 4
data[1] = Nagoya
data[2] = 90
data[3] = 100
data[4] = 110
How It Works
| Item | Description |
|---|---|
| split() | Splits a string and stores it in an array |
| n = split(...) | Retrieves the number of split elements |
| for (i = 1; i <= n; i++) | Processes sequentially using return value n |
| data[i] | Array element after split |
| Batch processing | Enables continuous processing of multiple data entries |
Explanation
Using the return value of awk split, array size can be accurately controlled.
Because it loops dynamically rather than a fixed number of times, it is well-suited for processing variable-length data.
Summary of Key Points for Using awk’s split Function
The awk split function is an important feature that can streamline string operations.
By combining it with arrays and for loops, you can flexibly process diverse data such as CSV files, logs, and configuration files.
In particular, mastering splitting via regular expressions and combining it with gsub and substr greatly expands its practical applicability.
It is recommended to start with basic processing such as space-delimited or CSV splitting, and gradually expand to more advanced patterns.
