Introduction
When it comes to manipulating strings on the command line, sed is an incredibly powerful option. Whether you are replacing file content or extracting specific lines, combining it with regular expressions allows you to complete complex tasks in a single line.
This article provides a step-by-step guide from basic usage to advanced techniques using regular expressions. It is structured to be accessible even for those unfamiliar with the command line, highlighting common pitfalls for beginners.
Reference: sed, a stream editor - GNU Official Documentation
How sed and Regex Replace the First Character with “H”
The primary command we will examine is:
| Element | Description |
|---|---|
sed | The stream editor. Processes files or standard input line by line. |
's/./H/' | The substitution command. Format: s/search_pattern/replacement_string/. |
. | A regex metacharacter matching any single character. |
H | The string to replace the match with. |
input.txt | The target file for processing. |
The s command without flags only replaces the first match in each line. Consequently, only the very first character of every line changes to "H".
Before Execution
First, create input.txt using the following command. Note that for lines containing tabs, you may need to press Ctrl+v then Tab in your terminal.
cat << 'EOF' > input.txt
hello.
world.
hello world.
hello.
1hello 2world.
EOF
input.txt:
hello.
world.
hello world.
hello.
1hello 2world.
After Execution
Output:
Hello.
Horld.
Hello world.
Hhello.
Hhello 2world.
Empty lines remain unchanged because there is no character to match. In Line 5, the leading tab character is replaced by "H". In Line 6, the "1" is the first character, so it becomes "H".
Execution image

Differences Between GNU sed and BSD sed
macOS typically comes with the BSD version of sed, while Linux usually features the GNU version. Their behaviors can differ.
| Item | GNU sed | BSD sed |
|---|---|---|
| -i option (In-place) | sed -i 's/a/b/' file | sed -i '' 's/a/b/' file (Requires backup extension) |
| \t (Tab) | Supported in regex | May not be supported |
| \+, \? | Supported in BRE | Often not supported |
| -E option | Enables Extended Regex (ERE) | Also used for ERE |
Unless otherwise noted, commands in this article are written for the BSD version.
Replacing All Occurrences of hello with HI
| Element | Description |
|---|---|
s | Substitute command. |
hello | The string to search for. |
HI | The replacement string. |
g | The global flag. Replaces all matches within a line. |
Without the g flag, only the first "hello" per line is replaced. With it, every instance—including those in Line 6—is changed.
Extracting Only Lines That Contain hello
| Element | Description |
|---|---|
-n | Suppresses default output. |
/hello/ | The pattern to match. |
p | The print command. Explicitly outputs the matched line. |
By default, sed outputs every line. Combining -n to suppress output with p to explicitly print matched lines achieves grep-like extraction behavior.
Extracting a Column Using Backreferences
| Elemtent | Description |
|---|---|
\(hello\) | Grouping. The matched content can be referenced as \1. |
.* | Matches any sequence of characters (0 or more). |
\1 | References the string matched by the first group. |
p | Outputs only if the substitution was successful. |
By combining regular expression grouping with backreferences, it is possible to extract only a portion of a line — effectively pulling out a specific column.
Swapping the Order of hello and world Using Backreferences
| Element | Description |
|---|---|
\(hello\) | First group (referenced as \1) |
\(world\) | Second group (referenced as \2) |
\2 \1 | Outputs the groups in reversed order |
Backreferences are numbered in order as \1, \2, and so on. On lines matching hello world, the substitution is applied and the output becomes world hello.
Removing Consecutive Blank Lines
| Element | Description |
|---|---|
N | Reads the next line into the pattern space and appends it |
s/^\n// | Deletes the leading newline character |
By default, sed processes one line at a time. The N command reads the next line as well, allowing two lines to be processed together. Without N, the pattern space does not contain a newline character, so the match fails. This is why N is necessary when removing consecutive blank lines.
Replacing Spaces with Underscores
| Element | Description |
|---|---|
| A half-width space | |
_ | The replacement string |
g | Replaces all matches on each line |
Although it looks straightforward, a common mistake is confusing half-width and full-width spaces. When a full-width space is present, the pattern will not match. Always pay close attention to the type of whitespace character when working with spaces in regular expressions.
Replacing Tab Characters with the String SPACE
| Element | Description |
|---|---|
\t | Escape sequence representing a tab character |
SPACE | The replacement string |
In BSD sed, \t may not work inside regular expressions. In that case, use $'\t' or embed an actual tab character by pressing Ctrl+v followed by Tab. Since input.txt contains a tab on line 5, this command can be used to verify the behavior.
Escaping the Dot (.) and Replacing It with DOT
| Element | Description |
|---|---|
\. | An escaped dot. Matches a literal . character |
DOT | The replacement string |
In regular expressions, . is a metacharacter meaning "any single character." To match a literal dot, it must be escaped as \.. Forgetting the escape causes every character to match, resulting in unintended replacements.
Replacing Digits with #
| Element | Description |
|---|---|
[0-9] | A character class matching any single digit from 0 to 9 |
# | The replacement string |
[0-9] is a regular expression character class that represents a single digit. Since \d is not supported in BSD sed, using [0-9] is the safe and reliable approach.
Dynamic Substitution Using Shell Variables and $1
Wrapping the sed command in double quotes allows shell variables to be expanded.
| Element | Description |
|---|---|
"s/hello/$var/" | Double quotes. The shell expands $var to argument |
's/hello/$var/' | Single quotes. $var is treated as a literal string |
Variables are not expanded inside single quotes. In shell scripts, $1 (the first positional argument) is commonly used. For example, sed "s/hello/$1/" input.txt allows the replacement string to be passed dynamically when the script is run.
Greedy and Non-Greedy Matching
| Element | Description |
|---|---|
h.*o | Matches the longest possible string starting with h and ending with o (greedy matching) |
By default, .* in sed performs greedy (longest) matching. When applied to hello world, h.*o matches as far right as possible — not stopping at the first o but continuing to the last one in the line.
Non-greedy (shortest) matching is not supported in BSD sed. Even in GNU sed, a workaround such as [^o]* is required. If non-greedy matching is essential, consider using Perl or Python instead.
Replacing Only on a Specific Line
| Element | Description |
|---|---|
4 | Address specifier targeting only line 4 |
s/hello/HI/ | Substitution command |
Address specifiers allow processing to be limited to a specific line number or lines matching a regular expression. A range such as 2,4s/hello/HI/ is also supported.
About BRE and ERE
sed supports two types of regular expressions: BRE (Basic Regular Expressions) and ERE (Extended Regular Expressions).
BRE example:
In BRE, grouping requires \( and \).
ERE example:
With the -E option, ERE is enabled and grouping can be done with just ( and ). Syntax that requires \( in BRE can be written more cleanly in ERE.
Replacing hello or world with X Using Extended Regular Expressions
| Element | Description |
|---|---|
-E | Enables extended regular expressions |
(hello|world) | Matches either hello or world |
X | The replacement string |
In ERE, | works without escaping,
Quick Reference: Common Regular Expressions in sed
| Expression | Meaning | Example |
|---|---|---|
. | Any single character | s/./X/ |
* | Zero or more repetitions of the preceding element | s/el*/X/ |
^ | Beginning of line | s/^/> / |
$ | End of line | s/$/ end/ |
[abc] | Any one of a, b, or c | s/[abc]/*/g |
[^abc] | Any character except a, b, or c | s/[^abc]//g |
[0-9] | Any single digit | s/[0-9]/#/g |
\(…\) | Grouping (BRE) | s/\(hello\)/[\1]/ |
\1 | Backreference | s/\(hello\) \(world\)/\2\1/ |
\. | Literal dot | s/\./,/g |
\t | Tab character (GNU sed) | s/\t/ /g |
+ | One or more repetitions (ERE) | s/[0-9]+/#/g |
? | Zero or one occurrence (ERE) | s/e?l/X/g |
| | Alternation (BRE) | s/hello|world/X/g |
Reverse Lookup: Find the Command for What You Want to Do
Example 1: Add a string at the beginning of each line
cat << 'EOF' > input.txt
hello.
world.
EOF
^ matches the beginning of a line, and > is inserted there.
Example 2: Remove trailing spaces from each line
cat << 'EOF' > input.txt
hello
world
EOF
$ matches the end of a line, and any spaces immediately before it are removed.
Example 3: Delete blank lines
cat << 'EOF' > input.txt
hello.
world.
EOF
^$ matches a line where the beginning and end are adjacent — in other words, an empty line. The d command deletes those lines.
Common Pitfalls When Commands Don’t Work as Expected
Full-width space deletion has no effect
cat << 'EOF' > sample.txt
hello world
EOF
This command looks like it removes a full-width space, but if the character was converted to a half-width space during copy-paste or terminal encoding, the pattern will not match and nothing will change. Use cat -A or a similar tool to verify the actual characters in the file.
HTML tag removal deletes the entire line
cat << 'EOF' > sample.txt
<p>hello</p> and <span>world</span>
EOF
Because .* is greedy, it matches from the first < all the way to the last > on the line, removing everything in between. To limit matching to individual tags, use [^>]* instead of .*.
Backslash errors in date format conversion
cat << 'EOF' > sample.txt
2024-04-16
EOF
In BSD sed, \+ (one or more repetitions) may not be supported. In that case, rewrite it as [0-9][0-9]*, or switch to extended regular expressions using the -E option.
Broaden Your String Processing Skills with sed and Regular Expressions
sed may look simple at first glance, but combined with regular expressions it handles a wide range of tasks — substitution, extraction, formatting, and more. Beginners often find the differences between BRE and ERE, or the behavioral gaps between BSD and GNU sed, confusing at first. Running the commands in this article on actual files is the best way to build real understanding. Start small, experiment freely, and gradually expand your range of applications.
