Usage of sed, tr,uniq and awk shell commands.

The Optimizer
5 min readJul 23, 2023

--

Hola!

There are many data manipulation shell commands available out there here are a few I used and found interesting.

  1. sed (Stream Editor): sed is a powerful stream editor used for text manipulation. It takes input from a file, a pipeline, or standard input and applies specified text transformation operations to each line of the input. The transformations are based on commands provided to sed. Common operations include searching, replacing, inserting, deleting, and more. sed is particularly useful for editing large amounts of text or automating text processing tasks.
  2. tr (Translate): tr is a command-line utility used for character translation or deletion. It takes input from a file or standard input, processes it, and produces the output to standard output or to a file. tr is commonly used to replace or remove specific characters in a text stream. It can also be used to squeeze multiple consecutive occurrences of a character into a single instance, and it can be helpful for tasks like converting between character sets or transliterating text.
  3. uniq (Unique): uniq is a command-line tool used to filter out repeated lines from a sorted file. It reads input from a file or standard input where each line is sorted, and it outputs only the unique lines to standard output. If there are duplicate lines adjacent to each other in the input, uniq ensures that only one copy of such lines appears in the output. To work effectively, uniq expects the input to be sorted since it only compares adjacent lines for uniqueness.
  4. awk awk is a versatile and powerful text-processing tool and programming language. It is designed for data extraction and reporting tasks. awk operates on a line-by-line basis, processing input from a file or standard input. It allows users to define patterns and actions, where patterns specify which lines to process, and actions define what operations to perform on those lines. awk excels at handling structured text, where data is organized in columns. It is commonly used for tasks like parsing log files, generating reports, and performing various data manipulations.

Example:
I was given an assignment to achieve the following in a single shell command.
Input:

double double toil and trouble
fire burn and cauldron bubble bubble
tomorrow and tomorrow and tomorrow
creeps in this this petty pace from day toto day
to the last syllable of recorded time time

Expected response:

double toil and trouble
fire burn and cauldron bubble
tomorrow and tomorrow and tomorrow
creeps in this petty pace from day toto day
to the last syllable of recorded time

I have used the following combination of commands by doing the trial and error method to achieve the same.

sed -z 's/\n/ , /g' input.txt | tr " " "\n" | uniq  | tr "\n" " " | sed -z 's/ , /\n/g'

How does it work?
Step-1:

The following command replaces the new line with “ , “ this acts as a placeholder for new lines.

sed -z 's/\n/ , /g' duplicate.txt

Expected Response for the above command:

double double toil and trouble , fire burn and cauldron bubble bubble , tomorrow tomorrow and tomorrow and tomorrow , creeps in this this petty pace from day toto day , to the last syllable of recorded time time ,

Step-2:
Then the output of the step-1 command is sent as input to the following command which replaces “ “ with a new line to make all words in new lines.

sed -z 's/\n/ , /g' duplicate.txt | tr " " "\n" 

Expected Response for the above command:

double
double
toil
and
trouble
,
fire
<----------->
day
,
to
the
last
syllable
of
recorded
time
time
,

Step-3:
Then the output of the step-2 command is sent as input to the following command which then “uniq” commands removes the CONSECUTIVE duplicate words in the file.

sed -z 's/\n/ , /g' duplicate.txt | tr " " "\n" | uniq 

Expected Response for the above command:

double
toil
and
trouble
,
fire
burn
and
cauldron
bubble
,
tomorrow
<---------->
syllable
of
recorded
time
,

Step-4:
Then the output of the step-3 command is sent as input to the following command which replaces the new line with “ “ to make all words into one new

sed -z 's/\n/ , /g' duplicate.txt | tr " " "\n" | uniq  | tr "\n" " "

Expected Response for the above command:

double toil and trouble , fire burn and cauldron bubble , tomorrow and tomorrow and tomorrow , creeps in this petty pace from day toto day , to the last syllable of recorded time , 

Step-5:
Then the output of the step-4 command is sent as input to the following command which replaces “ , “ with a new line to make the one line into smaller lines as the input.

sed -z 's/\n/ , /g' duplicate.txt | tr " " "\n" | uniq  | tr "\n" " " | sed -z 's/ , /\n/g'

Expected Response for the above command:

double toil and trouble
fire burn and cauldron bubble
tomorrow and tomorrow and tomorrow
creeps in this petty pace from day toto day
to the last syllable of recorded time

I haven’t used awk in the above example but it’s also a powerful CLI utility to filter the outputs here is an example of using awk to monitor CPU usage in real-time using the top command:

top -d 1 | awk 'NR > 7 {print "PID: "$1", CPU%: "$9}'

Explanation:

  • top -d 1: The top the command displays real-time information about system processes. The -d 1 option refreshes the output every 1 second to provide real-time updates.
  • awk 'NR > 7 {print "PID: "$1", CPU%: "$9}': awk processes the output of top. The NR > 7 condition skips the first 7 header lines of top output, as they contain non-process-related information. For each subsequent line (representing a process), it prints the process ID (PID) and the CPU usage percentage (9th field).
Expected output:
PID: 1234, CPU%: 12.3
PID: 5678, CPU%: 9.8
PID: 9101, CPU%: 5.6
...

More shell scripts are available here:
https://github.com/infinite8loop/shell

That’s it guys, Thanks for reading, and feel free to hit me up for any AWS/DevOps-related discussions — LinkedIn.

Happy scripting!!

--

--

The Optimizer
The Optimizer

Written by The Optimizer

Cloud & DevOps👨‍💻 | AWS☁️| K8s⚔️| Terraform🏗️ | CI/CD🚀| Open Source 🐧 | Versatile DevOps engineers. Well-versed with DevOps Tools and Cloud Services.

No responses yet