Usage of sed, tr,uniq and awk shell commands.

5 min readJul 23, 2023

Hola!

There are many data manipulation shell commands available out there here are a few I used and found interesting.

sed (Stream Editor): sed is a powerful stream editor used for text manipulation. It takes input from a file, a pipeline, or standard input and applies specified text transformation operations to each line of the input. The transformations are based on commands provided to sed. Common operations include searching, replacing, inserting, deleting, and more. sed is particularly useful for editing large amounts of text or automating text processing tasks.
tr (Translate): tr is a command-line utility used for character translation or deletion. It takes input from a file or standard input, processes it, and produces the output to standard output or to a file. tr is commonly used to replace or remove specific characters in a text stream. It can also be used to squeeze multiple consecutive occurrences of a character into a single instance, and it can be helpful for tasks like converting between character sets or transliterating text.
uniq (Unique): uniq is a command-line tool used to filter out repeated lines from a sorted file. It reads input from a file or standard input where each line is sorted, and it outputs only the unique lines to standard output. If there are duplicate lines adjacent to each other in the input, uniq ensures that only one copy of such lines appears in the output. To work effectively, uniq expects the input to be sorted since it only compares adjacent lines for uniqueness.
awk awk is a versatile and powerful text-processing tool and programming language. It is designed for data extraction and reporting tasks. awk operates on a line-by-line basis, processing input from a file or standard input. It allows users to define patterns and actions, where patterns specify which lines to process, and actions define what operations to perform on those lines. awk excels at handling structured text, where data is organized in columns. It is commonly used for tasks like parsing log files, generating reports, and performing various data manipulations.

Example:
I was given an assignment to achieve the following in a single shell command.
Input:

double double toil and trouble
fire burn and cauldron bubble bubble
tomorrow and tomorrow and tomorrow
creeps in this this petty pace from day toto day
to the last syllable of recorded time time

Expected response:

double toil and trouble
fire burn and cauldron bubble
tomorrow and tomorrow and tomorrow
creeps in this petty pace from day toto day
to the last syllable of recorded time

I have used the following combination of commands by doing the trial and error method to achieve the same.

sed -z 's/\n/ , /g' input.txt | tr " " "\n" | uniq  | tr "\n" " " | sed -z 's/ , /\n/g'

How does it work?
Step-1:
The following command replaces the new line with “ , “ this acts as a placeholder for new lines.

sed -z 's/\n/ , /g' duplicate.txt