← Linux Mastery: From Zero to Hero

Power User Tools

Task 1
Standard Streams

Standard Streams

In Linux, everything is treated as a file—not just text documents, but also hardware devices, and even the input and output of commands. A stream is a continuous flow of data from one of these "files" to another.

When a program runs, the Linux kernel gives it three standard streams to communicate with the terminal (and the world). Understanding these is the key to unlocking the power of the shell.

Standard Input (stdin)

  • Definition: The standard input is a data stream arranged as a continuous set of bytes from which many Linux commands can receive data. It is referenced by file descriptor 0.
  • Default Source: By default, standard input receives data from the keyboard. When you type characters, they are placed in the standard input stream and directed to the Linux command.
  • Redirection: You can redirect standard input to receive data from a file instead of the keyboard using the less-than sign (<) operator.

Example:

cat < myletter

This command reads input from myletter instead of the keyboard.

  • Commands: Commands like cat, sort, uniq, and wc can accept input from stdin. The read command is also used to read a single line from standard input.

Standard Output (stdout)

  • Definition: The standard output is a data stream where a command or program places its output. It is referenced by file descriptor 1.
  • Default Destination: The default destination for standard output is the screen (terminal).
  • Redirection (Overwrite): You can redirect standard output to a file using the greater-than sign (>).

Example:

ls > filenames.txt

This saves the list of files into filenames.txt. If the file already exists, it will be overwritten.

  • Redirection (Append): To add output to the end of a file without overwriting it, use the double greater-than (>>).

Example:

cat myletter >> alletters

This appends the contents of myletter to the file alletters.

  • Force Overwriting: In C shell and Korn shell, >! can be used to force overwriting a file even if the noclobber option is set.
  • Commands: Commands such as ls, cat, echo, sort, uniq, wc, head, tail, and tee send their output to stdout.

Standard Error (stderr)

  • Definition: Standard error is a separate output stream used only for error messages. It is referenced by file descriptor 2.
  • Default Destination: Like standard output, error messages are usually displayed on the screen.
  • Redirection (Overwrite): You can redirect standard error using 2>.

Example:

cat myintro 2> error_log.txt

This saves error messages to error_log.txt.

  • Redirection (Append): To append error messages instead of overwriting, use 2>>.

Standard Streams Summary

Stream Name File Descriptor Description Default Target
Standard Input (stdin) 0 The source of input for a program. The keyboard
Standard Output (stdout) 1 Where a program sends its normal output. The terminal screen
Standard Error (stderr) 2 Where a program sends its error messages. The terminal screen
Task 2
Filtering and processing text

Filtering and Processing Text

In Linux, grep, cut, sort, uniq, and wc are essential filter commands used for text processing and are often combined using pipes.

Filters take input, modify it, and then output the altered data. Most filter commands accept one or more filenames as arguments, and if no filename is specified, they read from standard input.

1. grep (Global Regular Expression Print)

The premier tool for filtering lines of text based on a pattern.

# Basic syntax
grep [options] pattern [file]

# Search for the word "error" in a log file (case-sensitive)
grep "error" /var/log/syslog

# Search for "error" (case-insensitive)
grep -i "error" /var/log/syslog

# Search for lines that do NOT contain "success"
grep -v "success" output.log

# Count how many lines contain the pattern
grep -c "GET" webserver.log

# Show the line number of matching lines
grep -n "username" /etc/passwd

2. cut

Used to extract specific columns or fields from a structured text file (like CSV files or /etc/passwd).

# Basic syntax
cut [options] [file]

# Extract the first field from a CSV file
cut -d ',' -f 1 employees.csv

# Extract fields 1 and 3 from /etc/passwd
cut -d ':' -f 1,3 /etc/passwd

# Extract the first 5 characters from every line
cut -c 1-5 data.txt
  • -d : Sets the delimiter (default is tab).
  • -f : Specifies which field(s) to extract.

3. sort (Sort Lines of Text)

The sort command orders lines alphabetically or numerically.

# Sort alphabetically
sort names.txt

# Sort in reverse order
sort -r names.txt

# Sort numerically
sort -n numbers.txt

# Sort by the 3rd column numerically
sort -n -k 3 data.csv
  • -r : Reverse sorting order.
  • -n : Numeric sort.
  • -k : Sort by a specific column.

4. uniq (Report or Omit Repeated Lines)

The uniq command removes or finds duplicate lines. Important: It only removes adjacent duplicates, so the input is usually sorted first.

# Remove duplicate lines
sort log.txt | uniq

# Count occurrences of each line
sort log.txt | uniq -c

# Show only duplicated lines
sort log.txt | uniq -d
  • -c : Show count of each line.
  • -d : Show only duplicate lines.

5. wc (Word Count)

The wc command prints counts of lines, words, and characters.

# Count lines, words, and characters
wc story.txt

# Count only the number of lines
wc -l < /etc/passwd

# Count number of files in the current directory
ls -1 | wc -l
  • -l : Count lines.
  • -w : Count words.
  • -c : Count bytes.
Task 3
Finding files

Finding Files

You know a file is on your system, but you don't remember where. Manually searching through directories is inefficient. Linux provides two powerful tools for this: the robust, real-time find and the lightning-fast locate.

find

The find command is one of the most powerful tools in the Linux arsenal.

Basic Syntax:

find [starting/path] [options] [expression] -action

1. Finding Files by Name

# Find a file named "config.txt" anywhere under the /home directory
find /home -name "config.txt"

# Case-insensitive search
find /home -iname "config.txt"

# Use wildcards (* and ?)
find /var -name "*.log"

2. Finding Files by Type

# Find all directories named "bin"
find /usr -type d -name "bin"

# Find all regular files
find /home -type f

# Find all symbolic links
find /usr -type l

3. Finding Files by Size

# Files larger than 100 MB
find / -size +100M

# Files smaller than 1 KB
find /home -size -1k

# Files exactly 1024 bytes
find /tmp -size 1024c
  • + → Larger than
  • - → Smaller than
  • No sign → Exact size
  • Units: c (bytes), k (KB), M (MB), G (GB)

4. Finding Files by Owner / Permissions

# Files owned by user "www-data"
find /var -user www-data

# Files with SUID permission
find /usr -perm /4000

# Files executable by everyone
find . -perm /o=x

5. Taking Action on Found Files (Using -exec)

The real power of find is automating tasks on the files it finds.

The {} placeholder represents the found file, and \; marks the end of the command.

# Delete all ".tmp" files
find /tmp -name "*.tmp" -delete

# Safer method using rm
find /tmp -name "*.tmp" -exec rm {} \;

# Change ownership of HTML files
find /var/www -name "*.html" -exec chown alice {} \;

# Display found files with details
find /etc -name "*.conf" -exec ls -l {} \;

locate

The locate command performs a rapid database search for filenames and displays results instantly.

# Search for files named "passwd"
locate passwd

# Case-insensitive search
locate -i "report2023"

# Limit the number of results
locate -n 5 ".log"

Updating the locate database:

sudo updatedb

If you created a file recently and locate cannot find it, update the database using sudo updatedb.

find vs locate (Quick Comparison)

Feature find locate
Speed Slower (searches filesystem in real-time) Very fast (searches a database)
Accuracy Always accurate and up-to-date May be outdated if database isn't updated
Advanced Filters Yes (size, type, owner, permissions) Limited filtering
Best Use Precise file searches Quick filename searches