Standard Streams
In Linux, everything is treated as a file—not just text documents, but also hardware devices, and even the input and output of commands. A stream is a continuous flow of data from one of these "files" to another.
When a program runs, the Linux kernel gives it three standard streams to communicate with the terminal (and the world). Understanding these is the key to unlocking the power of the shell.
Standard Input (stdin)
- Definition: The standard input is a data stream arranged as a continuous set of bytes from which many Linux commands can receive data. It is referenced by file descriptor 0.
- Default Source: By default, standard input receives data from the keyboard. When you type characters, they are placed in the standard input stream and directed to the Linux command.
-
Redirection: You can redirect standard input to receive data from a file instead of the keyboard using the less-than sign (
<) operator.
Example:
cat < myletter
This command reads input from myletter instead of the keyboard.
-
Commands: Commands like
cat,sort,uniq, andwccan accept input from stdin. Thereadcommand is also used to read a single line from standard input.
Standard Output (stdout)
- Definition: The standard output is a data stream where a command or program places its output. It is referenced by file descriptor 1.
- Default Destination: The default destination for standard output is the screen (terminal).
-
Redirection (Overwrite): You can redirect standard output to a file using the
greater-than sign (
>).
Example:
ls > filenames.txt
This saves the list of files into filenames.txt.
If the file already exists, it will be overwritten.
-
Redirection (Append): To add output to the end of a file without overwriting it,
use the double greater-than (
>>).
Example:
cat myletter >> alletters
This appends the contents of myletter to the file alletters.
-
Force Overwriting: In C shell and Korn shell,
>!can be used to force overwriting a file even if thenoclobberoption is set. -
Commands: Commands such as
ls,cat,echo,sort,uniq,wc,head,tail, andteesend their output to stdout.
Standard Error (stderr)
- Definition: Standard error is a separate output stream used only for error messages. It is referenced by file descriptor 2.
- Default Destination: Like standard output, error messages are usually displayed on the screen.
-
Redirection (Overwrite): You can redirect standard error using
2>.
Example:
cat myintro 2> error_log.txt
This saves error messages to error_log.txt.
-
Redirection (Append): To append error messages instead of overwriting,
use
2>>.
Standard Streams Summary
| Stream Name | File Descriptor | Description | Default Target |
|---|---|---|---|
| Standard Input (stdin) | 0 |
The source of input for a program. | The keyboard |
| Standard Output (stdout) | 1 |
Where a program sends its normal output. | The terminal screen |
| Standard Error (stderr) | 2 |
Where a program sends its error messages. | The terminal screen |
Filtering and Processing Text
In Linux, grep, cut, sort, uniq, and wc are essential
filter commands used for text processing and are often combined using
pipes.
Filters take input, modify it, and then output the altered data. Most filter commands accept one or more filenames as arguments, and if no filename is specified, they read from standard input.
1. grep (Global Regular Expression Print)
The premier tool for filtering lines of text based on a pattern.
# Basic syntax grep [options] pattern [file] # Search for the word "error" in a log file (case-sensitive) grep "error" /var/log/syslog # Search for "error" (case-insensitive) grep -i "error" /var/log/syslog # Search for lines that do NOT contain "success" grep -v "success" output.log # Count how many lines contain the pattern grep -c "GET" webserver.log # Show the line number of matching lines grep -n "username" /etc/passwd
2. cut
Used to extract specific columns or fields from a structured text file
(like CSV files or /etc/passwd).
# Basic syntax cut [options] [file] # Extract the first field from a CSV file cut -d ',' -f 1 employees.csv # Extract fields 1 and 3 from /etc/passwd cut -d ':' -f 1,3 /etc/passwd # Extract the first 5 characters from every line cut -c 1-5 data.txt
-d: Sets the delimiter (default is tab).-f: Specifies which field(s) to extract.
3. sort (Sort Lines of Text)
The sort command orders lines alphabetically or numerically.
# Sort alphabetically sort names.txt # Sort in reverse order sort -r names.txt # Sort numerically sort -n numbers.txt # Sort by the 3rd column numerically sort -n -k 3 data.csv
-r: Reverse sorting order.-n: Numeric sort.-k: Sort by a specific column.
4. uniq (Report or Omit Repeated Lines)
The uniq command removes or finds duplicate lines.
Important: It only removes adjacent duplicates, so the input is usually sorted first.
# Remove duplicate lines sort log.txt | uniq # Count occurrences of each line sort log.txt | uniq -c # Show only duplicated lines sort log.txt | uniq -d
-c: Show count of each line.-d: Show only duplicate lines.
5. wc (Word Count)
The wc command prints counts of lines, words, and characters.
# Count lines, words, and characters wc story.txt # Count only the number of lines wc -l < /etc/passwd # Count number of files in the current directory ls -1 | wc -l
-l: Count lines.-w: Count words.-c: Count bytes.
Finding Files
You know a file is on your system, but you don't remember where. Manually searching through directories is inefficient.
Linux provides two powerful tools for this: the robust, real-time find and the lightning-fast locate.
find
The find command is one of the most powerful tools in the Linux arsenal.
Basic Syntax:
find [starting/path] [options] [expression] -action
1. Finding Files by Name
# Find a file named "config.txt" anywhere under the /home directory find /home -name "config.txt" # Case-insensitive search find /home -iname "config.txt" # Use wildcards (* and ?) find /var -name "*.log"
2. Finding Files by Type
# Find all directories named "bin" find /usr -type d -name "bin" # Find all regular files find /home -type f # Find all symbolic links find /usr -type l
3. Finding Files by Size
# Files larger than 100 MB find / -size +100M # Files smaller than 1 KB find /home -size -1k # Files exactly 1024 bytes find /tmp -size 1024c
+→ Larger than-→ Smaller than- No sign → Exact size
- Units:
c(bytes),k(KB),M(MB),G(GB)
4. Finding Files by Owner / Permissions
# Files owned by user "www-data" find /var -user www-data # Files with SUID permission find /usr -perm /4000 # Files executable by everyone find . -perm /o=x
5. Taking Action on Found Files (Using -exec)
The real power of find is automating tasks on the files it finds.
The {} placeholder represents the found file, and \; marks the end of the command.
# Delete all ".tmp" files
find /tmp -name "*.tmp" -delete
# Safer method using rm
find /tmp -name "*.tmp" -exec rm {} \;
# Change ownership of HTML files
find /var/www -name "*.html" -exec chown alice {} \;
# Display found files with details
find /etc -name "*.conf" -exec ls -l {} \;
locate
The locate command performs a rapid database search for filenames and displays results instantly.
# Search for files named "passwd" locate passwd # Case-insensitive search locate -i "report2023" # Limit the number of results locate -n 5 ".log"
Updating the locate database:
sudo updatedb
If you created a file recently and locate cannot find it, update the database using
sudo updatedb.
find vs locate (Quick Comparison)
| Feature | find | locate |
|---|---|---|
| Speed | Slower (searches filesystem in real-time) | Very fast (searches a database) |
| Accuracy | Always accurate and up-to-date | May be outdated if database isn't updated |
| Advanced Filters | Yes (size, type, owner, permissions) | Limited filtering |
| Best Use | Precise file searches | Quick filename searches |