Ever wondered if you could quickly do some text processing in a bash script, for example, you can analyze a text or sort the output of a text, and provide it in a more meaningful way.
The good thing about doing things this way is that you can search a single query in a mammoth document, for example, say you are looking for a specific name in a document, and you need to return all the lines that the user is included in, without a filter program you won't be able to do that.
Before we go into the program that can carry out the aforementioned operation, let's understand what a filter is all about...
A filter is a nix or GNU/Linux command-line program that has the following properties:
- It takes either standard input (your keyboard) or the contents of one or more files, the input has to come from somewhere, be it a keyboard input, a file, or an output from another program.
- It performs some processing on the above data, the different processing is what separates one filter program from the other
- It produces an output based upon that input you give it an input, you specify what you wanna process on it, and it gives you an output based on the operation it performs on the input.
Listen, it might get a bit confusing as you might think that all GNU/Linux programs are filters, which kinda make sense since they all produce an output, this is not entirely true, and to clarify what I am saying, I'll show you an example of what a filter looks like and what it doesn't look like.
wc is a filter, the processing it performs is counting line, words, and character from standard input or one or more files. The output it produces is the number that represents those counts it those on the standard input or files, making sense?
Let's take an example:
devsrealm@server:~/bin$ wc data.txt
7 58 305 data.txt
devsrealm@server:~/bin$
The above produces lines, words and character of the file "data.txt"
To perform operation on all files in a given directory, you do:
wc *
devsrealm@server:~/bin$ wc *
wc: backup: Is a directory
0 0 0 backup
wc: book: Is a directory
0 0 0 book
wc: cat,: Is a directory
0 0 0 cat,
0 0 0 ch1.sh
7 58 305 data.txt
71 450 2864 dr_optimization.sh
25 78 482 entrance.sh
2 2 18 example1.sh
You can see how it filters and gives the lines, words and character of each file, it even process a directory.
Now, if you try to type wc as is, you'll see it is waiting for an input, now type any word or character, hit enter rinse and repeat, and once you are done, you hit CTRLL + D to indicate you are done typing, and wc would process all the input and gives you the output of whatever you typed, e.g:
devsrealm@server:~/bin$ wc
Things in Life Aren't That Black and White
Use your Brain With Whatver You Do
You Only Have A Time In Life
Use It Well
Gracias 4 26 126
Having said that, let's take a look at programs that do not filter, actually, most programs on GNU/Linux aren't filters, e.g:
ls - This is not a filter because it takes no input, the only thing ls does is to list a series of files and directories, you can't even pipe anything to ls, if you try piping to ls, it would completely ignore whatever you piped to it.
Again, you can't pipe a program output to ls but you can pipe ls output to a program, the only ls does is to produce an output, nothing more.
A good way to know if a program is a filter is to first ask yourself if you can pipe information from one program into another program, and have it use the info in some meaningful way, if the answer is yes, then it is a filter, if otherwise, it isn't a filter.
To summarize, a filer is a program or text processing tool used to process the data produced by other programs or data in files.
Here is a list of major filters that are used regularly in shell scripting:
Filter | The Processing done |
cat | It displays whatever input it takes |
more | Similar to the cat, just with pagination |
grep | Display lines of text from its input that contains a certain pattern |
wc | Counting of line, words, and characters |
sort | Sorting of text |
tee | Duplication, write to files and screen |
sed | Basic Editing |
awk | Anything |