Pattern Search & Processing Commands :
There are multiple pattern search commands which help us for getting exact data from huge data. There are multiple commands which help the script to get output effectively.
awk
sed
grep
sort
diff
AWK:Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match the specified patterns and then performs the associated actions.
Awk is abbreviated from the names of the developers – Aho, Weinberger, and Kernighan.
Syntax:awk options 'selection _criteria {action }' input-file > output-file
$cat > employee.txt ajay manager account 45000 sunil clerk account 25000 varun manager sales 50000 amit manager account 47000 tarun peon sales 15000 deepak clerk sales 23000 sunil peon sales 13000 satvik director purchase 80000
$ awk '/manager/ {print}' employee.txt ajay manager account 45000 varun manager sales 50000 amit manager account 47000
In-Built Options in AWK :
NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. The Awk command performs the pattern/action statements once for each record in a file.
NF: NF command keeps a count of the number of fields within the current input record.
FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.
RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a new line.
GREP:
The grep filter searches a file for a particular pattern of characters and displays all lines that contain that pattern.
Syntax:
grep [options] pattern [files]
Options Description
-c: This prints only a count of the lines that match a pattern
-h : Display the matched lines, but do not display the filenames.
-i : Ignores, the case for matching
-l: Displays list of filenames only.
-n : Display the matched lines and their line numbers.
-v : This prints out all the lines that do not match the pattern
-e exp : Specifies expression with this option. Can use multiple times.
-f file : Takes patterns from a file, one per line.
-E : Treats pattern as an extended regular expression (ERE)
-w : Match the whole word
-o : Print only the matched parts of a matching line, with each such part on a separate output line.
-A n : Prints searched line and nlines after the result.
-B n : Prints searched line and n line before the result.
-C n : Prints searched line and n lines after before the result.SED:
SED is a powerful text stream editor. Can do insertion, deletion, search and replace.
SED command in Unix supports regular expression which allows it to perform complex pattern matching.
Syntax:sed OPTIONS... [SCRIPT] [INPUTFILE...]
$cat > testfile.txt unix is great os. unix is opensource. unix is free os. learn operating system. unix linux which one you choose. unix is easy to learn.unix is a multiuser os.Learn unix .unix is a powerful.
Replacing or substituting string:
Output :
$sed 's/unix/linux/' testfile.txt linux is great os. unix is opensource. unix is free os. learn operating system. linux linux which one you choose. linux is easy to learn.unix is a multiuser os.Learn unix .unix is a powerful.
Options in Sed:
Replacing all the occurrences of the pattern in a line:
$sed 's/unix/linux/g' testfile.txt
Replacing from nth occurrence to all occurrences in a line :
$sed 's/unix/linux/3g' testfile.txt
Parenthesize the first character of each word:
echo "Welcome To The Test Stuff" | sed 's/\(\\b\[A-Z\]\)/\(\\1\)/g'
Replacing string on a specific line number :
$sed '1,3 s/unix/linux/' testfile.txtSORT:
SORT command is used to sort a file, arranging the records in a particular order. By default, the sort command sorts file assuming the contents are ASCII. Using options in the sort command can also be used to sort numerically.
Syntax :$ sort testfile.txt
$ cat > testfile.txt Dipesh Akshay Rahul Aditya Mohit
Command: $ sort testfile.txt Output: Aditya Akshay Dipesh Mohit Rahul
DIFF:
diff stands for the difference. This command is used to display the differences in the files by comparing the files line by line.
The important thing to remember is that diff uses certain special symbols and instructions that are required to make two files identical. It tells you the instructions on how to change the first file to make it match the second file.Special symbols are:
a: add
c: change
d: delete
Syntax :diff [options] File1 File2
$ cat File1.txt Gujarat Uttar Pradesh Kolkata Bihar Jammu and Kashmir $ cat File2.txt Tamil Nadu Gujarat Andhra Pradesh Bihar Uttar pradesh
$ diff File1.txt File2.txt 0a1 > Tamil Nadu 2,3c3 < Uttar Pradesh Andhra Pradesh 5c5 Uttar pradesh