Basic Of Shell Scripting -Part 5

Basic Of Shell Scripting -Part 5

Pattern Search & Processing Commands :

There are multiple pattern search commands which help us for getting exact data from huge data. There are multiple commands which help the script to get output effectively.

  1. awk

  2. sed

  3. grep

  4. sort

  5. diff

AWK:Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match the specified patterns and then performs the associated actions.

  1. Awk is abbreviated from the names of the developers – Aho, Weinberger, and Kernighan.
    Syntax:

    1.      awk options 'selection _criteria {action }' input-file > output-file
      
       $cat > employee.txt 
       ajay manager account 45000
       sunil clerk account 25000
       varun manager sales 50000
       amit manager account 47000
       tarun peon sales 15000
       deepak clerk sales 23000
       sunil peon sales 13000
       satvik director purchase 80000
      
       $ awk '/manager/ {print}' employee.txt 
      
       ajay manager account 45000
       varun manager sales 50000
       amit manager account 47000
      

      In-Built Options in AWK :

      NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. The Awk command performs the pattern/action statements once for each record in a file.

      NF: NF command keeps a count of the number of fields within the current input record.

      FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.

      RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a new line.

GREP:

  1. The grep filter searches a file for a particular pattern of characters and displays all lines that contain that pattern.

  2. Syntax:

     grep [options] pattern [files]
    

    Options Description
    -c: This prints only a count of the lines that match a pattern
    -h : Display the matched lines, but do not display the filenames.
    -i : Ignores, the case for matching
    -l: Displays list of filenames only.
    -n : Display the matched lines and their line numbers.
    -v : This prints out all the lines that do not match the pattern
    -e exp : Specifies expression with this option. Can use multiple times.
    -f file : Takes patterns from a file, one per line.
    -E : Treats pattern as an extended regular expression (ERE)
    -w : Match the whole word
    -o : Print only the matched parts of a matching line, with each such part on a separate output line.
    -A n : Prints searched line and nlines after the result.
    -B n : Prints searched line and n line before the result.
    -C n : Prints searched line and n lines after before the result.

    SED:

    SED is a powerful text stream editor. Can do insertion, deletion, search and replace.

    SED command in Unix supports regular expression which allows it to perform complex pattern matching.
    Syntax:

     sed OPTIONS... [SCRIPT] [INPUTFILE...]
    
     $cat > testfile.txt
     unix is great os. unix is opensource. unix is free os.
     learn operating system.
     unix linux which one you choose.
     unix is easy to learn.unix is a multiuser os.Learn unix .unix is a powerful.
    

    Replacing or substituting string:

    Output :

     $sed 's/unix/linux/' testfile.txt
     linux is great os. unix is opensource. unix is free os.
     learn operating system.
     linux linux which one you choose.
     linux is easy to learn.unix is a multiuser os.Learn unix .unix is a powerful.
    

    Options in Sed:
    Replacing all the occurrences of the pattern in a line:
    $sed 's/unix/linux/g' testfile.txt
    Replacing from nth occurrence to all occurrences in a line :
    $sed 's/unix/linux/3g' testfile.txt
    Parenthesize the first character of each word:
    echo "Welcome To The Test Stuff" | sed 's/\(\\b\[A-Z\]\)/\(\\1\)/g'
    Replacing string on a specific line number :
    $sed '1,3 s/unix/linux/' testfile.txt

    SORT:

    SORT command is used to sort a file, arranging the records in a particular order. By default, the sort command sorts file assuming the contents are ASCII. Using options in the sort command can also be used to sort numerically.
    Syntax :

     $ sort testfile.txt
    
     $ cat > testfile.txt
     Dipesh
     Akshay
     Rahul
     Aditya
     Mohit
    
     Command:
     $ sort testfile.txt
    
     Output:
     Aditya
     Akshay
     Dipesh
     Mohit
     Rahul
    

    DIFF:

    diff stands for the difference. This command is used to display the differences in the files by comparing the files line by line.
    The important thing to remember is that diff uses certain special symbols and instructions that are required to make two files identical. It tells you the instructions on how to change the first file to make it match the second file.

    Special symbols are:
    a: add
    c: change
    d: delete
    Syntax :

     diff [options] File1 File2
    
     $ cat File1.txt
     Gujarat
     Uttar Pradesh
     Kolkata
     Bihar
     Jammu and Kashmir
    
     $ cat File2.txt
     Tamil Nadu
     Gujarat
     Andhra Pradesh
     Bihar
     Uttar pradesh
    
     $ diff File1.txt File2.txt
     0a1
     > Tamil Nadu
     2,3c3
     < Uttar Pradesh
      Andhra Pradesh
     5c5
      Uttar pradesh