Splitting strings with IFS
Today I want to discuss splitting strings into tokens or “words”. I previously discussed how to do this with the <a href="http://bashcurescancer.com/reading-a-file-line-by-line.html">IFS variable and promised a more in depth discussion. Today, I will make the case on WHY to use IFS to split strings as opposed to using a subshell combined with <a class="zem_slink freebase/en/awk" title="AWK" rel="homepage" href="http://cm.bell-labs.com/cm/cs/awkbook/index.html">awk or cut.
I wrote this script which reads the /etc/password file line-by-line and prints the <a class="zem_slink freebase/en/user" title="User (computing)" rel="wikipedia" href="http://en.wikipedia.org/wiki/User_%28computing%29">username of any user which has a UID greater than 10 and has the shell of /sbin/nologin. Each test function performs this task 10 times to increase the length of the test:
[root@sandbox ~]# cat ifs-test.sh #!/bin/bash split_words_cut() { # execute 10 times for i in {0..9} do while read line do # get uid id=$(echo $line | cut -d: -f3) if [[ $id -gt 10 ]] then # get shell shell=$(echo $line | echo $line | cut -d: -f7) if [[ \'/sbin/nologin\' == \"$shell\" ]] then # print username echo $line | cut -d: -f1 fi fi done < /etc/passwd done } split_words_awk() { # execute 10 times for i in {0..9} do while read line do # get uid id=$(echo $line | awk -F: \'{print $3}\') if [[ $id -gt 10 ]] then # get shell shell=$(echo $line | awk -F: \'{print $NF}\') if [[ \'/sbin/nologin\' == \"$shell\" ]] then # print username echo $line | awk -F: \'{print $1}\' fi fi done < /etc/passwd done } split_words_native() { # execute 10 times for i in {0..9} do while read line do oldIFS=$IFS IFS=: set -- $line IFS=$oldIFS # at this point $1 is the username, $3 # is the uid, and $7 is the shell if [[ $3 -gt 10 ]] && [[ \'/sbin/nologin\' == \"$7\" ]] then echo $1 fi done < /etc/passwd done } echo -e \"---Cut---\" time split_words_cut ><a class=\"zem_slink freebase/en/dev_null\" title=\"/dev/null\" rel=\"wikipedia\" href=\"http://en.wikipedia.org/wiki//dev/null\">/dev/null</a> echo -e \"n---Awk---\" time split_words_awk >/dev/null echo -e \"n---Native---\" time split_words_native >/dev/null
As you can see, using the shell itself is about two <a class="zem_slink freebase/en/order_of_magnitude" title="Order of magnitude" rel="wikipedia" href="http://en.wikipedia.org/wiki/Order_of_magnitude">orders of magnitude faster than using the subshell awk/cut method:
[root@sandbox ~]# ./ifs-test.sh
Related articles by Zemanta
- Linux Tips Aggregated #1 (twm-kd.com)
- Rename files the easy way in OS X (amychr.wordpress.com)
- A script snippet for aggregating GDB backtraces (xaprb.com)
Related Articles:
- 2010/02/04 Top 3 Sites To Help You Become A Linux Command Line Master]
- 2010/01/28 Understand Awk Variables with 3 Practical Examples
- 2010/01/04 Display the output of a command from the first line until the first instance of a regular expression.
- 2009/12/28 Some Rather Old But Still Funny Anti-UNIX Jokes (One Liners)
- 2009/12/24 EncFS – Simple article to use Encrypted filesystem in Linux
Authored By Amit Agarwal
Amit Agarwal, Linux and Photography are my hobbies.Creative Commons Attribution 4.0 International License.