faster bash operations on files with File Descriptors.

I was writing a bash script that would do some operations and read and write to file. Seems that that was pretty simple with

1
2
3
4
5
while read line

do

done<file

and then use redirection operations like “>” and “>>” to write to file. Done with the script pretty fast. So far so good, when I went for real life tests, no one was interested in using it, why? Simple, it was simply taking too long. The file was reading about 10K lines and writing about 50 lines and was taking about more than 10 minutes.

So, I sat down to debug what can increase the performance of the script and one change made the difference. The script was taking a lot of time in opening and closing the file. Pretty evident, isn’t it!!!

When using “>” or “>>”, each operation would require bash to open the file, write to it and close it. Un-necessarily we would be doing a open and close for each write operation. Pathetic and useless waste of CPU power and time. How to avoid this?

Open a file and get the file descriptor. Keep writing to the file descriptor and close the descriptor after you are done with the file operations.

1
2
3
4
5
exec 3> File

echo "" >&3

exec 3>&-

In the above commands,

1
 exec 3> File

will open the file descriptor FD 3

Note that when you are working with FD, you don’t need  “>>” as the echo command will put the statements in the current position of file. So, if you want to append to the file use

1
exec 3>> File

Then you can write to the FD with redirection and finally close the descriptor with

1
 exec 3>&-

HTH

Enhanced by Zemanta

1 thought on “faster bash operations on files with File Descriptors.”