perl is faster than bash in some cases.
Some days back, I had to generate some data to be uploaded to a database. As usual I assumed that bash should be faster and hence wrote the script to create the files in bash. But I found that even after 5 hours I was only 10% done with the data generation. Now that would mean that it would take around 50 hours to complete the data generation. Something did not look correct to me and I asked one of my colleague. He suggested I do a strace.
A quick strace command on the PID was shocking but very clear on what was happening.
1
|
Here’s a explanation of what was happening:
We saw that for every write there was
write(1, “a\n”, 2) = 2
dup2(10, 1) = 1
fcntl64(10, F_GETFD) = 0x1 (flags FD_CLOEXEC)
close(10) = 0
We knew that these are very costly calls for CPU and immediately understood what we should do. What was actually happening was that for each of the echo command the FD was being opened, file appended and then FD closed. This made it very clear why the script was running so slow. So, I quickly did some test to very that this will fix the issue I was facing.
I wrote one bash and one perl script to test this and did the time on these. Here are the programs and the output of time on them.
1
2 3 4 5 |
time output:
real 0m0.020s
user 0m0.004s
sys 0m0.005s
1
2 3 4 5 6 7 |
time output:
real 0m0.035s
user 0m0.001s
sys 0m0.008s
one more test to confirm the result
1
2 3 4 5 |
time output:
real 0m0.018s
user 0m0.006s
sys 0m0.003s
As you can see the perl script took a lot lesser user time on the CPU and that is because the file was opened only once and then once all the output was written to the file, the file was closed so file operations in perl are much less than that in the similar bash script. The time taken in the bash script can be decreased drastically if we use open in the bash script also. So, the lesson that I learned was if there are some operations that you can remove from your script, even if they do not seem to be serious issue in the begining, you can improve the performance greatly.
Related Articles:
- 2010/11/20 Broadcast your shell thru port 5000
- 2010/10/13 Quick tip on zipping logs in real time.
- 2010/09/01 Use the last command\’s output as input to a command without piping and bind to it to a key sequence in bash.
- 2010/06/29 Getting your wordpress self hosted stats on your console with bash script.
- 2010/06/25 Delete Files older than 14 days
Authored By Amit Agarwal
Amit Agarwal, Linux and Photography are my hobbies.Creative Commons Attribution 4.0 International License.