No ads? Contribute with BitCoins: 16hQid2ddoCwHDWN9NdSnARAfdXc2Shnoa
Oct 032018
 

I have a Python script that over-simplifying, reads very large log files and runs a whole bunch of regular expressions on each line. As it had started running inconveniently slowly, I had a look at improving the performance.

The conventional wisdom is that if you are reading a file (or standard input), then the simplest method is probably almost always the fastest :-

for line in logstream:
    processline(line)

But being stubborn, I looked at possible improvements and came up with :-

from itertools import islice
    
while True:
    buffer = list(islice(logstream, islicecount))
    if buffer != []:
        for line in buffer:
             processline(line)
    else:
        break

This code has been updated twice because the first version added a splat to the output and the second version (which was far more elegant) didn’t work. The final version 

This I benchmarked as being nearly 5% quicker – not bad, but nowhere near enough for my purposes.

The next step was to improve the regular expressions – I read somewhere that .* can be expensive and that [^\s]* was far quicker and often gave the same result. I replaced a number of .* occurrences in the “patterns” file and re-ran the benchmark to find (in a case with lots of regular expressions) the time had dropped nearly 25%.

The last step was to install nuitka to compile the Python script into a binary executable. This showed a further 25% drop – a script that started the day taking 15 minutes to run through one particular run ended the day taking just under 8 minutes.

The funny thing is that the optimisation that took the longest and had the biggest effect on the code showed the smallest improvement!

Four Posts
Content not available.
Please allow cookies by clicking Accept on the banner
WP Facebook Auto Publish Powered By : XYZScripts.com

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close