Product: TIBCO Spotfire®
Comparison of apply() statements and for() loops
The documentation states that apply() statements are faster than for() loops, but customer is finding there are situations where the for() loop is faster. Which is correct?
First, you may have noticed that in all of the documentation on programming in S+ it states that apply() statements are 'generally' faster than for loops - it never states that apply() is always the way to go. Also the documentation hints that 10,000 iterations is when you might start to consider one method over the other. When you use a for() loop memory isn't released until the loop is complete, but memory is released every time the function within an apply() statement completes, so you have to deal with memory being repeatedly used and released. Lastly the apply() statement converts your object to a matrix which stores information in columns, so performing an operation on each row can be slightly slower.
The function apply(X,MARGIN,FUN) doesn't know what sort of output FUN() will produce until it has run it for each slice of X. It takes some time to analyze the results of all the calls to FUN and then put them into the appropriate format. If you know your function produces a, e.g., scalar numeric output then you can do better than apply() by using a for() loop. You can then preallocate the result vector and know where to put the output of each iteration of the loop.
One big advantage of the apply() family of functions is not the time savings (you sometimes get a fair bit of that) but the fact that it can make the code more understandable. It is usually easier for a human reader to understand the apply statement as it requires the user to bundle up what occurs at each iteration into a function with a specific sets of inputs and one output object.