Matlab is weird

Try running this code in Matlab:

tic
a=binornd(1,0.9,10000,1);
toc
tic
a=rand(10000)<0.9;
toc
tic
for i=1:10000
    a=binornd(1,0.9);
end
toc
tic
for i=1:10000
    a=rand<0.9;
end
toc

This is the output I get:

Elapsed time is 0.001861 seconds.
Elapsed time is 3.184994 seconds.
Elapsed time is 0.455896 seconds.
Elapsed time is 0.000405 seconds.

This seems very strange. I can see that the second approach is wasteful: generating a random number between 0 and 1 contains a lot more information than the one bit the cutoff reduces it to. But why is the final loop so fast? Aren’t loops meant to be bad in Matlab? It’s reassuring that the binornd function is faster than the second approach, but even more strange that it’s then slower in the loop in the third approach!

My conclusion is that Matlab is a very strange platform, and that you should be very careful assuming one way of doing something will be the fastest. It also pushes home the point that it’s a pretty dodgy environment to use for performance testing of algorithms!

Advertisements

3 Responses to “Matlab is weird”

  1. Scott Hirsch Says:

    You are coming across some of the subtleties of the JIT (Just in Time Compiler) in MATLAB, along with the overhead associated with calling functions. The JIT silently compiles MATLAB code behind the scenes, leading to much faster performance in certain cases. One thing it is particularly good at is tackling small for loops, which can make for loops as fast as vectorized code. This is what’s happening in your 4th example.

    MATLAB has a bit of overhead associated with making a function call, which is why the third example is much slower than the first (10,000 calls vs. 1). The big remaining question is why does #4 get fast while #3 is much slower? This is because ‘<‘ is a “built-in” function (not written in M), which avoids the function calling overhead.

    I agree that care should be taken when using MATLAB to performance test algorithms, as it can be difficult in some cases to distinguish between the system and the algorithm. FWIW, we are very hard at work at not only improving MATLAB performance, but in making it much more predictable.

    • explainaway Says:

      Hi Scott,

      Thanks for your reply – it’s great to know that Mathworks are listening to the community!

      Dave.

  2. Johan Says:

    rand(10000) produces a 10000×10000 matrix of rands…try rand(10000,1)!

    The difference between (3) and (4) also doesn’t surprise me, because an operation like binornd() does a whole lot of overhead checks (‘edit binornd’ should show you this). The 4th for-loop isn’t that bad because you’re not calling any amount of functions inside the loop, as you are with (3) from within binornd(), and that’s what grinds the looping to a halt, because of having to read all those called functions from other files, manage memory, etc.

    Basically if approach 2 really is the fastest of them all, the ordering of the crunchtimes makes sense by me. There’s nothing wrong with Matlab…!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: