So we're incrementing here correct? I'm not sure if multi-threading will help at all ... Also for timing, are we just benchmarking how long it takes to count from 0 to upper bound specified (1 Million, 2 Million, 5 Million, 10 Million, 1 Billion) incrementing with a step of 1? I could write this in C for the pic 18f452, but it wouldn't set any records here (40Mhz chip). How are you benchmarking things here? time to exit, or is the program supposed to keep track of this?
Anyways on a 3.2Ghz P4 I'm getting with C++ (MS VC++ 2005 express), no real optimizations:
1M: 2.406sec
2M: 4.937sec
5M: 12.437sec
10M: 24.797sec
1000M: 713.672sec
I'm not sure if the function I'm using for timing is getting me accurate measurements, the resolution of it isn't very great as far as I can tell as it doesn't like measuring things that take too far under a few thousand ICs