Trying to repeat experiment 100 times but results vary for elapsed time

**IHeartProgramming** · October 21st, 2012, 07:44 PM

Hi, I'm trying to time my sorting methods for my class, and I'm unsure why the elapsed time for each one after the first is wrong, I'm updating the start and end times. I'm currently timing my merge sort.

this is what I do using command lines and save to all the running times to a text file: run_times.txt
For example: >java Insertion_sort >> run_times.txt

When I check the run_times.txt I get for example:
elapsed time: 6000ms

run time: 1000ms

run time: 1020ms

run time: 1080ms

*I know the running time of approximatley 6000ms is correct, but why are the other times not timed properly. I reset start and end times after the inner for loop done inserting 1,000,000 items. I hope it's a silly mistake I made...

public static void main(String[] args)
	{
		int array = create_array(size);
 
		long start = 0;
		long end = 0;
 
		for ( int j=0; i<50; i++ )//run a total of 50 times
		{
			start = System.currentTimeMillis();//restarting time after all 1 000 000 items sorted
 
			for ( int j = 0; j < 1_000_000; ++j )//turn this into a method, don't do more work by reading values from an array
				mergeSort(array, 1_000_000);
 
			end = System.currentTimeMillis();
 
			System.out.println("Run time: " + (end - start) );
	}

**Zaphod_b** · October 21st, 2012, 09:53 PM

Originally Posted by IHeartProgramming

...
When I check the run_times.txt I get for example:
elapsed time: 6000ms

run time: 1000ms

run time: 1020ms

run time: 1080ms

*I know the running time of approximately 6000ms is correct, but why are the other times not timed properly?

I scratched my head over the same phenomenon when I started looking at Java a few months ago. Repeated calls should "even out" minor timing differences due to operating system busyness. At least that's the way that I did things with C++. Time slot granularity was different between Linux and Windows, so "minor" variations amounted to a couple of milliseconds on one system and 50 milliseconds on another. Different Linux kernels gave different granularity. None of this amounted to factors of 2 to 4 for successive loops in the same program.

Well...

See this paper: Interpreting Java Runtimes

Not the part about merge sort hashtable, etc. But the part that discusses how the Java Virtual Machine very helpfully isolates you from the hardware and makes precise timing a little problematic.

Anyhow...

The JVM loads classes as it needs them. (I'm talking about Java 1.6.0_22 here. Older and newer versions of the JVM might actually differ from my observations.)

So...

The first call to whatever routine you are using to sort loads a bunch of class stuff into the JVM. Successive loops with calls to the same class (and underlying classes) have less overhead, so they are faster.

Bottom line: Using times from loops like yours doesn't always tell you what you would like to know.

Cheers!

Z

**helloworld922** · October 21st, 2012, 10:36 PM

Originally Posted by Zaphod_b

I scratched my head over the same phenomenon when I started looking at Java a few months ago. Repeated calls should "even out" minor timing differences due to operating system busyness. At least that's the way that I did things with C++. Time slot granularity was different between Linux and Windows, so "minor" variations amounted to a couple of milliseconds on one system and 50 milliseconds on another. Different Linux kernels gave different granularity. None of this amounted to factors of 2 to 4 for successive loops in the same program.

Well...

See this paper: Interpreting Java Runtimes

Not the part about merge sort hashtable, etc. But the part that discusses how the Java Virtual Machine very helpfully isolates you from the hardware and makes precise timing a little problematic.

Anyhow...

The JVM loads classes as it needs them. (I'm talking about Java 1.6.0_22 here. Older and newer versions of the JVM might actually differ from my observations.)

So...

The first call to whatever routine you are using to sort loads a bunch of class stuff into the JVM. Successive loops with calls to the same class (and underlying classes) have less overhead, so they are faster.

Bottom line: Using times from loops like yours doesn't always tell you what you would like to know.

Cheers!

Z

The reality for modern VM's and computers is actually much more complex, but the same recommendation holds.

Luckily, there are many good resources for benchmarking Java:

How do I write a Correct Micro Benchmark in Java?, in particular the first answer highlights the key factors/steps any good micro-benchmarking code should follow.

The Caliper Project, a benchmarking tool for Java.

VisualVM - From what I've read it's basically the profiling tool separated from the Netbeans IDE

For basic benchmarking, here's the general approach i usually take:

Time the code you want to benchmark several times (I would recommend at least 5 or 10 times, more is better statistically). Optimally you'd want each trial run to be at least 100ms to get rid of any small discrepancies. Use the basic statistical tools such as min/max, standard deviation, mean (average), and median. Sometimes I discard the min/max trials before performing any other analysis such as mean and standard deviation. When benchmarking you want to make sure you have as many background programs closed as possible (not just minimized). The main reason for this is to help get consistent caching, which sometimes is responsible for significant timing differences. If you're standard deviation is rather large, you may want to consider adding more trials, or follow more of the recommendations for the first link I posted about writing a good micro-benchmark in Java.

edit:

Two last consideration I forgot to bring up is to run your code in release mode. This allows the compiler/runtime to make certain optimizations it won't in debug mode. Also make sure you're running all trials on the same computer ideally with the same software installed/running.

**copeg** · October 22nd, 2012, 01:36 PM

To add to the great advice already received, just a note regarding the code posted: for me it is unclear how the sort is operating on the array, but from what is posted I can make a guess you are passing the same array to the sort method - eg that the array becomes sorted on the first loop iteration and from then on each call to the sort is sorting an already sorted array: in other words you may be measuring the best case time for each of your algorithms (I'm presuming you wish to measure average time - I'll leave it up to you to see if doing so impacts your results dramatically).

Thread: Trying to repeat experiment 100 times but results vary for elapsed time

LinkBack

Thread Tools

Display

Trying to repeat experiment 100 times but results vary for elapsed time

Related threads:

Re: Trying to repeat experiment 100 times but results vary for elapsed time

Re: Trying to repeat experiment 100 times but results vary for elapsed time

The Following User Says Thank You to helloworld922 For This Useful Post:

Re: Trying to repeat experiment 100 times but results vary for elapsed time

The Following User Says Thank You to copeg For This Useful Post:

Similar Threads

Simple Repeat.

I'm vary new to java programming and confused about som concepts.

How to repeat a while loop?

How to display the results + repeat to the number entered by user?

Wanting to repeat my program...