Benchmarking Firefox 3.0 Beta 4: Did the Test Follow Good or Best Practice?

Everybody is awaiting Firefox 3.0 and many are in midst of testing the Beta 4 version
Recently we came across another test and we started to ask if the journalist had followed good or best practice, a mix or none at all.
We address this in more detail and outline why most tests comparing the Firefox Beta version may be of little use for users like you and me

Not so long ago we have pointed out that in some European countries 40% or more of users are surfing with the help of the Firefox browser. Naturally, we were intrigued when we came across another browser test here:

Firefox 3.0 Beta 4 – Benchmarked by Adrian Kingsley-Hughes

Adrian Kingsley-Hughes used two tests, namely:

ACID 3 standards compliance test , and
Sun-Spider JavaScript Benchmark Results for Firefox 2.0.0.12

Acid 3 is what the industry describes as a good practice kind of test that shows how well the colors and graphics will show with the browser you use. This is especially important to graphics people and designers, of course. The Sun-Spider test is just a test that outlines how well the JavaScript (a distant cousin of Java) can run (FAQ – Java versus JavaScript – the security basics for non-geeks).

Therefore, we tried to repeat some of the tests Adrian Kingsley-Hughes did.

Running Sun-Spider JavaScript Benchmark on your Browser (just click on link – start – test your browser right now that simple)

We did a comparison with Firefox 2.0.0.12 with K-Meleon 1.1 using the Sun-Spider JavaScript Benchmark test on 2008-03-12 and found that the differences are not statistically significantly different. No surprise, we also did it with Opera browser and Internet Explorer 6 and 7. Majority of these tests were not statistically significantly different for us either.

Accordingly, what can we learn from my tests in comparison to those made by Adrian Kingsley-Hughes? Put simply, we see two main difficulties with this particular benchmarking exercise:

1) unless differences shown between Internet Explorer, Opera, Firefox and other browsers are statistically significantly different – why bother? (Please click on the link, Login as guest – click on this link again and voila free access)

2) the benchmark indicators you choose matter, hence using a Firefox without any add-ons (e.g., various security features) or plug-ins may not be realistic because these are the features that make Firefox so useful for home-users. We have pointed this out in a comment to Adrian Kingsley-Hughes here: RE: Firefox 3.0 Beta 4 – Benchmarked

Bottom Line

Adrian did a so-called laboratory test in a controlled environment (basic version of browser, 0 add-ons). This is most certainly important for a start. Nonetheless, for business what matters is to know how these things work in the real world. Accordingly, what happens when users add their additional features they need, desire or want to the browser? How fast will it be then? Adrian’s test does not address this.

As well, unless you benchmark and you can show that things are statistically different to competitor/browser B, it may all be due to chance. In that case, it is dangerous to put any bets on the results because nobody may be able to repeat them either.

If this post was helpful to you, please consider stumbling it or Digg this ComMetrics post from CyTRAP Labs
Also of interest from around the Web:
Who is Responsible for Your Benchmarks?	Why Benchmark
b – browser usage varies enormously – ignore Firefox at your peril	good practice or best practice – what shall it be?

Remember, benchmarking is a very important exercise but it really depends on what measures or ratios you use. In turn, if these are too abstract or theoretical, such as using Firefox without any plug-ins or add-ons, your results may not be too meaningful. Hence, why benchmark this way?Benchmarking done carefully helps clarify things – specially what needs to be improved. Done sloppy it confuses matters and it can result in decisions based on the wrong information. Nobody wants this.Best practice is difficult to achieve. Nevertheless, applying tools improperly is surely the wrong first step as this comparison of Firefox and other browsers illustrates.

Digg Digg

Tagged as: a analytics taking action, Acid 3 standards compliance test, Adrian Kingsley-Hughes, benchmarking, buzz marketing, ComMetrics.com, comparison, Firefox 3.0 Beta 4, Firefox browser, good practice, measurement, social media buzz, social media influence, Sun-Spider JavaScript Benchmark

http://mhenriday.googlepages.com/ M Henri Day

It is indeed the case that a comparison of FF 2.0.0.12 and FF 3.0 beta 4 without add-ons does not correspond to the practice of most Firefox users – the add-ons, as pointed out in the article constitute a large part of what makes Firefox so useful

However, in this particular case, Mr Kingsley-Hughes choice to compare the two versions without any add-ons in not entirely devoid of sense, as many of the most useful add-ons, such as the Google Toolbar and the del.icio.us bookmarks, are not (yet) available for FF 3 b 4.

My suggestion to those who would attempt to benchmark 3 b 4 – or 3 b 5 when it comes – against 2.0.0.12 would be to run a couple of the more important add-ons that are available, such as Giorgio Maone’s NoScript or Wladimir Palant’s Adblock Plus, thus providing the measurements with a somewhat more realistic background….

Henri
http://ReguStand.CyTRAP.eu Urs E. Gattiker

Henri

Thanks for the comment. I agree with you wholeheartedly and I should have made it clearer. In particular, Mr Kingsley-Hughes starts his post by pointing out the many improvements in security Firefox 3.0 Beta 4 brings for users.

Naturally, this is great news. Unfortunately, he then does not make any tests that address security issues in any way.

You are, of course, spot on suggesting to use the NoScript or the Adblock Plus add-ons in the test. Testing the Firefox 3.0 Beta 4 with such add-ons installed would give us a better idea if Firefox still leaves the rest of the pack behind in the dust.

If we were to find differences in such a type of test, we would then have to make sure that they are statistically significantly different. Otherwise why bother if they are solely due to chance anyway (= not statistically significantly different)?

Until then, we still await a test that benchmarks a realistic Firefox configuration including the best security add-ons against the rest of the browser pack.

That having said, Henri, thanks for sharing.

Urs
Pingback: Addie Zavala

Benchmarking Firefox 3.0 Beta 4: Did the Test Follow Good or Best Practice?

Pages