Saturday, August 21, 2004

Lazy Friday night with the Wall Street Journal

I was not planning to write anything more today. But Chavez decided to make a cadena. I felt like forgetting TV and turn on my computer to check comments. Someone sent me a link to an article of the Wall Street Journal free article section on the web. Supposedly that article is debunking all the statistical theory advanced by the opposition this week. It might even be the one that Rodriguez of the CNE mentioned. So I read it.

Well, using that article to debunk the opposition theory is a tad daring. I really did not find anything new to an article which is rather political than statistical and I did not feel that it was coming hard on the opposition. Mostly adequate comments considering what a foreign journalist could understand from our crazy situation

However I did find this little gem that I want all of you to share. I have cut and pasted from what was on the web page tonight at around 9 PM.

Aviel Rubin, a computer-science professor at Johns Hopkins University, said he calculated odds of roughly one in 17 that two of three computers at a voting table would have identical results. That compares to about one in 15 that so far have shown similar results in Venezuela's referendum.

I have rally no idea what that means. I mean, I think that there is some important information that the journalist forgot to write down. Because if I am basing myself strictly on this paragraph we can make the following calculation:

Let's assume three machines for three lists of 100 electors each.
I remind the reader that an electoral center was made of several lists of electors each with its own machine. That is you were told which machine to go to.
Probability to reach a result of, say, 47 in one machine is 1 in 100
Probability to reach 47 in another machine is 1 in 100
How could it be possible that the probability to reach 47 in three machines at once is 1 in 17?
I would have guessed that the odds would have been 1 in 10,000 but what do I know? Would someone care to explain this to me? Is this "17" result coming from some law of probabilities that I do not know?

I have written to David Luhnow with a copy of this post and if he has anything to add to this message I will be delighted to post it.

PS: not added in proof. Checking on the comments section I read again my post and there was a typo: it should have read as it reads now instead of the previous "...47 in two machines at once...".

Now I must also add that indeed one could lower these statistics by saying that the only numbers expected to come out are between 35 and 65, which would lower somewhat the 1 in 10,000 result, and this is probably what was behind the 1 in 17.

To which I reply that actually if we take an average of 500 electors for every machine the expected numbers are between 200 and 300, which would increase again the odds!

Regardless, that same SI pattern should also be found for the NO votes and so far it does not seem to be happening. It should also happen for the number of people that did not vote, and as far as I know nobody has checked into it.

In other words, if I were to find that 20% of the electoral centers have "identical SI" then I should find "identical NO" in, say, 18 to 22% of the centers, and also "identical did not vote" in the same range. But if I find 20% "identical SI" and only 10% "identical NO", then there is a problem. I do remind people that one has to compare center with center and not machines with machines. That is, 100 aleatory machines will not give the same result as 10 aleatory machines from 10 different aleatory center, which is also 100 machines. I mean, they should give the same result if there is no fraud, but if there is a fraud as the one I have explained earlier, a difference will be observed. Every type of rigging will show a specific type of result distribution. Nice Gaussian curves are only found when there is no tampering whatsoever.

The wonderful things about statistics is that it is the same with each one of the parameters investigated and thus we have with this electoral exercice a great experiment with built in positive and negative controls! Any scientist's dream, but perhaps every cheat nigthmare.

Now we just have to wait until the real experts finish their work and can prove that statistically there is a problem. I hope that I have illustrated the problem in a simple way. However I can assure the reader that calculating these statistics is not something done in a few hours!!!! Be patient!

No comments:

Post a Comment

Comments policy:

1) Comments are moderated after the sixth day of publication. It may take up to a day or two for your note to appear then.

2) Your post will appear if you follow the basic polite rules of discourse. I will be ruthless in erasing, as well as those who replied to any off rule comment.