I am trying to use a hash set to do a Monte Carlo analysis of the Birthday Paradox.
Here is my pseudo code:
Set number of collisions to zero
Loop (10 to100 by 10)
Loop:
Generate a birthday. (use 1-365)
See if it is already in the set. If it is count as a collision and end the loop.
Add to the set.
Print out the number of collisions as a probability P.
The code is compiling and running fine, but I am getting completely wrong results:
After 100000 tests there were 1 occurrences of shared birthdays in a set of 10 people.
Probability : 1.0E-5
After 100000 tests there were 2 occurrences of shared birthdays in a set of 20 people.
Probability : 2.0E-5
After 100000 tests there were 3 occurrences of shared birthdays in a set of 30 people.
Probability : 3.0E-5
After 100000 tests there were 4 occurrences of shared birthdays in a set of 40 people.
Probability : 4.0E-5
After 100000 tests there were 5 occurrences of shared birthdays in a set of 50 people.
Probability : 5.0E-5
After 100000 tests there were 6 occurrences of shared birthdays in a set of 60 people.
Probability : 6.0E-5
After 100000 tests there were 7 occurrences of shared birthdays in a set of 70 people.
Probability : 7.0E-5
After 100000 tests there were 8 occurrences of shared birthdays in a set of 80 people.
Probability : 8.0E-5
After 100000 tests there were 9 occurrences of shared birthdays in a set of 90 people.
Probability : 9.0E-5
After 100000 tests there were 10 occurrences of shared birthdays in a set of 100 people.
Probability : 1.0E-4
When I should be getting:
After 100000 tests there were 11446 occurrences of shared birthdays in a set of 10 people.
Probability: 0.11446
After 100000 tests there were 41031 occurrences of shared birthdays in a set of 20 people.
Probability: 0.41031
After 100000 tests there were 70455 occurrences of shared birthdays in a set of 30 people.
Probability: 0.70455
After 100000 tests there were 89061 occurrences of shared birthdays in a set of 40 people.
Probability: 0.89061
After 100000 tests there were 97043 occurrences of shared birthdays in a set of 50 people.
Probability: 0.97043
After 100000 tests there were 99454 occurrences of shared birthdays in a set of 60 people.
Probability: 0.99454
After 100000 tests there were 99926 occurrences of shared birthdays in a set of 70 people.
Probability: 0.99926
After 100000 tests there were 99986 occurrences of shared birthdays in a set of 80 people.
Probability: 0.99986
After 100000 tests there were 99999 occurrences of shared birthdays in a set of 90 people.
Probability: 0.99999
After 100000 tests there were 100000 occurrences of shared birthdays in a set of 100 people.
Probability: 1.0
I am getting very frustrated and can't seem to figure out what I am doing wrong. I believe it has something to do with my loops. Any help would be greatly appreciated!!
import java.util.HashSet;
import java.util.Random;
import java.util.Set;
public class BirthdayParadox
{
public static void main(String [] args)
{
Random random = new Random();
int collisionCount = 0;
for(int people = 10; people <= 100; people += 10)
{
Set<Integer> birthdays = new HashSet<>(365);
birthdays.add(random.nextInt(365));
for(int runs = 0; runs < 100000; runs++)
{
int randomBirthday = random.nextInt(365);
if(birthdays.contains(randomBirthday))
{
collisionCount++;
break;
}
birthdays.add(randomBirthday);
}
float probability = (float)collisionCount / 100000;
System.out.println("After 100000 tests there were " +
collisionCount + " occurrences of shared " +
" birthdays in a set of " + people + " people.");
System.out.println("Probability : " + probability);
}
}
}