Problem: What is the probability of one specific person to have the same pin digits (assume the pin is four digits) with the person next to him?

Motivation: The “obvious” approach, that is 4!/10^4 is wrong.

As a proof of concept, the following Python program:

pins = ['{0:04}'.format(i) for i in range(0, 10000)] pins_a = set() for p in pins: pins_a.add(frozenset(p)) print(len(pins_a))

Therefore, the probability is at least 1/385 or 0.26% because some tuples are produced more often than some other. **UPDATE follows.**

Explanation:

If your PIN consists of only one number, say 1111, then it is much harder that the other person only uses this digit to form his PIN (only 1/10^4) while for, say 1234, there are many combinations (4! to be exact).

From now on, we assume that we pick a PIN according to the uniform distribution.

Let’s count (explanation follows):

PINs with only 1, 2, 3 and 4 digit(s):

- 10

Sanity check: Total number is 10000.

Let’s explain how these numbers are produced:

- With only one digit, you can only form 10 different numbers.
- There are to choose two number out of ten (without ordering). Now, we replicate them. So, in total we can either have {a, a, b, b} or {a, b, b, b} or {a, a, a, b}. Three different ways to replicate them. Let’s take the first way {a, a, b, b}. There are possible orderings. For the other two (they are similar), we have . The way we count the orderings is 4! for the four possible position divided by the number of the same elements in every group, due to symmetries.
- If you understood the previous one, this one is a piece of cake. Again, to choose three different number. Now, we have three ways of replicating one of them. Therefore, the possible orderings are
- We pick four different numbers and we care about their ordering.

In order to compute the probability, we need to know how often a specific set of number can occur. We have already argued how often a set with size 1, 2, 3 or 4 can occur. We omit the arguments since they are similar to what we have already seen.

- A specific set with one number can be derived by 1 PIN.
- A specific set with two numbers can be derived by 14 PINs.
- A specific set with three numbers can be derived by 36 PINs.
- A specific set with three numbers can be derived by 24 PINs.

Therefore, the total probability is:

Where the factor 10^8 is because now we count matching pairs and there are in total 10^8 pairs.

If this analysis does not convince you, then here are some Python programs, the first one counts the sets with a specific size while the second one emulates the random process.

pins = ['{0:04}'.format(i) for i in range(0, 10000)] counter1, counter2, counter3, counter4 = 0, 0, 0, 0 for p in pins: p = set(p) if len(p) == 1: counter1 += 1 elif len(p) == 2: counter2 += 1 elif len(p) == 3: counter3 += 1 elif len(p) == 4: counter4 += 1 print(counter1, counter2, counter3, counter4)

from random import randint def rand_pin_digits(): a = '{0:04}'.format(randint(0, 10000)) a = set(a) return a match = 0 for i in range(10**6): a = rand_pin_digits() b = rand_pin_digits() if a == b: match += 1 print(match)

Assuming equal probability for all pins. With humans, nothing is random. Everything has a psychological aspect… It is called human nature!

Check this:

http://www.cl.cam.ac.uk/~jcb82/doc/BPA12-FC-banking_pin_security.pdf

😉

Thanks Erriko! Very nice fitting observation! You are absolutely right. It is usually counter-intuitive the notion of randomness for us so we tend to think that sequences like 11111111 are more likely than 01001100.

I hope the fallacy is obvious.

The diagram in page 6 is extremely interesting (όλα τα λεφτά…)

😉