Home » maths » Probability of the same pin digits

Probability of the same pin digits

Problem: What is the probability of one specific person to have the same pin digits (assume the pin is four digits) with the person next to him?

Motivation: The “obvious” approach, that is 4!/10^4 is wrong.

As a proof of concept, the following Python program:

pins = ['{0:04}'.format(i) for i in range(0, 10000)]

pins_a = set()
for p in pins:
    pins_a.add(frozenset(p))

print(len(pins_a))

Therefore, the probability is at least 1/385 or 0.26% because some tuples are produced more often than some other. UPDATE follows.

Explanation:
If your PIN consists of only one number, say 1111, then it is much harder that the other person only uses this digit to form his PIN (only 1/10^4) while for, say 1234, there are many combinations (4! to be exact).

From now on, we assume that we pick a PIN according to the uniform distribution.
Let’s count (explanation follows):
PINs with only 1, 2, 3 and 4 digit(s):

  1. 10
  2. {{10}\choose{2}} \cdot \left( \dfrac{4!}{2!2!} \cdot \dfrac{4!}{3!1!} \cdot 2 \right) = 630
  3. {{10}\choose{3}} \cdot 3 \cdot \dfrac{4!}{1!1!2!} = 4320
  4. 10! \cdot 9! \cdot 8! \cdot 7! = 5040

Sanity check: Total number is 10000.

Let’s explain how these numbers are produced:

  1. With only one digit, you can only form 10 different numbers.
  2. There are {{10}\choose{2}} to choose two number out of ten (without ordering). Now, we replicate them. So, in total we can either have {a, a, b, b} or {a, b, b, b} or {a, a, a, b}. Three different ways to replicate them. Let’s take the first way {a, a, b, b}. There are \dfrac{4!}{2!2!} possible orderings. For the other two (they are similar), we have \dfrac{4!}{3!1!} \cdot 2. The way we count the orderings is 4! for the four possible position divided by the number of the same elements in every group, due to symmetries.
  3. If you understood the previous one, this one is a piece of cake. Again, {{10}\choose{3}} to choose three different number. Now, we have three ways of replicating one of them. Therefore, the possible orderings are \dfrac{4!}{1!1!2!}
  4. We pick four different numbers and we care about their ordering.

In order to compute the probability, we need to know how often a specific set of number can occur. We have already argued how often a set with size 1, 2, 3 or 4 can occur. We omit the arguments since they are similar to what we have already seen.

  1. A specific set with one number can be derived by 1 PIN.
  2. A specific set with two numbers can be derived by 14 PINs.
  3. A specific set with three numbers can be derived by 36 PINs.
  4. A specific set with three numbers can be derived by 24 PINs.

Therefore, the total probability is:
\dfrac{1}{10^8}(10 \cdot 1 + 630 \cdot 14 + 4320 \cdot 36 + 5040 \cdot 24) = 0.00285310 = 0.29\%

Where the factor 10^8 is because now we count matching pairs and there are in total 10^8 pairs.

If this analysis does not convince you, then here are some Python programs, the first one counts the sets with a specific size while the second one emulates the random process.

pins = ['{0:04}'.format(i) for i in range(0, 10000)]

counter1, counter2, counter3, counter4 = 0, 0, 0, 0
for p in pins:
    p = set(p)
    if len(p) == 1:
        counter1 += 1
    elif len(p) == 2:
        counter2 += 1
    elif len(p) == 3:
        counter3 += 1
    elif len(p) == 4:
        counter4 += 1

print(counter1, counter2, counter3, counter4)
from random import randint

def rand_pin_digits():
    a = '{0:04}'.format(randint(0, 10000))
    a = set(a)
    return a

match = 0
for i in range(10**6):
    a = rand_pin_digits()
    b = rand_pin_digits()

    if a == b:
        match += 1
        
print(match)
Advertisements

3 Comments

  1. Errikos says:

    Assuming equal probability for all pins. With humans, nothing is random. Everything has a psychological aspect… It is called human nature!
    Check this:
    http://www.cl.cam.ac.uk/~jcb82/doc/BPA12-FC-banking_pin_security.pdf

    😉

    • dimle says:

      Thanks Erriko! Very nice fitting observation! You are absolutely right. It is usually counter-intuitive the notion of randomness for us so we tend to think that sequences like 11111111 are more likely than 01001100.

      I hope the fallacy is obvious.

  2. Errikos says:

    The diagram in page 6 is extremely interesting (όλα τα λεφτά…)
    😉

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: