A psychological experiment

I had participated in an experiment. We were told that we are matched to pairs. Only one person in each pair would make a decision. According to this decision, the profits were split. If the total gain is X, then it relies completely on player A how this gain is divided. The other player remains inactive.

I was to make the decision. I thought I should play fair instead of trying to maximize my profit. The first round had passed when I realised how the game was set up.

We were required to fill in our “randomly” chosen role. The game consisted of multiple rounds with variations in each round, but there was only one position to fill in our role. That means that in every round, we had the same role. So, if it was true that our decisions affected others, then half of the people would be inactive throughout the experiment. It is a waste, don’t you think? So, we can assume that everyone had role A and there was no person who was affected by our decision (strictly speaking, this is not the only possible conclusion (finding out about fake rules does not determine the correct ones) but for all practical purposes, it is good enough). They tricked us to be sure that we make our choices while we know we might hurt someone else.

After the first round, I had still to take some decisions. I thought about it. Should I, now, act selfishly, since I can assume that no other person is hurt by my decision? Well, then, I would provide misleading evidence to the researchers who were conducting the experiment, in contrast to my beliefs. Should I, then, act like I didn’t see what they were doing? I proceeded it in this way, but I am now convinced that was not my best strategy.

They also provided some space where we could explain our reasoning. I still don’t know whether I should try to maximize my profit. Nevertheless, I should have pointed out that I find out about the set up of the experiment and then make them aware of my conscious choice, either by accepting a lower payoff and thus, making a bold statement, or by taking the extra money and then explaining that I didn’t think that it was necessary to buy the credibility of my “vote”.

Moral of the story: Full disclosure. How did I forget about it?

Probability of the same pin digits

Problem: What is the probability of one specific person to have the same pin digits (assume the pin is four digits) with the person next to him?

Motivation: The “obvious” approach, that is 4!/10^4 is wrong.

As a proof of concept, the following Python program:

pins = ['{0:04}'.format(i) for i in range(0, 10000)]

pins_a = set()
for p in pins:
    pins_a.add(frozenset(p))

print(len(pins_a))

Therefore, the probability is at least 1/385 or 0.26% because some tuples are produced more often than some other. UPDATE follows.

Explanation:
If your PIN consists of only one number, say 1111, then it is much harder that the other person only uses this digit to form his PIN (only 1/10^4) while for, say 1234, there are many combinations (4! to be exact).

From now on, we assume that we pick a PIN according to the uniform distribution.
Let’s count (explanation follows):
PINs with only 1, 2, 3 and 4 digit(s):

  1. 10
  2. {{10}\choose{2}} \cdot \left( \dfrac{4!}{2!2!} \cdot \dfrac{4!}{3!1!} \cdot 2 \right) = 630
  3. {{10}\choose{3}} \cdot 3 \cdot \dfrac{4!}{1!1!2!} = 4320
  4. 10! \cdot 9! \cdot 8! \cdot 7! = 5040

Sanity check: Total number is 10000.

Let’s explain how these numbers are produced:

  1. With only one digit, you can only form 10 different numbers.
  2. There are {{10}\choose{2}} to choose two number out of ten (without ordering). Now, we replicate them. So, in total we can either have {a, a, b, b} or {a, b, b, b} or {a, a, a, b}. Three different ways to replicate them. Let’s take the first way {a, a, b, b}. There are \dfrac{4!}{2!2!} possible orderings. For the other two (they are similar), we have \dfrac{4!}{3!1!} \cdot 2. The way we count the orderings is 4! for the four possible position divided by the number of the same elements in every group, due to symmetries.
  3. If you understood the previous one, this one is a piece of cake. Again, {{10}\choose{3}} to choose three different number. Now, we have three ways of replicating one of them. Therefore, the possible orderings are \dfrac{4!}{1!1!2!}
  4. We pick four different numbers and we care about their ordering.

In order to compute the probability, we need to know how often a specific set of number can occur. We have already argued how often a set with size 1, 2, 3 or 4 can occur. We omit the arguments since they are similar to what we have already seen.

  1. A specific set with one number can be derived by 1 PIN.
  2. A specific set with two numbers can be derived by 14 PINs.
  3. A specific set with three numbers can be derived by 36 PINs.
  4. A specific set with three numbers can be derived by 24 PINs.

Therefore, the total probability is:
\dfrac{1}{10^8}(10 \cdot 1 + 630 \cdot 14 + 4320 \cdot 36 + 5040 \cdot 24) = 0.00285310 = 0.29\%

Where the factor 10^8 is because now we count matching pairs and there are in total 10^8 pairs.

If this analysis does not convince you, then here are some Python programs, the first one counts the sets with a specific size while the second one emulates the random process.

pins = ['{0:04}'.format(i) for i in range(0, 10000)]

counter1, counter2, counter3, counter4 = 0, 0, 0, 0
for p in pins:
    p = set(p)
    if len(p) == 1:
        counter1 += 1
    elif len(p) == 2:
        counter2 += 1
    elif len(p) == 3:
        counter3 += 1
    elif len(p) == 4:
        counter4 += 1

print(counter1, counter2, counter3, counter4)

from random import randint

def rand_pin_digits():
    a = '{0:04}'.format(randint(0, 10000))
    a = set(a)
    return a

match = 0
for i in range(10**6):
    a = rand_pin_digits()
    b = rand_pin_digits()

    if a == b:
        match += 1
        
print(match)

Filenames may be treated as flags in shell

Be careful when running commands like rm *. A filename may be interpreted as a switch. That happens because the shell tries to interpret the filename before passing it to the program.

For instance, in the following example, we see that rm * is executed differently, depending on the names of the files of the current directory. The (double dash) signifies the end of the switches.

$ mkdir a
$ mkdir b
$ touch a.txt
$ rm *
rm: cannot remove `a’: Is a directory
rm: cannot remove `b’: Is a directory
$ ls
b/ a/
$ touch a.txt
$ touch — -rf
$ ls
b/ a/ -rf a.txt
$ rm *
$ ls
-rf

Incompleteness Theorem

We will prove Gödel’s first incompleteness theorem by using some terms of Computational Theory and the fact that the halting problem is uncomputable.

First of all, some terminology.

We assume that every mathematical statement s can be represented as a bitstring. (Just choose an encoding — for instance ASCII and then convert it to bit representation). Therefore, every mathematical statement s belongs to the set S \subset \{0, 1\}*.

Moreover, we are going to need a definition for what a program is. You can choose your favourite one, as long as we agree that it is a finite sequence of statements, where the statements are simple enough depending on you computational model (you don’t cheat here).

Now, let’s suppose that for each program A and bitstring x (its input), we have the statement “A(x) halts” (A(x) is going to finish its execution at some point). Let’s pick a convenient notation for this statement h_{A, x}. We require h_{A, x} \in S and \neg h_{A, x} \in S (they are mathematical statements). We also assume that every proof p can be represented as bitstring.

We have included explicitly the statements h_{A, x}, \neg h_{A, x} in S because they are going to help us in the proof by contradiction. We will create a machine that decides the halting problem if Gödel first incompleteness Theorem is not true.

Now, we have to agree on some function (our set of axioms) that determines whether a given proof is correct (proves a mathematical statement). We won’t put any limitation on this function. You can pick whatever you like. What we will see is that, no matter what you pick, there are some statements that can not be proven as long as we are consistent. More formally, let V be the function of your choice such that \{0, 1\}* \times \{0, 1\}* \rightarrow \{0, 1\}.

Now, some more assumptions which seem natural. First of all, we assume that all the proofs can be verified mechanically. That means, there is a Turing program V(< s, p >) which decides whether the proof is correct or not. This implies that the program V always halts.

One last thing before the actual proof. We say that statement s can be V-proven in case V(< s, p >) = 1 for some p \in \{0, 1\}*. Setting aside the notation, what we said is that s can be proven if there is a proof for that. Wow! Now, just to be clear, we say that V is inconsistent if we can prove both p and \neg p. Otherwise, it is consistent.

Like the annoying friends who, when you are going to the cinema, spoil to you how this ends, here comes the Gödel first incompleteness Theorem. It states, in summary, what the following proof is going to demonstrate.

Gödel first incompleteness Theorem

Let V be a consistent verification function. Suppose that for any Turing program A and any input x such that A(x) halts, h_{A, x} can be V-proven. Then, there exists a Turing program A and an input x such that A(x) runs forever, and yet \neg h_{A, x} cannot be V-proven.

Proof

We recall that the halting problem can not be decided. Let’s assume that, in contrast to the theorem, if a program A never halts then \neg h_{A, x} can be proven. Then, we will see that we can give a program H which decides the halting problem, which can not be true.

Our program H takes as input the description of program A and its input x encoded as tuple. Then, H simulates A(x). In parallel, it also simulates V(< h_{A,x}, p >), where it enumerates over all possible proofs p for \neg h_{A, x}. Let’s recall at this point that \neg h_{A, x} is the statement that A(x) does not halt.

Let’s see what can happen. If the simulation of A(x) ends at some point, H outputs 1. Otherwise, it will find eventually a proof for \neg h_{A, x} and it will output 0 (remember our assumption at the very beginning of this proof that if A(x) never halts, we will find a proof for that.).

Maybe, we already see intuitively that program H is able to decide whether A halts or not which can not be true. Let’s see that in more detail.

  • A(x) halts. Then h_{A, x} can be V-proven as we already saw. Since V is consistent, we can not also V-prove h_{A, x}. Therefore, H(<τ(Α), x>) outputs 1.
  • A(x) does not halt. Then, we will find eventually the proof p for \neg h_{A, x}. Therefore, H(<τ(Α), x>) outputs 0.

All in all, program H is able to decide whether A halts or not in the particular input x. Nevertheless, we know that the halting problem is not decidable. Therefore, there can not be such a program H which concludes our proof.

I was able to write this proof due to the course of Complexity Theory at ETH and the excellent notes that they provide us.

For an alternative, descriptive proof, you can check wikipedia.

A program that prints itself

We use a string that contains all the following code of the program. Then, we print the preamble and twice this string. We employ some tricks to easily print special characters like quotes.

code = """print('code = ""', end='"')
print(code + '"' + '""')
print(code, end='')
"""
print('code = ""', end='"')
print(code + '"' + '""')
print(code, end='')

Your base vector

English translation:

You can have two base vectors and explore 100 combinations in 2D, or you can try more and have three base vectors and maybe explore 50 combinations in 3D. At the second case, I think you are better off. Even better, if you go higher. What is the point of solving 200 exercises on addition and never learning multiplication? How will you know that you did well at the end?

Greek (original):

Μπορείς να έχεις δυο διανύσματα βάσης και να εξερευνήσεις 100 συνδυασμούς σε 2D, ή μπορείς με περισσότερο κόπο να έχεις τρία διανύσματα βάσης και να εξερευνήσεις ίσως 50 συνδυασμούς σε 3D. Στην δεύτερη περίπτωση νομίζω είσαι πιο κερδισμένος. Ακόμα καλύτερα αν το πας και σε παραπάνω. Τι να το κάνεις να λύσεις 200 ασκήσεις πρόσθεσης και να μη μάθεις ποτέ τον πολλαπλασιασμό; Πώς θα ξέρεις στο τέλος ότι έπραξες σωστά;

Thank you Andreas for reminding me.

How’s life?

The new semester at ETH started today. The exams finished for me on Saturday 4/2. So, I enjoyed almost two weeks of vacation.

The first few days, I was solving problems in Project Euler and I finished reading two literature books. Mostly Harmless (Hitchhiker’s Guide, #5) and Demian.

Then, some friends of mine came to Zurich. I gladly discovered new aspects of the city that I live during the last few months that I was not aware of. Furthermore, having a great company the city was looking even greater! The only misfortune was that all the days that these friends were here, it was really cold. We also visited Luzern.

The last few days were spent on various things. Just some rest, a small trip to Schaffhausen where, according to wikipedia, is the Europe’s largest waterfall and catching up with some administrative things.

Coming to the present day, ETH courses have started again and I am very happy about it. The courses that I attended today were very interesting. The instructors were clear and helpful in understanding the material.

Side note: I put some links at the beginning of this post. When I read another post or article, I usually open them in a new tab and read them after having finished what I was reading, unless they are essential. The other way would be read them and then come back. I have read studies that the back button is one of the most used in browsers. Therefore, some people read the linked webpage right away. The first method is like BFS while the second one is more like DFS. I like more DFS as an algorithm :P (silly preferences!) but I use BFS when browsing. I wonder if someone else had thought about it.