IBM Research | Ponder This | August 2024 Solution

August 2024 - Solution

August 2024 Challenge

August 2024 Solution:

The answers are 205 and 60694.

This problem is a relativly simple case of a Markov chain: Assume for the general case we have n sets with k elements each (for our problem we have n=k=4 for the standard and n=10, k=5 for the bonus).

We have a state Q_{i} for every number 0\le i\le n of sets we already guessed, and we compute the transition probability between states: p_{ij} is the probability of transition from Q_i to Q_j. Denote the expected time to hit Q_n starting from Q_i by h_i, then (as a special case of the general theorem about hitting times in Markov chains) we have the following equations:

h_n = 0
h_i = 1 + \sum_{j=0}^{n} p_{ij} h_j

This is simplified by noting that p_{ij} = 0 for j<i since we cannot "unguess" a set. So we can write:

h_i =1+p_{ii}h_i+\sum_{j=i+1}^n p_{ij}h_j

Leading to the solution

h_i = \frac{1+\sum_{j=i+1}^n p_{ij}h_j}{1-p_{ii}}

We are interested in the expected time to reach Q_n from Q_0, hence we iteratively compute h_n, h_{n-1}, \ldots, h_0 with h_0 being our result.

It remains to compute the transition probabilities p_{ij}, i.e. to solve the combinatorial problem "Given n-i sets to guess, each with k elements, what is the probability of guessing exactly j-i?"

The combinatorial object being counted is the number of ways to partition \{1,2,\ldots, n\cdot k\} into n sets, with no order between the sets. If we do consider the order between sets, this is the standard problem of permutations with repetition (we permute the set designations - the "colors"), which is solved by the formula \frac{(nk)!}{(k!)^n} We divide it by n! to accomodate for the lack of order between sets and obtain the value F(n,k)=\frac{(nk)!}{n!(k!)^n}.

Given at least j correct "guesses" it means there remain n-j sets whose elements are freely distributed, hence there are F(n-j,k) partitions which guess specific j sets. To count the number of guesses where exactly j correct guesses were made we use the inclusion-exclusion principle:

G(n,k,j) = {n\choose j}\sum_{r=0}^{n-j}(-1)^{r}{n-j\choose r}F(n-j-r,k)

Where {n\choose j} is the number of ways to choose the specific j sets correctly guessed, and the sum is a standard use of inclusion-exclusion to find the number of guesses (for the remaining n-j sets) where no correct guess was made.

This gives us the probability of guessing exactly j of the n possible sets: \frac{G(n,k,j)}{F(n,k)} and hence, p_{ij} = \frac{G(n-i,k,j-i)}{F(n-i,k)}, with the above hitting expectation time equations giving the solutions.

The following (inefficient) code immediately solves both the standard and the bonus problems:

from math import factorial, comb


def F(n, k):
    return factorial(n*k) // (factorial(k)**n * factorial(n))


def G(n,k,j):
    return comb(n,j)*sum([(-1)**r*comb(n-j,r)*F(n-j-r,k) for r in range(n-j+1)])


def p(i,j, n, k):
    return G(n-i,k,j-i)/F(n-i,k)


def h(i, n, k):
    return (1+sum([p(i,j,n,k)*h(j,n,k) for j in range(i+1,n)]))/(1-p(i,i,n,k))


print("(4,4) problem:", float(h(0,4,4)))
print("(10,5) problem:", float(h(0,10,5)))