Breaking Down Secretary Problem by@suraj-regmi

Data Scientist, The World Bank -- the views, content here represent my own and not of my employers.

What did you think of when you had a crush on someone? Did you fantasize about marriage with him/her? When you were in some serious relationship, did you plan marriage with your partner? How did the relationship turn out? Some relationships turn into a marriage, and some don’t. Hearing stories of many friends, I see extremely few people being in a relationship (and later marrying) with only one person whole over their life.

So, it is expected that people have experience of some relationships before being finally married to someone for the long term. The decision of marrying someone after a relationship is a binary choice. You either decide to marry or reject to marry. You can’t just keep someone as an option, and explore other partners. Everyone wants the “best” spouse for them, but you can’t really experiment with many spouses in life to find the best one. *Time and tide wait for no man.*

You are more likely to find a fitting partner at some age range than others. So, I would constrain this problem by the number of relationships you can be with all over your life. A man, if never wants to marry, might stay in twenty relationships all over his life on an average. Twenty is just a guess here. I am modeling this problem with such a constraint here.

The choice for someone is binary. The problem is constrained by the number of relationships *(n)*. So, yes, this is a secretary problem given the way I modeled.

My interest here is in finding the strategy which allows me to select the best partner. I can’t keep rejecting the partners hoping to get better finds later as I am constrained by the number of total relationships I can have. I can’t select too early as someone better might be there down the road. So, how can I shape my strategy?

**Problem formulation**

Let’s suppose we start to make a decision about our spouse after exploring *x %* of the relationships.

The probability that we choose the best one is the sum of probabilities: *the first relationship is the best one and we marry the first partner, the second relationship is the best one and we marry the second partner, the third relationship is the best one and we marry the third partner, …, the last relationship is the best one and we marry the last partner*

Let *B_i* be the event of *ith* partner being the best one, and *M_i* be the event of marrying with *ith* partner.

So, the probability is:

*p = P(B_1 & M_1) + P(B_2 & M_2) + … + P(B_n & M_n)*

**Our decision criteria**

According to our decision criteria, let’s suppose we are ready to make decision after *r - 1* relationships.

So,

*x = (r - 1) / n*

We do not make a decision until the *rth* relationship, so the probability terms until *rth* relationship are zero.

Hence,

*p = P(B_r & M_r) + P(B_[r+1] & M_[r+1]) + … + P(B_n & M_n)*

For some *i* such that *r ≤ i ≤ n,*

Using conditional probability:

*P(B_i & M_i) = P(B_i) * P(M_i|B_i)*

As the random variable *B* denoting the position on which the best candidate lies is uniformly distributed, *B_i = 1 / n.*

Now, given that the best candidate lies in *ith* position, the probability of we choosing the *ith* candidate is the probability of second-best lying in first *(r - 1) *elements. So, the probability becomes: *(r - 1) / (i - 1).*

Therefore, the total probability is:

*p = (r - 1) / n * [1 / (r - 1) + 1 / r + 1 / (r + 1) + … + 1 / n]*

The multiplier of the series, *(r - 1) / n*, is the value *x%* we defined in the beginning.

**Transforming infinite discrete sum into integration**

The probability value, *p*, is the discrete sum, whose optimum point depends on *n* and *r*, but let’s see the optimum value of *x* to maximize the probability, *p*, when *n* tends to infinity!

So, the converging probability for us would be:

*Max(p; n*

`→∞`

*)*

So, let’s take the limiting value of *p* as *n*→∞.

The infinite discrete series can be transformed into a continuous form by setting *1 / n* to be *dt* and *n / (r - 1)* to be *1 / t*.

`So, `

*t = (i - 1) / n *

`for i in `

*[r, n]*

`.`

Now, the lower bound for the integral would be:

*t(i = r) = (r - 1) / n = x*

And, the upper bound would be:

*(n - 1) / n.*

So, now, the integration would be:

*p = x * ∫[x, (n - 1) / n] (1 / t) dt *

*p = x * [ln{(n - 1) / n} - lnx]*

Now, using the limit

*n→∞*

,*p; n→∞ = x * [ln{(n - 1) / n} - lnx]; n→∞*

*p; n→∞ = x * [ln(1 - 1/n) - lnx]; n→∞*

*p; n→∞ = - x * lnx*

It is the probability of selecting the best candidate give the percentage *x*.

**Optimization for the probability value**

For the maximum value of the probability, we take the derivative of it with respect to *x* and equal to zero to find the optimum value of *x*.

`So, `

*dp / dx; n→∞ = 0*

*or, - x / x - lnx * 1 = 0*

*or, - 1 - lnx = 0*

*or, lnx = - 1*

*or, x = 1 / e*

So, when the value of *x** *is ** 1 / e** or

And, the probability of selecting the best partner is:

*-x * ln(x) = - 1 / e * ln(1 / e) = 1 / e*

So, if we follow the optimum strategy, we have ** ~36.8%** of selecting the best partner.

While the problem has beautiful mathematics and provides an interesting insight for optimal stopping tasks, it can’t be modeled for many real-life scenarios because of the following reasons.

- It is hard to quantify the competency of the people accurately in a very short time for scenarios like job interviews.
- The problem accounts for selecting only the best candidate. So, that means the last candidate is as worthy as second-best candidate. It is rarely the case in real life.
- The problem assumes the number of total candidates is known beforehand, which is unfortunately not true for many real-life scenarios.
- The problem requires rejection or acceptance instantly for all the candidates. Real-life scenarios don’t require or give such decisions.

However, it is an interesting problem to give insight into optimal strategies and behavior. To the scenarios which can be modeled by the problem (if you care for only the best partner, the marriage problem *may* fit into this model ;) ), the strategy can be effectively and easily applied.

Coming to the initial guess I made about the constraint (*20* relationships in total), the person should make the decision after *7 / 8* relationships. Let’s have more fun by assuming the age range. Suppose the *20* relationships are evenly spread in the age range *(20 - 40).* Then, taking an average of *1* relationship per year, the optimum age to start making decision is around *27/28*.

Wait, is it too early to make the marriage decision? Or does it actually make sense?

What do you think? Do let me know your views in the comment section.

*Ciao!*

*Also published at **https://medium.com/@surajregmi/what-is-secretary-problem-problem-definition-its-mathematics-and-real-life-scenarios-b4a2f2df912c*