PGA Military Tribute at The Greenbrier - 1st Rd: Which GROUP will record a LOWER COMBINED SCORE on Holes 1-2?

12:45 PM
Marc Leishman, Branden Grace and Sungjae Im or Tie
Kevin Na, Bubba Watson and Scott Stallings

Inputs To Solve

##### User Estimates #####

hole_1 = {'3':14,
          '4':49,
          '5':9}

hole_2 = {'3':11,
          '4':47,
          '5':10,
          '6':1}

print("The observed distribution of scores on hole 1 is: %s" % hole_1)
print("The observed distribution of scores on hole 2 is: %s" % hole_2)

The observed distribution of scores on hole 1 is: {'3': 14, '4': 49, '5': 9}
The observed distribution of scores on hole 2 is: {'3': 11, '4': 47, '5': 10, '6': 1}

## Inputs Defined in the Problem

golfers = ['G11','G21','G31','G12','G22','G32']

Method to Solve

[1] Enumrate all the possible combinations of observed scores (scores) and their respective probabilities (probabilities) for 3 golfers on the holes 1 and 2.
[2] Create a distribution of all possible unique combined scores for 3 golfers on holes 1 and 2 and their respective probabilities (dist_1and2)
[3] Enumerate all the possible combinations of scores (**_2groups_scores) for 2 sets of 3 golfers on holes 1 and 2 and their respective probabilities (_2groups_p**).
[4] The probability that group 1’s combined score is lower than or equal to group 2’s score (p_g1) is the sum of all the outcomes where group 1’s score is less than or equal to group 2’s score.

import numpy as np
import pandas as pd

t=sum(hole_1.values())
for i in hole_1:
    hole_1[i] = hole_1[i]/t

t=sum(hole_2.values())
for i in hole_2:
    hole_2[i] = hole_2[i]/t

y = np.array([(a,b,c,d,e,f) for a in hole_1.keys() for b in hole_1.keys() for c in hole_1.keys()
             for d in hole_2.keys() for e in hole_2.keys() for f in hole_2.keys()])

y = y.astype(int)
scores = pd.DataFrame(y)
scores.columns = golfers
scores['total_strokes'] = scores.sum(axis=1)

x = np.array([(a,b,c,d,e,f) for a in hole_1.values() for b in hole_1.values() for c in hole_1.values()
             for d in hole_2.values() for e in hole_2.values() for f in hole_2.values()])

probability = pd.DataFrame(x)
probability.columns = golfers
probability['p'] = probability.product(axis=1)

scores.head()

	G11	G21	G31	G12	G22	G32	total_strokes
0	3	3	3	3	3	3	18
1	3	3	3	3	3	4	19
2	3	3	3	3	3	5	20
3	3	3	3	3	3	6	21
4	3	3	3	3	4	3	19

probability.head()

	G11	G21	G31	G12	G22	G32	p
0	0.194444	0.194444	0.194444	0.15942	0.159420	0.159420	0.000030
1	0.194444	0.194444	0.194444	0.15942	0.159420	0.681159	0.000127
2	0.194444	0.194444	0.194444	0.15942	0.159420	0.144928	0.000027
3	0.194444	0.194444	0.194444	0.15942	0.159420	0.014493	0.000003
4	0.194444	0.194444	0.194444	0.15942	0.681159	0.159420	0.000127

## [2]

holes1_2 = {}
for score in set(scores['total_strokes']):
    holes1_2[score] = probability['p'][scores['total_strokes']==score].sum()
holes1_2

{32: 2.754712707008533e-07,
 33: 5.945423108651507e-09,
 18: 2.978635772907547e-05,
 19: 0.000694563705227987,
 20: 0.006873659215583314,
 21: 0.0374549401667416,
 22: 0.12190249186154868,
 23: 0.24075749956727904,
 24: 0.2828410099610215,
 25: 0.19387930907707826,
 26: 0.0844443027779792,
 27: 0.02501146224727144,
 28: 0.00523549293136384,
 29: 0.0007854214409735436,
 30: 8.368763703239537e-05,
 31: 6.091636476953164e-06}

## [3]
golfers = ['group1','group2']

y = np.array([(a,b) for a in holes1_2.keys() for b in holes1_2.keys()])

y = y.astype(int)
_2groups_scores = pd.DataFrame(y)
_2groups_scores.columns = golfers
_2groups_scores['G1_lower'] = scores['group1'] <= scores['group2']

x = np.array([(a,b) for a in holes1_2.values() for b in holes1_2.values()])

_2groups_p = pd.DataFrame(x)
_2groups_p.columns = golfers
_2groups_p['p'] = _2groups_p.product(axis=1)

_2groups_scores.head()

	group1	group2	G1_lower
0	32	32	True
1	32	33	True
2	32	18	False
3	32	19	False
4	32	20	False

_2groups_p.head()

	group1	group2	p
0	2.754713e-07	2.754713e-07	7.588442e-14
1	2.754713e-07	5.945423e-09	1.637793e-15
2	2.754713e-07	2.978636e-05	8.205286e-12
3	2.754713e-07	6.945637e-04	1.913323e-10
4	2.754713e-07	6.873659e-03	1.893496e-09

## [4]

p_g1 = _2groups_p['p'][_2groups_scores['G1_lower']==True].sum()

Solution

print("The proability that Marc Leishman, Branden Grace and Sungjae Im combined score on holes 1 and 2 is lower than or equal to Kevin Na, Bubba Watson and Scott Stallings combined score is ~%s" % round(p_g1,3))

The proability that Marc Leishman, Branden Grace and Sungjae Im combined score on holes 1 and 2 is lower than or equal to Kevin Na, Bubba Watson and Scott Stallings combined score is ~0.6

Info

download md file
email: krellabsinc@gmail.com
twitter: @KRELLabs

import sys
print(sys.version)

3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

Posted on 9/12/2019