PGA Tour Championship - 1st Rd (Atlanta, GA): Which PAIR will record a LOWER COMBINED BACK 9 SCORE?
4:20PM
Justin Thomas + Patrick Cantlay or Tie
Brooks Koepka + Patrick Reed
Inputs To Solve
East Lake Golf Club 2018 Course Stats
##### User Estimates #####
hole_10 = {'3':28,
'4':79,
'5':13}
hole_11 = {'2':10,
'3':86,
'4':24}
hole_12 = {'3':29,
'4':74,
'5':16,
'5':1}
hole_13 = {'3':20,
'4':86,
'5':13,
'6':1}
hole_14 = {'3':17,
'4':73,
'5':27,
'6':3}
hole_15 = {'2':18,
'3':76,
'4':19,
'5':7}
hole_16 = {'3':18,
'4':77,
'5':19,
'6':6}
hole_17 = {'2':1,
'3':28,
'4':75,
'5':16,
'6':4}
hole_18 = {'3':2,
'4':62,
'5':50,
'6':6}
## Inputs Defined in the Problem
golfers = ['1','2','3','4']
Method to Solve
- [1] Enumerate all the possible combinations of observed scores and their respective probabilities for a golfer on 9 holes (pair)
- [2] Enumerate all the possible combinations of observed scores (scores) and their respective probabilities (probability) for a pair of golfers on 9 holes
- [3] The probability Brooks Koepka + Patrick Reed’s score is lower is the sum of all the probabilities in the combinations where their score is lower (p_BK_PR_lower)
## [1]
import numpy as np
import pandas as pd
## [1]
t=sum(hole_10.values())
for i in hole_10:
hole_10[i] = hole_10[i]/t
t=sum(hole_11.values())
for i in hole_11:
hole_11[i] = hole_11[i]/t
t=sum(hole_12.values())
for i in hole_12:
hole_12[i] = hole_12[i]/t
t=sum(hole_13.values())
for i in hole_13:
hole_13[i] = hole_13[i]/t
t=sum(hole_14.values())
for i in hole_14:
hole_14[i] = hole_14[i]/t
t=sum(hole_15.values())
for i in hole_15:
hole_15[i] = hole_15[i]/t
t=sum(hole_16.values())
for i in hole_16:
hole_16[i] = hole_16[i]/t
t=sum(hole_17.values())
for i in hole_17:
hole_17[i] = hole_17[i]/t
t=sum(hole_18.values())
for i in hole_18:
hole_18[i] = hole_18[i]/t
y = np.array([(t,u,v,w,x,z,a,b,c) for t in hole_10.keys() for u in hole_11.keys() for v in hole_12.keys()
for w in hole_13.keys() for x in hole_14.keys() for z in hole_15.keys() for a in hole_16.keys()
for b in hole_17.keys() for c in hole_18.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores['total_strokes'] = scores.sum(axis=1)
z = np.array([(t,u,v,w,x,z,a,b,c) for t in hole_10.values() for u in hole_11.values() for v in hole_12.values()
for w in hole_13.values() for x in hole_14.values() for z in hole_15.values() for a in hole_16.values()
for b in hole_17.values() for c in hole_18.values()])
probability = pd.DataFrame(z)
probability['p'] = probability.product(axis=1)
pair = {}
for s in set(scores['total_strokes']):
pair[s] = probability['p'][scores['total_strokes']==s].sum()
pair
{24: 3.871561882524583e-10,
25: 3.48701010774946e-08,
26: 1.2427949177842029e-06,
27: 2.3328960825676486e-05,
28: 0.00026474145257614586,
29: 0.001962650964754365,
30: 0.00994404779209208,
31: 0.03533462344702592,
32: 0.08929012231728761,
33: 0.16171129995247704,
34: 0.21160121114358144,
35: 0.2033033961230164,
36: 0.14745682587312717,
37: 0.08316324610279467,
38: 0.03729879672483515,
39: 0.013499330366800424,
40: 0.00397390102762465,
41: 0.0009530365215584436,
42: 0.00018535840704149285,
43: 2.89071618538805e-05,
44: 3.543496246375633e-06,
45: 3.307231429671809e-07,
46: 2.2357504379360294e-08,
47: 1.0061618614175353e-09,
48: 2.5292671081376566e-11,
49: 2.0417973416965354e-13}
## [2]
y = np.array([(a,b) for a in pair.keys() for b in pair.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores['BK_PR_Lower'] = scores[0] < scores[1]
scores.head()
0 | 1 | BK_PR_Lower | |
---|---|---|---|
0 | 24 | 24 | False |
1 | 24 | 25 | True |
2 | 24 | 26 | True |
3 | 24 | 27 | True |
4 | 24 | 28 | True |
z = np.array([(a,b) for a in pair.values() for b in pair.values()])
probability = pd.DataFrame(z)
probability['p'] = probability.product(axis=1)
probability.head()
0 | 1 | p | |
---|---|---|---|
0 | 3.871562e-10 | 3.871562e-10 | 1.498899e-19 |
1 | 3.871562e-10 | 3.487010e-08 | 1.350018e-17 |
2 | 3.871562e-10 | 1.242795e-06 | 4.811557e-16 |
3 | 3.871562e-10 | 2.332896e-05 | 9.031952e-15 |
4 | 3.871562e-10 | 2.647415e-04 | 1.024963e-13 |
p_BK_PR_lower = probability['p'][scores['BK_PR_Lower']==True].sum()
Solution
print("The probability Brooks Koepka + Patrick Reed's combined score on holes 10-18 is lower is ~%s" % round(p_BK_PR_lower,3))
The probability Brooks Koepka + Patrick Reed's combined score on holes 10-18 is lower is ~0.424
Info
download markdown file
email: krellabsinc@gmail.com
twitter: @KRELLabs
import sys
print(sys.version)
3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Posted on 8/22/2019