PGA Tour Championship - 2nd Rd (Atlanta, GA): Which PLAYER will card a LOWER COMBINED SCORE on Holes 2-4?
12:53PM
Rickie Fowler (USA)
Kevin Kisner (USA) or Tie
Inputs To Solve
East Lake Golf Club 2019 Course Stats
##### User Estimates #####
hole_2 = {'2':6,
'3':24}
hole_3 = {'3':5,
'4':20,
'5':5}
hole_4 = {'3':5,
'4':17,
'5':8}
## Inputs Defined in the Problem
holes = ['2','3','4']
golfers = ['RF','KK']
Method to Solve
- [1] Enumerate all the possible combinations of observed scores and their respective probabilities for one golfer on Holes 2-4 (holes2_4).
- [2] Enumerate all the possible combinations of observed scores (scores) and their respective probabilities (probability) for Rickie Fowler and Kevin Kisner on Holes 2-4.
- [3] The probability that Rickie Fowler’s score is lower than Kevin Kisner’s is the sum of the probabilities of outcomes where his total score is less (p_RF).
## [1]
import numpy as np
import pandas as pd
t=sum(hole_2.values())
for i in hole_2:
hole_2[i] = hole_2[i]/t
t=sum(hole_3.values())
for i in hole_3:
hole_3[i] = hole_3[i]/t
t=sum(hole_4.values())
for i in hole_4:
hole_4[i] = hole_4[i]/t
y = np.array([(a,b,c,) for a in hole_2.keys()for b in hole_3.keys() for c in hole_4.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores.columns = holes
scores['total_strokes'] = scores.sum(axis=1)
z = np.array([(a,b,c) for a in hole_2.values() for b in hole_3.values() for c in hole_4.values()])
probability = pd.DataFrame(z)
probability.columns = holes
probability['p'] = probability.product(axis=1)
holes2_4 = {}
for score in set(scores['total_strokes']):
holes2_4[score] = probability['p'][scores['total_strokes']==score].sum()
holes2_4
{8: 0.005555555555555555,
9: 0.06333333333333332,
10: 0.2544444444444445,
11: 0.41444444444444445,
12: 0.22666666666666668,
13: 0.035555555555555556}
## [2]
y = np.array([(y,z)for y in holes2_4.keys() for z in holes2_4.keys()])
scores = pd.DataFrame(y)
scores.columns = golfers
scores['RF_W'] = scores['RF'] < scores['KK']
scores.head()
RF | KK | RF_W | |
---|---|---|---|
0 | 8 | 8 | False |
1 | 8 | 9 | True |
2 | 8 | 10 | True |
3 | 8 | 11 | True |
4 | 8 | 12 | True |
x = np.array([(y,z) for y in holes2_4.values() for z in holes2_4.values()])
probability = pd.DataFrame(x)
probability.columns = golfers
probability['p'] = probability.product(axis=1)
probability.head()
RF | KK | p | |
---|---|---|---|
0 | 0.005556 | 0.005556 | 0.000031 |
1 | 0.005556 | 0.063333 | 0.000352 |
2 | 0.005556 | 0.254444 | 0.001414 |
3 | 0.005556 | 0.414444 | 0.002302 |
4 | 0.005556 | 0.226667 | 0.001259 |
## [3]
p_RF = probability['p'][scores['RF_W']==True].sum()
Solution
print("The probability that Rickie Fowler's score is lower than Kevin Kisner's is ~%s" % round(p_RF,3))
The probability that Rickie Fowler's score is lower than Kevin Kisner's is ~0.353
Info
download markdown file
email: krellabsinc@gmail.com
twitter: @KRELLabs
import sys
print(sys.version)
3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Posted on 8/23/2019