PGA Tour Championship - 2nd Rd (Atlanta, GA): Which PLAYER will card a LOWER COMBINED SCORE on Holes 2-4?

12:53PM
Rickie Fowler (USA)
Kevin Kisner (USA) or Tie

Inputs To Solve

East Lake Golf Club 2019 Course Stats

##### User Estimates #####

hole_2 = {'2':6,
          '3':24}

hole_3 = {'3':5,
          '4':20,
          '5':5}

hole_4 = {'3':5,
          '4':17,
          '5':8}

## Inputs Defined in the Problem

holes = ['2','3','4']
golfers = ['RF','KK']

Method to Solve

[1] Enumerate all the possible combinations of observed scores and their respective probabilities for one golfer on Holes 2-4 (holes2_4).
[2] Enumerate all the possible combinations of observed scores (scores) and their respective probabilities (probability) for Rickie Fowler and Kevin Kisner on Holes 2-4.
[3] The probability that Rickie Fowler’s score is lower than Kevin Kisner’s is the sum of the probabilities of outcomes where his total score is less (p_RF).

## [1]

import numpy as np
import pandas as pd

t=sum(hole_2.values())
for i in hole_2:
    hole_2[i] = hole_2[i]/t
    
t=sum(hole_3.values())
for i in hole_3:
    hole_3[i] = hole_3[i]/t
    
t=sum(hole_4.values())
for i in hole_4:
    hole_4[i] = hole_4[i]/t

y = np.array([(a,b,c,) for a in hole_2.keys()for b in hole_3.keys() for c in hole_4.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores.columns = holes
scores['total_strokes'] = scores.sum(axis=1)

z = np.array([(a,b,c) for a in hole_2.values() for b in hole_3.values() for c in hole_4.values()])
probability = pd.DataFrame(z)
probability.columns = holes
probability['p'] = probability.product(axis=1)

holes2_4 = {}
for score in set(scores['total_strokes']):
    holes2_4[score] = probability['p'][scores['total_strokes']==score].sum()
holes2_4

{8: 0.005555555555555555,
 9: 0.06333333333333332,
 10: 0.2544444444444445,
 11: 0.41444444444444445,
 12: 0.22666666666666668,
 13: 0.035555555555555556}

## [2]

y = np.array([(y,z)for y in holes2_4.keys() for z in holes2_4.keys()])
scores = pd.DataFrame(y)
scores.columns = golfers
scores['RF_W'] = scores['RF'] < scores['KK']
scores.head()

	RF	KK	RF_W
0	8	8	False
1	8	9	True
2	8	10	True
3	8	11	True
4	8	12	True

x = np.array([(y,z) for y in holes2_4.values() for z in holes2_4.values()])
probability = pd.DataFrame(x)
probability.columns = golfers
probability['p'] = probability.product(axis=1)
probability.head()

	RF	KK	p
0	0.005556	0.005556	0.000031
1	0.005556	0.063333	0.000352
2	0.005556	0.254444	0.001414
3	0.005556	0.414444	0.002302
4	0.005556	0.226667	0.001259

## [3]

p_RF = probability['p'][scores['RF_W']==True].sum()

Solution

print("The probability that Rickie Fowler's score is lower than Kevin Kisner's is ~%s" % round(p_RF,3))

The probability that Rickie Fowler's score is lower than Kevin Kisner's is ~0.353

Info

download markdown file
email: krellabsinc@gmail.com
twitter: @KRELLabs

import sys
print(sys.version)

3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]

Posted on 8/23/2019