12:45 PM
Marc Leishman, Branden Grace and Sungjae Im or Tie
Kevin Na, Bubba Watson and Scott Stallings
The Old White Course Stats
hole_1 = {'3':14,
'4':49,
'5':9}
hole_2 = {'3':11,
'4':47,
'5':10,
'6':1}
print("The observed distribution of scores on hole 1 is: %s" % hole_1)
print("The observed distribution of scores on hole 2 is: %s" % hole_2)
The observed distribution of scores on hole 1 is: {'3': 14, '4': 49, '5': 9}
The observed distribution of scores on hole 2 is: {'3': 11, '4': 47, '5': 10, '6': 1}
golfers = ['G11','G21','G31','G12','G22','G32']
Method to Solve
- [1] Enumrate all the possible combinations of observed scores (scores) and their respective probabilities (probabilities) for 3 golfers on the holes 1 and 2.
- [2] Create a distribution of all possible unique combined scores for 3 golfers on holes 1 and 2 and their respective probabilities (dist_1and2)
- [3] Enumerate all the possible combinations of scores (**_2groups_scores) for 2 sets of 3 golfers on holes 1 and 2 and their respective probabilities (_2groups_p**).
- [4] The probability that group 1’s combined score is lower than or equal to group 2’s score (p_g1) is the sum of all the outcomes where group 1’s score is less than or equal to group 2’s score.
import numpy as np
import pandas as pd
t=sum(hole_1.values())
for i in hole_1:
hole_1[i] = hole_1[i]/t
t=sum(hole_2.values())
for i in hole_2:
hole_2[i] = hole_2[i]/t
y = np.array([(a,b,c,d,e,f) for a in hole_1.keys() for b in hole_1.keys() for c in hole_1.keys()
for d in hole_2.keys() for e in hole_2.keys() for f in hole_2.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores.columns = golfers
scores['total_strokes'] = scores.sum(axis=1)
x = np.array([(a,b,c,d,e,f) for a in hole_1.values() for b in hole_1.values() for c in hole_1.values()
for d in hole_2.values() for e in hole_2.values() for f in hole_2.values()])
probability = pd.DataFrame(x)
probability.columns = golfers
probability['p'] = probability.product(axis=1)
scores.head()
|
G11
|
G21
|
G31
|
G12
|
G22
|
G32
|
total_strokes
|
0
|
3
|
3
|
3
|
3
|
3
|
3
|
18
|
1
|
3
|
3
|
3
|
3
|
3
|
4
|
19
|
2
|
3
|
3
|
3
|
3
|
3
|
5
|
20
|
3
|
3
|
3
|
3
|
3
|
3
|
6
|
21
|
4
|
3
|
3
|
3
|
3
|
4
|
3
|
19
|
probability.head()
|
G11
|
G21
|
G31
|
G12
|
G22
|
G32
|
p
|
0
|
0.194444
|
0.194444
|
0.194444
|
0.15942
|
0.159420
|
0.159420
|
0.000030
|
1
|
0.194444
|
0.194444
|
0.194444
|
0.15942
|
0.159420
|
0.681159
|
0.000127
|
2
|
0.194444
|
0.194444
|
0.194444
|
0.15942
|
0.159420
|
0.144928
|
0.000027
|
3
|
0.194444
|
0.194444
|
0.194444
|
0.15942
|
0.159420
|
0.014493
|
0.000003
|
4
|
0.194444
|
0.194444
|
0.194444
|
0.15942
|
0.681159
|
0.159420
|
0.000127
|
holes1_2 = {}
for score in set(scores['total_strokes']):
holes1_2[score] = probability['p'][scores['total_strokes']==score].sum()
holes1_2
{32: 2.754712707008533e-07,
33: 5.945423108651507e-09,
18: 2.978635772907547e-05,
19: 0.000694563705227987,
20: 0.006873659215583314,
21: 0.0374549401667416,
22: 0.12190249186154868,
23: 0.24075749956727904,
24: 0.2828410099610215,
25: 0.19387930907707826,
26: 0.0844443027779792,
27: 0.02501146224727144,
28: 0.00523549293136384,
29: 0.0007854214409735436,
30: 8.368763703239537e-05,
31: 6.091636476953164e-06}
golfers = ['group1','group2']
y = np.array([(a,b) for a in holes1_2.keys() for b in holes1_2.keys()])
y = y.astype(int)
_2groups_scores = pd.DataFrame(y)
_2groups_scores.columns = golfers
_2groups_scores['G1_lower'] = scores['group1'] <= scores['group2']
x = np.array([(a,b) for a in holes1_2.values() for b in holes1_2.values()])
_2groups_p = pd.DataFrame(x)
_2groups_p.columns = golfers
_2groups_p['p'] = _2groups_p.product(axis=1)
_2groups_scores.head()
|
group1
|
group2
|
G1_lower
|
0
|
32
|
32
|
True
|
1
|
32
|
33
|
True
|
2
|
32
|
18
|
False
|
3
|
32
|
19
|
False
|
4
|
32
|
20
|
False
|
_2groups_p.head()
|
group1
|
group2
|
p
|
0
|
2.754713e-07
|
2.754713e-07
|
7.588442e-14
|
1
|
2.754713e-07
|
5.945423e-09
|
1.637793e-15
|
2
|
2.754713e-07
|
2.978636e-05
|
8.205286e-12
|
3
|
2.754713e-07
|
6.945637e-04
|
1.913323e-10
|
4
|
2.754713e-07
|
6.873659e-03
|
1.893496e-09
|
p_g1 = _2groups_p['p'][_2groups_scores['G1_lower']==True].sum()
Solution
print("The proability that Marc Leishman, Branden Grace and Sungjae Im combined score on holes 1 and 2 is lower than or equal to Kevin Na, Bubba Watson and Scott Stallings combined score is ~%s" % round(p_g1,3))
The proability that Marc Leishman, Branden Grace and Sungjae Im combined score on holes 1 and 2 is lower than or equal to Kevin Na, Bubba Watson and Scott Stallings combined score is ~0.6
Info
download md file
email: krellabsinc@gmail.com
twitter: @KRELLabs
import sys
print(sys.version)
3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Posted on 9/12/2019