10:42AM
D. Schwartzman v. R. Gasquet: 1st Set Aces or Tie
J. Furyk + P. Mickelson: Score on Hole 2
2019 Aces Per Service Games Played
Schwartzman
Gasquet
Medinah CC Course Stats
schwartzman_servwon = .73
gasquet_servwon = .79
Schwartzman_ACEperSERVE = 95/559
Gasquet_ACEperSERVE = 129/260
hole_2 = {'2':50,
'3':313,
'4':44,
'5':1}
print("The ratio of Aces per Serves for Schwartzman is %s" % round(Schwartzman_ACEperSERVE,3))
print("The ratio of Aces per Serves for Gasquet is %s" % round(Gasquet_ACEperSERVE,3))
The ratio of Aces per Serves for Schwartzman is 0.17
The ratio of Aces per Serves for Gasquet is 0.496
Method To Solve
- [1] Use Monte Carlo Simulation and simulate 9999 Sets between Schwartzman and Gasquet using their respective Service Games Won Percent to obtain an estimate for the number of times each player will serve and therefore an estimate of the number of Aces.
- [2] Create a distribution of Aces observed by simulating one set between the 2 players. aces_dist
- [3] Enumerate all the possible combinations of observed scores and their respective probabilities for 2 golfers on the 2nd hole. hole_2_dist
- [4] Enumerate all the possible combinations of Aces and Hole 2 Scores.
- [5] The probability that Furyk and Mickelson’s score is greater than 1st Aces (p_golfers_higher) is the sum of the probabilities of the respective outcomes.
import numpy as np
import pandas as pd
def sim_game(server,player1,player2):
if server:
serve = np.random.choice([1,0],1,p=[player1,1-player1])
if serve == 1:
ans = '1'
else:
ans = '0'
else:
serve = np.random.choice([1,0],1,p=[player2,1-player2])
if serve == 1:
ans = '0'
else:
ans = '1'
return ans
def set_over(winner):
set_over = False
if winner.count('1') >= 7:
set_over = True
if winner.count('0') >= 7:
set_over = True
if winner.count('1') >=6 and winner.count('0') < 5:
set_over = True
if winner.count('0') >=6 and winner.count('1') < 5:
set_over = True
if winner.count('0') == 6 and winner.count('1') == 6:
set_over = True
return set_over
def sim_set(player1,player2):
winner = ''
server = True
set_over_ = False
aces = 0
while not set_over_:
game = sim_game(server,player1,player2)
winner += str(game)
if server:
a = np.random.choice([1,0],1,p=[Schwartzman_ACEperSERVE,1-Schwartzman_ACEperSERVE])
else:
a = np.random.choice([1,0],1,p=[Gasquet_ACEperSERVE,1-Gasquet_ACEperSERVE])
aces += a
server = not server
set_over_ = set_over(winner)
return aces
iterations = 9999
aces = []
for i in range(iterations):
aces.append(int(sim_set(gasquet_servwon,schwartzman_servwon)))
aces_dist = {}
for a in aces:
if a in aces_dist:
aces_dist[a] += 1
else:
aces_dist[a] = 1
t=sum(aces_dist.values())
for i in aces_dist:
aces_dist[i] = aces_dist[i]/t
aces_dist
{6: 0.0502050205020502,
4: 0.21722172217221722,
2: 0.2078207820782078,
3: 0.2611261126112611,
7: 0.0178017801780178,
5: 0.12361236123612361,
1: 0.0962096209620962,
0: 0.0223022302230223,
8: 0.0033003300330033004,
9: 0.00040004000400040005}
t=sum(hole_2.values())
for i in hole_2:
hole_2[i] = hole_2[i]/t
y = np.array([(a,b) for a in hole_2.keys()for b in hole_2.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores['total_strokes'] = scores.sum(axis=1)
scores.head()
|
0
|
1
|
total_strokes
|
0
|
2
|
2
|
4
|
1
|
2
|
3
|
5
|
2
|
2
|
4
|
6
|
3
|
2
|
5
|
7
|
4
|
3
|
2
|
5
|
z = np.array([(a,b) for a in hole_2.values() for b in hole_2.values()])
probability = pd.DataFrame(z)
probability['p'] = probability.product(axis=1)
probability.head()
|
0
|
1
|
p
|
0
|
0.122549
|
0.122549
|
0.015018
|
1
|
0.122549
|
0.767157
|
0.094014
|
2
|
0.122549
|
0.107843
|
0.013216
|
3
|
0.122549
|
0.002451
|
0.000300
|
4
|
0.767157
|
0.122549
|
0.094014
|
hole_2_dist = {}
for s in set(scores['total_strokes']):
hole_2_dist[s] = probability['p'][scores['total_strokes']==s].sum()
hole_2_dist
{4: 0.01501826220684352,
5: 0.1880286428296809,
6: 0.6149617935409457,
7: 0.16606593617839296,
8: 0.015390715109573244,
9: 0.000528642829680892,
10: 6.007304882737409e-06}
y = np.array([(a,b) for a in hole_2_dist.keys()for b in aces_dist.keys()])
y = y.astype(int)
scores = pd.DataFrame(y)
scores['golfers_higher'] = scores[0] > scores[1]
scores.head()
|
0
|
1
|
golfers_higher
|
0
|
4
|
6
|
False
|
1
|
4
|
4
|
False
|
2
|
4
|
2
|
True
|
3
|
4
|
3
|
True
|
4
|
4
|
7
|
False
|
z = np.array([(a,b) for a in hole_2_dist.values() for b in aces_dist.values()])
probability = pd.DataFrame(z)
probability['p'] = probability.product(axis=1)
probability.head()
|
0
|
1
|
p
|
0
|
0.015018
|
0.050205
|
0.000754
|
1
|
0.015018
|
0.217222
|
0.003262
|
2
|
0.015018
|
0.207821
|
0.003121
|
3
|
0.015018
|
0.261126
|
0.003922
|
4
|
0.015018
|
0.017802
|
0.000267
|
p_golf_higher = probability['p'][scores['golfers_higher']==True].sum()
Solution
print("The probability Furyk and Mickelson's score is greater than the 1st Set Aces is ~%s" % round(p_golf_higher,3))
The probability Furyk and Mickelson's score is greater than the 1st Set Aces is ~0.909
Info
download markdown file
email: krellabsinc@gmail.com
twitter: @KRELLabs
import sys
print(sys.version)
3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)]
Posted on 8/15/2019