Pythagorean Win Expectancy
A few days ago, Tim wrote up his mid season power rankings and this got me thinking about how the teams would finish. Being the stats nerd that I am, I started working on Pythagorean wins for lacrosse.
When most people see Pythagorean, they think of the Pythagorean Theorem (a2+b2=c2). Well this is sort of the same thing. When applying this to sports, we look at the scoring ratio of each team. This is found by simply dividing goals for by goals against. Once the scoring ratio (R) is found, we can start to find which exponent best fits our model.
I collected the goals for, goals against, and actual win percentage for each team in the MLL from 2004-2010 (54 team seasons). Once this was entered into excel, I found R and then the mathemagic began. Using the formula =(R^(EXP))/(R^(EXP)+1) I started to calculate Pythagorean wins. By inputting exponents starting at 1 and increasing by increments of 0.1 for EXP in the formula, I get a winning percentage based on the team's scoring ratio.
For example, if we look at the 2010 Chesapeake Bayhawks we have 179 goals for and 166 goals against with an actual win percentage of 0.500 (6-6). The scoring ratio for the hawks would be R=179/166=1.08. So, starting with 1 as the exponent we have PythWin=(1.08^1)/(1.08^1+1)=0.519. Our next step is to find the absolute difference between actual win percentage (Win) and PythWin (=abs(Win - PythWin)). This is shown by AbsDiff=|0.500-0.519|=0.019.
This method is then applied to all teams with an exponent of 1. Once all of the absolute differences are found, I take the average of them to find the mean absolute difference (MAD) for that exponent. For the exponent of 1, MAD=0.122, meaning that PythWin is off by around 12% with this exponent. Now, I increase the exponent by 0.1 (the exponent is now 1.1) and do everything again. As the exponent is increased, MAD will decrease for a while until it starts increasing again. When MAD reaches the smallest value I have then found which exponent to use back in the original formula for this model. When EXP=3.7, MAD=0.074 which means that PythWin is only off by 7% which is a lot better than 12%.
Now that I have found the equation, we can start looking at how the teams will finish if they keep on at their current pace.
| Team | Goals For | Goals Against | Ratio | PythWin | Projected Record |
| Denver Outlaws | 70 | 51 | 1.373 | 0.763 | 9-3 |
| Boston Cannons | 75 | 56 | 1.339 | 0.747 | 9-3 |
| Chesapeake Bayhawks | 64 | 64 | 0.500 | 0.500 | 6-6 |
| Long Island Lizards | 52 | 59 | 0.881 | 0.385 | 5-7 |
| Rochester Rattlers | 44 | 56 | 0.786 | 0.291 | 3-9 |
| Hamilton Nationals | 38 | 57 | 0.667 | 0.182 | 2-10 |
Looking at these projections, the results make sense. The Outlaws and Cannons are both very potent teams while the Rattlers and the Nationals (especially post Grant Jr).
Note: These records do not add up to 36-36 which is due to the 7% error that I explained prior. Expect to see some variation from the final record, but not much (unless something happens to stir up the structure/play style of a team).
0 comments
|
0 recs |

by 
















