Tuesday, December 4, 2007

Home Cooking in the State of Wisconsin

There’s a quote I’ve always liked from Dale Carnegie, “If you are wrong, admit it quickly and emphatically”.

In a previous post on the MarquetteHoops.com and MUScoop.com boards, I analyzed data on the Wisconsin basketball team with a specific focus on foul disparity. This analysis attempted to look at whether the Foul Disparity was significant for UW wins, and to what extent Wisconsin received a “home cooking” advantage when they played at home. My initial conclusion was that UW did indeed have a statistically significant impact of Foul Disparity on wins, and that they did indeed have a Foul Disparity advantage when at home.

That post was incomplete and some of the conclusions drawn from it were wrong. Saying it emphatically, that post was incomplete and some of the conclusions drawn from it were wrong.

Therefore, I went back to the drawing board and dug through Regression Analysis notes to look at a better approach. I also looked at the data for Marquette at the same time to determine how comparable UW and MU were.

The data:

Data for both MU and UW were collected from espn.com box scores here and here, for seasons going back to 02-03. I pulled the following columns into a spreadsheet:

  • UW/MU Win, UW/MU Fouls, Opponent Fouls, Wisconsin/Marquette Score, Opponent Score, Home/Away, Conference / Non-Conference, Ranked / Unranked Opponent
    • UW/MU Win, Home/Away, Conf/Non-Conf, and Ranked/Unranked were all dummy variables
  • We then calculated additional categories for consideration such as Foul Disparity, Win Margin, and Total Points
  • For Marquette, I looked at data for the full seasons from 02 – present, and then also for the seasons of 02-03, and then 05 – present.
    • For this analysis, I eliminated 03-04 and 04-05 and used only the data for the seasons where Marquette was good, but there was not a significant difference in the results (ie – the same variables and similar coefficients). If anything, foul disparity should be more significant on these teams.

Question #1 – What are statistically significant for a UW and MU win?

Analysis (ignore if you don’t really care)

We started with a regression of a UW/MU win against the variables of UW/MU PF, Opponent PF, UW/MU Score, Foul Disparity, Conf/Non-Conf, Ranked/Unranked, Home/Away. UW/MU Score was included to see if there was some benefit for either team in keeping the score at a low value. Opponent score was not included because all initial regressions with that variable resulted in an equation of “team wins when they score more points than the other team”.

We then looked at the t-stat and p-values to determine which of the variables were potentially not statistically significant. A partial f-test was used against the suspect variables and we ruled out variables that weren’t statistically significant. The second f-test was used to rule out the null hypothesis that that all coefficients are zero. All the remaining variables and coefficients have a p-value < style=""> The residuals were plotted against each variable to determine that the data were linear and homoskedastic, and the residuals were confirmed as normal.

Regression Equations

  • UW Win = -0.02 (UW PF) + 0.01 (UW Score) + 0.02 (Foul Disparity) + 0.22 (Home/Away)

  • MU Win = 0.01 (MU Score) + 0.04 (Foul Disparity)

Well, well. Silly me! After all that effort to prove how important the foul disparity was for Wisconsin, I end up with a view that says Foul Disparity is twice as important for Marquette! Not only that, but there isn’t a depreciable value for MU winning when they play at home. In addition, there is no relation or benefit for either team in keeping the score low. Better said, both teams benefit when they score more points (duh). Also, for Wisconsin, the total PFs are significant, meaning that UW benefits when their total number of fouls are low. Last, but not least, Wisconsin gets a big benefit from playing at home. Please remember that each coefficient is assumed keeping all other variables constant.

Okay, Foul Disparity is statistically significant for UW as well as for MU. That leads us to…

Question #2 - What are the variables that contribute to Foul Disparity?

Analysis (Again, ignore if you don’t really care. I mostly leave this in to make it look like I know what I’m doing.)

We started with a regression of UW/MU Foul Disparity against the variables of UW/MU Score, Win Margin, Conf/Non-Conf, Ranked/Unranked, Home/Away. We then looked at the t-stat and p-values to determine which of the variables were potentially not statistically significant. A partial f-test was used against the suspect variables and we ruled out variables that weren’t statistically significant. The second f-test was used to rule out the null hypothesis that that all coefficients are zero. All the remaining variables and coefficients have a p-value < style=""> The residuals were plotted against each variable to determine that the data were linear and homoskedastic, and the residuals were confirmed as normal.

Regression Equations

  • UW Foul Disparity = 4.7 – 0.6 (UW PF) + 0.1 (UW Score) + 1.6 (Non-Conference) + 2.4 (Home)
    • Intercept Std Error = 2.1 ; Non-Conf Std Error = 0.6 ; Home Std Error = 0.7

  • MU Foul Disparity = -3.0 + 2.7 (Home) + 0.14 (Win Margin)
    • Intercept Std Error = 0.7 ; Home Std Error = 1.1

What does this mean?

#1 - Starting off by looking at the intercepts, we can interpret this as the “style” difference for each team. Holding every other variable constant, UW would generate a foul disparity of 4.7 and Marquette would be in the hole by three fouls. Also, there’s a fairly wide standard error for Wisconsin, so that disparity based on style could be as low as 2.5 or as high as 6.8 fouls! UW’s defense is not one that is geared to commit a lot of fouls, whereas the aggressive nature of MU’s defense means that we’ll commit more fouls. Sadly MU fans, when Wisconsin fans say that it’s the style of defense that causes fewer fouls, they’re right.

#2 – Holding every other variable constant, each team gets about the same benefit by playing at home. The coefficient for UW is 2.7 and the coefficient for MU is 2.4. What causes this benefit? Pomeroy (hat tip: Pardner) believes that the advantage a team gets at home is largely a result of the familiarity of one’s surroundings. For Marquette, playing at home somewhat neutralizes our style. For UW, playing at home helps them with an additional benefit in Foul Disparity. So much for my “Wisconsin has made a deal with the devil to get a better foul disparity” theory…

#3 - UW gets an additional benefit when they play Non-Conference teams on the order of 1-2 additional fouls. There is no such benefit for Marquette.

#4 – Marquette shows the coefficient that one would normally attribute to large margins of victory. In other words, a team that wins by a large amount will tend to have a higher foul disparity. Conversely, if MU loses by a large amount, they also are in the hole. UW does not have this same trend, however…

#5 – UW’s foul disparity is additionally tracked through the number of points that they score and the total number of fouls they commit. UW loses their style advantage when they commit 8 or more fouls in a game. While this may sound intuitive, the same variables are not significant for MU. Also, as previously mentioned, the total number of UW fouls is statistically significant for a UW win. Therefore, I believe that the UW score and UW PF variables are tracking the same type of phenomenon as Win Margin does for Marquette.

Conclusion

Both MU and UW look for a foul disparity to some extent to help result in Wins. Although Marquette has it as one of two variables, UW also gets a home court advantage for Wins that MU does not receive.

As much as I’d like to say that UW gets some level of unfair home court foul advantage, it just doesn’t seem to be true. I’m also not that good at manipulating analysis (yet) to prove it’s true. Both MU and UW get a home court advantage, and the key differences appear to be the style of defense (intercept) and the nature of opponent (non-conference). This may seem why UW translates to a Foul Disparity of 10 when we play at the Kohl Center. Take a style (4.7), add home court advantage (2.7), and a non-conference opponent (1.6) and that Foul Disparity is close to 10 already.*

A lot of people are saying that Marquette needs to keep the fouls down in order to win at Wisconsin. Statistical trends would indicate that this is somewhat unlikely. I would argue that Marquette needs to assume that they will get a lot of fouls, as well as more fouls than UW, but then just do everything possible to force Bucky to commit fouls. I haven’t analyzed it, but my intuition is that a faster pace will result in more fouls for each team. More UW PFs are bad for UW, so let’s work the pace. Obviously, lots of attacks during transition, remaining in control when going towards the hoop, and staying smart with open shots will also go a long way. Of course, now I’ve just said that the key to victory is to force the tempo, making me no better than a talking head. Guh. I’ll stick to the math next time.

*yes, I know that the regression equation assumes all other variables remain constant. I’m just using it to illustrate a point that a 10 foul disparity is not a statistical anomaly.

2 comments:

Anonymous said...

ahhh... I appreciate your effort but your regression analysis technique needs a little refinement.

Because the dependent variable is a binary variable (1 = win / 0 = loss) you need to use a discrete variable estimator for this analysis (e.g., a logit or probit estimator).

When using such an estimator the estimated coefficients indicate the marginal increase in the PROBABILITY of a Wisconsin / Marquette win.

In addition it is not in good form not to show all estimated coefficients and p-values. All estimated coefficients no mater their statistical significance should be presented to the reader.

If you want to talk about this more drop me a line at nels1069@umn.edu.

-Erik

Henry Sugar said...

Great feedback! Thank you very much and we will contact you offline.