Introducing CBCR2

Chapel Bell Curve R2 (CBCR2) is my very first predictive sports analytics model and will be my only model that diverges from R2's philosophy. Unlike other models published to this site, it is unlikely that CBCR2 will provide any significant edge in betting markets (don't worry, we're producing another system for that). However, where other R2 models include non-football variables describing things such as climatic differences, days rest, travel distances, time zone differences, etc., CBCR2 includes football-related variables only. As such, I believe that CBCR2 is important empirically as it may better answer the question: which team is best? Also it is fun!

CBCR2 was developed over the course of last season in the discord chats hosted by the Chapel Bell Curve podcast - who deserve a significant amount of credit for its development. It is built up using more traditional linear modeling techniques - variables are hand selected based on theory and evidence and tested using traditional diagnostics (for the most part). As such, R2 compares more to systems like Brian Fremeau's FEI and Bill Connelly's SP+.

By now, if you're reading this blog, you've probably heard of Bill Connelly's five factors: 1) efficiency, 2) explosiveness, 3) finishing drives, 4) field position, and 5) turnovers. Well, our modeling includes each of these five factors (and more), but suggests that this framework should be updated. Instead, our modeling suggests that football metrics could be simplified into three categories:

Efficiency - How efficiently your team moves the ball down the field and puts your team in position to score. The most important variable in this category is opportunity rate - or the "eckel ratio." It is simply the number of drives a team moves the ball into the red zone divided by their total number of drives. Other relevant variables include success rates and explosiveness.
Finishing Drives - The first item on this list has to do with how well a team puts themselves in position to score, so it makes sense then that the next item would be how efficiently a team converts those opportunities to points. In this category, the most important variable is points per opportunity.
Home Runs - Lastly, you want to make sure the model captures the ability of team to make big plays - particularly big scoring plays. This is captured in our modeling by explosiveness and expected points added (variables which also contribute in the other categories).

I've excluded field position and turnovers from my classification system, but not my modeling. The reason they are excluded here is because the modeling suggest they contribute to explaining performance much less than the others. The reason? They are obviously subsumed by the other categories. Turnovers and Field Position help (or hurt?) a team's efficiency, ability to finish drives, and make big plays.

Unlike my other models, CBCR2 does create an index of relative team strength. The advantage of this type of system is its flexibility. You can use it to compare teams and make predictions about the probability a team would win if those teams were to face each other at any time. If you're curious about your team's chances to win a conference or national title, this system is quite convenient.

The disadvantage to these index based systems compared to my machine learning models that include matchup-specific information is that it requires more judgement on behalf of the analyst. These index-based systems involve creating a preseason model and in-season model. The preseason model contains only information known prior to the season, and the in-season model uses this season's performance. As the season goes on, you gradually blend more of the in-season model into your predictions, eventually completely fading the preseason model. The machine-learning based models do not require such judgment by the analyst (me).

Finally, R2 has developed its own Simple Rating System (SRS) that we believe is superior to other simple rating systems available. SRS is essentially opponent-adjusted points. It uses a system-of-equations approach to solve for how many more points a team scores above an average team. For more information about SRS, see the resources at sports reference.

Big thanks to the whole team that helped me build this system along the way. They are the @ChapelBellCurve team, Nathan Lawrence (@nathanjlawrence), Justin Bray (@TheJustinBray), Dr. Stephen Joiner (@StephenJoiner), Dr. Stephen Chaudoin (stephenchaudoin.com), and Ryan Moore (@ryanmre).

Introducing CBCR2

Recent Posts

Comments