In Search of a National Champion
By Richard Billingsley
The sophisticated game of college football we know today scarcely resembles that first game played in 1869. Those games were brutal by some accounts, and one dimensional by today’s standards. In the early years teams just lined up and used brute strength to move the ball forward. Today we have complex offensive and defensive schemes that make the mental part of the game just as important as the physical. But the simplicity of the game in the pioneer era of the sport was not without controversy. Like determining a national champion for instance. In 1869 there were two games played, Rutgers beat Princeton 6-4 and in a rematch Princeton beat Rutgers 8-0. So, who do you think should have been crowned as the inaugural national champion? As you can see, things are not always as simple as they seem.
The popularity of college football spread widely in the early 1900’s. What began in 1869 with two teams grew to 98 major teams by 1920. The NCAA was founded in 1906 to regulate the sport and points for scores, size of field, and penalties etc were all standardized by 1912. But the NCAA failed to address the one issue that burned in the hearts and minds of players, alumni, and fans of all ages…the question of “who is #1”. Perhaps if they had addressed it 140 plus years ago we would not have the controversy we have today. Instead we have a plethora of polls and ranking systems which don’t always agree.
The History of Polls
The first widely recognized college football poll did not originate until 1926. It was a mathematical rating system developed by Frank Dickinson, a professor of economics at the University of Illinois. Later, an onslaught of pollsters came onto the scene, all prepared to crown college football’s best. The list was staggering; 1927 Deke Houlgate; 1929, Dick Dunkel; 1930, William Boand; 1932 Paul Williamson; 1934 Edward Litkenhouse; and in 1935, Richard Poling. Mathematical systems were considered to be the “norm” for determining national championships in those days. But all of that changed in 1936 when the Associated Press (AP) began publishing a poll voted on by a national board of sportswriters and broadcasters. Because of their national distribution, the AP poll instantly became gospel. The United Press International (UPI) joined the hoopla in 1950. Their theory I suppose was that coaches know more about football than writers and broadcasters.
It was bound to happen sooner or later, but it wasn’t until 1954 that the AP and UPI disagreed on who the national champion should be. The AP chose Ohio State and the UPI favored UCLA. Both teams were undefeated as was Oklahoma. Ever since that fateful day when the two “biggies” couldn’t agree, the controversy of “Who’s Number One” has raged on from the Golden Dome to the Tiger Den, from the Coliseum to the Swamp, from Happy Valley to Death Valley, and everywhere in between. Eventually everyone got in on the action, from the New York Times, Football News, Sports Illustrated, Sporting News, Sears, and McDonalds. Heck fire, there are more polls than bowls and God knows we’ve got more than we need of both. Over the years there have been many fine rating systems developed, and with the advent of the internet you may easily access all of them simply by clicking a button. Check out David Wilson’s Library of College Football Polls at:
In the summer of 2008 college football lost a great pollster, historian and wonderful human being. Herman Matthews died on August 22, 2008 at his home in Middlesboro, Kentucky. Herman was part of the BCS from 1999-2001, but politely resigned as the BCS moved towards a no margin of victory status. The Matthews Grid Ratings were a staple in college football from 1966-2007. He appeared regularly in the Football News and later provided rankings for the Scripps-Howard news service. He was a good friend and will be greatly missed.
The Birth of the Billingsley Report
The first thing I want to say is that my ranking system is not “better” than any other computer system in the BCS. It is certainly unique in its design, and I’m proud to say that it is very widely accepted and praised. ESPN used my computer rankings in their “College Football Encyclopedia” listing the results right next to the AP and Coaches Polls as “official” National Champions, but that doesn’t make the system itself any better than many you will find. I have tons of respect for my BCS counterparts; Jeff Sagarin, Jeff Anderson, Chris Hester, Wes Colley, Kenneth Massey, and Peter Wolfe who are all light years ahead of my mathematical skills. But my system is not about mathematical algorithms. It’s about rules created to compliment a common sense human response to a football game. The BCS computer pollsters come from different perspectives, and we all believe strongly in our positions, but we all have a healthy respect for one another. I would argue their right to stand up for their position as much as I would my own. I’m proud to call them my friends. We all have stories to tell, this just happens to be mine.
I became an avid poll watcher in 1966 and very quickly became disillusioned in the AP and UPI polls. At times I wondered if the coaches and sportswriters were even voting in response to the same games I watched, seemingly paying no attention to the strength of an opponent. In 1967 I embarked on the path of creating a mathematical formula. I was 16 years old. My goal was to create a ranking that was unbiased, logical, focused on strength of schedule and was fair to teams in regards to head to head competition. Two years later I had a blueprint in hand, the core of which I still use today. My first ranking was released in 1970. I created what I often refer to as “an improved AP” in the sense my rankings react like human voters, moving teams up and down based on the most recent week’s performance, but does it without the inherent human biases. My rankings are unlike any other you will encounter because the formula consists of a series of checks and balances in regards to wins, losses, power ratings, and team rankings. I know that sounds like a bunch of mumbo jumbo so along the way I’ll provide some examples to help explain what that means. There is one thing for sure, you’ll either agree with my methodology and love the Billingsley Report, or you’ll hate it, there seems to be little room in between.
Dynamics of the system
My rankings are in effect, a “power rating” and it is possible to derive a projected point spread from them by subtracting ratings, dividing by three and adding three points to the home team, however, I’m not as concerned about predicting future outcomes as I am honoring what transpired most recently on the field of play. Let me give you a general example. If #35 Texas Tech beat #10 Texas (regardless of the score as margin of victory is not a consideration), and both teams have an identical record of 5-1, then my philosophy dictates the Red Raiders should be ranked ahead of Texas in my next poll, regardless of whether the odds are they would win again if they played the next week. The results may not hold true for more than one week, but that’s OK because if a team EARNED that position, they deserve the ranking, regardless of what happens in the next week of play. Ranking winning teams above losing teams is not always possible. It’s not logical to rank 0-6-0 # 110 Temple over 5-0-0 #21 Virginia Tech, (as in the case of the Owls 28-24 win in 1998). I guess in a sense, my rankings are a combination of who the “best team” is, and also who is the “most deserving” team.
Let’s start from the very beginning and move through the system using the data included, in order of its inclusion in the formula, and then detail each of the components.
#2- Accumulating points
#3- Strength of opponent
#4- Instituting deductions for losses
#5- Site of the game
#6- Instituting head to head rules
Starting Position- This is one of the most hotly debated subjects in rankings. Starting position DOES have an impact in rankings, especially in human polls where it is a HUGE advantage, and it does make a slight difference in my rankings. I respect many different points of view here, ranging from creating a pre-season poll based on returning starters and media hype (as in the AP/Coaches), starting everyone equal (as in some computer polls), or having a starting position based on an average of 3-5 previous seasons (also in some computer models). I believe having a starting position is best, but starting everyone equal is not logical to me. We know through observation of past seasons that some teams are stronger than others. No disrespect to the Vandals, but in 2007 Idaho was not as strong a team as Texas. If we know this in advance, to a high degree of accuracy, then ranking Texas and Idaho equal is not only illogical, it is unfair to Texas and completely (in my mind) skews any hope of an accurate strength of schedule. I also do not believe in allowing media hype to propel teams from nowhere into the Top 10, instead I keep teams in their earned rank positions from the end of one season to the beginning of the next. If a team finishes #10 in 2007, they start #10 in 2008. I do however change a team’s RATING to a standard point value that brings teams closer together, preventing an unfair advantage in points from one season to the next. Each year the #1 team starts at 270 points; #2, 269 points; #3, 268 points and so forth all the way to #120 starting at 151 points. This allows a team to easily overcome a lower starting rank simply by winning over higher ranked opponents. This season (2008) you can witness this by looking at Colorado which started at # 64 and in three weeks moved to #27, or Arkansas State from #106 to #65.
Accumulating Points- My system is the only one I am aware of that uses an “accumulating” value system. It was designed this way to emphasize a team’s most recent game as the AP and Coaches do. As a result, a team only gets credit for playing an opponent ONE TIME. Whatever happens to that opponent from that point forward is “water under the bridge.” The greatest example I could ever use to defend this philosophy came during the 2007 season in a scenario involving Oregon and Southern Cal. After beating #4 USC and #5 Arizona State on successive weekends the Ducks rose to #3 in the Billingsley Report (#3 AP, #3 Coaches). After losing QB Dennis Dixon, Oregon fell to Arizona, UCLA and Oregon State. Every time Oregon lost, USC suffered in some rankings. I don’t agree with that methodology. The Trojans played an Oregon team that was playing some of the best football in the nation during that game. They should not have to suffer because of an injury that happened to Oregon after the fact. In my rankings it did not matter. USC went on to finish #3 in the Billingsley Report, #3 in the AP, and #3 in the Coaches Poll. Each week a team accumulates or “earns points” based on their own record and their opponent’s rating and rank and nothing else. If a team has a bye week, their rating does not change, with one major exception. A special rule is in place (in the head to head section) that allows an undefeated team to ALWAYS be ranked ahead of every opponent they have beaten, and allows any team experiencing a bye week to remain ahead of a team they had just beaten the week before.
Strength of opponent- This is another great topic of discussion. The value placed on the strength of an opponent is (as it should be) the core of most computer rankings. My system is unique in it’s calculation of strength of schedule as most models use wins and losses and I do not. I use an opponent’s RANK and RATING instead. Let me give you an example. In the 8th week of 2007 Washington posted a record of 2-4 while playing one of the most the nations most difficult schedules. Army recorded 3-4 while playing a milder schedule. By counting wins and losses (as the NCAA does) as a method of determining strength, Army would be given equal or slightly more value as an opponent. In my system Washington was ranked #44 and Army at # 109, therefore a team playing the Huskies would receive more than 3 times as much credit. I believe strongly this is a more accurate method of determining opponent strength. Wins and losses do not always tell the whole story.
Instituting deductions for losses- Remaining undefeated is paramount in my system. A team with no losses has, in effect, a “ticket to the top ten” as long as they are playing a reasonable schedule. With no losses a team receives “full earnings” of their “available opponent value”, but each loss creates a percentage of deduction. For instance, if Maryland is 5-0-0 playing a #35 opponent, and North Carolina is 4-1-0 playing a #30 opponent, the Terps will still receive more points that week than the Tar Heels even though North Carolina played a slightly more difficult team because of the penalty the Tar Heels incur from having a loss. However, if Maryland is playing a significantly lower opponent, say #90 Colorado State, then North Carolina, even with one loss, will receive more credit that week than the Terps. Two losses create a larger handicap and so on. The only way for a team to overcome a loss is to beat higher ranked opposition.
Site of the game- I realize some computers do not take the site of the game into consideration, but I believe it is important. The reward, once again is slight, but it is still a consideration. I believe that playing at Tennessee in front of 106, 000 fans screaming Rocky Top is more difficult than playing in front of 15,000 at Rice stadium. There are some who say any form of measuring the value of the site of a game is biased, but I disagree. My scale is based on information available to the general public through the NCAA and is evaluated by stadium size and average attendance over a 5 year period. Rice plays in a 72,000 seat stadium, but only fills a portion of that to capacity, so playing at Rice is not as valuable as playing at some MAC teams who fill their smaller stadiums to capacity.
Instituting head to head rules- The most powerful part of the program states that if certain criteria is met in regards to wins, losses, ratings and rankings that the winner of a game will be ranked ahead of the loser in the next poll. This is guaranteed for one week only. A team must be consistent and continue winning against good opposition to maintain their position from week to week. These rules set me apart from most ranking systems. I realize that by instituting these rules the program basically creates a situation where it is not the best “power rating” system it could be. Winning teams will not always be able to maintain their most recent level of play, but again, I feel if they earned it, they deserve to be ranked higher even if for just one week. East Carolina this season (2008) is a perfect example of that. The Pirates beat Virginia Tech and West Virginia. They earned the right to be ranked ahead of both of those teams at the time. Rising to #8 in the Billingsley report and losing to North Carolina State the next week makes the system look like a failure but I defend East Carolina’s right to be ranked in the Top 10 based on what they accomplished on the field. In spite of any issues in the power rating, the system still holds an average 76% of higher ranked teams beating lower ranked opponents over its 37 year history. If you are seeking a power ranking system that is specifically designed to project future outcomes, check out Jeff Sagarin. His work in that area is about as good as you’ll ever find. We run neck and neck in comparison when not using margin of victory, but what Jeff calls his “predictor” system has a superior winning percentage.
One final thought before I close. It’s no secret that I’m a big fan of the BCS. I suppose the majority of fans who read this will feel it’s because I’m part of the process. That is not the case. I would be in favor of this format even if I were not a participant because I believe strongly in what the BCS is accomplishing in college football. The mission of the BCS was clear; create a set of rules and match the #1 and #2 teams in a championship game. I’ve been a fan of college football for 50 years. I lived through the days of bowl game participants being determined by “smoke filled back room deals” sometimes weeks before the regular season was complete. I’ve lived through seasons where the top teams could not be matched in a game because of conference ties to specific bowl games. What a tragedy we could not witness Ohio State and Penn State in 1968, Texas/ Penn State in 1969, Georgia Tech/Colorado in 1990; Miami/Washington in 1991 or Nebraska/ Michigan in 1997 just to mention a few. Thanks to the BCS we no longer have to deal with “mythical” national championships. The BCS is not standing in the way of a playoff. The regular season in college football is a playoff. This is an evolving sport. At some point we may see a playoff but in the interim we have some form of championship in place. Controversy would not end even if we had a more suitable format for a playoff as someone is always going to feel slighted regardless of the number of teams that are involved. The BCS is not perfect, but college football is light years ahead of where we were before 1998.