In Search
of a National Champion
By Richard Billingsley
The
sophisticated game of college football we know today scarcely resembles that
first game played in 1869. Those games were brutal by some accounts, and one dimensional
by today’s standards. In the early years teams just lined up and used brute
strength to move the ball forward. Today we have complex offensive and
defensive schemes that make the mental part of the game just as important as
the physical. But the simplicity of the game in the pioneer era of the sport
was not without controversy. Like determining a national champion for instance.
In 1869 there were two games played, Rutgers beat Princeton 6-4 and in a
rematch Princeton beat Rutgers 8-0. So, who do you think should have been
crowned as the inaugural national champion? As you can see, things are not
always as simple as they seem.
The
popularity of college football spread widely in the early 1900’s. What began in
1869 with two teams grew to almost 90 major teams by 1920. The NCAA was founded
in 1906 to regulate the sport and points for scores, size of field, and
penalties etc were all standardized by 1912. But the NCAA failed to address the
one issue that burned in the hearts and minds of players, alumni, and fans of
all ages…the question of “who is #1”. Perhaps if they had addressed it 130
years ago we would not have the controversy we have today. Instead we have a
plethora of polls and ranking systems which don’t always agree.
The
History of Polls
The first
widely recognized college football poll did not originate until 1926. It was a
mathematical rating system developed by Frank Dickinson, a professor of
economics at the University of Illinois. Later, an onslaught of pollsters came
onto the scene, all prepared to crown college football’s best. The list was
staggering; 1927 Deke Houlgate; 1929, Dick Dunkel; 1930, William Boand; 1932
Paul Williamson; 1934 Edward Litkenhouse; and in 1935, Richard Poling.
Mathematical systems were considered to be the “norm” for determining national
championships in those days. But all of that changed in 1936 when the
Associated Press (AP) began publishing a poll voted on by a national board of
sportswriters and broadcasters. Because of their national distribution, the AP
poll instantly became gospel. The United Press International (UPI) joined the
hoopla in 1950. Their theory I suppose was that coaches know more about
football than writers and broadcasters.
It was bound
to happen sooner or later, but it wasn’t until 1954 that the AP and UPI
disagreed on who the national champion should be. The AP chose Ohio State and
the UPI favored UCLA. Both teams were undefeated as was Oklahoma. Ever since
that fateful day when the two “biggies” couldn’t agree, the controversy of “Who’s
Number One” has raged on from the Golden Dome to the Tiger Den, from the
Coliseum to the Swamp, from Happy Valley to Death Valley, and everywhere in
between. Eventually everyone got in on the action, from the New York Times,
Football News, Sports Illustrated, Sporting News, Sears, and McDonalds. Heck
fire, there are more polls than bowls and God knows we’ve got more than we need
of both. Over the years there have been many fine rating systems developed, and
with the advent of the internet you may easily access all of them simply by
clicking a button. Check out David Wilson’s Library of College Football Polls
at:
www.cae.wisc.edu/~dwilson/rsfc/rate/index.html
This past
summer college football lost a great pollster, historian and wonderful human
being. Herman Matthews died on August 22, 2008 at his home in Middlesboro,
Kentucky. Herman was part of the BCS from 1999-2001, but politely resigned as
the BCS moved towards a no margin of victory status. The Matthews Grid Ratings
were a staple in college football from 1966-2007. He appeared regularly in the
Football News and later provided rankings for the Scripps-Howard news service.
He was a good friend and will be greatly missed.
The Birth
of the Billingsley Report
The first
thing I want to say is that my ranking system is not “better” than any other
computer system in the BCS. It is certainly unique in its design, and I’m proud
to say that it is very widely accepted and praised. ESPN used my computer
rankings in their “College Football Encyclopedia” listing the results right
next to the AP and Coaches Polls as “official” National Champions, but that
doesn’t make the system itself any better than many you will find. I have tons
of respect for my BCS counterparts; Jeff Sagarin, Jeff Anderson, Chris Hester,
Wes Colley, Kenneth Massey, and Peter Wolfe
who are all light years ahead of my mathematical skills. But my system
is not about mathematical algorithms. It’s about rules created to compliment a
common sense human response to a football game. The BCS computer pollsters come
from different perspectives, and we all believe strongly in our positions, but
we all have a healthy respect for one another. I would argue their right to
stand up for their position as much as I would my own. I’m proud to call them
my friends. We all have stories to tell, this just happens to be mine.
I became an
avid poll watcher in 1966 and very quickly became disillusioned in the AP and
UPI polls. At times I wondered if the coaches and sportswriters were even
voting in response to the same games I watched, seemingly paying no attention
to the strength of an opponent. In 1967 I embarked on the path of creating a
mathematical formula. I was 16 years old. My goal was to create a ranking that
was unbiased, logical, focused on strength of schedule and was fair to teams in
regards to head to head competition. Two years later I had a blueprint in hand,
the core of which I still use today. My first ranking was released in 1970. I created
what I often refer to as “an improved AP” in the sense my rankings react like
human voters, moving teams up and down based on the most recent week’s
performance, but does it without the inherent human biases. My rankings are
unlike any other you will encounter because the formula consists of a series of
checks and balances in regards to wins, losses, power ratings, and team
rankings. I know that sounds like a bunch of mumbo jumbo so along the way I’ll
provide some examples to help explain what that means. There is one thing for
sure, you’ll either agree with my methodology and love the Billingsley Report,
or you’ll hate it, there seems to be little room in between.
Dynamics of the system
My rankings
are in effect, a “power rating” and it is possible to derive a projected point
spread from them by subtracting ratings, dividing by three and adding four
points to the home team, however, I’m not as concerned about predicting future
outcomes as I am honoring what transpired most recently on the field of play. Let
me give you a general example. If #35 Texas Tech beat #10 Texas (regardless of
the score as margin of victory is not a consideration), and both teams have an
identical record of 5-1, then my philosophy dictates the Red Raiders should be
ranked ahead of Texas in my next poll, regardless of whether the odds are they
would win again if they played the next week. The results may not hold true for
more than one week, but that’s OK because if a team EARNED that position, they
deserve the ranking, regardless of what happens in the next week of play.
Ranking winning teams above losing teams is not always possible. It’s not
logical to rank 0-6-0 # 110 Temple over 5-0-0 #21 Virginia Tech, (as in the
case of the Owls 28-24 win in 1998). I guess in a sense, my rankings are not
only about who the “best team” is, but also about who is the “most deserving”
team.
Let’s start
from the very beginning and move through the system using the data included, in
order of its inclusion in the formula, and then detail each of the components.
#1-Starting
position
#2-
Accumulating points
#3- Strength
of opponent
#4-
Instituting deductions for losses
#5-
Rewarding defense
#6- Site of
the game
#7-
Instituting head to head rules
Starting Position- This is one of the most hotly debated
subjects in rankings. Starting position DOES have an impact in rankings,
especially in human polls where it is a HUGE advantage, and it does make a
slight difference in my rankings. I respect many different points of view here,
ranging from creating a pre-season poll based on returning starters and media
hype (as in the AP/Coaches), starting everyone equal (as in some computer
polls), or having a starting position based on an average of 3-5 previous
seasons (also in some computer models). I believe having a starting position is
best, but starting everyone equal is not logical to me. We know through
observation of past seasons that some teams are stronger than others. No
disrespect to the Vandals, but in 2007 Idaho was not as strong a team as Texas.
If we know this in advance, to a high degree of accuracy, then ranking Texas
and Idaho equal is not only illogical, it is unfair to Texas and completely (in
my mind) skews any hope of an accurate strength of schedule. I also do not
believe in allowing media hype to propel teams from nowhere into the Top 10,
instead I keep teams in their earned rank positions from the end of one season
to the beginning of the next. If a team finishes #10 in 2007, they start #10 in
2008. I do however change a team’s RATING to a standard point value that brings
teams closer together, preventing an unfair advantage in points from one season
to the next. Each year the #1 team starts at 270 points; #2, 269 points; #3,
268 points and so forth all the way to #120 starting at 151 points. This allows
a team to easily overcome a lower starting rank simply by winning over higher
ranked opponents. This season (2008) you can witness this by looking at
Colorado which started at # 64 and in three weeks moved to #27, or Arkansas
State from #106 to #65.
Accumulating Points- My system is the only one I am
aware of that uses an “accumulating” value system. It was designed this way to
emphasize a team’s most recent game as the AP and Coaches do. As a result, a
team only gets credit for playing an opponent ONE TIME. Whatever happens to
that opponent from that point forward is “water under the bridge.” The greatest
example I could ever use to defend this philosophy came just last year in a
scenario involving Oregon. After beating #4 USC and #5 Arizona State on
successive weekends the Ducks rose to #3 in the Billingsley Report (#3 AP, #3
Coaches). After losing QB Dennis Dixon Oregon fell to Arizona, UCLA and Oregon
State. Every time Oregon lost, USC and Arizona State suffered in some computer
rankings. I don’t agree with that methodology. The Trojans and Sun Devils
played an Oregon team that was playing some of the best football in the nation
during those games. They should not have to suffer because of an injury that
happened to Oregon after the fact. In my rankings it did not matter. USC went
on to finish #3 in the Billingsley Report, #3 in the AP, and #3 in the Coaches
Poll. Each week a team accumulates or “earns points” based on the situation
surrounding the current week’s opponent and nothing else. If a team is playing
a #89 team, they cannot earn more points than a team with an equal record
playing a #50 opponent, or a #10 opponent etc. If a team has a bye week, their
rating does not change, with one major exception. A special rule is in place
(in the head to head section) that allows an undefeated team to ALWAYS be
ranked ahead of every opponent they have beaten, and allows any team
experiencing a bye week to remain ahead of a team they had just beaten the week
before.
Strength of opponent- This is another great topic of
discussion. The value placed on the strength of an opponent is (as it should
be) the core of most computer rankings. My system is unique in it’s calculation
of strength of schedule as most models use wins and losses and I do not. I use
an opponent’s RANK and RATING instead. Let me give you an example. In the 8th
week of 2007 Washington posted a record of 2-4 while playing one of the most
the nations most difficult schedules. Army recorded 3-4 while playing a milder
schedule. By counting wins and losses (as the NCAA and most computers do) as a
method of determining strength, Army would be given equal or slightly more
value as an opponent. In my system Washington was ranked #44 and Army at # 109,
therefore a team playing the Huskies would receive more than 3 times as much
credit. I believe strongly this is a more accurate method of determining
opponent strength. Wins and losses do not always tell the whole story.
Instituting deductions for losses- Remaining undefeated is paramount
in my system. A team with no losses has, in effect, a “ticket to the top ten”
as long as they are playing a reasonable schedule. With no losses a team
receives “full earnings” of their “available opponent value”, but each loss
creates a percentage of deduction. For instance, if Maryland is 5-0-0 playing a
#35 opponent, and North Carolina is 4-1-0 playing a #30 opponent, the Terps
will still receive more points that week than the Tar Heels even though North
Carolina played a slightly more difficult team because of the penalty the Tar
Heels incur from having a loss. However, if Maryland is playing a significantly
lower opponent, say #90 Colorado State, then North Carolina, even with one
loss, will receive more credit that week than the Terps. Two losses create a
larger handicap and so on. The only way for a team to overcome a loss is to
beat higher ranked opposition.
Rewarding defense- With the spread offenses we have
today sometimes a 21 or 28 point margin is no longer a guarantee you can “call
of the dogs” so to speak. Teams seem to be forced to “buffer” their margins
just to ensure victories. That’s just a by product of today’s offensive minded
football schemes. Unfortunately that means there is no way to determine whether
a team is “buffering” or being “unsportsmanlike”. As a result there is no
reward for scoring offense. I am convinced however that great defense wins
championships, so I reward scoring defense. If a team holds an opponent below
10 points they receive a benefit. More is awarded for 7 points or less, more
for 3 or less, and a team is rewarded most if an opponent is shut out. Just
keep in mind, the points available here are very slight in the overall
scheme.
Site of the game- I realize some computers do not
take the site of the game into consideration, but I believe it is important.
The reward, once again is slight, but it is still a consideration. I believe
that playing at Tennessee in front of 106, 000 fans screaming Rocky Top is more
difficult than playing in front of 15,000 at Rice stadium. To go one step
further, winning on the road is worth more than losing on the road, and winning
on the road as an underdog is more valuable than winning on the road as a
favorite. For instance, if Georgia is ranked #1 and is hosting #7 Alabama, The Tide
receives one consideration for playing at Georgia, and even higher
consideration if they win between the Hedges. It’s like rewarding a team for
excelling in the most difficult circumstances imaginable. I’m sure there are
some who would say using the site of a game is biased, but I disagree. My scale
is based on information available to the general public through the NCAA and is
evaluated by stadium size and average attendance over a 5 year period. Rice
plays in a 72,000 seat stadium, but only fills a portion of that to capacity,
so playing at Rice is not as valuable as playing at some MAC teams who fill
their smaller stadiums to capacity.
Instituting head to head rules- The most powerful part of the
program states that if certain criteria is met in regards to wins, losses,
ratings and rankings that the winner of a game will be ranked ahead of the
loser in the next poll. This is guaranteed for one week only. A team must be
consistent and continue winning against good opposition to maintain their position
from week to week. These rules set me apart from most computer analyst. I
realize that by instituting these rules the program basically creates a
situation where it is not the best “power rating” system it could be. Winning
teams will not always be able to maintain their most recent level of play, but
again, I feel if they earned it, they deserve to be ranked higher even if for
just one week. East Carolina this season (2008) is a perfect example of that.
The Pirates beat Virginia Tech and West Virginia. They earned the right to be
ranked ahead of both of those teams at the time. Rising to #8 in the
Billingsley report and losing to North Carolina State the next week makes the
system look like a failure but I defend East Carolina’s right to be ranked in
the Top 10 based on what they accomplished on the field. In spite of any issues
in the power rating, the system still holds an average 76% of higher ranked
teams beating lower ranked opponents over its 37 year history. If you are
seeking a power ranking system that is specifically designed to project future
outcomes, check out Jeff Sagarin. His work in that area is about as good as
you’ll ever find. We run neck and neck in comparison when not using margin of
victory, but what Jeff calls his “predictor” system has a superior winning
percentage.
One final
thought before I close. It’s no secret that I’m a big fan of the BCS. I suppose
the majority of fans who read this will feel it’s because I’m part of the
process. That is not the case. I would be in favor of this format even if I
were not a participant because I believe strongly in what the BCS is
accomplishing in college football. The mission of the BCS was clear; create a
set of rules and match the #1 and #2 teams in a championship game. I’ve been a
fan of college football for 50 years. I lived through the days of bowl game
participants being determined by “smoke filled back room deals” sometimes weeks
before the regular season was complete. I’ve lived through seasons where the
top teams could not be matched in a game because of conference ties to specific
bowl games. What a tragedy we could not witness Ohio State and Penn State in
1968, Texas/ Penn State in 1969, Georgia Tech/Colorado in 1990;
Miami/Washington in 1991 or Nebraska/ Michigan in 1997 just to mention a few.
Thanks to the BCS we no longer have to deal with “mythical” national
championships. The BCS is not standing in the way of a playoff. The regular
season in college football is a playoff. This is an evolving sport. At some
point we may see a playoff but in the interim we have some form of championship
in place. Controversy would not end even if we had a more suitable format for a
playoff as someone is always going to feel slighted regardless of the number of
teams that are involved. The BCS is not perfect, but college football is light
years ahead of where we were before 1998.
Richard Billingsley
October 2008