Visually Analyzing College Football's BCS Rankings
Posted by Wade Tibke on January 31, 2008The 2007 college football season ended with LSU hoisting the $30,000 Waterford Crystal BCS Championship Trophy.
If you're unfamiliar, BCS (Bowl Championship Series) is the organization that determines which teams play in the major college football bowls – including matching the top two teams against each other for at least a share of the national championship.
Four of the five BCS bowl games this year were blowouts, including a lackluster title game. These match ups were decided using a formula that consists of three equally weighted components: the USA Today Coaches Poll, the Harris Interactive College Football Poll, and an average of six computer rankings. Despite the controversy that surrounds it, there is one thing the BCS does do right – it waits until mid season to release its first rankings rather than judge a team on what it did last year or its weak pre-conference opponents.
Using Tableau I visually analyzed this year's BCS data and three things stood out (Download the Tableau Packaged Workbook - view with Tableau Desktop or free Tableau Reader).
1. It was a wild ride. In the view below, you can see long color streaks as teams moved dramatically though the rankings. As the paths change direction a story is told about late season heroics and failures. You can see that several teams moved all the way to #2 only to lose their next game (highlighted). We even see teams like South Florida climb to #2, South Carolina to #6, and Kentucky to #7, only to be left out of the season ending polls. South Florida did get a chance to resurrect their season in the Sun Bowl, but lost badly to Oregon, another #2 that took a nose dive though the polls. I guess this could have been called the Overrated Bowl.

2. Never give up the fight. An interesting visual brought to light that several of the top teams were not ranked above #12 when the BCS rankings were first released. Except for Ohio State, the top teams have strong upward end-of-season trends. USC moved all the way from #19 to #2 and Georgia went from #20 to #3.

3. The three BCS inputs are relatively consistent in the way they treat conferences. I did some analysis on how the different polls treated different conferences. The only obvious outlier was how the computer rankings score WAC teams. I’m not sure, but perhaps the computers rankings weigh strength-of-schedule more so than the Harris and USA polls. I'm guessing, given human nature, that the coaches and panelists voting in the Harris and USA polls tend to give the smaller, underdog schools the benefit of the doubt. In the view below, I isolated five teams with the largest discrepancy between the computer rankings and the USA Poll (Harris and USA are nearly identical). Most notable was how the computer rankings liked the Big East (South Florida and Connecticut) and were not huge fans of the WAC (Boise State and Hawaii). But considering what Georgia did to Hawaii and then East Carolina taking care of Boise State, maybe the computers do know best. Out of the teams that carried the largest ranking differences only two ended up in the final USA Rank (they all fell out of the computer rankings).

- Click here to download the Tableau Packaged Workbook file (2007-BCS-Analysis.twbx)
- I used the final USA poll for my final BCS rankings as the BCS does not release a poll after the bowls.
- The six computer rankings used in 2007 include:
Searching for the Holy Grail of Analysis
Posted by Elissa Fink on January 29, 2008Jock Mackinlay and Chris Stolte recently posted a Tableau Letter I could have used when my primary job was conducting data analysis or managing other analysts. Their Letter "There Is No Single View" suggests that searching for the one perfect view may be a noble cause but the effort is typically futile. Rather than focusing on a single perfect view, people are better served focusing on the process of analysis, which explores a wide range of views to answer questions or present findings.
Early in my career as a marketing analyst, I had projects where I or someone on my team spent weeks on a project manipulating and tuning the data so that we could create the one chart, graph, table or visualization that told the whole story. I was particularly adept at manipulating data in Excel in ways that were probably never intended by Microsoft (and boy, what a love-hate relationship I had with that amazing product!). What would happen is that I would get to the graphic or table and then find out that it didn't really answer the question, was not comprehensible by anyone else, or raised too many other unanswered questions.
Jock and Chris's Letter is helpful for (at least) 2 reasons:
1. It takes 2 common types of views - Treemaps and Dashboards - and explains that while neither is the perfect single view, both are useful in answering certain kinds of questions.
2. It explains why the iterative process of visualization is the key to effective analysis. The easier it is to move through your data, the faster you'll find the insights.
Anyway, check out the Letter directly.
Building Cycle Charts to Look at Trend Data
Posted by Jock Mackinlay on January 15, 2008The January 2008 newsletter from Perceptual Edge is an excellent description of Cycle Plots by Naomi B Robbins Ph.D, which are a less known way to look at trend data. On her website, you can find a link that describes how you can create these views in Excel and provided the input data. I think it is great when people give you access to the data that is shown in examples. I used the same data to build a cycle chart in Tableau in a couple of minutes.
Here is the packaged workbook, suitable to use with both Tableau Desktop and Tableau Reader (our no-charge application for reading and interacting with visual analytics). The workbook also includes the two other trends she talks about in her articles. All three views are useful and Tableau makes it easy to switch between them and get many other views as well.
Eating Our Own Dog Food: How We Use Fast Analytics for Bug Tracking
Posted by Chris Stolte on January 4, 2008At Tableau, a key cultural value is to “eat our own dogfood”. Over the last several years, we’ve had an on-going internal project (“Project ALPO”) focused on connecting Tableau to our own operational data, ranging from SalesForce CRM data to Fogbugz defect data, to data from our Cisco router. Not only has this given us the typical benefits of improving the team’s familiarity with the product, getting usability feedback, and finding bugs, but it has also enabled us to reap the benefits of analyzing our business’s core metrics.
As a developer, usability fanatic, and Tableau co-founder, I’ve been fascinated by several aspects of our Project ALPO to use our own products. One of the most interesting elements has been to watch how each employee’s perspective on data and its utility to their day-to-day job changes as they use the product. There are two findings I’ve seen with most new users:
1. People don’t know what questions to ask of their data. Most tools for asking questions of data are difficult to use and require a heavy investment for each new question. As a result, people rarely venture to ask new questions and invest little time thinking about what questions they would ask if they could. People stumble a bit with the possibilities: The blank palette can be intimidating at first. But then, they start asking simple questions (“How many bugs do I have open?”). When the answers come easily, they venture to ask more sophisticated questions evolving into rapid-fire Q&A sessions with their data (“How rapidly are bugs verified and closed after being resolved?”, “What percentage of revenue this quarter is due to new business compared to previous quarters?”). It is an exciting process to witness.
2. Visual analytics? I’ll take a text table, thanks. The transition to visual analytics is incremental. Tableau is really two tools in one: (a) an easy to use tool for Q&A with your data and (b) a data visualization tool. Most people start using it for the former and then incrementally venture into using it for the latter. People start by simply recreating the reports they can already generate, simply gaining the benefits of quicker and easier answers. But for almost every use, they slowly venture into visual analytics. One day they take their traditional text table and drop a metric on color, highlighting the anomalies. A week later, they tweak the table, transforming it to a tabular bar chart, communicating the same data as before but new information starts to jump out. Then they create a new sheet with a whole new view of their data and soon they are experts in the process of visual analytics and seeing their data like never before.
Over the next several weeks, I’ll author several blog posts outlining how we’ve connected Tableau to numerous common data sources. For this post, I’m focusing on a data source that I care deeply about, our defect-tracking system Fogbugz. I’ll follow up soon with a posting about SalesForce.com and our evolving use of visualization in our analytics process.
Visual Analytics for Fogbugz
Fogbugz is a defect management system built on some of the same principles we care about at Tableau: usability and simplicity. Our development group starting using it about four years ago when we matured from a couple of guys in a basement with a passion for data into a real software company with thousands of users. Fogbugz stores its data in a fairly intuitive (and well documented!) schema with clearly named primary and foreign keys in one of numerous databases, all supported by Tableau: MySQL, Microsoft Access, or SQL Server. This made it an easy initial target for Project ALPO.
Connecting Tableau to Fogbugz is simple –connect to the database as an appropriate user and model the core schema. I focused on just looking at defects and their evolution over time, so I modeled this schema within Tableau:
Figure 1: Fogbugz schema for the core defect tracking tables (from the Fogbugz Support forum)
After getting connected, the interesting part started: What questions to ask? We often have multiple releases being developed simultaneously – partly due to an expanding product line and partly due to our partnership with Oracle-Hyperion. So, I started with the obvious question: What bugs are open against which release and assigned to whom?
Figure 2: The number of open (Active) bugs broken down by release, to whom the bug is assigned, and the bug priority. The developers are sorted in descending order by the number of bugs they have active.
A simple question and a simple answer – but immediately useful. I could pop this open each morning (as could the developers) and get an immediate sense of where we stood. By the way, I randomized and anonymized the data (can you guess the theme for the people’s names?).
Once I asked this obvious question, I have to admit I got stuck for a while. What other questions should or could one ask of a defect management system? I didn’t want to fall into the trap of monitoring useless metrics or using them to evaluate our QA or development team (that can only lead to bad behavior). But I also had the intuition that the data had stories to tell that would help me and the team do our jobs better.
A useful question for me (and typically useful for all data) was: What is changing? For this data, that meant “What new bugs are being filed?”, “What bugs are being closed?”, “What resolutions are most bugs getting?”, or “Who is closing and opening bugs?” That led to a number of views. The following is one nice example of seeing the daily change in bugs:
Figure 3: A view into the day-to-day change in defects showing what bugs are being opened and closed and by who. Different hues are used to indicate different roles within the Development team: Shades or orange are used for members of the QA team and shades of Green for developers.
Tableau as a Fogbugz Interface
In many of the views we started authoring, a single mark corresponded to a single bug. That meant we could use Tableau’s ability to associate data-driven hyperlinks with marks to jump directly from outliers on a graph to the bug entry in Fogbugz. Tableau started to be about more than asking questions about the data – it was becoming our interface for the data!
A nice example is the next view: Our process for bug resolution was for the bug to be resolved back to the person who opened it, they would verify the fix, and then the bug would be assigned to QA to be verified again and closed. But despite email notifications, people often didn’t notice when a bug was resolved back to them and bugs would sit resolved but not verified for quite a while.
Then someone on the team authored and shared the following view. It let us quickly see what resolved bugs were assigned to the team, how they were resolved, and how long they had been sitting waiting for verification. And even better, one could simple right-click on any mark and jump to the bug in Fogbugz to start the verification process:
Figure 4: A view of resolved bugs assigned to the person who opened them and awaiting verification. A quick glance shows bugs that have been sitting idle a long time – and which might not be correctly fixed. The color shows the resolution (duplicate, fixed, not reproducible, etc.). Users can right-click on a mark to jump directly to the corresponding bug in Fogbugz.
Time-based behavior
Soon, we became very interested in the time-based behavior of defects during a release as we experimented with different approaches to software development and release management. We wanted to graph the number of open bugs for any day during the release and see how events like stabilization periods were influencing the bugs.
Unfortunately, there isn’t historic data in Fogbugz but it was easy to add. We authored a simple stored procedure that ran nightly on our SQL Server instance collecting summary statistics. We were then able to construct the following type of view for each release:
Figure 5: A view of open bugs over time. Views like this are used to compare different approaches used with the team for software development, such as planned stabilizations and phased deliverables. Reference lines are used to indicate key milestones. The view shows a less than ideal sprint release where all of the bugs accumulate until near the end of the release.
The above was a really quick peek at our evolving usage of Tableau to understand and question ourFogbugz data The team is creating new views everyday to answer a wide range of useful questions. I’ll finish up with a screenshot of a dashboard most of the team uses for a quick peek at the state of any release.
Any questions about how we’ve connected Tableau to Fogbugz? Please don’t hesitate to send me an email at (chris at tableausoftware.com).
Figure 6: A dashboard one of our development leads authored to provide multiple perspectives on the state of a release. It shows the recently opened bugs, graphical views of the short- and long-term history of the release, and the current bug count by person.
Three of History's Best Charts Ever According to The Economist
Posted by Elissa Fink on January 1, 2008
Check out the December 22nd 2007-January 4th 2008 issue of The Economist. You probably remember it from the newsstands: it's got Mao in a Santa's hat. It's also got a great article "Worth a thousand words" about what they call "the best charts ever" - and guess what? They are all from the 19th century!
The 3 visualizations that The Economist described as "three of history's best" include...
1. Florence Nightangale's 1858 graphic demonstrating the factors affecting the lives (and death rates) of the British army (which resulted in a graphic type called “Nightingale's Rose” or “Nightingale's Coxcomb”). She showed in a visual graphic that it wasn't wounds killing the highest number of soldiers - it was infections.
2. Charles Joseph Mindard's very famous 1861 graphic depicting the Russian campaign of 1812 - Tufte called it the “the best statistical graphic ever drawn”. Ever since I first saw this, I've loved it. What a dramatic story it tells.
3. William Playfair's 1821 chart comparing the “weekly wages of a good mechanic” and the “price of a quarter of wheat” over time. He was one of the first people to use data not just to educate but also to persuade and convince.
You can quibble with some of the technicalities of each of these charts, certainly. But what's amazing to me (not a visualization expert but certainly an admirer of those that are) is that the science of visual data analysis, which often feels so new, has it roots so deep in history. It does prove the old adage: a picture is worth a thousand words. And has been for quite a long while.