TigerEye Review – Week 4
Massaging the Outlier Data Point Edition
The bane of all statistical research is the outlier data point. Much like the sinking of the Titanic, researchers throw a huge number such as 1500+ passenger-liner lives lost to an iceberg collision in an era with historically low collision incidents (they peaked in the 1870’s–80’s just before the advent of steam power). And passenger liner losses wouldn’t be anywhere close to that figure until later decades when the number of incidents of lives lost increased due to an rapid increase of affordable passenger traffic (NOT counting the WWI German unrestricted U-boat campaign—a separate data silo). Yet, the story of that event remains fresh, inspiring everything from good books to bad movies to seemingly incessant Internet memes. That one event has stayed amazingly present in our national psyche. Simply mention the word “Titanic” in a conversation a full 105 years later, and everyone listening immediately assumes you’re talking about the proper name and not the word as an adjective.
Likewise, try as you might to avoid such an event in any analysis of data, one or several troubling outlier points remain, gleefully taunting the amateur statistician, fans and professional bean counters alike. But in my experience, you dismiss these singular points outside the comfortable mean at your peril. They still happened, and if you’re serious about the craft, you need to find a way to account for them.
So what do we do with Missouri’s ten-touchdown day against Southern Missouri State and its three total touchdowns in games since then? How do we handle Vanderbilt’s outstanding defensive numbers prior to that brutal chainsaw massacre of an Alabama game? And what do we make of South Carolina’s numbers on both sides of the ball vis a vis its W–L record? Or even Auburn’s outstanding yards per play—except for the dismal 1.2 YPP of the Clemson game?
In my mind, you can’t dismiss any of it. It all goes into the analysis as raw numbers for the very simple reason that it occurred. However, understand that once you start down the path of selecting data under “valid’ or “invalid” criteria, you leave the path of real analysis and start down one of prejudicial distortion. However good your intentions, you’ll never recover objectivity and clear interpretation unless all the data is collected and included in how you view the subject.
What is implied in reviewing early data is a brief skewing of any analysis in the short term. If you’ve noticed in my previous posts I’ve been careful to use words and phrases like “if this continues” or “if this is accurate” and so on. This is with the understanding that we won’t have enough data to completely understand the season until it ends. In the meantime (no pun intended), what we collect is a slowly normalizing set of data points, outliers included, that will point with increasing certainty to the real interpretation as we approach mid-season and beyond. By season’s end, the outlier will cease to be a problem or to skew the data to the large degree it did the first few weeks.
So, when is the tipping point? In previous years, I usually didn’t get a strong feel for where things were going until midway in the season and conference play. Right about game six ,overall, and games three and four in-conference, you can start give credence to what the analysis is showing. Some early data will be confirmed and verified as valid from the start. Some outlier Titanic-like numbers will be recognized only in retrospect for what they were. The proper way to treat early is to acknowledge it and wait and see how it pans out. Now that we’ve reached the first third of the season mark, the next couple of weeks will, hopefully, point out what is what going forward.
By the numbers – what a championship level SEC team should be:
Strong reactions to outliers are often based purely upon perspective
The SEC West
There is movement, and not all of it is upward. Auburn is improving game by game, but those Clemson game numbers will take some time to overcome. A couple of more strong showings in the right direction will move us up. Likewise, those early Mississippi State numbers might also be outliers yet to be impacted by better data. Or maybe the Georgia game was the outlier. The same goes for those troubling Alabama red zone issues. They might turn out to be nothing to worry about, but they’ve still only had 11 TD’s in 19 trips to the red zone, against more than just that now 0–2 FSU defense.
But it is enough to drop Alabama’s offense just one tick off of the elite rating. This means that in the entire SEC West, we have only “average” offenses – Alabama and Texas A&M plus five other squads struggling to normalize their numbers in key areas. Auburn is indeed leading all SEC teams in third-down conversions, but our yards per play are dismal, and points per game and TD’s in the red zone haven’t recovered from Week 2.
Defensively, Auburn took a hit with those two garbage time Mizzou TD’s in the red zone. But again, I cannot exclude them. We’ll have those 2nd-team guys in late in the season in critical games, and to dismiss their performance here would be a mistake if this is a true trend. So in they stay, and Auburn goes down a peg from the top. Likewise Alabama continues to prove its worth—Vandy never got past mid-field, despite all its improvement from last year to this, it was still an Alabama–Vandy game such as every other time they’ve met.
As for the rest of the division, who knows? Every other team has issues in some facet of defensive performance, and from the look of it, it doesn’t look at all like any team but the top three have anything approaching stability. Performances are seemingly all over the place as the numbers come in.
The SEC East
Stability in the East is a curse as offensive performance so far has been consistently bad. No one east or north of the state of Alabama is showing anything like progress towards good numbers in any facet of the game. Georgia may score in the red zone, but everything else it is doing is pedestrian. Florida also shows great numbers in the red zone, but that’s only because it only made it there four times in three games. All the rest of its scoring has been on big plays that rival the prayer at Jordan-Hare and the Kick Six in surprise levels. And 50% of its scores are from its own side of the field in the last minute of play in two games. Amazing, yes, but that just can’t continue to be a thing. Someone at some point is going to start playing these guys hard on defense.
That being said, it might have to wait until the Florida–Georgia classic game. No one else is showing signs of life or ability in the SEC East on defense. All six teams that are not based in Athens have severe numbers issues in the first four games of the season, even if their W–L records may belie it. These teams are not fully functional in all aspects of the game, with yards per play and scoring being the biggest areas of concern across the board.
By these numbers, Georgia should have the division locked up, and its new top-ten ranking is a reflection of this. But that Florida jinx in the Jacksonville game is a hard one to overcome if history is any judge. It remains to be seen if the Gator luck holds out for that game, and if the division race in the East is still on.
The State of the Conference
…is fast leaning to being just the state of Alabama or Georgia. With Auburn on the move offensively and catching momentum, we could be well in the mix for the championship with games against the only two elite teams in the conference in the late season Amen Corner. The other “maybe” teams of Mississippi State, LSU and TAMU have all shown some disturbing trends that are still surfacing as we progress in the season. You can’t dismiss them all as outliers when they happen this often, game after game.
It’s still early enough for them and Auburn to right the ship and play like champions on both sides of the ball, but the performance slope is steepening as the season progresses. With outliers being what they are, you can never fully count them out. Neither can you count ON them going forward. Getting a 72-point performance out of your offense in conference play is just not that common. Not saying it can’t happen, but there is just less and less impact such a game will have as this data builds over the next two weeks.
No matter what Lady Macbeth might say, keep your hands off this and wait for the numbers to settle out.