# NFL Statistical Analysis

Discussion in 'Math' started by PRS, Sep 29, 2010.

1. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
I took up statistics this year in order to write a computer program to predict the outcomes of football games. To start with I downloaded last year's data and looked it over for any patterns that might suggest possible algorithms with which to make a mathematical prediction.

I noticed offensive ranking predicted New Orleans to be the best team in football. Minnesota was right up there, so was Green Bay to name the top of the list. So I took the offensive ranking (1-32) and added it to 1/2*Defensive ranking and compared the matched up teams. Also, I subtracted 5 for home field advantage.

Quality = offense + 1/2 defense - 5 if home team. The lower the score, the higher the quality and these quality factors were used to predict outcomes of particular match-ups.

After 3 games my program is only 55% correct. I think I could do as well just by guessing. It's back to the drawing board.

Total yards seems to be the best way to rank both offenses and defenses. But there's a trick to it: Scoring 30 points against the Seahawks is not the same as scoring 30 points against the Packers. Thus there needs to be a "weighted" total yards.

This is just one thought. I have others. Does anyone else have any thoughts on this? Another is this: The Seahawks should have a heavier weight for home field advantage than the Chargers because the home-away stats for each team.

2. ### Papabravo Expert

Feb 24, 2006
10,143
1,790
You are absolutely wasting your time. There is little or no evidence to suggest that ANY past measure of team performance has ANY predictive power whatsoever.

A team is a dynamic entity with stochastic performance, and it is never the same team twice.

3. ### jpanhalt AAC Fanatic!

Jan 18, 2008
5,692
901
Look up Stein's Paradox. There was a great Scientific American (S.A.) article about it, maybe 3 decades ago or more. I just checked Google and there seem to be several early hits with a similar theme as the S.A. article.

The S.A. article basically begins with the proposition that baseball game outcomes seem random, but in any one year, you know which team to bet on. The same can be said about football, but we all know not to bet on the Cleveland Browns if they are playing a team with a winning record. The Stein estimator helps predict that bit of common sense knowledge.

I have moved a couple of times since I last referenced the article, so I cannot find it right now (i.e., three moves = one fire, Mark Twain). If I find it, I will try to post a link or pdf.

John

4. ### davebee Well-Known Member

Oct 22, 2008
539
46
If a person enjoys doing this sort of thing then it's not a waste of time at all.

"Life is a journey, not a destination"

5. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
I'll admit last year's data does not predict this year's outcomes very well. My experience just showed me that. However, if a team was good or bad last year, chances are they will be pretty much the same this year. And this was my reason for using last year's data.

But the low score -- about 55% overall -- that I have earned with my elementary algorithm, only tells me I went about crunching the data in a fallacious way. I intend to improve the algorithm for this coming week's games by using total yards data for both offenses and defenses, and including special teams data.

My interest is really in statistics. Since there are so many variables in sports, I chose to use sports as a means to checking my progress in learning statistical analysis. There are many famous mathematicians who have done the same and with good reason: Games are statistical and therefore a mathematician should be able to make money with his or her expertise in math!

6. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
thank's Jpanhalt. I hope you find what you have alluded to. It sounds pertinent. By the way, do you have any ideas on how to weigh the difference between scoring 30 points against Seattle vs scoring 30 points against Green Bay? Not all yardage should be given the same significance, I think.

7. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
I'm just having fun in my own way. I want to learn statistics and test my knowledge at the same time. If I can't predict game outcomes then I'm not really understanding statistics. Right now I'm a D student and I want to improve upon that.

8. ### Georacer Moderator

Nov 25, 2009
5,142
1,266
Why don't you weigh the scored yards according to the rolling average score of a team on the final classification, over the years you have a logg of?

9. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
Do you care to explain that to me a little better, Goeracer? What is a rolling average?

10. ### Georacer Moderator

Nov 25, 2009
5,142
1,266
Well, at first thought you could evaluate the value of a team by finding an average of its standings on the final classification of the championship. But as a team's dynamic changes over the years, that would be misleading. Instead, if you admit that a team can do good at one year and not totaly such the next you can think as follows. Start from year X and calculate the average value as: $V_{\small{avg\ n}}=0.8 \cdot V_{\small{avg\ n-1}} + 0.2 \cdot V_n$, where V is the counted variable, make it score, standing or whatever. The idea is to take into account older performance but as time goes by let the newer data take over. Tweek the 0.8 and 0.2 constants for more gravity to the past or the recent scores. More "0.8" for gravity to the past or more "0.2" to emphasise the recent performance. Don't forget that the sum must always be 1!

11. ### jpanhalt AAC Fanatic!

Jan 18, 2008
5,692
901
Here's the citation: B. Efron and C. Morris, Stein's Paradox in Statistics, Sci. Amer. 236(5):119-127, 1977. That's the May issue. Attached is a pdf of the first page in lieu of an abstract. Scientific American does not have that issue available for sale, but in my experience dealing with the publisher, it takes copyright pretty seriously. I suspect libraries in your area will have bound or other versions for you to view.

John

View attachment 23069

12. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
I get the idea. I think it's just what I wanted. Thanks again. I'll "the" old data for 2 or three years and include this year's with the greatest weight. When I do, I'll let you know what happened.

13. ### PRS Thread Starter Well-Known Member

Aug 24, 2008
989
35
Thanks, jpanhalt. I googled Charles Stein and found the material Efron and Morris used. I'll have to study it before I can use it. When I do, I'll post results.

14. ### sceadwian New Member

Jun 1, 2009
499
37
You have to understand risk factors as well though, injuries, top athletes really put themselves out there, top performance means top risk, a single injury can flush all your existing data down the toilet if it's risk factors aren't taken into account. You can't look at a whole team in a game such as American football because single players can have dramatic effects if there's a total team synergy, it's impossible to statistically analyze this currently, but living systems that complex can't be fully modeled reliably though I'm sure you can get close enough to increase your odds, the question is can you increase your odds to the point where you can make any money.
Oh yeah, and betting on the NFL is illegal so if you ever did find such a system you'd end up in jail if you couldn't hide it.

15. ### Wendy Moderator

Mar 24, 2008
20,766
2,536
Not in Las Vegas, I believe. Those guys will beat on anything, and do.

16. ### loosewire AAC Fanatic!

Apr 25, 2008
1,584
435
The bookies has better win, loss record In all the sports.
An exciting close game Is a bookie win but fans love sports.
Business has now found a chance model that Is legal,plus the
logo you wear make a lot difference,you are owned by your school
for life,If you are a good fan.You notice that you don't see home
video of professional sports.The teams own It all. Your kids education
money gos to sports not the class room.

17. ### sceadwian New Member

Jun 1, 2009
499
37
Bill, look up the book odds of being able to determine the Superbowl winner at the end of the year at the start of the season =)

Yes, they do know their buisness, and they'll never explain their statistical analysis methods =)

18. ### sceadwian New Member

Jun 1, 2009
499
37
Last time I checked betting on sports was still illegal by large.
I would love to see a substantiation that education monies go to sports rather than to a class room.

19. ### Wendy Moderator

Mar 24, 2008
20,766
2,536
Never said they did, I was responding to a blanket statement. There are a lot of things that are legal in Las Vegas that drive their neighbors nuts. IMO, this is a good thing, morality should not be legislated. But I digress.

Anyone wanting to actually test a theory does have the venue to do so.

20. ### sceadwian New Member

Jun 1, 2009
499
37
I think basically what I was trying to say is that chaotic systems aren't practical to model statistically on short timescales, and short timescales is relative to the system itself. Sure a good system will increase your odds over base but it's not possible to account for everything, keeping in mind these are human beings not chess pieces. No analytical theory will ever be able to accurately produce good results from such a chaotic system.

It's like trying to predict the weather, even with huge super computers hell bent on figuring this stuff out we can at best predict trends, then you get your Katrina's