NFL Statistical Analysis

PRS · Sep 29, 2010

I took up statistics this year in order to write a computer program to predict the outcomes of football games. To start with I downloaded last year's data and looked it over for any patterns that might suggest possible algorithms with which to make a mathematical prediction.

I noticed offensive ranking predicted New Orleans to be the best team in football. Minnesota was right up there, so was Green Bay to name the top of the list. So I took the offensive ranking (1-32) and added it to 1/2*Defensive ranking and compared the matched up teams. Also, I subtracted 5 for home field advantage.

Quality = offense + 1/2 defense - 5 if home team. The lower the score, the higher the quality and these quality factors were used to predict outcomes of particular match-ups.

After 3 games my program is only 55% correct. I think I could do as well just by guessing. It's back to the drawing board.

Total yards seems to be the best way to rank both offenses and defenses. But there's a trick to it: Scoring 30 points against the Seahawks is not the same as scoring 30 points against the Packers. Thus there needs to be a "weighted" total yards.

This is just one thought. I have others. Does anyone else have any thoughts on this? Another is this: The Seahawks should have a heavier weight for home field advantage than the Chargers because the home-away stats for each team.

Papabravo · Sep 29, 2010

You are absolutely wasting your time. There is little or no evidence to suggest that ANY past measure of team performance has ANY predictive power whatsoever.

A team is a dynamic entity with stochastic performance, and it is never the same team twice.

jpanhalt · Sep 29, 2010

Look up Stein's Paradox. There was a great Scientific American (S.A.) article about it, maybe 3 decades ago or more. I just checked Google and there seem to be several early hits with a similar theme as the S.A. article.

The S.A. article basically begins with the proposition that baseball game outcomes seem random, but in any one year, you know which team to bet on. The same can be said about football, but we all know not to bet on the Cleveland Browns if they are playing a team with a winning record. The Stein estimator helps predict that bit of common sense knowledge.

I have moved a couple of times since I last referenced the article, so I cannot find it right now (i.e., three moves = one fire, Mark Twain). If I find it, I will try to post a link or pdf.

John

davebee · Sep 29, 2010

If a person enjoys doing this sort of thing then it's not a waste of time at all.

"Life is a journey, not a destination"

PRS · Sep 30, 2010

Papabravo said:
You are absolutely wasting your time. There is little or no evidence to suggest that ANY past measure of team performance has ANY predictive power whatsoever.

A team is a dynamic entity with stochastic performance, and it is never the same team twice.

I'll admit last year's data does not predict this year's outcomes very well. My experience just showed me that. However, if a team was good or bad last year, chances are they will be pretty much the same this year. And this was my reason for using last year's data.

But the low score -- about 55% overall -- that I have earned with my elementary algorithm, only tells me I went about crunching the data in a fallacious way. I intend to improve the algorithm for this coming week's games by using total yards data for both offenses and defenses, and including special teams data.

My interest is really in statistics. Since there are so many variables in sports, I chose to use sports as a means to checking my progress in learning statistical analysis. There are many famous mathematicians who have done the same and with good reason: Games are statistical and therefore a mathematician should be able to make money with his or her expertise in math!

PRS · Sep 30, 2010

jpanhalt said:
Look up Stein's Paradox. There was a great Scientific American (S.A.) article about it, maybe 3 decades ago or more. I just checked Google and there seem to be several early hits with a similar theme as the S.A. article.

The S.A. article basically begins with the proposition that baseball game outcomes seem random, but in any one year, you know which team to bet on. The same can be said about football, but we all know not to bet on the Cleveland Browns if they are playing a team with a winning record. The Stein estimator helps predict that bit of common sense knowledge.

I have moved a couple of times since I last referenced the article, so I cannot find it right now (i.e., three moves = one fire, Mark Twain). If I find it, I will try to post a link or pdf.

John

thank's Jpanhalt. I hope you find what you have alluded to. It sounds pertinent. By the way, do you have any ideas on how to weigh the difference between scoring 30 points against Seattle vs scoring 30 points against Green Bay? Not all yardage should be given the same significance, I think.

PRS · Sep 30, 2010

I'm just having fun in my own way. I want to learn statistics and test my knowledge at the same time. If I can't predict game outcomes then I'm not really understanding statistics. Right now I'm a D student and I want to improve upon that.

Georacer · Sep 30, 2010

Why don't you weigh the scored yards according to the rolling average score of a team on the final classification, over the years you have a logg of?

PRS · Sep 30, 2010

Georacer said:
Why don't you weigh the scored yards according to the rolling average score of a team on the final classification, over the years you have a logg of?

Do you care to explain that to me a little better, Goeracer? What is a rolling average?

Georacer · Sep 30, 2010

Well, at first thought you could evaluate the value of a team by finding an average of its standings on the final classification of the championship. But as a team's dynamic changes over the years, that would be misleading. Instead, if you admit that a team can do good at one year and not totaly such the next you can think as follows. Start from year X and calculate the average value as: \(V_{\small{avg\ n}}=0.8 \cdot V_{\small{avg\ n-1}} + 0.2 \cdot V_n\), where V is the counted variable, make it score, standing or whatever. The idea is to take into account older performance but as time goes by let the newer data take over. Tweek the 0.8 and 0.2 constants for more gravity to the past or the recent scores. More "0.8" for gravity to the past or more "0.2" to emphasise the recent performance. Don't forget that the sum must always be 1!

jpanhalt · Sep 30, 2010

Here's the citation: B. Efron and C. Morris, Stein's Paradox in Statistics, Sci. Amer. 236(5):119-127, 1977. That's the May issue. Attached is a pdf of the first page in lieu of an abstract. Scientific American does not have that issue available for sale, but in my experience dealing with the publisher, it takes copyright pretty seriously. I suspect libraries in your area will have bound or other versions for you to view.

John

View attachment 23069

PRS · Oct 2, 2010

Georacer said:
Well, at first thought you could evaluate the value of a team by finding an average of its standings on the final classification of the championship. But as a team's dynamic changes over the years, that would be misleading. Instead, if you admit that a team can do good at one year and not totaly such the next you can think as follows. Start from year X and calculate the average value as: \(V_{\small{avg\ n}}=0.8 \cdot V_{\small{avg\ n-1}} + 0.2 \cdot V_n\), where V is the counted variable, make it score, standing or whatever. The idea is to take into account older performance but as time goes by let the newer data take over. Tweek the 0.8 and 0.2 constants for more gravity to the past or the recent scores. More "0.8" for gravity to the past or more "0.2" to emphasise the recent performance. Don't forget that the sum must always be 1!

I get the idea. I think it's just what I wanted. Thanks again. I'll "the" old data for 2 or three years and include this year's with the greatest weight. When I do, I'll let you know what happened.

PRS · Oct 2, 2010

Thanks, jpanhalt. I googled Charles Stein and found the material Efron and Morris used. I'll have to study it before I can use it. When I do, I'll post results.

sceadwian · Oct 2, 2010

You have to understand risk factors as well though, injuries, top athletes really put themselves out there, top performance means top risk, a single injury can flush all your existing data down the toilet if it's risk factors aren't taken into account. You can't look at a whole team in a game such as American football because single players can have dramatic effects if there's a total team synergy, it's impossible to statistically analyze this currently, but living systems that complex can't be fully modeled reliably though I'm sure you can get close enough to increase your odds, the question is can you increase your odds to the point where you can make any money.
Oh yeah, and betting on the NFL is illegal so if you ever did find such a system you'd end up in jail if you couldn't hide it.

Wendy · Oct 2, 2010

Not in Las Vegas, I believe. Those guys will beat on anything, and do.

loosewire · Oct 2, 2010

The bookies has better win, loss record In all the sports.
An exciting close game Is a bookie win but fans love sports.
Business has now found a chance model that Is legal,plus the
logo you wear make a lot difference,you are owned by your school
for life,If you are a good fan.You notice that you don't see home
video of professional sports.The teams own It all. Your kids education
money gos to sports not the class room.

sceadwian · Oct 2, 2010

Bill, look up the book odds of being able to determine the Superbowl winner at the end of the year at the start of the season =)

Yes, they do know their buisness, and they'll never explain their statistical analysis methods =)

sceadwian · Oct 2, 2010

Business has now found a chance model that Is legal,plus the

Last time I checked betting on sports was still illegal by large.
I would love to see a substantiation that education monies go to sports rather than to a class room.

Wendy · Oct 3, 2010

sceadwian said:
Bill, look up the book odds of being able to determine the Superbowl winner at the end of the year at the start of the season =)

Yes, they do know their buisness, and they'll never explain their statistical analysis methods =)

Never said they did, I was responding to a blanket statement. There are a lot of things that are legal in Las Vegas that drive their neighbors nuts. IMO, this is a good thing, morality should not be legislated. But I digress.

Anyone wanting to actually test a theory does have the venue to do so.

sceadwian · Oct 3, 2010

I think basically what I was trying to say is that chaotic systems aren't practical to model statistically on short timescales, and short timescales is relative to the system itself. Sure a good system will increase your odds over base but it's not possible to account for everything, keeping in mind these are human beings not chess pieces. No analytical theory will ever be able to accurately produce good results from such a chaotic system.

It's like trying to predict the weather, even with huge super computers hell bent on figuring this stuff out we can at best predict trends, then you get your Katrina's

Thread starter	Similar threads	Forum	Replies	Date
	PWM current vs DC current	General Electronics Chat	5	Saturday at 3:06 AM
T	understanding the diode demodulator circuit?	General Electronics Chat	3	Apr 17, 2024
D	Statistical tools on Artificial intelligence.	Machine Learning, AI & Neural Networks	2	Apr 23, 2016
I	how to compare mesurements and statistical destributions ?	Programming & Languages	0	Oct 21, 2014
B	Probability vs statistical inference	General Science, Physics & Math	4	Jan 5, 2009

NFL Statistical Analysis

Join our Engineering Community! Sign-in with:

NFL Statistical Analysis

PRS

Papabravo

jpanhalt

davebee

PRS

PRS

PRS

Georacer

PRS

Georacer

jpanhalt

PRS

PRS

sceadwian

Wendy

loosewire

sceadwian

sceadwian

Wendy

sceadwian

You May Also Like

Samsung, Micron, and SK Hynix Lead the Charge on HBM3E DRAM

Microchip Expands Its Serial SRAM Devices to 2 Mb and 4 Mb

NXP Launches Open S32 CoreRide Platform for Software-Defined Vehicles

Understanding Output Signal Swing in Op Amps