Applying statistical methods

dwm042
Probationary Member

Posts: 6

Applying statistical methods Oct 30, 2011 10:37:46 GMT -6

Quote

Post by dwm042 on Oct 30, 2011 10:37:46 GMT -6

Guys,

I'm not a coach, but I have a PhD in biochem and a football blog with a serious sabermetrics/analytics flavor. But before my blog was known for that, because I wrote articles about football books and football blogs, high school coaches (that blogged) found me first. So, to kind of return a favor..

There are two kinds of stats you guys want to be looking at. Yards per carry, adjusted yards per attempt, stats like this are measures of explosiveness. That's right, not consistency, but explosiveness. You don't want to throw away those outliers, you want to add it all in. You're looking to this kind of stat to show you how explosive your team is.

The modern consistency stat is the success rate. Now, that begs a question, what is a successful play? One definition would be: half the yardage needed for first down on first down, 2/3 of the yardage needed for a first down on second down, and all the yardage needed for a first down on third down. The percentage of successful plays by this criterion is commonly used as a measure of consistency, and DVOA, Football Outsiders signature stat, is a success rate stat relative to the league average success rate.

The best way to define success rates, in my opinion, is the way Brian Burke does on Advanced NFL Stats, but high school coaches really aren't going to have 5-6-7 years of play by play stats for their entire league. But if you did, you could build an expected points database for the league, and from that define success rate as any play that increases your expected points. In the absence of an expected points database, you'll need a simpler success rate definition, such as the one I provided above.

I hope this helps.

David.

codeandfootball.wordpress.com

coachd5085
Executive Member

Posts: 15,834

Applying statistical methods Oct 30, 2011 10:59:59 GMT -6

Quote

Post by coachd5085 on Oct 30, 2011 10:59:59 GMT -6

Oct 30, 2011 10:37:46 GMT -6 dwm042 said:

Guys,

I'm not a coach, but I have a PhD in biochem and a football blog with a serious sabermetrics/analytics flavor. But before my blog was known for that, because I wrote articles about football books and football blogs, high school coaches (that blogged) found me first. So, to kind of return a favor..

There are two kinds of stats you guys want to be looking at. Yards per carry, adjusted yards per attempt, stats like this are measures of explosiveness. That's right, not consistency, but explosiveness. You don't want to throw away those outliers, you want to add it all in. You're looking to this kind of stat to show you how explosive your team is.

The modern consistency stat is the success rate. Now, that begs a question, what is a successful play? One definition would be: half the yardage needed for first down on first down, 2/3 of the yardage needed for a first down on second down, and all the yardage needed for a first down on third down. The percentage of successful plays by this criterion is commonly used as a measure of consistency, and DVOA, Football Outsiders signature stat, is a success rate stat relative to the league average success rate.

The best way to define success rates, in my opinion, is the way Brian Burke does on Advanced NFL Stats, but high school coaches really aren't going to have 5-6-7 years of play by play stats for their entire league. But if you did, you could build an expected points database for the league, and from that define success rate as any play that increases your expected points. In the absence of an expected points database, you'll need a simpler success rate definition, such as the one I provided above.

I hope this helps.

David.

Interesting thoughts. However, as COACHES, not fans, remember we probably have a different underlying rationale for such study. We are looking at the data to make corrections and improve, not for fantasy football or gambling or talk show purposes. There are much greater variances in ability levels between games and between seasons and at the H.S level, the number one contribution to an explosive play is differences in athletic ability, not play execution or play selection.

“As to methods there may be a million and then some, but principles are few. The man who grasps principles can successfully select his own methods. The man who tries methods, ignoring principles, is sure to have trouble.”-- ---Ralph Waldo Emerson apparently would have been a great football coach!

dwm042
Probationary Member

Posts: 6

Applying statistical methods Oct 30, 2011 12:10:48 GMT -6

Quote

Post by dwm042 on Oct 30, 2011 12:10:48 GMT -6

Coachd,

No matter what level you play at, it helps to understand the stat you're using in the first place. And it also helps to know that some of the modern methods aren't geared to fantasy football or gambling. One technique becoming common is to use logistic regressions to determine which stats actually are relevant in determining whether a team wins or not. Logistic regressions are a lot better at fits to data whose results are merely ones and zeroes (wins and losses).

Yards per carry and net yards per attempt do correlate with winning, much more so than cumulative stats. And that's true, in general, of all the rate stats, relative to cumulative stats.

D-

codeandfootball.wordpress.com

Chris Clement
Executive Member

Posts: 10,463

Applying statistical methods Oct 30, 2011 13:31:47 GMT -6 via the ProBoards App

Quote

Post by Chris Clement on Oct 30, 2011 13:31:47 GMT -6

Broadly speaking, yes, especially at the NFL level where there's good consistency across teams and seasons. But from watching the games, I know we have a good yards per play. I also know we bust plenty of big plays, but our team loses a lot. So, is it because we can't sustain drives (my hypothesis)? A success rate stat might be good, but defining success is difficult. How about determining what percentage of plays are above the average? And maybe that compared to the mean and median could be significant.

dwm042
Probationary Member

Posts: 6

Applying statistical methods Oct 30, 2011 14:03:22 GMT -6

Quote

Post by dwm042 on Oct 30, 2011 14:03:22 GMT -6

cclement,

I know the definition of a success rate is abstract, but the one I defined has some useful properties. For one, a success rate less than 33% represents an issue with your team. A team that runs about 4 yards a carry all the time, give or take a yard, should have a success rate around 67%.

% of plays above average doesn't help you in terms of winning, that's a measure of symmetry of your plays around your average. That's not a measure I'd expect to be symmetric. esp if you have a knack for big plays.

Analyzing game play in terms of drives, noting how the drives ended will tell you a lot. That kind of analysis is the core of more sophisticated approaches such as expected points curves. How successful are your drives? How many plays do you have in an average drive? How many are ending in scores and how many are not?

First down counts, turnover rates, penalty rates, and time of possession help with analyzing consistency issues as well.

Success rate on third down is also an important measure, and that one has no abstract definition. Either you get the first down (or score) or you don't. Defensively, you either get the stop or you don't.

D-

codeandfootball.wordpress.com

spreadattack
Contributor

Posts: 4,822

Applying statistical methods Oct 30, 2011 16:38:53 GMT -6

Quote

Post by spreadattack on Oct 30, 2011 16:38:53 GMT -6

I'm a little more doubtful of the current state of football statistics vis-a-vis actual coaching decisionmaking. I think what Burke and others do is great, but I too have a difficult time applying it to actual decisionmaking, preparation, practice, or gameplanning. I also think the "logistic regressions" are fine and all but you get a ton of noise with them because of the sample sizes and the rather obvious nature of the inputs. I have a feeling at the high school level that if I ran a regression model on wins and losses and used forty times and height/weight I'd find those correlate pretty highly with wons and losses, as opposed to other popular stats or even less popular ones like success rate/etc.

I do think there is some value of the standard deviation/success rate/average stuff based on down and distance for purposes of playcalling. On first down or a gimme down you want to simply maximize your expected returns; other downs, like all third downs or if you're playing with a lead, consistency of result is probably more important than maximizing your output.

So I'm on the fence about this stuff right now. I think in the long, long run, there's a lot to be gained from statistical analysis. I think what coaches do even now on efficiency metrics is way beyond what was done 30 years ago. As Bill Parcells said, if you're using legal pads and your opponents are using computers, you're going to lose. And it's clearly easier to collect data now than ever before.

But we're not there yet, and I too haven't found too much that would really tangibly translate. I do think the Football Outsiders stuff is very fan/fantasy/etc oriented and really lacks a translatable basis to real football. Yet there's always hope it'll get there.

My website: Smart Football
smartfootball.com

My new book, The Art of Smart Football, is now out:
amzn.to/1fK94FM

Chris Clement
Executive Member

Posts: 10,463

Applying statistical methods Oct 30, 2011 17:26:30 GMT -6 via the ProBoards App

Quote

Post by Chris Clement on Oct 30, 2011 17:26:30 GMT -6

The biggest difference in HS is the wild variations in scheme and talent, so you can't get good data for stuff like win probability and expected points. That is a good point that on second and short you want to maximize your expected gain, or maybe not even that, but maximize your chances of a big play. However, usually I'm happy with just 4 yards. Thee just isn't enough good data.

dwm042
Probationary Member

Posts: 6

Applying statistical methods Nov 1, 2011 11:15:01 GMT -6

Quote

Post by dwm042 on Nov 1, 2011 11:15:01 GMT -6

The only thing preventing people from making expected points charts and win probabilities of high school games is the availability of play by play data. If, for example, I could log onto the site of the Georgia High School Association or the Georgia High School Football Historian's Association and download the play by play of between 2500 and 3000 high school games (that's about the size of Brian Burke's data set), then I could crank out expected points models for that sample as precise as Brian's are for the NFL.

It's not rocket science, it's mostly about figuring out how to score plays in a game.

Now in terms of whether the abstract models would affect decision making, that's yet to be determined. I'd suspect that personal experience to be more valuable than theory, and study of video more informative than anything derived from play by play analysis. That said, if you knew what the odds were of a safety on your 1 yard line, or the odds of scoring a TD from the opponent's 1, that might factor into your thinking, and you need to know those things to calculate an expected points model.

D-

codeandfootball.wordpress.com

pmeisel
Junior Member

Posts: 324

Applying statistical methods Nov 2, 2011 21:35:48 GMT -6

Quote

Post by pmeisel on Nov 2, 2011 21:35:48 GMT -6

"I have a feeling at the high school level that if I ran a regression model on wins and losses and used forty times and height/weight I'd find those correlate pretty highly with wons and losses, as opposed to other popular stats or even less popular ones like success rate/etc."

Post hoc, ergo propter hoc. Spurious correlation. The bane of analysts.

Analyzing data is fun , for certain sick mathematically addicted people like me. The hard part about analysis is trying to isolate what you are really trying to figure out, and just getting to the data that tells you about that. The decision orientation -- choice of plays and strategies -- is where the gold is, and I don't think anyone will find it just from general season statistics.

However, I think looking at overview statistics might prompt asking the kind of questions and pursuing the lines of thinking that might lead you somewhere. I have found that to be true in other fields of endeavor..............

Coach Huey
Administrator

Board Founder & CEO

High School Athletic Director & Head Football Coach (TX)

Posts: 10,782

Applying statistical methods Nov 3, 2011 10:13:53 GMT -6

Quote

Post by Coach Huey on Nov 3, 2011 10:13:53 GMT -6

i have no clue what you guys are talking about...

but, when we look at our "success/no-success" we start by looking at totals series ... then number of series that resulted in points... we look at those that didn't result in points and classify as "turnover, 3&out, 1st down but punted, turnover on downs"... we then look at plays called that lead to a turnover (was there a particular pass play that kept getting picked? was it one player that did most of the fumbling?), so on and so forth.

simple, I know, but it relates more to playcalling.... for us guys that aren't too concerned with how many yards per snap we gained on inside zone to the right for the last 3 years.

Last Edit: Nov 3, 2011 10:14:12 GMT -6 by Coach Huey

Yash Executive Member wide zone Posts: 2,101	Applying statistical methods Nov 3, 2011 16:01:27 GMT -6 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Yash on Nov 3, 2011 16:01:27 GMT -6 sounds like a job for my janitor to figure out. this thread is too smart for me...
	squareup.com/store/coachyash/ Clinic DVDs www.coachyash.com

Chris Clement
Executive Member

Posts: 10,463

Applying statistical methods Nov 26, 2011 21:06:05 GMT -6 via the ProBoards App

Quote

Post by Chris Clement on Nov 26, 2011 21:06:05 GMT -6

I ran our success rate using 40% on 1st down, 50% on 2nd down, and a first down on 3rd or 4th. We had a success rate of 40% on the year, I ignored plays that didn't "fit," like kneels and spikes and give-up runs on 3rd and forever. Now i have some data, but this is kind of meaningless in a vacuum. We won 1 game, and it was the only game where our success rate was better than the opponent's, but they were awful. Has anyone got numbers to compare against?

pmeisel
Junior Member

Posts: 324

Applying statistical methods Nov 27, 2011 13:09:51 GMT -6

Quote

Post by pmeisel on Nov 27, 2011 13:09:51 GMT -6

We won 1 game, and it was the only game where our success rate was better than the opponent's

You're on to something. Look at data from some of the more successful teams in your league. If you don't have all their data, maybe use a composite of just the data of the better teams in the game they played you. That will give you something to shoot at.

A success rate of 40% sounds to me like you were 3 and out over half the time. Your gut can tell you that's not good enough, without any math at all.....

jonnyjon
Sophomore Member

cOUrage

Posts: 141

Applying statistical methods Nov 28, 2011 16:08:22 GMT -6

Quote

Post by jonnyjon on Nov 28, 2011 16:08:22 GMT -6

I don't think ypp is an overrated stat at all. I've found that it is pretty good at predicting bowl game winners from teams with comparable schedules (comparable conference strength).

your ypp is 5.6 which would be good for 59th in division 1. I would not call that fantastic by any means. I realize much consideration has to be taken in the difference between div 1 and highschool ball but this gives me enough info to believe that 5.6ypp offensively is less than fantastic.

you guys are saying all these made up scenarios about a high ypp but that is usually not how it happens in real life. If you are taking a whole season of data, the ypp for the year should be a reasonably solid considering the number of "trials" you had.
I'm sure I can find problems with all the other statistics you guys might choose as well. For example, I've seen teams put up tons of points and have terrible 3rd down percentages.

clement, I'd be willing to bet that you guys gave up more than 5.6ypp if you only won one game.

Time is the only valuable. - Randy Pausch
There is no secret ingredient - Po

Chris Clement
Executive Member

Posts: 10,463

Applying statistical methods Nov 28, 2011 20:03:31 GMT -6

Quote

Post by Chris Clement on Nov 28, 2011 20:03:31 GMT -6

Wwell, we gained 5.6 ypp, but allowed 7.3, which IS a problem, but 1) I knew that already, 2) they two shouldn't correlate to that great a degree (great O can exist independently of great D) and 3). In fact, I KNOW our defense stinks, because one coach taught the DE's to spill, the other taught them to contain, and nobody taught them the difference.What I want to do is to confirm what my gut tells me, that we don't execute well, but rather rely on our great athletes to bail us out of 3rd and 20.

From what I've seen and the math I've done so far, it certainly seems to point to that, given:

1/2 our plays are for 2 yards or less
10% of our plays are for 10 or more yards
15% of our plays are for 15 or more yards
20% of our plays are for 20 or more yards (weird how that worked out)
Our success rate is only 40%
We had 17 fumbles and 4 interceptions (not all fumbles lost, but almost all of them killed the drive)
We only threw (with some exceptions, mostly by accident) into zones 1, 3, 7, and 9. Hitches, slants, and bombs-away sideline passes.
Our offense, at a glance, actually does look like those extreme scenarios posted above, because our QB insists on going deep to his best friend, and our RB thinks he can outdance entire teams. When it works, it's great, but it isn't often.

BUT, these numbers don't have any value without knowing what a decent team SHOULD look like, so I come to you good gentlemen in the fervent hope that someone has done some similar work so that we might compare notes.

jonnyjon Sophomore Member cOUrage Posts: 141	Applying statistical methods Nov 28, 2011 22:31:25 GMT -6 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by jonnyjon on Nov 28, 2011 22:31:25 GMT -6 how do you have 10% of plays 10 yards or more and 20% of plays 20 yards or more? shouldn't you have 20% + of your plays be 10 yards or more then?
	Time is the only valuable. - Randy Pausch There is no secret ingredient - Po

Chris Clement Executive Member Posts: 10,463	Applying statistical methods Nov 29, 2011 1:31:18 GMT -6 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Chris Clement on Nov 29, 2011 1:31:18 GMT -6 got it backwards, sorry. 20%>10 15%>15 10%>20

pmeisel Junior Member Posts: 324	Applying statistical methods Dec 11, 2011 18:36:20 GMT -6 Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by pmeisel on Dec 11, 2011 18:36:20 GMT -6 When you say half of your plays are for 2 yards or less -- how many of those are incomplete passes? Or are they excluded from that statistic?

Chris Clement Executive Member Posts: 10,463	Applying statistical methods Dec 11, 2011 18:51:58 GMT -6 via the ProBoards App Quote Select Post Deselect Post Link to Post Member Give Gift Back to Top Post by Chris Clement on Dec 11, 2011 18:51:58 GMT -6 The vast majority of those bad plays are incompletions, but just under half of our run plays were for 2 or less, vice just over half for our passes.

pmeisel
Junior Member

Posts: 324

Applying statistical methods Dec 11, 2011 19:52:52 GMT -6

Quote

Post by pmeisel on Dec 11, 2011 19:52:52 GMT -6

OK. For your analysis purposes I would split your runs from your passes. They are just not comparable to each other -- even great QBs and receivers have some incompletions, but good running offenses do not get stopped for zero very often.

Do you have similar stats for your opponents, or better, for your league rivals against common opponents? That would give you a good benchmark for what success looks like in your conference. What your opponents do against your defense is one thing, but how do they do against the defenses you play?

A lot of numbers can be mind-numbing. Perhaps focusing on just one or two rivals whose success you want to emulate will be easier to deal with. For example, if you are a run-first option team, compare yourself to other run-first teams. If you throw the ball around a lot, choose a rival who is similar. Comparing Mike Leach's Texas Tech teams to a pound-it-out two-back squad wouldn't tell you much.

Half of your run plays for two or less sounds like a problem to me. The difference between two and three is more than a yard, it's getting in a hole.

Don't give up. Asking the questions is where the real learning is, the math is just arithmetic.

Chris Clement
Executive Member

Posts: 10,463

Applying statistical methods Dec 11, 2011 20:51:20 GMT -6 via the ProBoards App

Quote

Post by Chris Clement on Dec 11, 2011 20:51:20 GMT -6

I don't have any third-party games, just our own stuff, and the problem is that our defense also stinks and we're down so much so fast the numbers are bunk. The team has much bigger issues, like two coaches , one who drills block-down-step-down one day, and the other wants DE contain.

pmeisel
Junior Member

Posts: 324

Applying statistical methods Dec 17, 2011 8:02:05 GMT -6

Quote

Post by pmeisel on Dec 17, 2011 8:02:05 GMT -6

Well, coach, there is still a way to benchmark yourself. You need to get 10 yards in four plays for a first down, and you really want to get them in 3 downs.

To make the math simple, categorize your plays into groups, like Loss, 0-2,3-5,5-9. Then assign a probability to each based on your stats, and build what I call a "reverse decision tree" (forgot the proper name long ago).

E.g. it's 1st and ten. you have 4 possible outcomes for starting second second down based on groups above, with percent assigned... then 16 possible outcomes for starting third down... chain calculate percentages to see how often you achieve first down.

Your ultimate goal is to score, but you can't if you don't have the ball, and first downs are what keep you the ball, so they are your intermediate goal. 10 yards in 3 plays is an easy benchmark to understand, and statistically the approach above isn't too overwhelming for a notepad and hand calculator.

At any level of the game, offenses that average over 4 yards a carry, and over 60% completion, do well because those numbers will sustain drives. No reason to over think it.

Applying statistical methods

Post by dwm042 on Oct 30, 2011 10:37:46 GMT -6

Post by coachd5085 on Oct 30, 2011 10:59:59 GMT -6

Post by dwm042 on Oct 30, 2011 12:10:48 GMT -6

Post by Chris Clement on Oct 30, 2011 13:31:47 GMT -6

Post by dwm042 on Oct 30, 2011 14:03:22 GMT -6

Post by spreadattack on Oct 30, 2011 16:38:53 GMT -6

Post by Chris Clement on Oct 30, 2011 17:26:30 GMT -6

Post by dwm042 on Nov 1, 2011 11:15:01 GMT -6

Post by pmeisel on Nov 2, 2011 21:35:48 GMT -6

Post by Coach Huey on Nov 3, 2011 10:13:53 GMT -6

Post by Yash on Nov 3, 2011 16:01:27 GMT -6

Post by Chris Clement on Nov 26, 2011 21:06:05 GMT -6

Post by pmeisel on Nov 27, 2011 13:09:51 GMT -6

Post by jonnyjon on Nov 28, 2011 16:08:22 GMT -6

Post by Chris Clement on Nov 28, 2011 20:03:31 GMT -6

Post by jonnyjon on Nov 28, 2011 22:31:25 GMT -6

Post by Chris Clement on Nov 29, 2011 1:31:18 GMT -6

Post by pmeisel on Dec 11, 2011 18:36:20 GMT -6

Post by Chris Clement on Dec 11, 2011 18:51:58 GMT -6

Post by pmeisel on Dec 11, 2011 19:52:52 GMT -6

Post by Chris Clement on Dec 11, 2011 20:51:20 GMT -6

Post by pmeisel on Dec 17, 2011 8:02:05 GMT -6