NFL Betting Markets

The landscape of sports entertainment is evolving rapidly, with sports betting emerging as a major player. Once a niche hobby, it has now become mainstream, attracting millions of new enthusiasts. The widespread legalization and accessibility of betting apps are driving this growth, making it easier than ever to participate. Moneylines, spreads, and over/unders have become a key part of nearly every sports broadcast.

These three game betting markets are known as some of the most accurate betting markets, and it is tough to generate a profitable strategy against them in the long run. There is research that shows this is a nearly efficient market, but that the interactions between moneyline, spread, and total create some inefficiencies that can be exploited.

If these three markets have inefficiencies, it is possible that other markets have the same or even more exploitable gaps. The goal here is to use the sportsbooks data against each other, similar to a +EV Strategy. Using the moneyline markets and their implied win percentages, we can simulate the NFL season thousands of times and compare the results against futures markets for team win totals and division winners.

Data collection and Cleanup

NflfastR is a super useful package for analysis like this, and this schedule data will have everything we need to create our simulation. We can calculate the implied probability of each team winning from the moneyline, and then remove the vig to get the true implied probability. The lines that nflfastR has are sourced from DraftKings.

# Get 2024 NFL Schedule
games <- nflfastR::fast_scraper_schedules(2024)

# calculate implied probability for each team and remove the vig
games_with_win_probability <- games %>%
  filter(season == 2024) %>%
  mutate(
    implied_prob_home = if_else(home_moneyline > 0, 100 / (home_moneyline + 100), -home_moneyline / (-home_moneyline + 100)),
    implied_prob_away = if_else(away_moneyline > 0, 100 / (away_moneyline + 100), -away_moneyline / (-away_moneyline + 100)),
    total_prob = implied_prob_home + implied_prob_away,
    home_team_win_prob = implied_prob_home / total_prob,
    away_team_win_prob = implied_prob_away / total_prob
  ) %>%
  select(season, week, home_team, away_team, home_team_win_prob, away_team_win_prob)

Creating Simulation Model

Another great package for something like this is nflseedR which will help simulate the NFL season thousands of times with one line of code, and provide a digestible summary with another line. The most complex part is building the logic that tells the simulation the parameters you’d like to use.

sim_model <- function(teams, games, week_num, win_probs, ...) {
  # Check if win probabilities are already joined
  if (!"home_team_win_prob" %in% colnames(games)) {
    # Merge win probabilities with the games dataframe only if not already present
    games <- games %>%
      left_join(win_probs, by = c("home_team", "away_team", "week"))
  }
  
  # Update the result based on win probabilities for the current week
  games <- games %>%
    mutate(rand_number = runif(nrow(games))) %>%
    # add playoff logic, not relevant for this model so kept it simple
    mutate(
      home_team_win_prob = ifelse(week > 18, 0.65, home_team_win_prob),
      away_team_win_prob = ifelse(week > 18, 0.35, away_team_win_prob),
      result = case_when(
        !is.na(result) | week != week_num ~ result,
        home_team_win_prob > rand_number ~ 1,
        home_team_win_prob < rand_number ~ -1,
        TRUE ~ 0
      )
    )
  
  # Return the modified games dataframe
  return(list(teams = teams, games = games))
}

This analysis is only concerned with the regular season, so the playoff logic is very simple. The home team will win 65% of the time in the playoffs. The next code chunk includes that one line of code that runs the simulations, and the other that provides the fancy graphic.

set.seed(123)

sims <- simulate_nfl(nfl_season = 2024, process_games = sim_model, simulations = 10000, win_probs = games_with_win_probability)
## ℹ 16:38:23 | Loading games data
## ℹ 16:38:23 | Beginning simulation of 10000 seasons in 6 rounds
## ℹ 16:41:09 | Combining simulation data
## ℹ 16:41:09 | Aggregating across simulations
## ℹ 16:41:10 | DONE!
summary(sims)
Simulating the 2024 NFL Season
Summary of 10k Simulations using nflseedR
AVG.
WINS
Make
POST
Win
DIV
No.1
Seed
Win
Conf
Win
SB
No.1
Pick
Top-5
Pick
AVG.
WINS
Make
POST
Win
DIV
No.1
Seed
Win
Conf
Win
SB
No.1
Pick
Top-5
Pick
E A S T
9.7 62% 35% 9% 9% 6% <1% 5% 9.8 69% 44% 11% 11% 4% <1% 4%
9.6 61% 34% 9% 9% 6% <1% 3% 9.6 68% 42% 10% 10% 3% <1% 3%
9.3 56% 29% 6% 7% 5% <1% 6% 7.1 20% 7% <1% 2% <1% 6% 31%
5.7 5% 2% <1% <1% <1% 16% 53% 6.9 17% 6% <1% 1% <1% 8% 34%
N O R T H
10.4 75% 42% 16% 13% 9% <1% 1% 9.9 69% 42% 12% 11% 3% <1% 3%
10.2 69% 35% 13% 11% 7% <1% 2% 9.4 60% 32% 8% 8% 3% <1% 4%
8.8 44% 15% 5% 5% 3% <1% 7% 8.7 46% 20% 5% 6% 2% 1% 10%
7.8 27% 8% 2% 3% 2% 2% 15% 7.2 21% 7% <1% 2% <1% 4% 24%
S O U T H
9.1 56% 42% 6% 8% 5% <1% 5% 9.5 66% 52% 8% 10% 3% <1% 6%
8.4 39% 26% 3% 5% 3% 2% 12% 7.9 36% 21% 2% 4% 2% 2% 16%
8.4 39% 25% 2% 5% 3% 2% 15% 7.9 34% 21% 2% 4% 1% 3% 19%
6.6 13% 7% <1% 1% 1% 9% 38% 6.3 12% 6% <1% 1% <1% 13% 47%
W E S T
11.0 86% 72% 25% 18% 11% <1% <1% 11.1 87% 68% 30% 19% 7% <1% <1%
8.7 41% 18% 4% 5% 3% 2% 12% 8.9 51% 19% 6% 6% 2% <1% 7%
7.1 17% 6% <1% 1% <1% 5% 28% 7.7 26% 7% 1% 2% <1% 3% 19%
6.4 9% 3% <1% <1% <1% 10% 42% 7.2 19% 5% <1% 2% <1% 5% 27%
nflseedR

Comparing Results to Futures Markets

The simulation also makes it easy to see a teams probability of going over X number of wins, which is super helpful when looking at futures markets.

sims$team_wins[11:15,]
## # A tibble: 5 × 4
##   team   wins over_prob under_prob
##   <chr> <dbl>     <dbl>      <dbl>
## 1 ARI     5       0.798     0.0854
## 2 ARI     5.5     0.798     0.202 
## 3 ARI     6       0.626     0.202 
## 4 ARI     6.5     0.626     0.374 
## 5 ARI     7       0.425     0.374

Fanduel is one place to play these futures markets and check for EV. This csv contains manually collected futures odds at the time of writing.

FD_futures <- read.csv("FD futures.csv") %>%
  mutate(Team = ifelse(Team == "LAR", "LA", Team))

head(FD_futures,5)
##   X Team Fanduel.Over.Under Fanduel.Over.Odds Fanduel.Under.Odds
## 1 1   SF               11.5               116               -142
## 2 2   KC               11.5              -122                100
## 3 3  PHI               10.5              -104               -118
## 4 4  BUF               10.5               122               -150
## 5 5  DAL               10.5               130               -160
##   FanduelOverProb FanduelUnderProb overWinnings underWinnings
## 1       0.4629630        0.5867769    1.1600000     0.7042254
## 2       0.5495495        0.5000000    0.8196721     1.0000000
## 3       0.5098039        0.5412844    0.9615385     0.8474576
## 4       0.4504505        0.6000000    1.2200000     0.6666667
## 5       0.4347826        0.6153846    1.3000000     0.6250000

It is straightforward to compare the simulation results to the Fanduel futures markets and look for any +EV plays.

FD_futures_w_sim <- FD_futures %>%
  # join futures with team wins
  left_join(sims$team_wins, by = c('Team' = 'team', 'Fanduel.Over.Under' = 'wins')) %>%
  # calculate EV for overs and unders
  mutate(
    overEV = (over_prob * overWinnings) - ((1 - over_prob) * 1),
    underEV = (under_prob * underWinnings) - ((1 - under_prob) * 1)
  )

Checking for +EV Overs on Futures:

FD_futures_w_sim %>%
    filter(overEV > 0) %>%
    select(Team, overEV, Fanduel.Over.Under, Fanduel.Over.Odds, overWinnings, over_prob) %>%
    arrange(desc(overEV)) %>%
    mutate(overEV = percent(round(overEV, 3)),
           over_prob = percent(round(over_prob, 3)),
           overWinnings = dollar(overWinnings))
##   Team overEV Fanduel.Over.Under Fanduel.Over.Odds overWinnings over_prob
## 1  DEN  31.2%                5.5              -102        $0.98     66.2%
## 2  NYG  25.9%                6.5               122        $1.22     56.7%
## 3   NE  17.0%                4.5              -162        $0.62     72.4%
## 4  CAR  14.6%                5.5              -134        $0.75     65.6%
## 5  WAS  14.4%                6.5              -115        $0.87     61.2%
## 6   LV  11.9%                6.5              -122        $0.82     61.5%
## 7  TEN   5.3%                6.5               110        $1.10     50.2%
## 8  MIN   5.3%                7.5               138        $1.38     44.2%
## 9   NO   2.8%                7.5              -130        $0.77     58.1%

Checking for +EV Unders on Futures:

FD_futures_w_sim %>%
    filter(underEV > 0) %>%
    select(Team, underEV, Fanduel.Over.Under, Fanduel.Under.Odds, underWinnings, under_prob) %>%
    arrange(desc(underEV)) %>%
    mutate(underEV = percent(round(underEV, 3)),
           under_prob = percent(round(under_prob, 3)),
           underWinnings = dollar(underWinnings))
##    Team underEV Fanduel.Over.Under Fanduel.Under.Odds underWinnings under_prob
## 1   HOU  26.40%                9.5                122         $1.22      57.0%
## 2   DET  21.10%               10.5               -105         $0.95      62.0%
## 3    KC  18.90%               11.5                100         $1.00      59.5%
## 4   CIN  18.30%               10.5                110         $1.10      56.3%
## 5   PHI  17.30%               10.5               -118         $0.85      63.5%
## 6   BUF  13.00%               10.5               -150         $0.67      67.8%
## 7   ATL  10.70%                9.5                120         $1.20      50.3%
## 8    GB  10.60%                9.5                112         $1.12      52.2%
## 9   BAL   7.70%               10.5                112         $1.12      50.8%
## 10  DAL   7.30%               10.5               -160         $0.62      66.0%
## 11  CHI   6.40%                8.5                130         $1.30      46.2%
## 12  NYJ   6.30%                9.5                132         $1.32      45.8%
## 13  MIA   5.50%                9.5                100         $1.00      52.7%
## 14  JAX   2.30%                8.5               -105         $0.95      52.4%
## 15  LAC   1.10%                8.5                118         $1.18      46.4%
## 16  SEA   0.70%                7.5                112         $1.12      47.5%
## 17  PIT   0.60%                8.5               -168         $0.60      63.1%

Assumptions and Limitations

This +EV model assumes that the pre-season moneyline markets are efficient, which is not the case. Moneyline values shift a lot as the season goes on as new data becomes available and players and teams over or underperform pre-season expectations.

It isn’t a coincidence that all of the high EV unders are teams that the market expects to be good, and all the high EV overs are teams the market expects to be bad. Aggregating individual game lines to the season level directly may not be a sound methodology, because the risk the sportsbook is willing to take on at those levels, as well as the variability of outcomes, is different.

There isn’t sufficient pre-season moneyline data easily available to back-test this methodology.

Takeaways

This model is just one way to analyze the future win total market. It is by no means perfect, but is meant as an interesting and different way to find potentially +EV bets.