Three weeks ago, Klay Thompson—of the Golden State Warriors—made 60 shots in 29 minutes. The concept of the hot hand is simple and intuitive— a basketball player who is doing well will continue to do well, whether it is because they are in the mood or because there is actually something affecting their ability.
But is there mathematical basis for such "hot hand"?
To examine this phenomenon, we sourced data of every single shot made in a NBA season, 2014-2015. In previous studies, research was conducted with a single basketball team, or with a single college team. This time, we're examining the data for the entire league.
We found that there were no columns for a player’s team name, the opposing team for the game, or the date of the game in this dataset. Instead, the dataset included a column called “MATCHUP” that included all of this information, e.g. “MAR 04, 2015 - CHA @ BKN.” As such, we extracted the relevant data from this column to create three columns: “MATCHUP” that represented the date of the match, “TEAM” that represented the player’s team, and “OPPONENT” that represented the opposing team during that match.
In addition, “SHOT_RESULT” took values of “missed” or “made.” This was encoded into binary values, with 1 representing a shot that was made and 0, missed. The same was done for the “LOCATION” column. “A,” or an away game, was encoded to 0. “H,” or a home game, was encoded to 1. The “W” column was similarly converted, where “W,” or a win, was encoded to 1. “L,” or a loss, was encoded to 0.
We then examined the data for any null values and found that a column, SHOT_CLOCK, had 5567 null values. We imputed the data for this by using linear regression, reasonable because the variable was continuous. Imputing the data led to reasonable results as well, with no values about 24 seconds, the maximum.