...This reduces model complexity, training time, and a whole whack load of hyperparameters we don’t have to worry about. Every video will be subsampled down to 40 frames. So a 41-frame video and a 500-frame video will both be reduced to 40 frames, with the 500-frame video essentially being fast-forwarded. We won’t do much preprocessing. A common preprocessing step for video classification is subtracting the mean, but we’ll keep the frames pretty raw from start to finish.