An article in The Seattle Times reported that “an orange used car is least likely to be a lemon.” This discovery surfaced in a competition hosted by Kaggle to predict bad buys among used cars using a labeled dataset.
Of the 72,983 used cars, 8,976 were bad buys (12.3%). Yet, of the 415 orange cars in the dataset, only 34 were bad (8.2%). The visualization used was entirely appropriate and accurate, but susceptible to the small-sample effect so it led to incorrect conclusions.
This white paper dives into the details and explores techniques, particularly Target Shuffling, to avoid making the same mistake.