Target Shuffling

Target Shuffling is a process for testing the statistical accuracy of data mining results. It is very useful for identifying false positives. Learn more from the creators of the process.

The Ten Most Common Data Science Business Mistakes

Practical Text Mining – Applying Analytics and Modeling

In one comprehensive resource, Practical Text Mining and Statistical Analysis for Non-Structured Text Data Applications provides complete coverage of statistical and analytical concepts, techniques, and applications for text mining.

Handbook of Statistical Analysis and Data Mining Applications

The Handbook of Statistical Analysis and Data Mining Applications is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers (both academic and industrial) through all stages of data analysis, model building and implementation.

It is a Mistake to Rely on One Technique

Top 10 Data Science Mistakes eBook

Ensemble Methods in Data Mining

Ensemble methods have been called the most influential development in Data Mining and Machine Learning in the ast decade. They combine multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges—from investment timing to drug discovery, and fraud detection to recommendation systems—where predictive accuracy is more vital than model interpretability.

It is a Mistake to Lack Relevant Data

It is a Mistake to Focus on Training Results

It is a Mistake to Ask the Wrong Questions

Are Orange Cars Really not Lemons