Target Shuffling is a process for testing the statistical accuracy of data mining results. It is very useful for identifying false positives. Learn more from the creators of the process.
In one comprehensive resource, Practical Text Mining and Statistical Analysis for Non-Structured Text Data Applications provides complete coverage of statistical and analytical concepts, techniques, and applications for text mining.
The Handbook of Statistical Analysis and Data Mining Applications is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers (both academic and industrial) through all stages of data analysis, model building and implementation.
Ensemble methods have been called the most influential development in Data Mining and Machine Learning in the ast decade. They combine multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges—from investment timing to drug discovery, and fraud detection to recommendation systems—where predictive accuracy is more vital than model interpretability.