At Elder Research, we don’t experiment on research projects in a lab, we deploy production-grade operational analytics solutions.
Our teams are trained to leverage a disciplined, time-tested approach. It starts by deeply understanding the client’s organizational context and accurately framing the business objectives. We then explore data, build usable models, and validate them against real-world criteria. Then we design and deploy powerful, user-friendly analytics solutions that integrate seamlessly into workflows, systems, and processes.
This rigorous design, analysis, and deployment framework is Elder Research’s Agile Data Science process.
Agile Data Science
Analytic modeling is a discovery task. It is difficult to know in advance which algorithms and variables, when combined, will reveal the meaningful and actionable secrets concealed in the data. To mitigate risk, Elder Research conducts a rigorous Agile Data Science process where all aspects of data discovery, modeling, and deployment are anchored on iterative development and frequent customer feedback. Our Agile Data Science process combines best practices for Data Science — embodied by the Cross Industry Standard Process for Data Mining (CRISP-DM) — with best practices for Agile software development.
This unique process combination allows Elder Research to deliver higher performing analytical models and solutions in less time and with less risk.
Our Agile Data Science process integrates the following practices:
Frame the Business Objectives
Having clearly defined business objectives is critical to the success of an analytics project. We have consistently achieved breakthrough solutions by combining the business domain expertise of our clients with our consulting expertise in business analysis, systems engineering, and data science modeling. Through this collaboration, we consider and assess the data, business goals, and organizational constraints in order to design and build a reliable and deployable analytics solution.
Explore and Transform Data
Elder Research has developed rigorous data preprocessing, cleansing, de-normalization, and extraction techniques and tools to ensure and improve data quality. With our Agile Data Science process and custom data analysis tools, the complexity of structuring and transforming the data is implemented over a number of iterations to distribute the burden of the data preparation phase. Careful attention to the existence of fatal data flaws, such as “leaks from the future” and survivor bias ensures the data properly reflects what will be seen during operation.
Ensemble Modeling Techniques
Our goal in every analytics project is to discover the most practical, highest performing model with the best out-of-sample error — the one that performs the best when given new, unseen data. To best achieve this goal, we have mastered many modeling techniques and we study many model options as it is often surprising which algorithms perform best in new situations.
Model Validation
The final analytic model is selected using a rigorous cross-validation process that ensures the model is robust to changes in the data and underlying assumptions. We employ out-of-sample testing and validation to limit over-fit and over-search during model development. Elder Research judges the performance of each model iteration for its ability to accurately predict or classify the target variable in a hold-out sample of data. This helps to ensure the stability of the algorithm when it is deployed into production environments.
Target Shuffling is one technique we use for testing the statistical accuracy of our data mining results. It is particularly useful for identifying false positives, or when two events or variables occurring together are perceived to have a cause-and-effect relationship, as opposed to a coincidental one. The more variables you have, the easier it becomes to ‘oversearch’ and identify (false) patterns among them—called the ‘vast search effect’.
Production Deployment
Our teams start every project with the end in mind. Over many years of tackling complex analytics projects across many sectors and within hundreds of unique organizations, we have learned key lessons that ensure production deployment of valuable analytic solutions.
Our executives engage C-suite and leaders across the organization to ensure strategy and business value is realized and communicated. Our data scientists and engineers integrate with IT departments and systems owners to streamline and de-risk technology deployment. Our project and program leaders create and execute change management plans to increase adoption and end-user engagement.
We infuse decades of practical design, implementation, and deployment wisdom into our Agile Data Science process.