Businesses all over the world have been betting big on Big Data – making huge investments in big data storage and processing platforms like Hadoop, Spark, Hydra, and the likes. However, a survey conducted by Gartner Inc. in June 2016 found that many big data projects don’t have a tangible return on investment to show. Commenting on this finding, Nick Heudecker, research director at Gartner stated, “The big issue is not so much big data itself, but rather how it is used.”
The statement is apt because so far, the focus has been on gathering and storing as much data as possible, considering it a separate effort and not making it part of the holistic business plan. Today, data is available in such a volume that gathering data is no longer an issue; analyzing it and uncovering actionable insights from it still is. The following two facts are quite contradictory, yet true, and deserve our attention.
1. The Hadoop market is forecast to grow at a compound annual growth rate 58% surpassing $1 billion by 2020. – Forbes
2. Less than 1% of all data is ever analyzed and used. – Forbes
It’s clear that we have been investing big in big data, but we’re not yet tapping into the true potential of data. It is high time we put this data to use and converted it into actionable insights, so we can make accurate predictions and data-driven, well-informed decisions for the future.
In a recent webinar – Machine Learning – Road to Citizen Data Scientist – we had Scott Trauthen from Alteryx, Sandeep Saini and Rahul Sachdeva from Grazitti Interactive talk about using data for future decisions as well as developing and deploying the tournament of models in Alteryx. With over 60 attendees, the webinar helped analysts and marketers understand how they can make the most of their data. In this blog post, we’re sharing a brief of what was discussed in the webinar.
In the webinar, the trio discussed:
- How to access, prep, blend and analyze all relevant data in a repeatable workflow
- Selecting the right predictive model for your data
- Real world use cases for leveraging predictive analytics and machine learning
Building Predictive Models in Alteryx
The first step to building predictive models in Alteryx is to prepare data. Here are the steps you need to follow –
- Initial Cleanup
- Join/Append/Union multiple datasets
- Create required data fields
- Validate for correctness of data
- Produce basic summary reports
- Define and create variables
- Eliminate nulls
- Validate through summary reports
Well, it’s not as complex as it sounds. Watch the webinar recording to see how simple and effective the process of data preparation is.
Model Workflow – Leveraging Alteryx
Grazitti has built a number of models exclusively for Alteryx. Below are a couple of examples.
i) This workflow comprises of various stages of data processing – input, cleansing, variables, sampling, tournament of models, selecting the best fit, model assessment, validation dataset scoring, and open deals scoring.
ii) This workflow is a prime example of how we can bring disparate systems together to form a single data set enriched with 3rd party data sets such as demographics, social data; run this enriched data through a predictive engine built in Alteryx and provide end users with ready-to-consume custom predictive models.
To see how these two and other such models works, you may want to watch the webinar recording.Predictive Analytics & Machine Learning: Successful Use Cases