
There’s an abundance of articles, blogs, and reports these days that talk about the general importance of Big Data. Recently, however, numerous publications about Lean Data, Lean Data Analytics, and Lean approaches to Big Data have caught my attention. In particular, a recent blog post in The Guardian by Matti Keltanen jumped out at me with the headline: “The big data hype may not help you make the right decisions for your business – and there are four reasons why a lean approach makes better sense.”
In my opinion, this perspective is on the right track but a bit too simplistic – Big Data can help businesses understand, produce, communicate and build better products. But there are some common mistakes that can be addressed and solved – with the help of lean data – in order to make the most of the Big Data available to us.
Common mistakes about Big Data
One of the primary questions surrounding Big Data is whether it’s all just hype. Should executives simply continue business as usual and ignore the Big Data challenge?
Let me answer that with two common mistakes we see around big data.
The first one is to look at Big Data solely from a technology perspective. Some technology vendors will tell you that to solve the Big Data problem, you just need to acquire advanced analytical tools and data management infrastructure to collect all possible data points and apply maximum computing power to find tiny pieces of “data gold” that may represent breakthrough insights.
The other mistake is to ignore Big Data completely, and instead rely only on intuition and past business experience while using traditional limited data points to make decisions.
The reality is, both of these approaches are counterproductive. The first one is expensive, and creates a lot of waste that may even prevent an organization from finding real critical business insights from an ocean of Big Data. The second approach, meanwhile, may create blind spots in understanding the fast-changing business environment, and may keep executives in the dark about changing realities… until it is too late.
Is there a better approach?
There is a growing interest in the “lean” approach to solve the Big Data problem. Indeed, the lean way to deal with it allows us to avoid or minimize the risks I’ve mentioned – but it must be applied in the right way.
In the beginning of my career, I learned how to apply scientific research not only to astrophysics but also to a wide range of business and technology problems across different situations. What’s interesting is that both the scientific and lean approaches use incremental learning and discovery to support continuous progress in our understanding of the laws of nature; they also help us glean insights that will assist us in making better business decisions.
How to apply the lean data way to solve the Big Data problem for business
There are 5 core Lean Data Management principles we can apply to solve the Big Data problem:
- Measure and act upon only those data points that are important to your current business situation. If you’re unsure about which data points may have the biggest impact on your business performance, consider those indicators that measure your strategic business objectives such as:
- Acquiring new customers
- Generating more business from existing customers
- Reducing the cost of customer acquisition
- Identifying and servicing the customers that generate the most profit
- Improving customer retention
Admittedly, selecting your optimal data points may not be simple or straightforward. If this is the case, it may be helpful to hire a data scientist who can use data mining algorithms to find which data points have the most meaningful impact on your business.
- Apply an experimental approach to measure the impact of changes to product, operations and marketing. It’s a good idea to do a limited number of changes at any given time. This process makes analyzing samples of data before and after the changes effective and more conclusive. Applying basic rules of experimental science as well as the rigour of data interpretation will help distinguish real insights from the noise.
- Plan effective data collection. You need to have clean data that does not cause misleading interpretations. In many cases, relative ratio or relative trends are the most important, so try to make data collection consistent. If you need absolute values, calibrate a small sample of data that you may easily restore using relative ratio changes over time.
- Avoid collecting and processing data points that do not provide value for decision-making (even if they’re easy to collect!). From my experience, people sometimes measure data points that are readily accessible and prepare nice, colourful graphic reports that in reality are completely useless. I often notice this happens with web analytics and social media metrics. It makes much more sense to invest some effort in collecting valuable data points – such as with custom tagging, bar codes, or by integrating meaningful data tracking within your applications.
- Select reporting and analytical tools that are simple to use, and provide dashboards that are easily accessible. You want simple and easy to use tools that provide you with a view of the most important data, and how it affects your business. Unless you have strong data mining, statistical and data analytical skills – not to mention an excess of time – there is no need to invest in the most expensive and complex data mining or analytical tools. Leave heavy lifting of complex data processing and advanced predictive modeling to professional analysts and data scientists.
Conclusion
You should not try to solve the problem of Big Data with massive technology investments only – nor do you want to ignore it completely. The lean approach that simplifies data collection and measurement is the most effective. (The famous Einstein principle: “Everything should be made as simple as possible, but not simpler” is exceptionally fitting here).
Applying the lean approach to Big Data provides significant benefits when we apply lessons from experimental sciences to solve business problems and base solutions on insights that are found within existing data.