Today, I’d like to comment on (not summarize) a new big data study*, that came out on October 16th.
As opposed to the view that analysts and vendors paint, this study sheds some light on the current level of understanding and use of the much hyped notion of “Big Data” among business and IT professionals.
It turns out that
- for more than 50% of respondents big data starts already at 1 Terabyte,
- the most frequently named characteristic associated with big data was “a greater scope of information” (18%) while “large volumes of data” came in with only 10%, social media data with 7%,
- more than half of the respondents reported internal data as the primary source of big data in their organization
which tells me that many have a very pragmatic view on big data, going back to the fact that data volumes are simply growing and with it the challenges to manage and analyze it in a timely fashion.
The study also reveals how many respondents already do have proper big data technologies already in pilot:
- a columnar database (~ 14%)
- Hadoop or NoSQL engines (~12% each)
The key challenge for all these pilots is to find a compelling business case. Which might be due to the fact that these pilots are mostly launched within IT and not within the business.
There’s still a lot of exiting work to be done to drive innovations that exploit the huge potential of Big Data – “the new oil” as Clive Humby coined it. Explorative work that starts with the challenging business questions, then identifies what resources are needed to answer these, and only then narrows down on the data resources needed. Which may be accessible already or which have yet to be acquired and collected. If that data is massive volume machine generated- or social media data, then most likely new analytic capabilities, true big data capabilities, are needed.
*) The IBM Institute of Business Value and the Oxford University just published a study of how big data is used today. The study surveyed 1144 business and IT professionals in 95 countries. It can be downloaded here: http://www.ibm.com/2012bigdatastudy