We’re constantly seeing new uses for big data. So much so that it’s not even necessary to describe what the term means. Is the use of huge amounts of amorphous data making us better at our work, or is it adding layers of abstraction over the realities of the world?
Certainly there are valid uses for collecting data ad-infinitum but the most common reason I hear is the notion ‘why not?’ Data transfer and storage capacities have become exceedingly affordable to the point that the cost of a few additional terabytes is worth the gamble that the data stored will eventually be useful. In fact one of the uses often mentioned for analyzing big data is to discover patterns and answers we haven’t yet developed questions for.
I think we should be thinking about big data in 2 distinct categories. Category 1 can be seen as a collection of data that is part of a process, has known values (even if they don’t consist of structured data in tables and fields) and can provide business value within a known period of time. Category 2 consists of data that is a byproduct of the environment. This can be working documents, social media streams, video clips, and anything else that is not directly related to the process of getting specific tasks completed.