In addition to the ironic reference of the growth of data in recent times, Moore quite funnily draws our attention to something which is quite significant for the "data business": We need a new acronym of buzz word every couple years: CRM, SaaS, ERP, CDI, MDM, Cloud.... are you still there?
As I have stated in a previous blog, I believe that the acronym is never the solution. And I agree with Moore's Law: there is a lot of nonsense going around as far as Big Data is concerned. Having had a lot of discussions on this subject with colleagues, customers, partners and competitors, I have come up with my own Law on Big Data:
"The only way to manage Big Data is through a sound and intelligent matching approach."
As organisations are trying to convert all kinds of data (owned, data, social media data, online data, etc.) into competitive advantages, they need a matching approach for data that oftentimes lack metadata descriptions. Traditional matching methods will not do the trick: they are based on atomic string comparison functions (e.g. match-codes, phonetic comparison, Levenshtein distance and n-gram). The drawback of these functions is that they cannot distinguish between apples and oranges – you end up comparing family names with street names.
For REAL Big Data management you need an engine that will yield a high quality result for matching of records distributed over various heterogeneous data sources. That can only be achieved by combining a probabilistic and a deterministic approach. Please check my blog post on the the Gartner Summit this year to learn more about this approach.