The Evolutionary Path of Data

This week I was at 500startups meeting the new accelerator class. The subject of data came up once again and I thought I’d finally get to this post which I’ve been meaning to write for a while now.
I have found that data always follows this path:
1. Measure the data – Some mechanism appears which allows us measure the data. The insertion of a tracking pixel on a web page, the creation of new type of sensor – these are just examples of activating the ability to measure data.
2. Collect the data – Once we can measure the data, we can collect it. So it gets dumped into a database or similar. For example, the tracking pixel’s loads from a server are saved in log files, or we connect a data collection device to the new sensor which takes readings at regular time intervals and saves them.
3. Display the data – After collecting the data, we can display it, usually in its most primitive form, which is often rows and columns of numbers.
4. Visualize the data – When we display data in raw form, it’s not optimal. Visualizing the data via graphing or through further number crunching can bring better interpretation of large data sets than scrolling through endless rows/columns of numbers.
5. Derive insight and action from the data – After we analyze the data, we can determine what happened and then what to do next.
Usually, people get through steps 1 through 4 very quickly. We see what I call the “Mint-ifying” of the data collected. But I contend that this is not good enough. It’s nice to see the data in some easily digestible form but always the next question is, “what do I do next?”
Almost no one ever gets to 5. It’s always up to the viewer to take a look and then figure out what to do next. But this is hard. Some people can figure it out and some people cannot. Some people could figure it out if they took the time to go deeper but they don’t have the time.
That’s why I think the ability to generate insights and what to do next is the holy grail. It can seem exceedingly difficult to create a system that not only displays data but also tells you what to do next.
Some observations here:
1. If you don’t have deep knowledge and experience in the area in which you’re building a data system for, you’ll never get there. Either that or you’ll never realize the full potential of the data.
2. If you are not a current user of the data in a real world application, you’ll never know what another person might want to do with the same set of data.
3. You might want to hire someone to help. The number of people who know how to use real, measurable data and figure out what to do next with it is very, very small. So trying to find someone in your area of operation may be very difficult. Or you will require a lot of time and experimentation to get there, maybe more time than you have.
I’ve met many startups over the years trying to exploit the data they are collecting. But as far as I can tell, virtually none of them have produced systems that can tell you what to do next from the data they display. These days I always ask the data startups how and when are they going to generate insight and action and I look for some glimmer of hope there. This is because I truly believe that if you can produce insight and action, you will have real gold in your business.