Tech & VC 28 Mar 2008 09:39 am
It’s the Data, Stupid
Anand Rajaraman, who is teaching a data mining class at Stanford, wrote up a great example of the power of a superior data asset. Anand instructed his data mining students to break into teams and create entries for the Netflix prize. Here’s what happened:
Different student teams in my class adopted different approaches to the problem, using both published algorithms and novel ideas. Of these, the results from two of the teams illustrate a broader point. Team A came up with a very sophisticated algorithm using the Netflix data. Team B used a very simple algorithm, but they added in additional data beyond the Netflix set: information about movie genres from the Internet Movie Database (IMDB). Guess which team did better?
Team B got much better results, close to the best results on the Netflix leaderboard!
We, at USV, talk about the advantage of more (or better… or proprietary…) data all the time. Brad wrote about the subject in his December post on Google’s data asset:
Google has so much more data at their fingertips that even if a startup does a much better job leveraging data to deliver recommendations, Google could potentially provide a better value proposition to the end user with an inferior algorithm powered by more data, sourced from a broader range of services.
Brad’s example makes an important point: data is one of the few remaining means of defensibility. In the first dot-com boom, you could find defensibility in patents, out-fundraising your competition, proprietary code and algorithms. Now, by contrast, no one respects patents (and they’re too costly to defend), web services are so capital efficient that out-fundraising your competition is just a distraction, and open source code has eroded the advantage of proprietary code and algorithms. The main source of defensibility that remains is in your data asset. If you can aggregate more data, license more proprietary data, generate more of your own implicit usage data, or crowdsource more data than your competitors then you will be at a significant and defensible advantage.
James Carville hung a sign in the Clinton campaign headquarters in 1992 that said, “It’s the Economy, Stupid” as a constant reminder of what fundamentally mattered in the process of unseating George H. W. Bush. I would love to see some pictures of people with It’s the Data, Stupid signs in their startup’s office.
