Private companies are sitting on some of the most lucrative and socially valuable data in existence. In discussions with CEOs trying pitch their latest product feature to me, I regularly end up stumbling upon far more interesting news once we start talking about the data they are collecting on their users.
Here’s a primer on how companies can unearth the hidden gems in their data vaults.
Myth Busting
Tech companies tend to be far more rigorous with their data collection than their government counterparts. For instance, online education pioneer Sal Khan learned that public schools had been mislabeling failing students. In a pilot program in California schools where he could rigorously track every single video students watched and problem they attempted, Khan learned that many so-called “failing” students were really just stuck on a single problem.
Once struggling students could re-watch videos on their own, some raced to the head of the class. This was monumentally positive news in the education industry, and it had nothing to do with an official study or new product release.
AI Weekly
The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.
Included with VentureBeat Insider and VentureBeat VIP memberships.
If companies stumble upon counterintuitive findings in their data, please share them with the press or via a blog post. Khan has yet to do a wide-scale academically rigorous test finding, but it’s enough that other educators can start to test.
Topical Confirmation
Industry news can provide a more intuitive narrative to big trends. For instance, inequality is arguably the number one political topic of the year. There are serious debates about whether inequality is inevitable or whether its creating a permanent elite class of highly paid programmers in San Francisco.
Marketing research firm VisionMobile did a nice infographic showing that technologists are suffering from inequality as much as the rest of the country. The data below shows that the “app economy” is monopolistic and that only 1.6 percent of developers had apps earning more than $100,000 each month. Most apps weren’t economically stable.
The data is a nice illustration of the fact that inequality is quite common on the Internet. Networks naturally exaggerate the contributions among members.
Many companies will no doubt witness common trends in their own backyards. This kind of data helps journalists and the public understand how big trends hit home for hardworking citizens.
Create your own ranking
Companies sitting on industry-wide data are often in a great position to create rankings. For instance, salary-database startup Payscale published its own college rankings based on how much graduates earned rather than some nebulous concept of reputation.
On Payscale’s rankings, the expensive Ivy-leagues tanked, while technical schools such as CalTech rose to the top. Payscale received tons of press by doing the public a big favor. The most powerful politicians in the country, including President Obama, have been trying to force reluctant colleges to provide gainful-employment data.
It could take years to get this simple request through union opposition and a paralyzed Congress. Payscale had most of the data, which it supplemented with a big survey. It didn’t need anyone’s permission.
There are all kinds of rankings and comparisons on the Web, from entertainment to politics. There’s a good chance that tech companies are sitting on much better data than the established players.
Beware amateurs
If companies want to do this right, they should hire a data scientist or let experts have access to their data. Statistics is a very tricky art. In the old days of Facebook, its amateur government department used to make absurd claims about how Facebook likes could predict the winner of an election.
Eventually, the social network’s team wised up. Facebook started partnering with respected quantitative political scientists and discovered that while it couldn’t predict elections, Facebook could increase voter turnout.
Likewise, the public has had fun with data from Uber, which has tentatively found that the car-ride sharing app may be reducing drunk driving incidents.
A well-trained data scientist or academic can find uncover valuable findings in locked-up data. So instead of pitching press on a new product release, try putting those resources to discover something new and valuable.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More