When It Comes to Investing, the Web Goes to Waste

There is so much data online at investors’ fingertips, but much of it is falling through their fingers.


The Google internet front page logo is shown Friday August 6, 2004. Google Inc.'s initial public offering, the largest ever for an Internet company, may be delayed after California securities regulators began an investigation of the company over unregistered shares it issued. Photographer: Steven Brahms/Bloomberg News

Steven Brahms/Bloomberg News

No one likes waste.

The vast majority of information in the world is produced and exists online. Every minute of every day, Google receives 4 million search queries, YouTube users upload 72 hours of video, 204 million e-mails are sent, and Facebook users share 2.46 million pieces of content. The pace of this data creation is only accelerating. The most recent EMC Digital Universe study, done in conjunction with analysis firm IDC, found that the total volume of information worldwide roughly doubles every 18 months.

Yet the investment community largely relies on offline data to inform its decisions — a minute (and time-lagged) fraction of the total data available.

Although much digital information is noise, proper data analysis can reveal signals. Despite the sheer scale of the information on offer, the bulk of online data goes unused by firms, which continue to rely largely on traditional sources of analysis and information. To be fair to the investment community, it has started to wake up to the power of these data. Major firms have started to build out teams devoted to this purpose. Research from IDC shows that 23 percent of the world’s digital information, if captured and analyzed, could be useful to investors. Yet at present only 3 percent is captured, and only 0.5 percent is then analyzed.

It’s also worth noting that much of the interest in mining online data has focused on the possible advantages for those looking to turn a short-term profit. Market-moving news often breaks on Twitter before making waves on mainstream outlets, for example. Although this sort of awareness will always have its place, the buy side’s bread and butter is long-term insight into fundamentals. Many fund managers are paid to make investments across a two- to three-year time frame, rather than a two- to three-hour one. And it is here that the vast volume of nonconventional data online can really shine (see also “Everything You Need to Know About Big Data to Keep Your Job”).

It’s useful to illustrate the sort of insight that can be gained with some recent examples. Take the much-fêted launch of Apple’s latest gizmo, the Apple Watch. We at Eagle Alpha analyzed more than 3 million tweets to identify 3,000 customers who had preordered the device. By tracking what they were saying, we found that the majority had received their watches earlier than expected, suggesting that Apple was following a worst-case-scenario strategy with regard to shipping dates. Topic analysis also suggested that media reports overplayed battery concerns. Also instructive was the volume and content of Google searches relating to the Apple Watch in the buildup to the launch date. These findings could then be compared with Google searches for the original iPhone and iPad in the buildup to their respective launches.


The point of online data mining is not to supplant traditional analysis but to augment it. To take a macroeconomic example, let’s look at the U.S. labor market. A continued month-on-month fall in the volume of search terms such as “employment agencies” and “job search” may be a bullish indicator. Analyzing the gap between job openings and job seekers on popular job sites such as Indeed.com can also shed valuable light, as can analysis of web traffic to those sites. This analysis can then be extended to individual sectors and geographies. These measures are unlikely to be of much use taken alone, but taken together and combined with conventional data and the requisite analysis, they can create a far more comprehensive picture.

So why are we still in a situation in which only 0.5 percent of web data is being used in this way by the buy side? There are a number of barriers to adoption. Compliance is one. Many firms strictly control access to social media sites. But the biggest barriers are the expertise and tools required to sift through the noise to generate meaningful insights. The increasing demand for data science skills is a major challenge facing the industry. The McKinsey Global Institute, part of consulting firm McKinsey & Co., estimates that by 2018 the U.S. will have a shortage of 190,000 data scientists and 1.5 million research analysts capable of handling big data.

One day, the notion of factoring a mere 0.5 percent of the web’s data into investment decisions will be seen as nothing less than foolhardy. One day, robust analysis of the Net will be par for the course, an expected minimum. But for now, those firms with the savvy to make the first moves on this front will find themselves at a distinct advantage.

Emmett Kilduff is the founder and CEO of Dublin-based Eagle Alpha, a financial data firm that analyzes social media.

Get more on trading and technology.