It’s been an #epic 5+ years of tweeting. What started as a micro creek quickly flooded its banks from the exponential downpour of 140-character drips to generate a blue ocean with powerful waves, innovative currents, and trendy metrics (insert additional buzzword/metaphor here). Twitter has successfully progressed.
“What are you doing?” –> “What is happening?” –> “Follow your interests”
However, when it comes to processing the data, we seem to recognize the potential, but still struggle to navigate these waters.
I think Mark Suster put it well:
“Our goal is to make the enormous volume of real-time information more manageable for the 99% of companies that lack the infrastructure to process these volumes in real time. Think of DataSift as turning the fire-hose into a cost-effective and manageable tap of running water. Or in utility speak, they are transmission and we are last-mile distribution.
And better yet, the company has a product that will turn the stream into a lake. What does that mean? The Twitter stream like most others is ephemeral. If you don’t bottle it as it passes by you it’s gone. DataSift has a product that builds a permanent database for you of just the information you want to capture.
Finally, DataSift has an enormous amount of historical data already stored, we can help you go back and retrieve some older data for analytical purposes.” –@msuster
What does that mean? Twitter has currently given only two companies legitimate re-syndication rights of fire hose data: Gnip and DataSift. Suster has chosen to “double-down” his investment on the Twitter ecosystem by investing in the latter having already invested in Ad.ly. He believes that DataSift will be able to execute its vision, but the window of time is limited.
I feel like Twitter will eventually acquire both of these “firemen” once they have pushed the envelope far enough as this clearly seems to be Twitter’s approach to innovation lately (most recently TweetDeck & Backtype). Therefore, either of these “hose holders” are probably a solid investment opportunity for a VC, but who will win the hearts of developers by providing the most value-add?
Nearly all the major vendors in the social media measurement arena seem to get their Twitter data from Gnip. It will be interesting to see how DataSift is received with its pay-per-use subscription model and processing tiers. Gnip appears to be more expensive with a fixed cost approach, but pricing is not the deal-breaker for developers. DataSift only supports streaming output in the JSON format while Gnip supports JSON, XML, and Activity Stream.
What are the additional pros and cons for developers to consider moving forward? Will pay-per-use become the dominant business model? As Twitter Lake grows deeper, will these chosen “firemen” be able to help accelerate innovation?