kim0Real-time search engines have proliferated over the last month, with a series of launches from start-ups like Topsy, almost.at and Scoopler. The companies are hoping to edge in on a space that Google co-founder Larry Page has admitted is a weakness for the search giant. And they’re using microblogging and social bookmarking sites as tools to figure out what content is relevant up to the second.

Real-time search is valuable because it lets you know what’s happening right now on any given topic. Companies use it to handle customer service. News junkies use it to follow political events.

And I’ve tested out nine real-time search offerings by pointing them all to Iran’s disputed elections to compare their results. At the end of this post, I’ve also covered two further contenders who launched just in the last couple of days.

The issue for real-time search is figuring out the right balance between immediacy, popularity and relevance. Stream everything without any filtering, and a search could bring up a lot of irrelevant chatter. Filter too strongly, and a search might omit important trends that have been picked up in the last hour.

AI Weekly

The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.

Included with VentureBeat Insider and VentureBeat VIP memberships.

All of the real-time start-ups search Twitter, and some have added in social bookmarking sites like Digg and Delicious. Because Twitter already has an in-house search engine through its acquisition of Summize last year, the newer start-ups have to differentiate their products by filtering content based on relevancy and popularity. At this time, Twitter search only produces results by how recently they were published.

Almost.at and Scoopler have designed their user interfaces to favor the most recent content, while Tweetmeme, OneRiot, DailyRT and Topsy focus more on what’s popular.

Google hasn’t confirmed any specific projects in real-time search, although it may only be a matter of time. Co-founder Larry Page said last month at Google’s Zeitgeist conference that the company has fallen behind Twitter.

“People really want to do stuff real time and I think they [Twitter] have done a great job about it,” he said. “We’ve done a relatively poor job of doing things that work on a per second basis.”

kim11Scoopler: Real-Time Search as Channel-Surfing.
Scoopler gives a results page divided into two columns, one with unfiltered live content and the other with results sorted by popularity. It gives a nice balance of what’s happening right at the moment and what content has surfaced to the top via retweets and shares in the last few hours.

“Users should be able to create custom streams and watch all the live video, commentary and content in one place,” said Scoopler co-founder AJ Asver.

Search Results: I get a mix of news stories from the BBC and YouTube videos posted as recently as two hours ago along with a live conversation stream about cell phone blackouts and Iranian authorities using Twitter to spread disinformation.

Sources: Twitter, Digg, Delicious, Flickr, Identica

Funding: Incubated by Y Combinator. Asver founded Scoopler with MIT computer science grad Dilan Jayawardane, whom he met through networking in London, and the pair publicly launched the service last month.

kim2

kim3

Topsy: Real-Time Search as Social Search .
Topsy scans only Twitter and gives more weight to a source’s authority and how many times a piece of content has been shared. It ranks the influence of individual Twitter accounts by measuring the fraction of their tweets that attract responses and re-tweets from followers. Results from more “influential” Twitterers are ranked higher.

“We imagine search as a process filtered by the social web,” said Rishab Ghosh, Topsy co-founder and vice president of research.

Search Results: When searching “Iran,” I actually get links to competitor Twazzup’s custom Iran search and to Boston’s Big Picture blog. It’s a little less useful than Scoopler for learning about what’s happening right now in Tehran, although I do find a few heavily followed Iranian Twitter accounts that only began posting within the last 24 hours.

Sources: Twitter, although Ghosh says the company may incorporate other sources

Funding: $15 million in equity and debt to date, including $900,000 in seed funding from angel investor and Listbot founder Scott Banister. Other backers include Blue Run Ventures, Ignition Partners, Founders Fund and Western Technology Investment. See VentureBeat’s coverage here.


kim4

kim5OneRiot: Hunting for the Most Shared Content.
OneRiot focuses on the actual content users are sharing via Twitter and Digg instead of their tweets and commentary. The company says it factors in 26 different criteria in its search engine, including a link’s freshness, its domain authority or how reputable the Web site is, and velocity or the speed at which a link is shared through the community.

Tobias Peggs, the company’s general manager, says OneRiot’s advantage is the level of spam guards the company has built into the engine and its scalability. He says as Twitter becomes popular, more users will latch onto trending topics to promote themselves or unrelated content like pornography.

Another useful feature is search within a specific domain name so you can figure out what’s shared most heavily from a certain Web site. (Try it with VentureBeat.)

Search Results: The top results all come from mainstream media sources like MSNBC, The Wall Street Journal and Reuters plus a blog post on Twitter’s decision to postpone maintenance because of protests in Iran. If you want reports by the professional press, this is where to go, but it’s less useful if you want to follow conversation on the ground.

Sources: Twitter, Digg, Delicious.

Funding: Two rounds of funding totalling $20 million from firms including Spark Capital, Appian Ventures and Commonwealth Capital Ventures.

kim61

kim7Tweetmeme: Link Searching Via Twitter. As its name suggests, U.K.-based Tweetmeme only searches Twitter, looking at keywords, article relevance, level of retweets and a tweet’s timestamp.

On the right-hand column, users can narrow searches to a certain time period or level of retweets. You can look for what’s been retweeted over 100 or 500 times.

Tweetmeme is one product from U.K.-based social media company Fav.or.it. (The company also launched a very compelling real-time product called TweetTabs Tuesday, which is browser based, and thereby wins serious points for ease of use: It allows you to open several tabs with different Twitter search queries, something you might not if you with a desktop client. And clicking on any link provides an overlay with information about the URL, including blog posts. It also provides easy ways to retweet, or reply to a tweet.)

Search Results: Like OneRiot, I get a lot of results from professional media sources like CNN and the Guardian, plus a police brutality video that also cropped up on my Scoopler search. When I narrow my search to tweets that have been shared more than 100 or 500 times, things get a little more interesting. I get BoingBoing’s guide to cyber warfare in Iran and an interesting post from FriendFeed showing that their Iranian web traffic may be blocked.

Sources: Twitter.

Funding: 650,000 pounds in two rounds of angel funding.


kim8

kim9Almost.at: Event Streaming Through Twitter, YouTube, Flickr.
Almost.at has the most unique interface of all the real-time search companies. It has a results page divided into three parts where you can see a live stream of tweets, shared links and photos from an event. Almost.at aims to have people who are actually on location reporting back to the larger community. It does this by having users nominate Twitter accounts that appear to be tweeting live from the event for others to follow.

Developer David Cann says one possible business model for Almost.at is to create special embeds or widgets for event organizers who want to give followers who can’t physically attend a stream of content. He could also offer premium accounts for users who want to track terms outside the three or four events Almost.at follows each day.

Search Results: When I tried Almost.at Monday, I got recently posted photos and YouTube videos from the protests in Tehran which was neat. Tuesday, as microblogging about Iran picked up, I got more extraneous tweets from foreign observers. There are also a few spammers using the #IranElection hashtag to promote their sites in the links column.

Sources: Twitter, YouTube, Flickr.

Funding: The project is self-funded by Cann.


kim10

kim111DailyRT: Popularity-Based Real-Time Search.
DailyRT seems similar to Tweetmeme and OneRiot in that its search is more reliant on retweets and sharing. The results are the most shared tweets on a search term in the last hour, 24 hours, seven days, or of all-time.

Search Results: I get lots of interesting tweets about cyber warfare in Iran and how to help Iranian bloggers, but they are by author Neil Gaiman, Office castmember Rainn Wilson and British actor Stephen Fry.

Sources: Twitter.

Funding: They don’t have any contact links on their Web site, oddly enough. Trying to reach them via Twitter.

kim12

kim13Twazzup: Twitter Search Plus Some Extras.
Like many of its competitors, Twazzup focuses on creating a real-time stream of tweets, but it gives a few extras like widely shared links and photos. It also suggests influential Twitterers.

Search Results: Twazzup picked up extra traffic Tuesdaywith a specially designed Unrest in Iran search. It’s a page showing freshly Tweeted comments with a right hand column of photos, videos and news articles from CNN and The Boston Globe. Twazzup also suggests some Twitter accounts, called “correspondents,” who say they are students or activists in Tehran. This special page is probably the best I’ve seen so far to follow what’s happening on the ground, but it will probably be difficult to reproduce this efficiently for all events and stories people will be interested in.

Sources: Twitter.

Funding: I’ll update with funding info as soon as I have it. [Update: Twazzup is self-funded by the three co-founders — Cyril Moutran, Hugo Hardel and Stephane Philipakis — and the company isn’t actively seeking any funding.]

kim14

kim151Friendfeed: Former Googler’s take on real-time.
Friendfeed, which was co-founded by Gmail creator Paul Buchheit, launched a redesign earlier this year with real-time results. The search function ranks results by how recently they’ve had activity such as receiving a comment or a “like.” In advanced search, a user can rank results by the number of comments or “likes” an item has earned, letting popular or heavily shared content surface to the top.

Search Results: Although the results aren’t sorted solely on timestamp, the search produces a lot of tweets that aren’t completely helpful. I get a lot of results in Persian, which is neat, but I don’t speak Persian, and a link about The Daily Show sending a correspondent to Iran.

Sources: Twitter, Digg, Google Reader, Tumblr, Flickr.

Funding: The company raised a $5 million Series A round from founders Buchheit and Sanjeev Singh and Benchmark Capital. Buchheit said the company isn’t seeking funding.

kim16

kim17Twitter: Timestamp only.
Twitter’s search was born out of Summize and only shows the most recent tweets without any filtering for relevancy or popularity. It also has trending topics, or the 10 most widely tweeted terms on its right-hand column. Twitter users took advantage of this by creating hashtagged terms or phrases like IranElection with a # sign in front of them for easy searching.

Search Results: The #IranElection hashtag was the highest on Twitter’s trending topics list, reflecting widespread interest when I tested it on Tuesday. However, with hundreds of tweets on Iran a minute, the results are very random. For a narrow topic or company search, Twitter search is useful because there are far fewer results. For a top 10 term, there is a great deal of noise and an increasing amount of spam.

Sources: Twitter.

Funding: $55 million in three rounds of funding with investors including Institutional Venture Partners and Benchmark Capital.

kim18

collectaCollecta, which launched just three days ago, claims to be the fastest of the real-time search contenders. It pulls its results from all over the Web, not just from Twitter, but also from blog posts, comments, Flickr, news feeds and more. As VentureBeat’s Anthony Ha wrote  in his recent coverage of the company, “Collecta has some nice filtering options, so you can just see blog posts, or photos, or you can remove all the updates from Twitter.”

One advantage to Collecta is its use of the XMPP instant messaging protocol (the same technology powering Google’s communication and collaboration tool Wave), which allows it to show information that’s truly in real-time, rather than items that are simply recent.

The company raised a little less than $2 million last year under its old name Stanziq.

crowdeyeFinally, there’s CrowdEye, which launched on Wednesday. It pulls its results from Twitter. It’s privately funded and was co-founded by Ken Moss, who previously founded and ran the Bing search technical team for Microsoft. However, while it provides context to Twitter trends, it doesn’t have full access to the Twitter “firehose” of data, and therefore is at a slight disadvantage to other sites.

It’s twitter-centric, and so appears to fall for now in the category of “also-ran.” It shows a graph of tweet subject trends over the past three days. It includes a list of related categories, hashtags, and common words from tweets about your search term — but displayed as a tag cloud (unfortunately, tag clouds for most people have failed to become very compelling, because, well, they’re cloudy, as in inexact). Like Tweetmeme, popular links about your search term are also highlighted.

[Top image from palagems.com]

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More