The Future Is Now: Semantic Search


Trendwatch - It’s not every day that we are witnessing the birth of a new search engine. But every once in a while a new company comes along with an interesting idea to make one of the most important tools on the Internet much more efficient. Just like Google came virtually out of nowhere in the late 1990s, we are now seeing a new company that takes on the keyword-based searches we are used to today: Powerset’s secret weapon is semantic search. Is this the company that has the technology to challenge Google?

Back ten years ago, Google was a small start-up, but its founders Larry Page and Sergey Brin had an idea to make the confusing mess of Internet search results more relevant. It almost appeared that Page and Brin had found a way to make search engines much smarter than the Yahoos, Altavistas and Excites of the time. A decade later, this idea, keyword-based page-rank searches have become the accepted standard and although it may seem so at first sight, today’s search engines aren’t exactly what one would call smart.

They are still based on so-called robots, automated programs that scan web pages and store copies in a huge database called index. Data mining algorithms run through the index to create relations between web pages and calculate rankings of pages and use a complex way to connect those web pages to search terms. The better the intelligence, the more relevant ads can be displayed on a search results page. Despite the fact that Google improves its search tools and the backend of its massive index - such as the combination of different indexes (text, images, maps, videos, etc.) - the pace of the innovation in the search space appears to have stalled. Just like ten years ago, we still enter search term and end up with a list of web pages. Admitted, it is more likely today than ten years ago that you are finding what you are looking for on the first page of your search results, but - as the relevance decreases - you are confronted with lots of junk. That is why we often have to augment our searches with information from other online databases such as Wikipedia.

Powerset has a different idea. Its search engine does not simply index every page online; it tries to understand its content: Semantic searching is part of a broader semantic web initiative, described by some as Web 3.0. Semantic web is all about connecting information from various sources and creating a meaningful relation between different pieces of online information. As much as Web 2.0 was, and still is, about collaborating and sharing information online (such as in social networks), Web 3.0 (a term that is highly debated at this time) is commonly referred to as a way of transferring the Web into a huge database and bringing an understanding to the vast amount of information online.

Powerset is in an early beta phase and has limited scope, as its index consists only of information coming from Wikipedia and the Metaweb Technologies database. It uses these two sources, supported by proprietary algorithms and smart search technology licensed from Xerox’s PARC subsidiary, to create a more useful collection of facts related to a search query. Instead of treating pages as a pile of indexed words without meaning, Powerset parses each sentence and extracts its possible meaning. This approach yields more efficient and smarter search results than what the existing search engines can provide. Of course, it is behind Google in several disciplines, as its index is tiny compared to what Google has accumulated over the years. But some Powerset searches deliver more useful information than the mess you sometimes get today. Try searching for a popular term such as ’iPhone’ to see what I mean.

Powerset gained some initial praises already, for example from industry analyst Greg Sterling of Sterling Market Intelligence who thinks that "Powerset created both a better search engine for Wikipedia and a massive ’proof of concept’ for their algorithm and technology." But experts doubt Powerset will be able to extend its relationship with Wikipedia and Metaweb Technologies to thousands of premium content creators online.

It would take an enormous amount of time and manpower to convince content creators to sign up with Powerset and offer their copyrighted content for semantic searching. As a start-up, Powerset has limited cash and cash is really what it would need to refine its engine. Current investors claim they are committed to fund the growth of Powerset and scale the technology to index 20 billion pages. For now, Powerset does not display ads in search results. And while advertising is part of the company’s business plan as a source of revenue down the road, the company does not plan to use the usual keyword-based advertising. Instead, the company plans to match the "meaning" of a search query to display relevant ads.

Industry watchers are keeping a close eye on the company, since it may have a technology that could threaten Google’s dominance by delivering the promise of semantic search. But taking on Google is a gargantuan task for any company, let alone a start-up. The latest ComScore numbers released last week highlight the cut-throat nature of the search business as Google remains seemingly unstoppable. In the US alone, Google’s estimated search share in April jumped to 61.6%, up from 59.8% in March. Yahoo’s search share dropped to 20.4%, Microsoft’s Live Search share is down to 9.1% and AOL is down to 4.6%.

So, what if companies such as Microsoft or Yahoo can’t stop Google, what are the chances of Powerset? The difference appears to be that Microsoft and Yahoo are trying to catch up with Google from behind. A young company with a new perspective and innovative approach may actually have a much better opportunity to change the rules of the game, if it can offer a superior search product that users will love. Keep in mind that existing search players still only talk about semantic search, while they are milking the keyword-based search business model. Powerset, on the other hand, plays its cards on semantic search exclusively. If it manages to build an infrastructure and a clever index before it runs out of cash or ends up being acquired, it may have a good shot at the search market.

  • Pei-chen
    I think we are still at Web 1.0. Web 2.0, 3.0 all sounds like marketing push.
  • gm0n3y
    They are just buzz words for popular ideas. Web 2.0 is just sites allowing users to post and share content. Its basically about creating online communities (social networking sites, etc) where the users are also the content creators. I'm not sure how exactly a better search engine is called Web 3.0 though.
  • frozenlead
    they aren't calling the engine web 3.0, it's just the idea of a computer understanding and analyzing content, instead of just seeing characters.

    at least, that's what i gathered.