Sign in with
Sign up | Sign in

Exclusive: Microsoft Patents The Search Engine

By - Source: Tom's Hardware US

Microsoft has received a patent that covers a search engine platform that is based on a "bag-of-words" and "essential pages" ranking system to make searches more efficient.

While it is clear that this patent is the foundation for Bing, it is somewhat stunning to see how this patent collides in its details with what could also be claimed to be the foundation of Google's search engine.

The technology field is littered with silly patents that potentially should have never been awarded in the first place and do not serve their intended focus to foster innovation and protect the intellectual property of ingenious minds. One of those recent patents may have been Microsoft's Heureka! moment that the idea of the operating system shutdown should be generally owned by the company.
  
In fact, we are lately seeing much more activity in patents that cover a wide range of technologies, many of which are critical to the future of IT. For example, I have recently stressed that Google patented cloud browser sync as well as threaded email management, while Microsoft patented GPU-accelerated video encoding and the voice search capability through a search engine. In fact, it is especially the patent battle between Microsoft and Google that highlights much of the IP that is likely to determine key technology of the future Internet platform.

Microsoft has now bagged a patent that is labeled as "a search engine platform", which - by its general nature - is sure to raise some interest across the industry. Exactly what search engine has Microsoft patented here?

Patent filings always have a general and detailed description and as you dive deeper, this patent gets much more interesting. Microsoft general claim is:  
      
"Systems and methods to perform efficient searching for web content using a search engine are provided. In an illustrative implementation, a computing environment comprises a search engine computing application having an essential pages module operative to execute one or more selected selection algorithms to select content from a cooperating data store. In an illustrative operation, the exemplary search engine executes on a received search query to generate search results. Operatively, the retrieved results can be generated based upon their joint coverage of the submitted search query by deploying a selected sequential forward floating selection (SFFS) algorithm executing on the essential pages module. In the illustrative operation, the SFFS algorithm can operate to iteratively add one and delete one element from the set to improve a coverage score until no further improvement can be attained. The resultant processed search results can be considered essential pages."

As I read through the patent, I learned that Microsoft described is a search engine technology that aims to increase the likelihood to find certain content with fewer mouse clicks. This idea is based on traditional search engine spidering techniques, a ranking system, as well as secondary information from neighboring search results to retrieve relevant information for a re-ranking of a search result. In Microsoft's words:

"In addition to relevance, existing practices also consider diversity of Web-search results as an additional factor for ordering documents. A re-ranking technique based on maximum marginal relevance criterion to reduce redundancy from search results as well as presented document summarizations has been considered. Additionally, an affinity ranking scheme to re-rank search results by optimizing diversity and information richness of the topic and query results has been developed. Such practices model the variance of topics in groups of documents.

The herein described systems and methods provide a modeling of the overall knowledge space for a specific query and improving the coverage of this space by a set of documents. In an illustrative implementation a "bag-of-words" model for representing knowledge spaces is provided. Additionally, in the illustrative implementation, a formal notion of coverage over the "bag-of-words" is provided and a simple but systematic algorithm to select documents that maximize coverage is derived to allow relevance to the search topic."

Microsoft considers a web page as a "bag-of-words" where keywords are filtered, extracted and counted to achieve a certain valuation of that document. The result is basically a document that lists keywords. Microsoft's patented search engine platform relies on a bag-of-words approach in which "a document is processed as a collection of statistics over a set (i.e., bag, of words used in it, without explicit semantic constructions such as sentences, formatting, etc.)." This document based on a bag-of-words provides the foundation for what Microsoft calls "essential pages" that relate to the bag-of-words and are said to eliminate certain less relevant search results from a search query and require a user to perform fewer mouse clicks.

The indexing and processing of the bag-of-words is a highly complex process and involves an interpretation and processing of each word, including the identification of the root of the word, word stemming. For example, Microsoft removes the endings as well as those that do not describe context semantics, such as "as," "is," or "be." According to Microsoft, this process will provide more "pertinent search results."

So, how is this patent different from what Google does today? Microsoft applied for this patent in March 2008, about one year before the company provided a first glimpse at the search engine. The concept comes down to keyword generation, extraction and storage - as well as a way how they are applied to a search query. The description largely describes what Google has been doing for several years as well as a keyword practice that has been implemented by basic search engine optimization efforts for several years. And even if Microsoft's patent differs in certain details from Google's approach, it is somewhat surprising that this idea has made it through the U.S. Patent and Trademark Office in the record time of not even three years. Legally, Microsoft may have some leverage against Google, even if it is questionable whether Microsoft would really try to go after Google at this time - in a critical technology area such as keyword extraction and interpretation.      

Google's lawyers, on the other hand, may want to look at this patent more closely and figure whether Microsoft has invaded Google territory with this patent or not. Somehow I feel that this is not the last time we have heard of this patent.

There are 59 Comments. B
Top Comments
  • 23
    Trizomik , October 14, 2010 1:54 AM
    They really want us to hate them...
  • 21
    Netherscourge , October 14, 2010 1:49 AM
    Good luck having that patent upheld in court.
  • 20
    drakefyre , October 14, 2010 1:59 AM
    First off, awesome article. It really reads like one of the older Tom's articles.

    Second, wow Microsoft is stupid, as is the whole patent system. This is ridiculous.
Other Comments
  • 21
    Netherscourge , October 14, 2010 1:49 AM
    Good luck having that patent upheld in court.
  • 16
    zorky9 , October 14, 2010 1:49 AM
    Wow.. Now that's a patent ogre.
  • 23
    Trizomik , October 14, 2010 1:54 AM
    They really want us to hate them...
  • 15
    bdaonion , October 14, 2010 1:57 AM
    These things keep getting more and more petty...
  • 9
    IzzyCraft , October 14, 2010 1:58 AM
    NetherscourgeGood luck having that patent upheld in court.

    obviously you never seen some of the frankly ludicrous cases that has flown out of US patient court cases.
    TrizomikThey really want us to hate them...

    Who microsoft or google both gobble up tons of patents for things people would think should be allowed for everyone some of which was listed in the article if you read it, and google has moved to gobbling up upstarts that compete with them on some level.

    They are both large corps =p
  • 20
    drakefyre , October 14, 2010 1:59 AM
    First off, awesome article. It really reads like one of the older Tom's articles.

    Second, wow Microsoft is stupid, as is the whole patent system. This is ridiculous.
  • 4
    tntom , October 14, 2010 2:09 AM
    Wow! I'm going patent "Systems and methods to perform efficient mode of transportation for passengers of a powered vehicle." Maybe I'll just patent the most efficient of everything. That wouldn't be to vague would it?
  • 2
    JohnnyLucky , October 14, 2010 2:09 AM
    Looks like they are following in IBM's footsteps.
  • 3
    diablob , October 14, 2010 2:13 AM
    Unpatentable: A*X=B
    Patentable: Puppies x Cuteness = Adorability-Factor

    The more I read about software patents, the more angry I get. When are we going to fix the system and formally pass a law that stops software from being patentable?
  • 0
    verbalizer , October 14, 2010 2:26 AM
    things will only continue to get worse and worse in the future.
    judgment day as in Terminator is coming and MS will be the source.. lol
  • 3
    ispam , October 14, 2010 2:29 AM
    [sarcasm=true;]
    I'm going to patent sex and all of you are going to be very sorry you didn't have this idea earlier when you'll be forced to license it from me.
    [sarcasm=!sarcasm;]
  • 0
    ares1214 , October 14, 2010 2:33 AM
    Wheres the patent on air? Or is Apple going to beat you to that? Water? Maybe google will get that, google>goggles>water, id see that upheld in court. Seriously, i feel like everyday tasks will be patented soon, and all its doing is destroying business.
  • 2
    Gin Fushicho , October 14, 2010 2:35 AM
    Really Microsoft? Really? I don't think they have the right to patent that.

    Pretty soon someone is going to patent the act of sex and plant chips inside of your body that extract money from your account every time you masturbate or have intercourse.
  • 3
    znegval , October 14, 2010 2:37 AM
    Am I the only one who thinks any patent starting with "systems and methods" should be automatically rejected? Perhaps excluding a few ones.
  • 4
    kelemvor4 , October 14, 2010 2:49 AM
    Well. The patent system STILL needs a major overhaul. Things like this seem humorous until they end up costing someone a truckload of money to defend themselves in court.
  • 1
    GNCD , October 14, 2010 2:50 AM
    "Systems and methods" Can they be more vague?
  • 5
    AMD_pitbull , October 14, 2010 2:52 AM
    Same thing I usually say, I blame the government. The courts have to stop this.
  • 0
    brando56894 , October 14, 2010 2:55 AM
    Microsoft needs to go to hell with a this stupid patent BS, they literally want to own everything. If MS disappeared tomorrow I think they world would be a better place.
  • 4
    cronik93 , October 14, 2010 3:06 AM
    Microsoft is a monopoly that needs to be stopped.
  • 1
    loomis86 , October 14, 2010 3:07 AM
    This is what happens when you have lawyers and stock traders running the economy instead of entrepreneurs and industrialists.
Display more comments