Sign in with
Sign up | Sign in

Exclusive: Microsoft Patents The Search Engine

By - Source: Tom's Hardware US | B 59 comments

Microsoft has received a patent that covers a search engine platform that is based on a "bag-of-words" and "essential pages" ranking system to make searches more efficient.

While it is clear that this patent is the foundation for Bing, it is somewhat stunning to see how this patent collides in its details with what could also be claimed to be the foundation of Google's search engine.

The technology field is littered with silly patents that potentially should have never been awarded in the first place and do not serve their intended focus to foster innovation and protect the intellectual property of ingenious minds. One of those recent patents may have been Microsoft's Heureka! moment that the idea of the operating system shutdown should be generally owned by the company.
  
In fact, we are lately seeing much more activity in patents that cover a wide range of technologies, many of which are critical to the future of IT. For example, I have recently stressed that Google patented cloud browser sync as well as threaded email management, while Microsoft patented GPU-accelerated video encoding and the voice search capability through a search engine. In fact, it is especially the patent battle between Microsoft and Google that highlights much of the IP that is likely to determine key technology of the future Internet platform.

Microsoft has now bagged a patent that is labeled as "a search engine platform", which - by its general nature - is sure to raise some interest across the industry. Exactly what search engine has Microsoft patented here?

Patent filings always have a general and detailed description and as you dive deeper, this patent gets much more interesting. Microsoft general claim is:  
      
"Systems and methods to perform efficient searching for web content using a search engine are provided. In an illustrative implementation, a computing environment comprises a search engine computing application having an essential pages module operative to execute one or more selected selection algorithms to select content from a cooperating data store. In an illustrative operation, the exemplary search engine executes on a received search query to generate search results. Operatively, the retrieved results can be generated based upon their joint coverage of the submitted search query by deploying a selected sequential forward floating selection (SFFS) algorithm executing on the essential pages module. In the illustrative operation, the SFFS algorithm can operate to iteratively add one and delete one element from the set to improve a coverage score until no further improvement can be attained. The resultant processed search results can be considered essential pages."

As I read through the patent, I learned that Microsoft described is a search engine technology that aims to increase the likelihood to find certain content with fewer mouse clicks. This idea is based on traditional search engine spidering techniques, a ranking system, as well as secondary information from neighboring search results to retrieve relevant information for a re-ranking of a search result. In Microsoft's words:

"In addition to relevance, existing practices also consider diversity of Web-search results as an additional factor for ordering documents. A re-ranking technique based on maximum marginal relevance criterion to reduce redundancy from search results as well as presented document summarizations has been considered. Additionally, an affinity ranking scheme to re-rank search results by optimizing diversity and information richness of the topic and query results has been developed. Such practices model the variance of topics in groups of documents.

The herein described systems and methods provide a modeling of the overall knowledge space for a specific query and improving the coverage of this space by a set of documents. In an illustrative implementation a "bag-of-words" model for representing knowledge spaces is provided. Additionally, in the illustrative implementation, a formal notion of coverage over the "bag-of-words" is provided and a simple but systematic algorithm to select documents that maximize coverage is derived to allow relevance to the search topic."

Microsoft considers a web page as a "bag-of-words" where keywords are filtered, extracted and counted to achieve a certain valuation of that document. The result is basically a document that lists keywords. Microsoft's patented search engine platform relies on a bag-of-words approach in which "a document is processed as a collection of statistics over a set (i.e., bag, of words used in it, without explicit semantic constructions such as sentences, formatting, etc.)." This document based on a bag-of-words provides the foundation for what Microsoft calls "essential pages" that relate to the bag-of-words and are said to eliminate certain less relevant search results from a search query and require a user to perform fewer mouse clicks.

The indexing and processing of the bag-of-words is a highly complex process and involves an interpretation and processing of each word, including the identification of the root of the word, word stemming. For example, Microsoft removes the endings as well as those that do not describe context semantics, such as "as," "is," or "be." According to Microsoft, this process will provide more "pertinent search results."

So, how is this patent different from what Google does today? Microsoft applied for this patent in March 2008, about one year before the company provided a first glimpse at the search engine. The concept comes down to keyword generation, extraction and storage - as well as a way how they are applied to a search query. The description largely describes what Google has been doing for several years as well as a keyword practice that has been implemented by basic search engine optimization efforts for several years. And even if Microsoft's patent differs in certain details from Google's approach, it is somewhat surprising that this idea has made it through the U.S. Patent and Trademark Office in the record time of not even three years. Legally, Microsoft may have some leverage against Google, even if it is questionable whether Microsoft would really try to go after Google at this time - in a critical technology area such as keyword extraction and interpretation.      

Google's lawyers, on the other hand, may want to look at this patent more closely and figure whether Microsoft has invaded Google territory with this patent or not. Somehow I feel that this is not the last time we have heard of this patent.

Discuss
Ask a Category Expert

Create a new thread in the News comments forum about this subject

Example: Notebook, Android, SSD hard drive

This thread is closed for comments
Top Comments
  • 23 Hide
    Trizomik , October 13, 2010 6:54 PM
    They really want us to hate them...
  • 21 Hide
    Netherscourge , October 13, 2010 6:49 PM
    Good luck having that patent upheld in court.
  • 20 Hide
    drakefyre , October 13, 2010 6:59 PM
    First off, awesome article. It really reads like one of the older Tom's articles.

    Second, wow Microsoft is stupid, as is the whole patent system. This is ridiculous.
Other Comments
    Display all 59 comments.
  • 21 Hide
    Netherscourge , October 13, 2010 6:49 PM
    Good luck having that patent upheld in court.
  • 16 Hide
    zorky9 , October 13, 2010 6:49 PM
    Wow.. Now that's a patent ogre.
  • 23 Hide
    Trizomik , October 13, 2010 6:54 PM
    They really want us to hate them...
  • 15 Hide
    bdaonion , October 13, 2010 6:57 PM
    These things keep getting more and more petty...
  • 9 Hide
    IzzyCraft , October 13, 2010 6:58 PM
    NetherscourgeGood luck having that patent upheld in court.

    obviously you never seen some of the frankly ludicrous cases that has flown out of US patient court cases.
    TrizomikThey really want us to hate them...

    Who microsoft or google both gobble up tons of patents for things people would think should be allowed for everyone some of which was listed in the article if you read it, and google has moved to gobbling up upstarts that compete with them on some level.

    They are both large corps =p
  • 20 Hide
    drakefyre , October 13, 2010 6:59 PM
    First off, awesome article. It really reads like one of the older Tom's articles.

    Second, wow Microsoft is stupid, as is the whole patent system. This is ridiculous.
  • 4 Hide
    tntom , October 13, 2010 7:09 PM
    Wow! I'm going patent "Systems and methods to perform efficient mode of transportation for passengers of a powered vehicle." Maybe I'll just patent the most efficient of everything. That wouldn't be to vague would it?
  • 2 Hide
    JohnnyLucky , October 13, 2010 7:09 PM
    Looks like they are following in IBM's footsteps.
  • 3 Hide
    diablob , October 13, 2010 7:13 PM
    Unpatentable: A*X=B
    Patentable: Puppies x Cuteness = Adorability-Factor

    The more I read about software patents, the more angry I get. When are we going to fix the system and formally pass a law that stops software from being patentable?
  • 0 Hide
    verbalizer , October 13, 2010 7:26 PM
    things will only continue to get worse and worse in the future.
    judgment day as in Terminator is coming and MS will be the source.. lol
  • 3 Hide
    ispam , October 13, 2010 7:29 PM
    [sarcasm=true;]
    I'm going to patent sex and all of you are going to be very sorry you didn't have this idea earlier when you'll be forced to license it from me.
    [sarcasm=!sarcasm;]
  • 0 Hide
    ares1214 , October 13, 2010 7:33 PM
    Wheres the patent on air? Or is Apple going to beat you to that? Water? Maybe google will get that, google>goggles>water, id see that upheld in court. Seriously, i feel like everyday tasks will be patented soon, and all its doing is destroying business.
  • 2 Hide
    Gin Fushicho , October 13, 2010 7:35 PM
    Really Microsoft? Really? I don't think they have the right to patent that.

    Pretty soon someone is going to patent the act of sex and plant chips inside of your body that extract money from your account every time you masturbate or have intercourse.
  • 3 Hide
    znegval , October 13, 2010 7:37 PM
    Am I the only one who thinks any patent starting with "systems and methods" should be automatically rejected? Perhaps excluding a few ones.
  • 4 Hide
    kelemvor4 , October 13, 2010 7:49 PM
    Well. The patent system STILL needs a major overhaul. Things like this seem humorous until they end up costing someone a truckload of money to defend themselves in court.
  • 1 Hide
    GNCD , October 13, 2010 7:50 PM
    "Systems and methods" Can they be more vague?
  • 5 Hide
    AMD_pitbull , October 13, 2010 7:52 PM
    Same thing I usually say, I blame the government. The courts have to stop this.
  • 0 Hide
    brando56894 , October 13, 2010 7:55 PM
    Microsoft needs to go to hell with a this stupid patent BS, they literally want to own everything. If MS disappeared tomorrow I think they world would be a better place.
  • 4 Hide
    cronik93 , October 13, 2010 8:06 PM
    Microsoft is a monopoly that needs to be stopped.
  • 1 Hide
    loomis86 , October 13, 2010 8:07 PM
    This is what happens when you have lawyers and stock traders running the economy instead of entrepreneurs and industrialists.
Display more comments