Google Machine Learning Sentence Compression Algorithms Powers Features Snippets

Dec 1, 2016 • 7:45 am | comments (4) by twitter Google+ | Filed Under Google Search Engine Optimization
 

google machine learning

The other day, I covered at Search Engine Land a Wired article named Google's hand-fed AI now gives answers, not just search results.

The article explains that Google is now using "sentence compression algorithms" as of this week in the desktop search results. Sentence compression algorithms is Google's way of extracting the best answer for a query to be displayed in the featured snippets.

Of course, this is not just used for featured snippets but also for Google Home responses, Google Assistant and more. Which is why it is important that Google build a better way to get more answers.

Here is a snippet (using my own sentence compression) to pull out the core nugget from this article:

Deep neutral nets are pattern recognition systems that can learn to perform specific tasks by analyzing vast amounts of data. In this case, they’ve learned to take a long sentence or paragraph from a relevant page on the web and extract the upshot—the information you’re looking for.

These “sentence compression algorithms” just went live on the desktop incarnation of the search engine. They handle a task that’s pretty simple for humans but has traditionally been quite difficult for machines. They show how deep learning is advancing the art of natural language understanding, the ability to understand and respond to natural human speech. “You need to use neural networks - or at least that is the only way we have found to do it,” Google research product manager David Orr says of the company’s sentence compression work. “We have to use all of the most advanced technology we have."

To train Google’s artificial Q&A brain, Orr and company also use old news stories, where machines start to see how headlines serve as short summaries of the longer articles that follow. But for now, the company still needs its team of PhD linguists. They not only demonstrate sentence compression, but actually label parts of speech in ways that help neural nets understand how human language works. Spanning about 100 PhD linguists across the globe, the Pygmalion team produces what Orr calls “the gold data,” while and the news stories are the “silver.” The silver data is still useful, because there’s so much of it. But the gold data is essential. Linne Ha, who oversees Pygmalion, says the team will continue to grow in the years to come.

This kind of human-assisted AI is called “supervised learning,” and today, it’s just how neural networks operate. Sometimes, companies can crowdsource this work—or it just happens organically. People across the internet have already tagged millions of cats in cat photos, for instance, so that makes it easy to train a neural net that recognizes cats. But in other cases, researchers have no choice but to label the data on their own.

I wonder if you guys noticed any changes to the featured snippets that corroborate the Wired story that this went live on desktop search at Google this week?

I asked Glenn Gabe who tracks a nice number of featured snippets and he noticed no significant changes this week with them:

Forum discussion at Twitter.

Previous story: Google Docs Explore: Search Engine Optimization Is Related To Weasels
 
blog comments powered by Disqus