Video Of Inside A Search Quality Meeting

Mar 13, 2012 - 9:20 am 4 by

Google has published a video of a snippet of an internal search quality meeting on their blog yesterday. It is pretty amazing to watch, even though everyone knows they are being recorded and it is only a small snippet of the meeting.

Here is the video:

Who are the main players at this table?

Amit Signhal, Matt Cutts

  • Left - Ben Azose Search Quality Analyst
  • Middle - Amit Singhal in charge of search
  • Right - Matt Cutts

Scott Huffman, Panda Nayak & Paul Haahr

  • Left - Scott Huffman testing man
  • Middle - Panda Nayak (um, Panda update)
  • Right - Paul Haahr Ranking Lead

Cutts, Gomes & Lars

  • Left - Matt Cutts
  • Middle - Ben Gomes
  • Right - Lars Hellsten Engineer Spell Correction Team
I bet some of you can even recognize some of the people on the floor? Hi Tiffany and I see you Alon on the screen from Haifa!

Here is the transcript:

0:02Singhal: Everyone, thank you for setting this up,
0:04and guys, thank you for putting up
0:05with all the inconvenience we are putting you through.
0:09It so happens that this meeting is the heart of what we do.
0:15What we approve, how we run Search.
0:18This is an experiment.
0:20We will see how the tape comes out.
0:23If I look bad, we will not put it out.
0:25[laughter]
0:28Okay?
0:30If Gomes looks bad,
0:31we will put it on the front page.
0:33[laughter]
0:37Huffman: All right. Spell-correcting long queries.
0:40Singhal: Lars.
0:43Lars: So, to keep our latency low,
0:45spelling has always just corrected
0:4810 terms in long queries,
0:50and we decided to use the first 10 terms,
0:55which was sort of arbitrary,
0:57and so this is a change by Euro in Zurich,
0:59who decided that we could be
1:01a little bit more intelligent about this.
1:03And so we're going to pick the two words
1:06that we think are most likely to be misspelled in the query,
1:09and form intervals of five words around each,
1:12so we're still correcting only 10 words.
1:14And this is just a smarter way
1:18of deciding which words to correct.
1:20Gomes: So, your context is the five words
1:22rather than the whole 10 words.
1:24So, you're more likely to find a match.
1:25man: Well, in general, the context is only three words,
1:27because we use trigrams for this thing,
1:29but they correct five words at a time,
1:32rather than simply the first 10 words.
1:35man: So, if you take a look at the mean scores...
1:36man: This is huge.
1:38man: This is very, very positive.
1:39man: We send both fragments to spelling separately,
1:42or is it strung together?
1:43Lars: No, they're sent together.
1:45We have a way of marking which terms we correct,
1:47and which terms we won't correct.
1:48Cutts: But roughly what percentage
1:49of queries have more than 10 terms?
1:51Lars: Not a lot. So...
1:53[laughter]
1:54man: But it is very annoying
1:56when your misspelling is towards the end of a long query,
2:00and you don't-- you don't get it.
2:01And it's so obviously wrong.
2:03Paul: We've seen these where it was pasted quotes
2:04and the last word is mangled.
2:06Singhal: Why would anything ever go wrong with this?
2:08man: Yeah. man: It does, because you try
2:11to correct something late in the query,
2:13and you'll see some examples
2:15where early in the query
2:16there's also a misspelling which you failed to correct.
2:18Singhal: Oh, so, because of your two-word selection,
2:21you end up picking--
2:22if there are more than two misspellings in a query...
2:24man: Or there's a very rare word that makes you believe
2:27that that's a potential misspelling.
2:29'Cause you don't know it's a misspelling.
2:30Gomes: Why wouldn't you apply the misspelling
2:32across the whole query?
2:33The same misspelling, you're saying,
2:35would get corrected in one place, because of context?
2:36man: No, no, no, it's a different
2:38misspelling at the beginning.
2:39The problem is we--if we could just correct the whole thing,
2:42but then you'd pay in cost.
2:44Right, latency and things, so they don't want to do that.
2:46man: It's mostly the latency, right?
2:48Like, why? I don't know, it seems a little like--
2:50We could do, you know, hundreds of--thousands of QPS,
2:53right, why can't we send--
2:55break the query up into multiple chunks,
2:56and then send them all through parallel so that--
2:59so we can correct the entire query, right?
3:00Lars: we could do that, but I think the traffic effect
3:03would just be a really small slice.
3:05Singhal: So...
3:06But why not just do that right?
3:08Mean, like, take overlapping five-word windows,
3:11and send runs of 10-word queries,
3:14as many as you can make out of a query,
3:16and send them all in parallel?
3:18man: Actually 'cause there's only a 0.1% change.
3:22Singhal: And, you know-- And by the way,
3:24in most cases, you'll be pretty much done.
3:26You will cover up to 15-word queries
3:30with just two.
3:32Paul: I think we should certainly launch this.
3:35I think [indistinct] gets points for a clever idea on it,
3:36but I think it is driving the same--
3:38the idea of splitting it.
3:41That's probably more infrastructure work.
3:43man: I'm sorry, I just want to jump back to this problem
3:45with the beginnings of the queries.
3:47So, these situations where--
3:50if you look at the second one in the second block there.
3:52"Int he book 'Julius Caesar,'" et cetera, et cetera, et cetera,
3:57we don't catch--we catch all sorts of misspellings
3:59about Caesar and differences,
4:01but we miss the fact that "int he" should be "in the."
4:05We have another query
4:09about sponsoring a child living in Tenerife,
4:11and we want to figure out
4:14whether "Tenerife" is misspelled,
4:15but we miss the fact that it's "cam" instead of "can."
4:20Gomes: By the way, are you doing this--
4:21but in the course of Suggest--
4:22So, the same thing will work with Suggest?
4:24When we have live-spelling Suggest?
4:27man: I'm sure if-- once you launch this,
4:29Suggest will do the same thing, right?
4:31Gomes: So, Suggest will be actually all from--
4:32man: This is all inside the Spell server,
4:34so there are no multiple calls being made.
4:36It's all embedded inside the Spell server.
4:39Singhal: So, on the sponsor, did we send the context,
4:41left and right?
4:43man: We did.
4:44man: And then why didn't we correct the context?
4:47Lars: Actually, this is sort of
4:48an issue with the current implementation.
4:50If there are--
4:52if there are two intervals that are close enough together,
4:55then we merge them into one,
4:57so what's actually happening is we're correcting, I think,
5:00from "I" to "credit."
5:03Paul: So, we just missed one.
5:05Lars: Yeah, so...
5:06Paul: Picked--We picked slightly the wrong window.
5:09Look, that's gonna happen with any of these.
5:11man: I mean, certainly, the original thing
5:13of picking the first 10 was missing a lot of words.
5:15man: That's right, that's right.
5:17Paul: The averages say this is clearly an improvement.
5:20Cutts: But if this is, like, .01% of queries,
5:23why not just correct--
5:24man: No, it's .1. Not .01, it's .1.
5:27Cutts: But how much, resource-wise--
5:30Paul: I think it's more just the--
5:32the infrastructure work on doing it.
5:34Because you now have to have the Spell servers call out
5:35to other Spell servers.
5:37Cutts: Okay.
5:40man: It seems good.
5:42Gomes: I mean, to a large extent, you will be seeing
5:43those spell corrections happening in Suggest,
5:44because you're going to get that initial window...
5:47Paul: I think a lot of these are just pasted queries, though.
5:49Singhal: This is cut and pasted.
5:51These are cut and paste. No one's typing these.
5:54man: We're seeing a lot of people's--
5:56Paul: "Cam I sponsor--" man: "Cam I sponsor."
5:59Cutts: The Caesar one is--that's a kid just doing his homework.
6:02Paul: "Stein, S. et al amino acid analysis"
6:04is a pasted query.
6:06Cutts: Yeah.
6:08Paul: So, I mean--So... man: Not all of them.
6:10Paul: But not all of these.
6:11Cutts: Like "how long do you have to wait
6:13to wash your hair after a perm?"
6:14man: "Int he book" is almost certainly not.
6:15Singhal: It may hap--
6:17plenty of pastes do all kinds of funky things.
6:19man: And if you look at the wins,
6:20a lot of those are definitely typed queries.
6:22Paul: Okay.
6:23Anyway, I--Look, this is clearly a good change.
6:26man: Maybe a great change.
6:28Paul: Let's give a recommendation to the team
6:29to actually give up the 10-word limits.
6:31Singhal: No, but I want some follow-up
6:32on that recommendation.
6:34Paul: Yeah.
6:35Singhal: So, how are we gonna get that follow-up?
6:37man: Your recommendation is issuing multiple...
6:40Singhal: Just do it. All of it.
6:41Right in that, by chunking.
6:42man: Yeah, I just think we should have some system
6:43that can handle 100-word queries, right?
6:46I don't know. Singhal: Yeah.
6:48man: We shouldn't die on, like, the hardest query.
6:50Paul: Right, but I think we end up doing that on the front end
6:52and not in--
6:53Singhal: No, but I don't--Paul.
6:54Paul: We don't care. Singhal: I'm sorry.
6:56You're defending something that--
6:57you know, the design is not perfect.
6:58Don't defend it.
7:00Okay?
7:01Paul: I think it's fine to do the recommendation,
7:03but I think this is a good step,
7:04and I think Euro gets points
7:05for getting us to look at that again.
7:07Singhal: No, that's fine, but, you know,
7:08I want to make sure that the team comes back,
7:10or we put some kind of exploding deadline
7:12that, you know, we won't do this.
7:13If you don't do it right within, say, three months,
7:17Gomes: Your Spell server is being used
7:19for running text in other places too?
7:21man: Now they're being used in--
7:24for red underline also, isn't that right?
7:27Lars: We don't use--
7:28We use the same servers-- Yes and no.
7:30man: But a different set-up?
7:32Lars: For part of--
7:33one of the red underline clients is using them, yeah.
7:35man: So, that must be much longer chunks of text.
7:37man: No, no, but I think they break it up into smaller chunks.
7:40Singhal: Treat this as someone's typing an email.
7:42man: Email, right. That's what I was...
7:43man: And bring in all the red underlines.
7:45Gomes: Right.
7:49Singhal: Okay, we can launch this, but...
7:52man: I mean, remember,
7:53we may still have problems even there
7:55because of context and things, if you break things up.
7:58So, they'll always be--
8:01Gomes: Yeah, treat it as running text, right?
8:03Singhal: Okay. man: Okay.

Forum discussion at Google+, WebmasterWorld and Matt Cutts Google+.

 

Popular Categories

The Pulse of the search community

Google Search Volatility

More Details

Search Video Recaps

 
Video Details More Videos Subscribe to Videos

Most Recent Articles

Search Forum Recap

Daily Search Forum Recap: June 8, 2026

Jun 8, 2026 - 10:00 am
Google Updates

Google Search Ranking Volatility This Weekend - Around June 6th

Jun 8, 2026 - 7:55 am
Google Search Engine Optimization

New Google Document On Third-Party SEO Tools, Services & Advice

Jun 8, 2026 - 7:51 am
Google Search Engine Optimization

Google Updates Its Hiring An SEO Doc Warning On SEO Tools & AI Optimization

Jun 8, 2026 - 7:41 am
Google

Google Search Profiles Insights & Analytics

Jun 8, 2026 - 7:31 am
Bing Search

Bing Gives Searchers A Way To Disable AI Copilot Answers

Jun 8, 2026 - 7:21 am
 
Previous Story: New Google Webmaster Tools Crawl Errors Confuses