Tuesday, February 19, 2002

As promised, here's the info on google from Dane Carlson, who emailed me earlier answering my many questions on the art and science of google. (I summarized below):

JS: Does anyone know--and can anyone enlighten me--on how google works in keeping/compiling results? See here's the thing. Lately I've been doing my vanity searches, and I generally get 10 pages of results, but when I click on page 10, it really comes about that I have 6 pages of unique hits. Which is cool. I'm good with that.

DC: You're seeing the filtered Google results. Google does this to save us time and energy. On page six of the Google results for Jeneane.Sessum http://www.google.com/search?q=Jeneane.Sessum&hl=en there's this little phrase at the bottom: "In order to show you the most relevant results, we have omitted some entries very similar to the 52 already displayed. If you like, you can repeat the search with the omitted results included.."

Click that. Now the first 990 results of the 2,610 are available (Google only lets you see 990 results for any search). These are the unfiltered results. The reason Google only returns 52 results at first is because it is comparing where the words Jeneane Sessum appear on each of the pages in relation to all of the other content there. So, if you're included in lots of blogrolling lists, but not mentioned elsewhere in the content, Google will realize this and only return one or two pages with that specific template. For example, Saltire only comes up twice, even though it really occurs dozens of times in the expanded results because your name always appears between Andy Chen and Paul Boutin.

JS: But even more strange lately, it seems like I'm losing hits that used to come up. Like, I was in the 2,600's for number of results returned a couple of weeks ago, and now I'm in the 2,500s, even though folks have been linking to allied fervently. Same with Gonzo Engaged.

DC: It's doing well, but not all the hits it has gotten register on Google. Google loves weblogs -- she likes to read from them everyday. Reading Gonzo Engaged was last spidered by Google on Sunday, Feb 17, 2002. I know this because of this page: http://www.google.com/search?q=cache:http%3A%2F%2Fgonzoengaged%2Eblogspot%2Ecom%2F. Unfortunately, for the non-weblog world, Google's appearance usually only happens once a month or so. Search engine optimization specialists (that haven't quite figured this secret of weblogging out yet) rejoice when Google finally visits their site. Now here's where it gets really neat. Since Google only visits most sites once a month, the Google database is usally only updated every month (every 28 days actually).

That update is coming up soon -- this weekend, most likely. After this weekend and the Google dance, you should see quite a few more links to your websites.

JS: So, do some hits/results disappear from Google after a time? Why? What can I do to boost my blog's "findability"? And ultimately, the number of results returned?

DC: All webpages stored in Google hava a PR (Page Rank) score. The score for each page is somewhere between one and ten. Reading Gonzo is a 6. Dave Winer's Scripting News is an 8. Google sometimes doesn't show pages in your results that have low PR scores. That doesn't mean that they're not there -- just that Google really doesn't think you're that interested in seeing your name on a bunch of PR 1 pages.

JS: Any tips from this search maven novice would be appreciated.

DC: Here's something else you can try. http://www.google.com/search?q=link:http://gonzoengaged.blogspot.com This will show every site (in this database) that's linking to Reading Gonzo.

I hope that this answers some of your questions, and sparks new ones! I really enjoy reading you. Cheers,


