The CLUAS Archive: 1998 - 2011

01

(First up - apologies for my absence from this blog in recent weeks, I won't burden you with a long list of protracted excuses, suffice to say I'll be about this place a bit more often. So, moving swiftly along...)

With thousands of its pages indexed by Google, CLUAS today receives a healthy chunk of its traffic from the world's leading search engine. The number of visitors they send our way can vary greatly from day to day, from week to week, but it is safe to say that we get a minimum of several hundred vistors a day coming from Google. Behind this fact lies plenty of interesting info and observations about how Google sees CLUAS, stuff I have been keeping my eye on for years but which now (cue collective groan) I am going to explore in a series of blog entries...

Casual users of Google wouldn't be aware (nor do they need to be) of the fact that Google shares out, for free, considerable amounts of information to webmasters about how Google sees their website(s). They do this via their Webmaster Tools service and all you need to do to get this info for your website is to prove to Google that you are indeed the owner of the domain name. They then dish out all sorts of info that any conscientious websmaster would be mad for, like:

  • Search queries that most often returned pages from your site, and which of them were clicked,
  • Which pages on your site have links pointing to them from other sites,
  • The number of pages on your site that Google indexes per day,
  • The average time it takes Google to download a page,
  • Pages that it has trouble accessing.

Exciting stuff, eh?

Anyway I've been checking in with the Google Webmaster Tools service for well over a year now to keep tabs on how Google is interacting with CLUAS. Last week when I logged in I noticed something unexpected. Google all of a sudden had dramatically reduced the number of CLUAS.com pages it crawls on an average day. It dropped from an average of thousand pages a day to about 25 a day (see the graph). My first reaction was "WTF?" before calming all the way down.

Number of pages crawled by Google

There are many reasons why Google would suddenly reduce dramatically the number of pages it crawls on any site: the site might not be updated often enough to merit 'deep crawling', the site might not be receiving enough new links from other sites, the site might have started using all sorts of frowned upon practices to deceive search engines. There could be any number of reasons. However I was reassured when I saw that CLUAS articles are still appearing in Google news within an hour of them being published. Somehow I don't think we are in the Google doghouse.

My own feeling is that this is something to do with the fact that, starting for a 3 week period on April 5, CLUAS stopped running Google ads on the site (so that we could run a banner ad campaign for Independent Records). I'm not saying that I think Google went "ahhhh, those CLUAS lads, they stopped running our ads, off with their heads, etc." Here's why. See, when you visit a page with a Google ad a few things happen in the blink of an eye. Simplifying it greatly, you visit the page, the page tells a Google ad server "there's a visitor on this page", Google grabs the page, checks its content and then serves up a ad relevant to content on the page. My guess is that once we stopped running Google ads for the 3 weeks, Google during this period - obviously - stopped "grabbing pages" to check content in order to decide what ads it should run on the page. And this is what has made our "pages crawled" stats plummet (background info: Google last year bundled together the task of checking a page for ad content with checking a page for possible inclusion in its search result pages).

Maybe I am wrong and Google just thinks CLUAS is not worth crawling as often as it did before for some other reason. Now that the Google ads are again running across the site I'll soon be able to see if my theory is right. I'll report back in a few weeks with an update on what happens.

Betcha just can't wait.


More ...

[Read More...]

Posted in: Blogs, Promenade
Actions: E-mail | Permalink |

Search Articles

Nuggets from our archive

2002 - Interview with Rodrigo y Gabriela, by Cormac Looney. As with Damien Rice's profile, this interview was published before Rodrigo y Gabriela's career took off overseas. It too continues to attract considerable visits every month to the article from Wikipedia.