Jenstar

Sponsors

AdSense mediapartners bot adding to the Google search index

Since Google AdSense launched, there has been rumors and speculation about the possibility of the AdSense bot (officially known as the “Mediapartners-Google/2.1″ and unofficially as the “mediabot”) including some of its information into the regular Google search index. After all, it would be a nice perk of using AdSense if it gave a publisher easier access to getting pages into the natural search results ;)

But no one has ever seemed to have concrete evidence of this happening. And I have had several discussions with Matt Cutts over the past few years about this issue, and I have always been assured that they are completely separate and they are always careful the two never cross contaminate each other. And I have looked hard to try and prove Matt otherwise, but in the past it has always been to no avail ;)

But on SEO Rockstars this week, Greg Boser (aka WebGuerrilla) mentioned that he had seen mediabot information showing up in the natural search index, and my ears perked up. And Greg has now followed up with this entry detailing what he is seeing.

During last Tuesday’s Rockstar show, I mentioned that I had been working on a project that got a bit messed up due to the fact that Google’s Mediapartner bot was being used to index content for Google’s database. We had setup some 301′s for Googlebot, but had neglected to redirect the AdSense bot. The end result was a whole bunch of duplicate content due to the fact that we were serving the AdSense bot the old url, and Googlebot the new one. Both were getting indexed and added to the cache.

As I am often doing testing with AdSense, I had a collection of sites that I had not done any natural search optimization on it, since I was strictly using specific PPC terms (as a control group) to drive traffic and test some placements and ad unit color schemes. And none of those sites had any pages in the index as a result of the mediabot.

However, I went and checked some established sites. And the date and time on the Google cached version of the page is the identical time that the mediabot visited the site. (Cached time is GMT; Log time is EST). Click on each screenshot of the logs/cache info to view full sized version.

The first one is taken from this site (JenSense.com). The page can be viewed here and you can view Google’s cached version here. The time matches on the logs and the cached time.

mediabotindex.gif

The following two are from a site URL I cannot reveal, but I included them to illustrate the problem is across multiple sites and covering multiple date ranges (JenSense is indexed regularly and there were no cache dates back that far). Again, the times on the cached version of the page and the time of the mediabot visit are identical.

mediabotindex1.gif

mediabotindex2.gif

With multiple dates being affected, it doesn’t seem to be a case of just a one day glitch.

It is interesting to note that these pages have been visited by the mediabot since this time, but the new visits are not reflected in the cache.

So what does this all mean? First off, the AdSense support site clearly states that the two bots serve complete different purposes and should not affect the other.

Participating in Google AdSense does not affect your site’s rank in Google search results and will not affect the search results we deliver. Google believes strongly in freedom of expression and therefore offers broad access to content across the web. Our search results are unbiased by our relationships with paying advertisers and publishers. We will continue to show search results according to our PageRank technology.

Adding the Google AdSense ad code or AdSense for search code to your site will not queue your pages for crawling by our main index bots. While our bot (starting with ‘Mediapartners-Google’) does crawl content pages for the purpose of targeting ads, this crawl is not associated with our main index crawl.

There is the possibility that there was an accidental cross over taking place if the AdSense team was keeping cached copies of the pages serving AdSense for quality checking purposes, such as checking to see if a publisher is serving the mediabot something different than what Joe Surfer sees when visiting the page.

It does seem that it is only affecting those sites that are already indexed, and likely pages that were already indexed at the time the mediabot took the cached version snapshot for the regular search index. I could not find any evidence of multiple sites I checked that were not already indexed getting any sort of indexing boost via the mediabot. However, could it potentially be an option for getting fresher pages in the index? Possibly. But I also found instances where the mediabot had visited the same pages yet not updated the cached version of the page, so there is likely more to the hows/whens of the mediabot updating the cached copy of a page.

But what is potentially more dangerous is the fact that the Google search index is including what the mediabot sees, and not what the Googlebot would see, as noted by Greg.

The content of that post got indexed in a template that we only serve to AdSense. It has no navigation and no comments; just the actual post.

This could have severe consequences to webmasters, such as Greg who suddenly had a duplicate content issue to clean up. Webmasters usually wouldn’t think to include the mediabot in any special headers or robots.txt instructions they have for the regular googlebot.

But how much does it actually help from a webmaster perspective? On the surface, it saves on bandwidth for those few who complain about how much bandwith the various Google bots are using. But as far as how it helps in the natural search results, that is something that much more testing is needed on.

It will be interesting to see what happens with this issue. I must admit I was pretty surprised to finally see evidence of it, because I have periodically hunted for it over the years. But this is definite clear cut evidence that yes, the mediabot is sharing info with the googlebot, and possibly vice versa.

Share this with others!
  • Twitter
  • Digg
  • Sphinn
  • StumbleUpon
  • del.icio.us
  • Reddit
  • Technorati
  • Mixx
  • Google Bookmarks
  • Facebook

22 comments to AdSense mediapartners bot adding to the Google search index

  • AdSense Mediapartners influyendo en los resultados

    En las

  • Adsense tie Googlen indeksiin?

    Gerg Boser huomasi muutamia p?ivi? sitten, ett? Googlen v?limuistitallenne er??st? h?nen bloginsa sivuista oli Google Mediapartner -robotin, eli Adsense -robotin n?kem? sivu. H?n nimitt?in oli luonut kokeeksi uudelleenohjaussysteemin, miss? “oik…

  • av1

    GoogleBot seems to disguise itself occasionally (perhaps to check wether its being server the same content as users). Meybe it sometimes disguises as mediabot?

  • Mediapartners-Google pages in SERPs

    I just read on Jens Blog about the fact that pages spidered by the AdSense-Bot (Mediapartners-Google) are getting included in the SERPs as Greg Boser mentions. They both use logfiles to prove it, though this is certainly nothing everybody could check w…

  • Using AdSense to get in the Google SERPS?

    Jen at JenSense writes about the evidence that pages spidered by the AdSense-Bot (Mediapartners-Google) are getting included…

  • You SEO people have too much free time on your hands. . .

  • The Googlebot and AdSense Affair

    For all the SEO people out there, I?m becoming greatly bothered by Google recently. This post is for my brother, who said to me just a little while ago that he bets people running Google AdSense advertisements somehow benefitted on the Google search e…

  • Google indexes my AdSense URLs? Good!

    There’s a big fuss going on in the SEO community right now because pages being crawled by the mediabot (the AdSense crawler) are finding their way into the Google search results index. See AdSense mediapartners bot adding to the Google search index an…

  • me

    If you want more to talk about…. I have a site that Google has indexed EVEN THOUGH EVERY PAGE IS CLEARLY CODED WITH NOINDEX AND GOOGLEBOT IS DISALLOWED IN THE ROBOTS.TXT.

    Seems Google is doing a lot it is not supposed to be!

  • Ken

    Is it against AdSense TOS to feed mediabot differrent content from what regular visitor (or Googlebot) sees?

  • Ken: The adsense guidelines say not to use cloaking, so, yes. They didn’t specify mediabot but they did say that you cant feed crawlers different content than regular visitors.

  • Right – AdSense guidelines say this, and AdSense
    guidelines (well, support in this case) say that, and obviously what they’re actually DOING is quite another matter altogether. How come any experienced SEO isn’t one bit surprised? Maybe because this isn’t exactly the first time that Google’s been exerting double standards big time. And maybe, too, it’s because if people really and honestly weren’t into “doing evil”, there’d be no reason to blab about it all the time?

  • I have observed this for my website too. Google seems to use Adsense to track the most popular page in a website

  • Can AdSense pump-up your Google Search Ranking?

    Ever since Google came up with AdSense, there have been speculations about it! Can place AdSense on your pages pump-up the ranking of your website in Google Search Engine?
    The answer from Google’s side has been, obviously, a BIG no!
    However, a bl…

  • Matt Cutts Confirms Media Bot Crawling For Big Daddy

    At the Lunch sponsored by Google today Matt Cutts confirmed the recent rumors about media bot results getting into Big Daddy. Matt said it is a bandwidth saving feature to have GoogleBot and MediaBot both contributing to big daddy. Matt also stated th…

  • Googles Mediapartner-Bo indiziert

    Selbst habe ich es auch schon auf meiner Expertenseite bemerkt, die gleiche Erfahrung haben auch in den letzten Tagen weitere SEOs gemacht: Google nutzt anscheinend den f

  • ??AdSense????????

    ?Google AdSense??????,?????AdSense????????????????Spider(????Mediapartners-Google bot ,?????mediabot),????????Google???????,??…

  • Google Adsense?????????????

    ???Matt Cutts????,Google Adsense??(Mediapartner bot)????????Google??????Matt Cutts??????????????????
    ??Google??(Googlebot/2.1)???????????…

  • Avere gli AdSense sul sito non aiuta a farlo indicizzare meglio da Google. O forse si??

    GoogleBot e MediaBot sono i due spider di Google che si occupano rispettivamente di indicizzare le pagine per la ricerca e peer contestualizzare gli AdSense. Almeno cos