home

Are PHP Session ID’s A Cause For Duplicate Content With Google?

June 22nd, 2007
Written By: Adam Sussman


I know certain web application depend on Session ID’s to handle unique user experience.

You know you’ve caught a case of Session ID’s when you’re browsing a site and your URL’s have nice random characters and number appended to it. Basically they are a real eyesore.

But more importantly, from what I understand Session IDs can create duplicate content issues for your website. You no longer have one page with one URL, but you can have thousands of unique URL pointing to one single page.

Google might crawl your site one day and pick up all your links with one session ID, and the next time they crawl they pick up a whole set of new links pointing to the same pages because the session ID has changed. That does suck!

Disabling PHP Session ID’s is not that complicated and there are a verity of tricks that can prevent search engines from picking them up. You can be simple and flip a global switch and turn off Session ID’s all together, or target the bots directly.

A few years ago I launched a site running OS Commerce and I had Session IDs enabled. Google did its thing and a few weeks later my results were a mess.

Now word on the street is that Google can handle Session ID’s much better then a few years ago. So is it worth ones time to even think about Session IDs’? I mean, with all those big brains working at the G-Factory you would think they could decipher Session ID’s.

Now I lean towards being better safe than sorry and I turn off my Session ID’s.

No Cookie, no Washy!

Share and Enjoy:These icons link to social bookmarking sites where readers can share and discover new web pages.
  • blogmarks
  • del.icio.us
  • digg
  • Furl
  • Shadows
  • Simpy
  • Spurl
  • YahooMyWeb

9 Responses to “Are PHP Session ID’s A Cause For Duplicate Content With Google?”

  1. david
    June 22nd, 2007 12:06
    1

    You’re a victim of misinformation and some confusion Adam. Session IDs are not an either-or with cookies. The session ID contains no information, only the reference to a session. That way you can store lots more information server-side. Problem is..you have to pass that ID around with the user so you can keep track of them. There are 2 ways to do that:

    1) Pass in the URL (like you’re talking about)
    2) Store it in a cookie on the user.

    All you have to do is convert your site to storing the ID in a cookie instead of the URL and your problem is solved. Disabling the session IDs completely isn’t the right solution.

  2. shandyking
    June 22nd, 2007 12:39
    2

    Hey Dave!

    I totally get that sessions can tracked with cookies instead of the url and that cookies and sessions are not the same thing.

    “Disabling the session IDs completely isn’t the right solution.”

    What do you think is the right solution?

  3. Brian
    June 22nd, 2007 15:19
    3

    shandy: The right solution is to hold the session IDs in a cookie, rather than the URL.

    A session ID can be stored in the URL, or in a cookie. In the URL, it (used to) cause a mess in the SERPS. In a cookie, it doesn’t matter.

    But you still need a session ID for any type of interactivity; it’s just where you put it that matters.

    These days, I highly doubt that a session ID will cause major dupe issues in Google - after all, they are a bunch of bright people over there, and I think that would have been a problem they tackled early on, in order to improve the quality of the search results.

  4. Josh
    June 24th, 2007 18:22
    4

    in general, what is the feeling on how Google handles urls with session IDs appended, php or otherwise?

  5. Ebay vs. Google - another manic Monday » Shop.org Blog
    June 24th, 2007 18:26
    5

    […] URL extensions cause issues in […]

  6. shandyking
    June 24th, 2007 19:39
    6

    Josh, I am running a test on one of my sites to see how Big G handles them.

  7. Bill Hartzer
    July 5th, 2007 14:35
    7

    Wouldn’t there be a way in in the robots.txt file to make sure that they don’t index certain parts of the site?

    I know there’s a lot of issues with Microsoft’s ecommerce server, though, as every internal link and every session has a different URL.

  8. Session ID’s and Google: Part 2
    July 20th, 2007 17:43
    8

    […] July 20th, 2007 I wrote a couple weeks ago Are PHP Session ID’s A Cause for Duplicate Content with Google? […]

  9. acnecaregal
    December 12th, 2007 03:20
    9

    i’m also running a test on one of my sites to test this idea of yours. let’s see how the google finds it.

Leave a Reply


  • Meta