Subscribers

Wednesday, January 7, 2009

«Image Search Fix»

A while back I started to notice a problem with my image hosting for my blog. It was taking longer and longer to get a directory listing of my pictures folder in FileZilla (an FTP program). The reason was the sheer number of images building up in the directory, and it was taking too long to download all the file names. I knew that it would only get worse as I put more images on my blog.

So I decided to break them out by year. I moved all my images from http://pictures.mastermarf.com/blog/ to http://pictures.mastermarf.com/blog/2008/. That way I can create a new folder for each year. For instance, now I'm uploading any new pictures to http://pictures.mastermarf.com/blog/2009/. It works great. A little slow getting the directory listing towards the end of the year, but manageable.

The only problem was that Google Image Search had already indexed many of the images at their old location. You'll notice when you do a Google image search you have the option to "See full size image". So now I had a lot of people getting a 404 not found error when they tried to view an image result from my page.

I finally fixed the 404 errors people were getting from the Google Image search results of my images that I moved into their 2008 folder. I wrote a short little PHP script and put it on the 404 error page for my pictures subdomain:

<?php
$uri = $_SERVER['REQUEST_URI'];
if (stripos($uri,"/08") == 5)
{
$uriImage = substr($uri,6);
// Permanent redirection
header("HTTP/1.1 301 Moved Permanently");
header("Location: http://pictures.mastermarf.com/blog/2008/" . $uriImage);
exit();
}
?>

It's a 301 redirect to the new address for the images I moved. A 301 redirect basically says to the browser or search engine spider, "I have permanently moved to this new location." I had to tell the server to use PHP version 5 for the stripos() function. It didn't work when it tried to run in version 4.

What the script does is gets the web address that was requested and checks if "/08" is in the right position to be one of the images I moved. I did this because I only wanted to redirect the moved images. All the file names of the images I moved start with "08". I figured it was easier to do than have a long list of specific images to check for. Also, there is are no other file names that start with "08" in that directory, so it worked out.

The rest of the script is a pretty strait-forward 301 redirect. I pull just the file name out of the requested web address, and re-build the new web address right there inside the header() function.

I just hope it wasn't too late. I moved the images a while ago, so Google has had time to see the 404 Not Found error and remove the images from their index.

3 comments:

  1. Hi Marf,
    Possibly by regenerating a site map and resubmitting it to Google you might find that the old image links which Google has will get updated quicker.

    ReplyDelete
  2. @ Bunc: From what I've read, Google Image Search finds and indexes the images through the page they're found on. It ignores image files in sitemaps.

    I read that somewhere, but I don't have time right now to find and link to where it said it.

    ReplyDelete
  3. @ Bunc: I found where it says it:

    "Do not include direct image URLs in Sitemaps. Google does not index the image directly; instead, we index the page on which the image appears. Direct image URLs included in Sitemaps won't be indexed."
    - Sitemap Guidelines

    ReplyDelete

Thanks for taking the time to comment.

Note: Only a member of this blog may post a comment.

»» «« »Home«