Jump to content

Cre8asiteforums Internet Marketing
and Conversion Web Design


Photo

mod_rewrite and SEO


  • Please log in to reply
15 replies to this topic

#1 peter_d

peter_d

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 1914 posts

Posted 30 October 2002 - 04:14 PM

Phil talked a little about this in another thread, but I thought I'd start a new one as this is an interesting topic for those who deal with SEO on large, dynamic sites.

Phil, and anyone with expertise in this area, I'm sure many people, myself included, would be interested to hear your knowledge on this topic.

#2 sanity

sanity

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 6889 posts

Posted 30 October 2002 - 05:37 PM

Yes please!!

#3 Black_Knight

Black_Knight

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 9339 posts

Posted 30 October 2002 - 06:37 PM

Search engines are getting better at indexing and following dynamic URLs, but my experience still says that using mod_rewrite for Apache (or indeed, patching IIS with a special DLL in the case of Windows based servers) to create static-seeming URLs is the most effective method for getting dynamic pages properly indexed and ranked.

You can find Phil's related post on mod_rewrite here.

You can discover more about URL rewriting with IIS at http://www.devasp.co...att/iistips.asp or you can buy ready-made add-ons at sites such as http://www.isapirewrite.com/ , http://www.qwerksoft...cts/iisrewrite/ or http://www.opcode.co...nts/rewrite.asp to name but a few suppliers.

I find that the majority of my larger corporate clients are not using Apache servers, and many have to develop their own url rewriting techniques.

Of course, simply rewriting the URLs is only the start of the process. The dynamic content can be indexed and crawled, but ranking will still come from optimisation. The trick to that is to modify the templates so that Titles and Description meta tags are dynamically written with the page, and will include such things as product names and models, etc.

#4 sanity

sanity

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 6889 posts

Posted 31 October 2002 - 05:09 PM

Thanks for the info Ammon.

Does anyone know if there's any info/tools available for Cold Fusion?

#5 Guest_PhilC_*

Guest_PhilC_*
  • Guests

Posted 31 October 2002 - 07:04 PM

I am not an expert on mod_rewrite. I was aware of it for quite some time but I only began to use it after a phone chat with Ammon, who suggested that I give it a whirl. I did and it doubled my traffic - at least.

mod_rewrite is an Apache module that can be used for just about any url manipulation you care to dream up. It's commonly used to defend against robots, such as email harvesters and site-rippers. It is also used to make the contents of databases available to search engine spiders. This is sometimes necessary because spiders can't fill in forms, so they cannot get the database content pages that surfers get when using some sites.

Using mod_rewrite requires the server facility of being able to set up and use a .htaccess file on the domain. As a practical example of using it for SEO, I'll explain my .htacces file (the one that doubled my traffic) and what it does.

RewriteEngine on
RewriteBase /accommodation-counties
RewriteRule ^ rewrite.php3 [T=application/x-httpd-php3]

That's what is in the file.

The first line turns the server's rewrite engine on.

The second line states which directory it applies to. In this case, it is a sub-directory of the domain's root directory. The .htaccess file is actually in that directory. It may ok if it is in the domain's root - I'm not sure. I'm also not sure if the effect is applied to sub-directories of the RewriteBase directory. You see? I'm not an expert.

The third line tells the RewriteEngine what to do. In this case, the instruction is to redirect every request for a file in this directory to rewrite.php - which is also in the directory. The mime-type is also changed by this line to suite the php3 file. There can be many RewriteRule lines, but my system only needs this one.

The way I work it is that I have a link from another page on the site to the index.html page in this directory. The index page is like a top level database map in that it contains links to the <county>.html pages in this directory. But the county pages don't exist. When a request is made for a county page, it is redirected to the rewrite.php3 file.

All file requests to this directory are redirected to rewrite.php3 which examines the requested filename and, if the request is for the index page, it sends the index page. If the request is for a county page, the php script performs a search on the database, compiles an html page from the results, and sends it to the requestor.

The compiled county pages contain links to other non-existant pages such as <town in the county>.html. To do this accurately, the php script searches the database again and gets all the names of towns in the county. There are other links placed on all pages but they are specific to my database, so there's no need to go into them. All the pages/links are given the .html extension when the links are compiled, so they look like static pages although they are all dynamic.

In this way, I provide spiders with a link to the directory's index page. From that page, they get links to county pages and, from those they get links to towns pages and more. All I have in the directory are an index page, a .htaccess file and the rewrite.php3 script but Google currently has 1630 pages in its index from that directory. I just added the "towns" links so they haven't been crawled yet. I've no idea how many thousands of pages Google will have when the crawl has been done again.

All the dynamic pages contain some javascript so that, when a link to any of these pages is clicked in the serps, the surfer gets the full framed site with the expected page displayed in the main frame.

As Ammon pointed out, the dynamic pages need to be optimised if they are to do well in the engines. I use some optimising techniques on the pages and I put the phrases from the pagenames in the Titles, Descriptions, etc.

As I said a couple of times, I'm not an expert on mod_rewrite and maybe my system isn't as streamlined as it could be - but it works.

If you want to see it in action, click http://www.holidays....von_hotels.html. The filename is self-explanatory.

I should point out that you won't see the links on the returned page, but not because you receive different pages to the ones that spiders receive. I approve of cloaking although I've never used it. You do actually get the Rewrite page with links but, as I said, that page pulls the framed site together and a fresh search is done that puts the page you want into the main frame. The fresh (normal) search doesn't go through the RewriteBase directory and is not performed by rewrite.php3.

Phil.

#6 fantomaster

fantomaster

    Unlurked Energy

  • Members
  • 3 posts

Posted 05 November 2002 - 12:46 PM

You'll find a free mod_rewrite tutorial focusing on SEO related issues here:
http://fantomaster.c...tml#mod_rewrite

#7 cre8pc

cre8pc

    Dream Catcher Forums Founder

  • Admin - Top Level
  • 13362 posts

Posted 05 November 2002 - 12:54 PM

Look what the cat dragged in! Greetings Fantomaster and welcome (finally) to Cre8asite. I always enjoy seeing friends drop by. Thanks for the tip ;)

Kim

#8 ricka

ricka

    Honorary Member

  • Members
  • 342 posts

Posted 05 November 2002 - 01:34 PM

I noticed that two dynamic sites I'm working with at the moment have lots of pages in Google’s database. See this and this.

I'm trying to get a handle on how important mod_rewrite and other such techniques still are, considering that Google seems pretty good at spidering dynamic sites these days. What sayest thou, sir Knight and other noble souls?

#9 wildline

wildline

    New To Community

  • Members
  • 1 posts

Posted 05 November 2002 - 08:50 PM

I've been using mod_rewrite with reasonable success for some time now. I write PHP programs that use mySQL databases to generate the pages.

The technique I am using is specifically for search engines

A listing in a search engine that looks like this:
http://www.mydomain....g-you-want.html will execute a PHP program and that program "sees" the keyword phrase "anything you want".

Those keywords are then searched for in the database and relevant content based on the search term is put in the served page.

Create a forward link in each page to the next keyword and you will create a stream of virtual pages that can be crawled by any of the major search engines.

If you want Google to find your "mod_rewrite" site, simply put links to your virtual pages in pages on other domains so Google will find them. The forward links you build in each page will do the rest of the work. As long as each page has significantly different content, Google will eventually index all your virtual pages.

#10 Guest_PhilC_*

Guest_PhilC_*
  • Guests

Posted 05 November 2002 - 10:53 PM

You'll find a free mod_rewrite tutorial focusing on SEO related issues here:
http://fantomaster.c...tml#mod_rewrite

Nice article, Ralph - duly bookmarked.

Nice to see you here too :wave:

Phil.

#11 Guest_PhilC_*

Guest_PhilC_*
  • Guests

Posted 05 November 2002 - 11:04 PM

Hi wildline - and welcome.

Your system is just like mine except that I've allocated a specific directory for the job. It works a treat, doesn't it?

I recently added a little bit and Google has just spidered over 20,000 of the dynamic pages - before today it was 1630. Now I'm curious to see what effect they will have on the PR. It took 2 dances for the 1630 pages to be effective and move the home page from 4 to 5. I wonder if over 20,000 of them is enough to move to 6.

Phil.

#12 cvos

cvos

    Whirl Wind Member

  • Members
  • 93 posts

Posted 06 November 2002 - 01:34 AM

Yes, the benefits of creating easy to use URLs is hard to dismiss. Amazon has done it since day 1. Aside from being searchengine friendly, it is end user friendly. I use some form of server side processing on all my sites and have created a nice template that automatically generates title and meta tags from page content.

#13 Black_Knight

Black_Knight

    Honored One Who Served Moderator Alumni

  • Hall Of Fame
  • 9339 posts

Posted 06 November 2002 - 02:50 AM

I'm trying to get a handle on how important mod_rewrite and other such techniques still are, considering that Google seems pretty good at spidering dynamic sites these days. What sayest thou, sir Knight and other noble souls?


No matter how good the search engines become at indexing urls with query strings (and only Google and FAST seem to be making a real effort) there is still a major advantage to static-looking urls: static urls are generally shorter, easier to remember, more logical and users prefer them.

Sites that have switched from query strings to static-seeming URLs generally find that email referrals increase (less links break when one user sends them to another), inbound links increase, and generally traffic goes up even beyond the extra power of the search engine listings.

#14 fantomaster

fantomaster

    Unlurked Energy

  • Members
  • 3 posts

Posted 06 November 2002 - 08:40 AM

Sites that have switched from query strings to static-seeming URLs generally find ... less links break when one user sends them to another.

Very good point, Ammon. Lots of sites with dynamic content are making use of "Tiny URLs" types of third party services because of just that, but of course from an SEO standpoint this doesn't make a lot of sense as they don't improve link popularity and PR one bit.

The best solution to this issue is undoubtedly mod_rewrite (or its IIS parallel) and every SEO/SEM should at least be aware of it.

#15 ricka

ricka

    Honorary Member

  • Members
  • 342 posts

Posted 06 November 2002 - 11:37 AM

I follow you on the useability advantages of shorter, simpler URLs, but I'm trying to get a handle on how important they are from a spider's perspective. Would it be accurate to summarize that they're essential for all but Google and Fast, and that they might make things easier for those two as well? Or is it more true to say that even Google and Fast will do better with static URLs, despite the fact that they seem to be spidering dynamic ones quite readily? If the latter, I wonder how long it will be before it no longer matters?

#16 fantomaster

fantomaster

    Unlurked Energy

  • Members
  • 3 posts

Posted 06 November 2002 - 12:34 PM

Even those engines (you might want to add AltaVista to your list for what it's worth these days) that will spider dynamic pages with funny characters in their URLs can't always be relied on to do a good, reliable job. Googlebot, for instance, cannot even be trusted to heed the robots.txt convention at all times and under all circumstances. So even if it should work out most of the time, using such URLs you will always run the very real risk of giving spiders the hiccups.

While we have no empirical evidence indicating that such pages will be actually be penalized by Google, FAST or AV, the advantages of short, static-looking URLs definitely outweigh any additional effort spent in generating them, the more so as there are no other discernible disadvantages in using them.



RSS Feed

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users