mod_rewriting an entire site

A little while ago, I received an email from Brad Timinski, who asked:

In your mod_rewrite article, you mention that you ‘use mod_rewrite to redirect all pages to one central PHP page’. How exactly do you do that?

Well, in retrospect it isn’t too difficult, but I do remember that when I was trying to work out how to do this myself it did cause a bit of head-scratching. On my site, I decided to use an all-index structure, as that’s how I prefer to do things - it means that the scripting language is more hidden from the end user than if you linked to pages such as “something-bizarre.jsp” and means that if the scripting language used to create the pages was changed the names of the pages wouldn’t have to be,

In using mod_rewrite to modify an entire website, the following points needed to be addressed:

  • Images and CSS files should not be rewritten
  • Since the only subdomain used by the site is ‘www’, if the user does not enter it then it should be added automatically and visibly for them.
  • All versions of a webpage should be automatically and visibly rewritten to a single URL. i.e. ‘www.example.com/somepage/’, ‘example.com/somepage/’, ‘www.example.com/somepage’ and ‘example.com/somepage’ should all resolve to ‘www.example.com/somepage/’
  • Once all visible rewriting has been completed, the URL should be invisibly redirected to a master page which is able to interpret the URL which the user requested and serve up the correct content.

The following is what I came up with. Please refer to “mod_rewrite, a beginner’s guide (with examples)” if you need any extra pointers as to what anything means.

###################################################
# Turn the RewriteEngine on.                      #
###################################################

RewriteEngine on

###################################################
# Add a leading www to domain if one is missing.  #
###################################################
# If this rule is used, the rewriting stops here  #
# and then restarts from the beginning with the   #
# new URL                                         #
###################################################

RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

###################################################
# Do not process images or CSS files further      #
###################################################
# No more processing occurs if this rule is       #
# successful                                      #
###################################################

RewriteRule \.(css|jpe?g|gif|png)$ - [L]

###################################################
# Add a trailing slash if needed                  #
###################################################
# If this rule is used, the rewriting stops here  #
# and then restarts from the beginning with the   #
# new URL                                         #
###################################################

RewriteCond %{REQUEST_URI} ^/[^\.]+[^/]$
RewriteRule ^(.*)$ http://%{HTTP_HOST}/$1/ [R=301,L]

###################################################
# Rewrite web pages to one master page            #
###################################################
# /somepage/            => master.php             #
#                            ?page=somepage       #
# /somesection/somepage => master.php             #
#                            ?section=somesection #
#                            &page=somepage       #
# /somesection/somesub/somepage/                  #
#                       => master.php             #
#                            ?section=somesection #
#                            &subsection=somesub  #
#                            &page=somepage       #
###################################################
# Variables are accessed in PHP using             #
# $_GET['section'], $_GET['subsection'] and       #
# $_GET['page']                                   #
###################################################
# No more processing occurs if any of these rules #
# are successful                                  #
###################################################

RewriteRule ^([^/\.]+)/?$ /master.php?page=$1 [L]
RewriteRule ^([^/\.]+)/([^/\.]+)/?$ /master.php?section=$1&page=$2 [L]
RewriteRule ^([^/\.]+)/([^/\.]+)/([^/\.]+)/?$ /master.php?section=$1&subsection=$2&page=$3 [L]

If you enjoyed reading this and would like other people to read it as well, please add it to del.icio.us, digg or furl.

If you really enjoyed what you just read, why not buy yourself something from Amazon? You get something nice for yourself, and I get a little bit of commission to pay for servers and the like. Everyone's a winner!

comments (21) | write a comment | permalink | View blog reactions

Comments

  1. by Anonymous on October 26, 2005 10:46 PM

    thy a lot for this one! just solving my very problem! thx again!

  2. by Anonymous on January 4, 2006 03:32 AM

    Great info, thank you.

  3. by Josh Prowse on March 28, 2006 03:44 PM

    Great explanations, but I had one problem. I found that if I was using more than one parameter, I got weird results because of the ampersand (&).

    For example, if my rule was this:

    RewriteRule ^blog/([^/\.]+)/([^/\.]+)/?$ /blog.php?yr=$1&mo=$2 [L]
    

    And I used the URL:

    /blog/2005/10
    

    Then the resulting GET info looked like this:

    [yr] => 2005blog/2005/10mo=10
    

    The ampersand was replaced with the entire original page request! The only way I could find around this was to add a slash before the ampersand in the rule, so it looked like this:

    RewriteRule ^blog/([^/\.]+)/([^/\.]+)/?$ /blog.php?yr=$1\&mo=$2 [L]
    

    and then everything was coolio.

  4. by Doug on April 13, 2006 04:02 AM

    Great info, thanks!

    I was wondering if anyone knew a way of getting the from the child doc. For Example: I have index.php and program.php, so when I write /program/ it all works out with mod_rewrite in place. But is there anyway to get info specific to program.php (like the title) into other parts of the index.php file. I can get it to work but only after the program.php file is loaded in…

    I hope this makes sense, any help appreciated, thanks!

  5. by jalpa on April 18, 2006 12:21 PM

    this thing is not work in IIS?

  6. by Neil Crosby [TypeKey Profile Page] on April 18, 2006 07:29 PM

    jalpa: Nope, it’s an Apache only thing.

  7. by Twan on May 27, 2006 12:54 PM

    Great blog!

  8. by Corsin on June 9, 2006 01:56 PM

    How you handle this when a real folder exits? like you have /posts/abc/ and you have a folder /resume/? How do you access this /resume/ folder?

    Thanks Corsin

  9. by Anonymous on June 14, 2006 12:18 AM

    I am looking for a way to add trailing “/” onto multiple subdirectories anytime they are called. All the examples I have seen are for the root or one directory deep and when I try expand, it does not work. Ideally, I need to traverse 5 levels deep.

    http://localhost/aaa —-> http://localhost/aaa/ http://localhost/aaa/bbb —-> http://localhost/aaa/bbb/ ..and so forth.

    Thanks for you time! - Ryan

  10. by Tapan Bhanot on July 2, 2006 09:31 AM

    Hi,

    Yes, i would also like to know …how do you deal with the question #8 by Corsin ? I want to goto in admin folder but i cannot go as the rule in .htaccess is making it redirected to the script. Please help.

    Thanks.

  11. by Tapan Bhanot on July 2, 2006 10:08 AM

    Hi Corsin,

    Use the following to achieve what you’re trying to. The 1st, 2nd and 3rd lines are conditions which check first if the request isn’t a file, directory, or symbolic link.

    Otherwise, the 4th line link send anything else to a file named showcat.php.

    RewriteCond %{REQUESTFILENAME} !-f
    RewriteCond %{REQUESTFILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    RewriteRule .* showcat.php [QSA,L]
    

    Hope this helps.

  12. by Miles on July 19, 2006 06:55 PM

    What do I if I want to have directories instead of dynamic?

  13. by Corsin on July 19, 2006 08:06 PM

    Hello Tapan and thank you

    Just a correction, in case somebody else need this

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-l
    

    your file and directory missed the _ in REQUEST_FILENAME :)

    Thank you again — Corsin

  14. by Nicholas on October 18, 2006 10:33 PM

    I too really appreciate an article such as this that clearly explains what is going on.

    I have a question in relation to redirects, as I made some changes to a site as part of a SEO and the new page links are not getting updated in Google.

    The links indexed by Google included a session id, so there were 4000+ pages indexed for a small dynamic site. I also moved the shop from an old silly directory to one called /shop/.

    I successfully added teh appropriate re-writes, however I wanted to ask if this could cause problems with Google? Any link followed by Google, or refered by Google, that contains the session id results in 2 permanent redirects before landing at the correct page - I wanted to ask if this is problematic?

  15. by Tony on January 13, 2007 06:13 PM

    Hi, great two articles! I’m having a problem, probably simple but I just can’t seem to figure out what is happening. I have multiple variables separated by slashes (e.g. pagename/var1/var2/var3/var4/var5/). The problem is that not all these variables are used all the time…so I get stuff like pagename/var1///var4/var5/. The variables are always in the same order and my rewrite rule for each of the variables that are optional is ([a-z0-9])/([a-z0-9]). When I look at the rewrite log, it would seem that the multiple slashes are being collapsed. The result is that none of my rules are matched despite the sections being optional (by using the *, I thought that meant 0 to N times?). Google has turned up nothing. Any help would be appreciated!

  16. by Spinner on January 29, 2007 12:55 PM

    Thanks for the tutorial. I have a few questions:-

    1 How can I force some pages to use https? 2 How can I handle invalid pages, is it possible to redirect them to the normal 404

    Thanks

  17. by steve on February 28, 2007 10:13 PM

    I have used the above code for my personal coding project s|og , accessible at my personal site. if it is considered “closed source” please let me know and I will remove it.

  18. by Fredway on March 16, 2007 02:50 AM

    How can I mod_rewrite this one: http://www.firstinkjets.co.uk/index.php?action=terms&sessionid=sMkxiM6uqcMLFFXwv0ix5jPvSsZ7HzPAZ4mh246c1yCF2WKo0AG06lrks5O3xL5U

    please help me on my problem…please

  19. by Fredway on March 16, 2007 04:17 AM

    how to remove sessionid using mod rewrite? heres the url: http://www.firstinkjets.co.uk/index.php?action=terms &sessionid=sMkxiM6uqcMLFFXwv0ix5jPvSsZ7HzPAZ4mh246c1yCF 2WKo0AG06lrks5O3xL5U change it to http://www.firstinkjets.co.uk/action/terms.html

  20. by Pezco on March 27, 2007 10:31 PM

    What abouth style sheets and images, they will not load unless you specify an absolute URL.

  21. by Kumar on July 10, 2007 05:54 PM

    Thanks for explaining mod_rewrite , I need to do the following : when users go to www.abc.com/owner it should redirect to

    www.abc.com/servlet/Satellite/owner

    www.abc.com/seller to www.abc.com/servlet/Satellite/seller etc .

    I need to use a wild card since there are 100 of directors that are needed to be redirected . Is there a way we can hide Satellite/owner form the user’s browser ?

    thanks Sag

other relevant pages

about wwm

workingwith.me.uk is a resource for web developers created by Neil Crosby, a web developer who lives and works in London, England. More about the site.

Neil Crosby now blogs at The Code Train and also runs NeilCrosby.com, The Ten Word Review and Everything is Rubbish.