luni, 26 septembrie 2011

Website with subdomains

Given a website - it is requested that the users are given the possibility to create subdomains of their own choice.
The domain name is configured as a vhost on a http Apache Server. The * must be configured in the server alias directive, so all subdomains are mapped to the same website (see the following lines).

<VirtualHost *:80>
DocumentRoot "/[...]/example_path/"
ServerName "www.example.com"
ServerAlias "www.example.com" "example.com" "*.example.com"

<Directory "/[...]/example_path">
Options FollowSymLinks
AllowOverride All
Order allow,deny
Allow from all
</Directory>

</VirtualHost>


Restrictions:
- the subdomain names should be at least 6 and at most 63 characters in length.
- chosen to restrict the subdomain names to only alphanumeric characters (regular expression: [a-z0-9])
- the subdomain name must begin with a letter (regex: [a-z][a-z0-9]{5,63})


Solution:
- create a table which holds the user ids (users are logged in and have a 'uid' integer unique index stored in the database) and the subdomain names. We set a unique constraint on the subdomain column. This excludes the possibility that two users choose the same domain name

CREATE TABLE `subdomains` (
`uid` int(11) unsigned DEFAULT NULL,
`subdomain` varchar(128) DEFAULT NULL,
PRIMARY KEY (`uid`, `subdomain`),
UNIQUE KEY `subdomain` (`subdomain`)
);


- create a web form in which the user is allowed to choose a subdomain name.

<form action="choose_subdomain.php" method="POST">
Choose a subdomain: <input type="text" name="subdomain" maxlength="63" value="">
<input type="submit">
</form>

- in choose_subdomain.php - we check the form data and if no errors, save the new domain name:

if ( count($_POST) )
{
$_POST["subdomain"] = trim($_POST["subdomain"]);
if ( empty($_POST["subdomain"]) )
{
die("you have to choose a subdomain");
}
elseif ( !preg_match("/([a-z])([a-z0-9\-\_]+)/i", $_POST["subdomain"]) || strlen($_POST["subdomain"])<6 || strlen($_POST["subdomain"])>63 )
{
die("the subdomain is not correctly formatted. It has to start with a letter, can be between 6 and 63 characters long. Only letters and numbers are allowed");
}
else
{
// we can additionally check for the subdomain here
// e.g. SELECT * FROM subdomains WHERE subdomain='_POST[subdomain]'
// if we find any results - then it's an error - the subdomain is already taken
// otherwise, we can just insert it
// e.g. INSERT INTO subdomains SET uid=_SESSION[uid], subdomain=_POST[subdomain]
// now we have a new subdomain registered, we can redirect to it
// header("Location: http://"._POST[subdomain].".example.com");
}
}
else
{
// just display the form
}



To get it to actually work:
- in the root directory of our website ("/[...]/example_path/" - as set in the Apache httpd.conf vhost area), we create or edit the .htaccess file.
We have to add or change the RewriteRule directive to set it to "On", like this:

RewriteEngine on


Just after this, we have to add a few lines like these:


RewriteCond %{SERVER_NAME} ^([a-z][a-z0-9]{5,62}).example.com [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^(.*) http://%1.example.com/subdomain_home.php?old_req=$0 [L]


What do these lines above do?
- The first one checks if the page was accessed via a subdomain with the same format as specified above.
RewriteCond %{SERVER_NAME} ^([a-z][a-z0-9]{5,62}).example.com [NC]
For example, accessing www.example.com/somepath does not match the regular expression ([a-z][a-z0-9]{5,62}).example.com which allows only 6 to 63 alphanumeric characters before the ".example.com" string. [NC] is a flag for the regex to match No-Case sensitive strings (ABCDEF would be matched as well as abcdef).

- The second line asks that the request is not an existing file.
RewriteCond %{REQUEST_FILENAME} !-f

This is needed to exclude indefinite loops. More info here: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
In practice, in the output HTML page, there will be references to images, CSS, JS, or other kinds of files, for example, supposing we have a file "main.css" in our example_path/ directory, we could find within our webpage:

<link rel="stylesheet" href="main.css">


When we access the webpage via a subdomain (say "http://abcdef.example.com", the browser would request the relative path file: "http://abcdef.example.com/main.css" . When Apache will try to respond to this request, the first RewriteCond condition would be met. If we skip the !-f condition, then the processor would continue downwards to the rewrite rule, and it would translate the request to: "http://%1.example.com/subdomain_home.php". So we would never get the actual CSS file, instead the HTML output from subdomain_home.php, and then again inside it a request to the CSS file, and so on. So, requests to the existing files don't need to be rewritten/redirected, we could end up with indefinite loops.

- The third line says the request isn't towards an existing directory (this is maybe less found in practice). The reason is the same as above, for the existing files.
RewriteCond %{REQUEST_FILENAME} !-d

- The fourth line says the request isn't towards an existing symlink. The reason is the same as above, for the existing files and directories.
RewriteCond %{REQUEST_FILENAME} !-l

- The fifth and last line, does the actual rewrite:
RewriteRule ^(.*) http://%1.example.com/subdomain_home.php [L]

This line redirects all requests with or without any path, to the file subdomain_home.php on the same subdomain matched at line 1 ([a-z][a-z0-9]{5,62}). Please note that %1 matches the first regex at RewriteCond (line 1), while $0 matches the first regex at RewriteRule (line 5). More info on this on apache's doc pages.
The regular expression "^(.*)" translates to: Beginning of line "^", and any character until newline any number of times - even zero - so both requests with empty path or some path match this rule: "http://abcdef.example.com" - no path specified; or any path given: "http://abcdef.example.com/example_page.html".
The [L] flag asks this is the last interpreted rewrite on this request. It stops the rule interpretation process and does the actual request.



The subdomain_home.php file, placed also in our website's root directory ("/[...]/example_path/subdomain_home.php") - will do whatever needed based on the subdomain requested. Here is an example of it:

if ( preg_match("/^([a-z][a-z0-9]{5,62})\.example\.com$/i", $_SERVER['SERVER_NAME'], $matches) )
{
$subdomain = $matches[1];
$r = mysql_query("SELECT uid FROM subdomains WHERE subdomain='".$subdomain."'");
if ( mysql_num_rows($r) ) { $row = mysql_fetch_assoc($r); $subdomain_owner = $row["uid"]; }
}

if ( $subdomain_owner )
{
echo "Yippiee! You have successfully reached the working subdomain of user ".$subdomain_owner."! <br>\n";
echo "You have originally requested the page: http://".$_SERVER['SERVER_NAME'].$_GET["old_req"]." <br>\n";
echo "And you have reached: http://".$_SERVER['SERVER_NAME'].$_SERVER["REQUEST_URI"]." <br>\n";
}
else
{
die("you have accessed this webpage from an inexisting subdomain. go away!");
}



That's it!
Opinions? Thoughts?

Niciun comentariu:

Trimiteți un comentariu