marți, 27 septembrie 2011

Website with subdomains (addition about cookies and sessions)

Oh, and yes,
If you want the session or the cookies to be working over all of the domains, you'd have to specify this for the cookies or for the session.

This is how to do it in PHP. It works for named domain names, as well as for IP's and also for localhost or LAN server names, but not for composed TLD's (e.g. works for example.com, but not for example.co.uk):


if ( strpos($_SERVER["SERVER_NAME"], ".")!==false && preg_match("/([^0-9\.])/i", $_SERVER["SERVER_NAME"]) ) // check if there are any dots inside the server name and also letters (excludes IP's and LAN server names)
{
$p = preg_split("/\./i", $_SERVER["SERVER_NAME"], -1, PREG_SPLIT_NO_EMPTY); // split the domain name by dots (www, example, com)
while ( count($p)>2 ) array_shift($p); // push out all elements except the last two (example and com)
$COOKIE_PARAMS["dom"] = ".".join(".", $p); // this is what we need ".example.com"
}
else $COOKIE_PARAMS["dom"] = $_SERVER['SERVER_NAME']; // the full name of the server for IP's and local server names
$COOKIE_PARAMS["path"] = "/"; // the path under which the cookie is available - we'll use this also for the session

session_set_cookie_params (0, $COOKIE_PARAMS["path"], $COOKIE_PARAMS["dom"]);
// lifetime =0 (until browser is closed),
// path=$COOKIE_PARAMS["path"] (path under which the session is available),
// and the important part: domain=$COOKIE_PARAMS["dom"] (".example.com" so it works for all possible subdomains)

setcookie("mycookie", "myvalue", time()+30*60, $COOKIE_PARAMS["path"], $COOKIE_PARAMS["dom"], false); // set a cookie "mycookie" with the value "myvalue", lifetime 30 minutes, available under the path $COOKIE_PARAMS["path"]



This is it!

Questions? Comments?

luni, 26 septembrie 2011

Website with subdomains

Given a website - it is requested that the users are given the possibility to create subdomains of their own choice.
The domain name is configured as a vhost on a http Apache Server. The * must be configured in the server alias directive, so all subdomains are mapped to the same website (see the following lines).

<VirtualHost *:80>
DocumentRoot "/[...]/example_path/"
ServerName "www.example.com"
ServerAlias "www.example.com" "example.com" "*.example.com"

<Directory "/[...]/example_path">
Options FollowSymLinks
AllowOverride All
Order allow,deny
Allow from all
</Directory>

</VirtualHost>


Restrictions:
- the subdomain names should be at least 6 and at most 63 characters in length.
- chosen to restrict the subdomain names to only alphanumeric characters (regular expression: [a-z0-9])
- the subdomain name must begin with a letter (regex: [a-z][a-z0-9]{5,63})


Solution:
- create a table which holds the user ids (users are logged in and have a 'uid' integer unique index stored in the database) and the subdomain names. We set a unique constraint on the subdomain column. This excludes the possibility that two users choose the same domain name

CREATE TABLE `subdomains` (
`uid` int(11) unsigned DEFAULT NULL,
`subdomain` varchar(128) DEFAULT NULL,
PRIMARY KEY (`uid`, `subdomain`),
UNIQUE KEY `subdomain` (`subdomain`)
);


- create a web form in which the user is allowed to choose a subdomain name.

<form action="choose_subdomain.php" method="POST">
Choose a subdomain: <input type="text" name="subdomain" maxlength="63" value="">
<input type="submit">
</form>

- in choose_subdomain.php - we check the form data and if no errors, save the new domain name:

if ( count($_POST) )
{
$_POST["subdomain"] = trim($_POST["subdomain"]);
if ( empty($_POST["subdomain"]) )
{
die("you have to choose a subdomain");
}
elseif ( !preg_match("/([a-z])([a-z0-9\-\_]+)/i", $_POST["subdomain"]) || strlen($_POST["subdomain"])<6 || strlen($_POST["subdomain"])>63 )
{
die("the subdomain is not correctly formatted. It has to start with a letter, can be between 6 and 63 characters long. Only letters and numbers are allowed");
}
else
{
// we can additionally check for the subdomain here
// e.g. SELECT * FROM subdomains WHERE subdomain='_POST[subdomain]'
// if we find any results - then it's an error - the subdomain is already taken
// otherwise, we can just insert it
// e.g. INSERT INTO subdomains SET uid=_SESSION[uid], subdomain=_POST[subdomain]
// now we have a new subdomain registered, we can redirect to it
// header("Location: http://"._POST[subdomain].".example.com");
}
}
else
{
// just display the form
}



To get it to actually work:
- in the root directory of our website ("/[...]/example_path/" - as set in the Apache httpd.conf vhost area), we create or edit the .htaccess file.
We have to add or change the RewriteRule directive to set it to "On", like this:

RewriteEngine on


Just after this, we have to add a few lines like these:


RewriteCond %{SERVER_NAME} ^([a-z][a-z0-9]{5,62}).example.com [NC]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^(.*) http://%1.example.com/subdomain_home.php?old_req=$0 [L]


What do these lines above do?
- The first one checks if the page was accessed via a subdomain with the same format as specified above.
RewriteCond %{SERVER_NAME} ^([a-z][a-z0-9]{5,62}).example.com [NC]
For example, accessing www.example.com/somepath does not match the regular expression ([a-z][a-z0-9]{5,62}).example.com which allows only 6 to 63 alphanumeric characters before the ".example.com" string. [NC] is a flag for the regex to match No-Case sensitive strings (ABCDEF would be matched as well as abcdef).

- The second line asks that the request is not an existing file.
RewriteCond %{REQUEST_FILENAME} !-f

This is needed to exclude indefinite loops. More info here: http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
In practice, in the output HTML page, there will be references to images, CSS, JS, or other kinds of files, for example, supposing we have a file "main.css" in our example_path/ directory, we could find within our webpage:

<link rel="stylesheet" href="main.css">


When we access the webpage via a subdomain (say "http://abcdef.example.com", the browser would request the relative path file: "http://abcdef.example.com/main.css" . When Apache will try to respond to this request, the first RewriteCond condition would be met. If we skip the !-f condition, then the processor would continue downwards to the rewrite rule, and it would translate the request to: "http://%1.example.com/subdomain_home.php". So we would never get the actual CSS file, instead the HTML output from subdomain_home.php, and then again inside it a request to the CSS file, and so on. So, requests to the existing files don't need to be rewritten/redirected, we could end up with indefinite loops.

- The third line says the request isn't towards an existing directory (this is maybe less found in practice). The reason is the same as above, for the existing files.
RewriteCond %{REQUEST_FILENAME} !-d

- The fourth line says the request isn't towards an existing symlink. The reason is the same as above, for the existing files and directories.
RewriteCond %{REQUEST_FILENAME} !-l

- The fifth and last line, does the actual rewrite:
RewriteRule ^(.*) http://%1.example.com/subdomain_home.php [L]

This line redirects all requests with or without any path, to the file subdomain_home.php on the same subdomain matched at line 1 ([a-z][a-z0-9]{5,62}). Please note that %1 matches the first regex at RewriteCond (line 1), while $0 matches the first regex at RewriteRule (line 5). More info on this on apache's doc pages.
The regular expression "^(.*)" translates to: Beginning of line "^", and any character until newline any number of times - even zero - so both requests with empty path or some path match this rule: "http://abcdef.example.com" - no path specified; or any path given: "http://abcdef.example.com/example_page.html".
The [L] flag asks this is the last interpreted rewrite on this request. It stops the rule interpretation process and does the actual request.



The subdomain_home.php file, placed also in our website's root directory ("/[...]/example_path/subdomain_home.php") - will do whatever needed based on the subdomain requested. Here is an example of it:

if ( preg_match("/^([a-z][a-z0-9]{5,62})\.example\.com$/i", $_SERVER['SERVER_NAME'], $matches) )
{
$subdomain = $matches[1];
$r = mysql_query("SELECT uid FROM subdomains WHERE subdomain='".$subdomain."'");
if ( mysql_num_rows($r) ) { $row = mysql_fetch_assoc($r); $subdomain_owner = $row["uid"]; }
}

if ( $subdomain_owner )
{
echo "Yippiee! You have successfully reached the working subdomain of user ".$subdomain_owner."! <br>\n";
echo "You have originally requested the page: http://".$_SERVER['SERVER_NAME'].$_GET["old_req"]." <br>\n";
echo "And you have reached: http://".$_SERVER['SERVER_NAME'].$_SERVER["REQUEST_URI"]." <br>\n";
}
else
{
die("you have accessed this webpage from an inexisting subdomain. go away!");
}



That's it!
Opinions? Thoughts?

marți, 13 septembrie 2011

Regular expression for URL matching

Searching for a regular expression pattern to match URL-s. Some get really complicated, none really seems to meet all needs.
I've built my own which at a first glance, satisfies many variants met in practice.
Here's the regular expression pattern, and the results ran on some test strings (some well-formed and some not well-formed URL's).

What do you think about it, if anybody tests this, please send some feedback.

Thanks,
Lucian Costin



TESTING WITH RegEX
^http\:\/\/([\w\d\-\.]+)\.([\w\d\-\.]+)($|\/|\?|\#|(\\[\w\d\:\#\@\%\/\;$\~\_\?\\+\-\=\\\.\&\,]+))


does not match http://
does not match http://123
does not match http://www
does not match http://www.
does match http://a.com
does match http://a.com??
does match http://www.a.com/a
does match http://www.a.com/a/b
does match http://www.a.com/a/b/a.php?a
does match http://www.a.com/aa.php?a
does match http://www.a.com/aa.php?a=b
does match http://www.a.com/aa.php?a=b#abcd
does match http://127.027.0.2
does match http://127.027.0.2#be
does match http://127.027.0.2/
does match http://127.027.0.2/a
does match http://127.027.0.2/a.php
does match http://127.027.0.2/a.php?a
does match http://127.027.0.2/?a
does match http://a.com?a
does match http://a.com#me