Form and Spelling Validation -
(Page 5 of 14 )
With the numerous ways that URLs can be written, it is very difficult to create a function that will validate all URLs. The best thing to do is to decide what type of URLs you would like to accept and validate those. For instance, if your HTML form collects data for a guestbook you probably won't accept URLs that start with ftp:// or https:// for the user's homepage. For this example, we will assume that the URL we want to validate is for a user's homepage.
URLs for homepages can still come in many flavors. Consider the following examples:
http://www.somewhere.com/ http://www.somewhere.com http://somewhere.com http://me.somewhere.com/ http://somewhere.com/me http://somewhere.com/~me http://123.123.123.123/me |
...and the list goes on.
Creating a regular expression, or other means, to validate all these URLs would quickly become a mess. In cases such as this, it is best to just make it simple and accept just about anything. The rules we will place on our validation routine will simply be that a URL must start with http:// and must include at least one period that is surrounded by some other number of characters. Very generic, but the last thing you want to do is make something so restrictive that a user gives up in disgust.
<?php function checkURL($url) { return preg_match ("/http:\/\/(.*)\.(.*)/i", $url); } ?> |
Sometimes, however, this simple match isn't quite enough. If we were collecting URLs for a link directory, we might want to check to be sure that we can reach a host before publishing it. In this case, we would first do a simple check like above, but then we would also attempt to actually contact the host. We will accomplish this by using the parse_url() function of PHP to extract the hostname from the URL and then trying to open a socket to that hostname. It is a common misconception that the parse_url() function can be used to validate a URL. This is not true.
<?php function checkURLandConnect($url) { if (!preg_match ("/http:\/\/(.*)\.(.*)/i", $url)) { return FALSE; } $parts = parse_url($url); $fp = fsockopen($parts['host'], 80, $errno, $errstr, 10); if(!$fp) { return FALSE; } fclose($fp); return TRUE; } ?> |
It should be noted that this additional functionality should be used with care. If a malicious person knew that they could use your script, on your web server, to open a connection to another host, they could potentially perform a denial of service attack with it. One method to avoid this would be to limit how often this validation is done by tracking the submitters IP address and not allowing the validation to take place if the same user has submitted within a certain time frame.
Next: >>
More Miscellaneous Articles
More By Matt Wade