PHP Strings Primer - Manually Stripping Tags
(Page 28 of 37 )
Earlier, we looked at the 'strip_tags()' function to remove HTML tags from user input. But, what would happen if one day you woke up and that function no longer existed? Let's examine how to duplicate some of its functionality with string functions.
<?php $text = "The <b>word</b> is bolded.<br /> Here is a <a href=\"page.php\">link</a>.";
while (substr_count ($text, '<') != 0) { Â Â Â $start = strpos ($text, '<'); Â Â Â $end = strpos ($text, '>'); Â Â Â $length = ($end - $start) + 1; Â Â Â $text = substr_replace ($text, '', $start, $length); } echo $text; ?> |
This would output:
The word is bolded. Here is a link. |
Now, how exactly does this work? Well, we run a while loop whose condition is that there is at least once occurrence of the less than symbol in the text. As long as one less than symbol remains, the loop will continue. Within the loop, we use the 'strpos()' function to obtain the position of the first less than symbol and the first greater than symbol. These two symbols would make up the beginning and end of a HTML tag. We then find the length between the two positions.
After we have gathered these values we use the 'substr_replace()' function to replace the substring, specified by the start and length values, with an empty string. If there are still less than symbols remaining, the loop continues. The shortcomings of this code are if there is a less than symbol with no matching greater than symbol the loop will execute forever and if a greater than symbol appears prior to a less than symbol we will have data loss. We will address the infinite looping problem, but leave the second problem for you to develop a solution to with the knowledge of regular expressions you will gain later in your programming career.
To solve our infinite loop problem, we can simply set the length value to one when we have a situation where there is not a matching greater than symbol. Currently, we are getting a value for '$start' and '$end' is getting a value of 'false', which evaluates to zero. So, our length always ends up being a negative number. By checking to see if '$end' is equal to false and setting '$length' to one if it is, we can avoid the infinite loop.
<?php $text = "The <b>word</b> is bolded.<br /> Here is a <a href=\"page.php\">link</a><.";
while (substr_count ($text, '<') != 0) { Â Â Â $start = strpos ($text, '<'); Â Â Â $end = strpos ($text, '>'); Â Â Â if ($end === false) { Â Â Â Â Â Â Â $length = 1; Â Â Â } else { Â Â Â Â Â Â Â $length = ($end - $start) + 1; Â Â Â } Â Â Â $text = substr_replace ($text, '', $start, $length); } echo $text; ?> |
Next: Password Strength Revisited >>
More Programming Basics Articles
More By Matt Wade