| Subcribe via RSS

Regular Expression Catch All Syntax

August 3rd, 2009 | Comments Off | Posted in Regular Expressions
One of the most used regular expressions in my arsenal is also one of the most simple. It is basically a catchall to find everything between to strings of text within a document. It is (.|\n)*? (see it in action below). As you can see in the example, we are looking to replace everything between the beginning and ending script tags, including the tags, with an empty string. It is very basic and very powerful so use with care.

Replace Ampersands but Not &amp

March 19th, 2009 | 1 Comment | Posted in Regular Expressions
We want to replace ampersands with the HTML entity ‘&’. The regexp to match is simple: &, i.e. match one ampersand. Unfortunately this will mess up our text if some of the ampersands have already been turned into HTML entities. So what we really want to say is replace an ampersand providing it is not followed by ‘amp;’. For this we need the negative lookahead assertion and our regexp becomes: &(?!amp;). The negative lookahead assertion is introduced with ‘(?!’ and finishes at the ‘)’. It means that the text it contains, ‘amp;’ in our example, must not follow the expression that preceeds it. –Courtesy of Linux.die.net Tags: ,