| Subcribe via RSS

Replace Ampersands but Not &amp

March 19th, 2009 Posted in Regular Expressions
We want to replace ampersands with the HTML entity ‘&’. The regexp to match is simple: &, i.e. match one ampersand. Unfortunately this will mess up our text if some of the ampersands have already been turned into HTML entities. So what we really want to say is replace an ampersand providing it is not followed by ‘amp;’. For this we need the negative lookahead assertion and our regexp becomes: &(?!amp;). The negative lookahead assertion is introduced with ‘(?!’ and finishes at the ‘)’. It means that the text it contains, ‘amp;’ in our example, must not follow the expression that preceeds it. –Courtesy of Linux.die.net