PHP preg_replace – some useful regular expressions

April 22, 2009

There loads of these all over the place, but here’s some useful preg_replace examples for text and html processing that were hard to find or I ended up writing  – use/praise/embellish/flame as you see fit.

Remove repeated words (case insensitive)

$text = preg_replace("/\s(\w+\s)\1/i", "$1", $text);

‘Keep your your head’ becomes ‘Keep your head’

 Remove repeated punctuation

$text = preg_replace("/\.+/i", ".", $text); 

 ‘Keep your head…’ becomes ‘Keep your head.’ Don’t forget to escape regex characters.

Clean up a sentence end that has no trailing space

$text = preg_replace("/\.(?! )/i", ". ", $text);

‘Keep your head.Don’t fall apart’ becomes ‘Keep your head. Don’t fall apart’  This uses lookahead.

Remove carriage returns, line feeds and tabs

$text = str_replace(array("\r\n", "\r", "\n", "\t"), '', $text);

An oldy but goody.

Get all image urls from an html document

$images = array();
preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
	$info = pathinfo($url);
	if (isset($info['extension']))
	{
		if (($info['extension'] == 'jpg') || 
		($info['extension'] == 'jpeg') || 
		($info['extension'] == 'gif') || 
		($info['extension'] == 'png'))
		array_push($images, $url);
	}
}
Puts all the image URLs in an array

Strip non printable characters

$text = preg_replace("/[^[:print:]]+/", "", $text);

Does what it says on the tin

Remove HTML tags

$text = preg_replace
	(
	array(
	// Remove invisible content
	'@<head[^>]*?>.*?</head>@siu',
	'@<style[^>]*?>.*?</style>@siu',
	'@<script[^>]*?.*?</script>@siu',
	'@<object[^>]*?.*?</object>@siu',
	'@<embed[^>]*?.*?</embed>@siu',
	'@<applet[^>]*?.*?</applet>@siu',
	'@<noframes[^>]*?.*?</noframes>@siu',
	'@<noscript[^>]*?.*?</noscript>@siu',
	'@<noembed[^>]*?.*?</noembed>@siu',
	// Add line breaks before & after blocks
	'@<((br)|(hr))@iu',
	'@</?((address)|(blockquote)|(center)|(del))@iu',
	'@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu',
	'@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu',
	'@</?((table)|(th)|(td)|(caption))@iu',
	'@</?((form)|(button)|(fieldset)|(legend)|(input))@iu',
	'@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu',
	'@</?((frameset)|(frame)|(iframe))@iu',),
	array(
	' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ',
	"\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0",
	"\n\$0", "\n\$0",),$text
	)
// Remove all remaining tags and comments and return.
$text = strip_tags( $text );

Ok, so strip_tags sort of does this, but fails to remove script, style etc etc.

19 Responses to “PHP preg_replace – some useful regular expressions”


  1. You’re so interesting! I do not believe I’ve truly read a single thing like this before.
    So good to find another person with some unique thoughts on this subject.
    Seriously.. thanks for starting this up. This site is something that is needed
    on the internet, someone with some originality!


  2. I’m truly enjoying the design and layout of your blog. It’s a very easy on the eyes which makes it
    much more enjoyable for me to come here and visit more often.
    Did you hire out a developer to create your theme? Outstanding work!


  3. I think the admin of this site is truly working hard in favor
    of his web site, because here every material is quality based data.


  4. Good post. I learn something totally new and challenging on sites I stumbleupon
    on a daily basis. It’s always exciting to read articles from other writers and practice a little something from other web sites.

  5. Lenard Says:

    You’re so cool! I don’t believe I have read anything like that before.
    So great to discover somebody with unique thoughts on
    this topic. Seriously.. many thanks for starting this up.
    This website is something that is needed on the internet, someone with a little originality!


  6. […] » Fonte Remover palavras repetidas (case insensitive) […]


  7. […] »  Fuente Eliminar palabras repetidas (mayúsculas y minúsculas) […]


  8. […] »  Source Supprimer les mots répétés (insensible à la casse) […]


  9. […] expressions tester that allows visitors to construct, test, and optimize regular expressions.PHP preg_replace – some useful regular expressions « Aliens ate my GUIThere loads of these all over the place, but here’s some useful preg_replace examples for text […]


  10. Hi there, its good piece of writing concerning media print,
    we all understand media is a great source of data.


  11. Hello colleagues, good article and pleasant arguments commented here, I am actually enjoying by these.


  12. I’m curious to find out what blog platform you’re utilizing?
    I’m having some minor security issues with my latest blog
    and I would like to find something more secure.
    Do you have any recommendations?


  13. It’s a pity you don’t have a donate button! I’d without
    a doubt donate to this outstanding blog! I guess for now i’ll
    settle for bookmarking and adding your RSS feed to my Google account.
    I look forward to new updates and will talk about this site with my Facebook group.
    Talk soon!


  14. Do you mind if I quote a few of your articles as long as I provide credit and sources back to your site?
    My blog site is in the very same area of interest as yours
    and my users would really benefit from some of the information you provide
    here. Please let me know if this okay with you. Thanks!


Leave a reply to laser spine surgery florida Cancel reply