PHP preg_replace – some useful regular expressions
April 22, 2009
There loads of these all over the place, but here’s some useful preg_replace examples for text and html processing that were hard to find or I ended up writing – use/praise/embellish/flame as you see fit.
Remove repeated words (case insensitive)
$text = preg_replace("/\s(\w+\s)\1/i", "$1", $text);
‘Keep your your head’ becomes ‘Keep your head’
Remove repeated punctuation
$text = preg_replace("/\.+/i", ".", $text);
‘Keep your head…’ becomes ‘Keep your head.’ Don’t forget to escape regex characters.
Clean up a sentence end that has no trailing space
$text = preg_replace("/\.(?! )/i", ". ", $text);
‘Keep your head.Don’t fall apart’ becomes ‘Keep your head. Don’t fall apart’ This uses lookahead.
Remove carriage returns, line feeds and tabs
$text = str_replace(array("\r\n", "\r", "\n", "\t"), '', $text);
An oldy but goody.
Get all image urls from an html document
$images = array(); preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $data, $media); unset($data); $data=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]); foreach($data as $url) { $info = pathinfo($url); if (isset($info['extension'])) { if (($info['extension'] == 'jpg') || ($info['extension'] == 'jpeg') || ($info['extension'] == 'gif') || ($info['extension'] == 'png')) array_push($images, $url); } } Puts all the image URLs in an array
Strip non printable characters
$text = preg_replace("/[^[:print:]]+/", "", $text);
Does what it says on the tin
Remove HTML tags
$text = preg_replace ( array( // Remove invisible content '@<head[^>]*?>.*?</head>@siu', '@<style[^>]*?>.*?</style>@siu', '@<script[^>]*?.*?</script>@siu', '@<object[^>]*?.*?</object>@siu', '@<embed[^>]*?.*?</embed>@siu', '@<applet[^>]*?.*?</applet>@siu', '@<noframes[^>]*?.*?</noframes>@siu', '@<noscript[^>]*?.*?</noscript>@siu', '@<noembed[^>]*?.*?</noembed>@siu',
// Add line breaks before & after blocks '@<((br)|(hr))@iu', '@</?((address)|(blockquote)|(center)|(del))@iu', '@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu', '@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu', '@</?((table)|(th)|(td)|(caption))@iu', '@</?((form)|(button)|(fieldset)|(legend)|(input))@iu', '@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu', '@</?((frameset)|(frame)|(iframe))@iu',),
array( ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0",),$text ) // Remove all remaining tags and comments and return. $text = strip_tags( $text );
Ok, so strip_tags sort of does this, but fails to remove script, style etc etc.
May 9, 2013 at 8:19 am
You’re so interesting! I do not believe I’ve truly read a single thing like this before.
So good to find another person with some unique thoughts on this subject.
Seriously.. thanks for starting this up. This site is something that is needed
on the internet, someone with some originality!
May 12, 2013 at 5:25 pm
I’m truly enjoying the design and layout of your blog. It’s a very easy on the eyes which makes it
much more enjoyable for me to come here and visit more often.
Did you hire out a developer to create your theme? Outstanding work!
June 1, 2013 at 7:01 pm
I think the admin of this site is truly working hard in favor
of his web site, because here every material is quality based data.
June 2, 2013 at 5:09 am
Good post. I learn something totally new and challenging on sites I stumbleupon
on a daily basis. It’s always exciting to read articles from other writers and practice a little something from other web sites.
June 11, 2013 at 5:00 am
You’re so cool! I don’t believe I have read anything like that before.
So great to discover somebody with unique thoughts on
this topic. Seriously.. many thanks for starting this up.
This website is something that is needed on the internet, someone with a little originality!
June 21, 2013 at 2:45 pm
[…] » Fonte Remover palavras repetidas (case insensitive) […]
March 1, 2014 at 8:16 am
[…] » Source […]
May 9, 2014 at 3:32 am
[…] » Fuente Eliminar palabras repetidas (mayúsculas y minúsculas) […]
May 9, 2014 at 3:33 am
[…] » Source Supprimer les mots répétés (insensible à la casse) […]
April 14, 2015 at 4:28 am
[…] expressions tester that allows visitors to construct, test, and optimize regular expressions.PHP preg_replace – some useful regular expressions « Aliens ate my GUIThere loads of these all over the place, but here’s some useful preg_replace examples for text […]
October 19, 2015 at 7:46 am
Hi there, its good piece of writing concerning media print,
we all understand media is a great source of data.
October 19, 2015 at 8:52 am
Hello colleagues, good article and pleasant arguments commented here, I am actually enjoying by these.
October 20, 2015 at 8:02 am
I’m curious to find out what blog platform you’re utilizing?
I’m having some minor security issues with my latest blog
and I would like to find something more secure.
Do you have any recommendations?
August 9, 2016 at 2:47 pm
[…] » Source […]
October 13, 2018 at 4:41 am
It’s a pity you don’t have a donate button! I’d without
a doubt donate to this outstanding blog! I guess for now i’ll
settle for bookmarking and adding your RSS feed to my Google account.
I look forward to new updates and will talk about this site with my Facebook group.
Talk soon!
October 14, 2018 at 1:42 am
Do you mind if I quote a few of your articles as long as I provide credit and sources back to your site?
My blog site is in the very same area of interest as yours
and my users would really benefit from some of the information you provide
here. Please let me know if this okay with you. Thanks!
October 14, 2018 at 11:15 am
Go fo it!
October 4, 2021 at 1:36 pm
[…] Source : Dave Brooks […]
March 28, 2022 at 1:49 am
[…] » Fuente […]