Yesterday I was mainly deploying a new release of php code to a Linux server.  Not that it’s a problem, but I normally develop php web application on IIS running on Windows Server just because I know where all the knobs and switches are.  VS.PHP from jcx.software is a great add-in for Visual Studio and with the Zend server side debugger it makes an awesome development environment. Sure I could use Eclipse PDT in a similar way for free, but ‘m more productive using Visual Studio; rightly or wrongly, the keyboard shortcuts are now part of my DNA.

I never quite feel comforable using Linux and I’m not sure why. Clearly it’s an excellent OS and I have  several embedded boxes which have been running Linux that I’ve never had to reboot. My WRT54G and NSLU2 always perform flawlessly. However, while on my journey home last night I realised what the problem might be. Linux makes my fingers hurt with all the typing I have to do!

Anyway, tip of the day for me was this little gem. I needed to copy the (massive) previous releases directory into my deployment working area and was happily using

cp -r blah blah

Then I started worrying about whether the permissions were being copied. (Ok so I’m happy to admit my perpetual virgin status when it comes to Linux). It was then pointed out to me buried in the directory structure were a raft of symbolic links and using cp would do a deep copy and fill up my server disc space pretty quickly. “Use tar to copy from the current directory to the new one like this”, they said.

tar cBf -.|(cd new_directory && tar xvBf -)

It worked a treat, I feel less like a Linux virgin, but my fingers still hurt.

There loads of these all over the place, but here’s some useful preg_replace examples for text and html processing that were hard to find or I ended up writing  – use/praise/embellish/flame as you see fit.

Remove repeated words (case insensitive)

$text = preg_replace("/\s(\w+\s)\1/i", "$1", $text);

‘Keep your your head’ becomes ‘Keep your head’

 Remove repeated punctuation

$text = preg_replace("/\.+/i", ".", $text); 

 ‘Keep your head…’ becomes ‘Keep your head.’ Don’t forget to escape regex characters.

Clean up a sentence end that has no trailing space

$text = preg_replace("/\.(?! )/i", ". ", $text);

‘Keep your head.Don’t fall apart’ becomes ‘Keep your head. Don’t fall apart’  This uses lookahead.

Remove carriage returns, line feeds and tabs

$text = str_replace(array("\r\n", "\r", "\n", "\t"), '', $text);

An oldy but goody.

Get all image urls from an html document

$images = array();
preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
	$info = pathinfo($url);
	if (isset($info['extension']))
	{
		if (($info['extension'] == 'jpg') || 
		($info['extension'] == 'jpeg') || 
		($info['extension'] == 'gif') || 
		($info['extension'] == 'png'))
		array_push($images, $url);
	}
}
Puts all the image URLs in an array

Strip non printable characters

$text = preg_replace("/[^[:print:]]+/", "", $text);

Does what it says on the tin

Remove HTML tags

$text = preg_replace
	(
	array(
	// Remove invisible content
	'@<head[^>]*?>.*?</head>@siu',
	'@<style[^>]*?>.*?</style>@siu',
	'@<script[^>]*?.*?</script>@siu',
	'@<object[^>]*?.*?</object>@siu',
	'@<embed[^>]*?.*?</embed>@siu',
	'@<applet[^>]*?.*?</applet>@siu',
	'@<noframes[^>]*?.*?</noframes>@siu',
	'@<noscript[^>]*?.*?</noscript>@siu',
	'@<noembed[^>]*?.*?</noembed>@siu',
	// Add line breaks before & after blocks
	'@<((br)|(hr))@iu',
	'@</?((address)|(blockquote)|(center)|(del))@iu',
	'@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu',
	'@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu',
	'@</?((table)|(th)|(td)|(caption))@iu',
	'@</?((form)|(button)|(fieldset)|(legend)|(input))@iu',
	'@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu',
	'@</?((frameset)|(frame)|(iframe))@iu',),
	array(
	' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ',
	"\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0",
	"\n\$0", "\n\$0",),$text
	)
// Remove all remaining tags and comments and return.
$text = strip_tags( $text );

Ok, so strip_tags sort of does this, but fails to remove script, style etc etc.

PHP performance tips

April 22, 2009

It’s Sunday morning, the weather’s terrific and I’m sitting at the desk in the office at home finishing off some project work that should have been done on Friday. Still, it’ll be done in a jiffy and I can take the dog out to the meadow in St Ives and reflect on why I don’t work at home every day rather than waste 4 hours commuting. It’s more productive, it doesn’t seem like work and I just get more time.

Anyway, I can’t talk too much about the project itself, other than it’s a website developed using PHP which involves extensive natural language parsing and processing. It needed to be performant and scalable so, although I’ve used PHP many times before, I thought it would be useful to revisit PHP performance.

Consequently, I build a small performance testing framework so I could quickly evaluate which PHP methods yielded the best results. Over the duration of the development I’ve compiled the following list which I thought I’d share.

  • Use single quotes over double quotes.
  • Use switch over lots of if statements.
  • Avoid testing loop conditionals with function tests every iteration eg. for($i=0;i<=count($x);$i++){…
  • Use foreach for looping collections/arrays.
  • PHP4 items are byval PHP5 items are byref
  • Consider using the Singleton Method when creating complex PHP classes.
  • Use POST over GET for all values that will wind up in the database for TCP/IP packet performance reasons.
  • Use ctype_alnum,ctype_alpha and ctype_digit over regular expression to test form value types for performance reasons.
  • Use full file paths in production environment over basename/fileexists/open_basedir to avoid performance hits for the filesystem having to hunt through the file path. Once determined, serialize and/or cache path values in a $_SETTINGS array. $_SETTINGS[“cwd”]=cwd(./);
  • Use require/include over require_once/include_once to ensure proper opcode caching.
  • Use tmpfile or tempnam for creating temp files/filenames
  • Use a proxy to access web services (XML or JSOM) on foreign domains using XMLHTTP to avoid cross-domain errors. eg. wibble.com<–>XMLHTTP<–>wobble.com
  • Use error_reporting (E_ALL); during debug.
  • Set Apache allowoverride to “none” to improve Apache performance in accessing files/directories.
  • Use a fast fileserver for serving static content (thttpd). static.mydomain.com, dynamic.mydomain.com
  • Serialize application settings like paths into an associative array and cache or serialize that array after first execution.
  • Use PHP output control buffering for page caching of heavilty accessed pages
  • Use PDO prepare over native db prepare for statements. mysql_attr_direct_query=>1
  • Do NOT use SQL wildcard select. eg. SELECT *
  • Avoid using SQL directive DISTINCT
  • Use database logic (queries, joins, views, procedures) over loopy PHP.
  • Use shortcut syntax for SQL inserts if not using PDO parameters parameters. eg. insert into sometable (field1,feild2) values ((“x”,”y”),(“p”,”q”));
  • Use Zend – it’s the best PHP library around

Comments on a postcard please.

A nuSOAP star

November 20, 2008

It’s not often I’m impressed with a new soap star – other than Hollyoaks and there’s only one reason to watch that. However, I recently had to integrate a web service client in the web site that displays a list of all the senior level consultants in the company. The website in question is written using php so a php SOAP consumer would be needed. And, as usual, I needed to get it done quickly.

There were a few options:

After some investigation and for various reasons I chose nuSOAP which is staggeringly easy to use. Here’s the code I needed to make it work – and there’s not much of it.

<?php
defined('_JEXEC') OR defined('_VALID_MOS') OR die( "Direct Access Is Not Allowed" );
require_once('soap\nusoap.php');
$proxyhost = isset($_POST['proxyhost']) ? $_POST['proxyhost'] : '';
$proxyport = isset($_POST['proxyport']) ? $_POST['proxyport'] : '';
$proxyusername = isset($_POST['proxyusername']) ? $_POST['proxyusername'] : '';
$proxypassword = isset($_POST['proxypassword']) ? $_POST['proxypassword'] : '';
$useCURL = isset($_POST['usecurl']) ? $_POST['usecurl'] : '0';
$client = new soapclient('http://somewebsite.com/Consultants.asmx?WSDL', true, $proxyhost, $proxyport, $proxyusername, $proxypassword);
$client->setUseCurl($useCURL);
$result = $client->call('GetConsultants', array(), '', '', false, true);
if ($client->fault)
{
   echo '<h2>Fault</h2><pre>';
   print_r($result);
   echo '</pre>';
}
else
{
   $err = $client->getError();
   if ($err)
   {
      echo '<h2>Error</h2><pre>' . $err . '</pre>';
   }
   else
   {
      print('<table class="contentpaneopen">');
      for ($i=0; $i<sizeof($result["GetConsultantsResult"]["ConsultantDetail"]); $i++)
      {
         print('<tr><td>');
         print('<h4>'.$result["GetConsultantsResult"]["ConsultantDetail"][$i]
                    ["Firstname"].'&nbsp;'.$result["GetConsultantsResult"]
                    ["ConsultantDetail"][$i]["Lastname"].'</h4>');
         print('</td></tr>');
         print('<tr><td>');
         print($result["GetConsultantsResult"]["ConsultantDetail"][$i]
                      ["Description"]);
         print('</td></tr>');
      }
      print('</table>');
   }
}
?>

nuSOAP is particularly easy to debug and display the SOAP request and responses. Incidently, this needed to be incorporated into a Joomla based site which normally means having to write a module or plugin. Yet again I was saved by the Jumi plugin which meant I could just put the php directly into an Article. Couldn’t be simpler. You can see it working here http://www.candelamedia.co.uk/index.php/candela-consulting/our-consultants