Linux makes my fingers hurt
May 1, 2009
Yesterday I was mainly deploying a new release of php code to a Linux server. Not that it’s a problem, but I normally develop php web application on IIS running on Windows Server just because I know where all the knobs and switches are. VS.PHP from jcx.software is a great add-in for Visual Studio and with the Zend server side debugger it makes an awesome development environment. Sure I could use Eclipse PDT in a similar way for free, but ‘m more productive using Visual Studio; rightly or wrongly, the keyboard shortcuts are now part of my DNA.
I never quite feel comforable using Linux and I’m not sure why. Clearly it’s an excellent OS and I have several embedded boxes which have been running Linux that I’ve never had to reboot. My WRT54G and NSLU2 always perform flawlessly. However, while on my journey home last night I realised what the problem might be. Linux makes my fingers hurt with all the typing I have to do!
Anyway, tip of the day for me was this little gem. I needed to copy the (massive) previous releases directory into my deployment working area and was happily using
cp -r blah blah
Then I started worrying about whether the permissions were being copied. (Ok so I’m happy to admit my perpetual virgin status when it comes to Linux). It was then pointed out to me buried in the directory structure were a raft of symbolic links and using cp would do a deep copy and fill up my server disc space pretty quickly. “Use tar to copy from the current directory to the new one like this”, they said.
tar cBf -.|(cd new_directory && tar xvBf -)
It worked a treat, I feel less like a Linux virgin, but my fingers still hurt.
PHP preg_replace – some useful regular expressions
April 22, 2009
There loads of these all over the place, but here’s some useful preg_replace examples for text and html processing that were hard to find or I ended up writing – use/praise/embellish/flame as you see fit.
Remove repeated words (case insensitive)
$text = preg_replace("/\s(\w+\s)\1/i", "$1", $text);
‘Keep your your head’ becomes ‘Keep your head’
Remove repeated punctuation
$text = preg_replace("/\.+/i", ".", $text);
‘Keep your head…’ becomes ‘Keep your head.’ Don’t forget to escape regex characters.
Clean up a sentence end that has no trailing space
$text = preg_replace("/\.(?! )/i", ". ", $text);
‘Keep your head.Don’t fall apart’ becomes ‘Keep your head. Don’t fall apart’ This uses lookahead.
Remove carriage returns, line feeds and tabs
$text = str_replace(array("\r\n", "\r", "\n", "\t"), '', $text);
An oldy but goody.
Get all image urls from an html document
$images = array();
preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
$info = pathinfo($url);
if (isset($info['extension']))
{
if (($info['extension'] == 'jpg') ||
($info['extension'] == 'jpeg') ||
($info['extension'] == 'gif') ||
($info['extension'] == 'png'))
array_push($images, $url);
}
}
Puts all the image URLs in an array
Strip non printable characters
$text = preg_replace("/[^[:print:]]+/", "", $text);
Does what it says on the tin
Remove HTML tags
$text = preg_replace ( array( // Remove invisible content '@<head[^>]*?>.*?</head>@siu', '@<style[^>]*?>.*?</style>@siu', '@<script[^>]*?.*?</script>@siu', '@<object[^>]*?.*?</object>@siu', '@<embed[^>]*?.*?</embed>@siu', '@<applet[^>]*?.*?</applet>@siu', '@<noframes[^>]*?.*?</noframes>@siu', '@<noscript[^>]*?.*?</noscript>@siu', '@<noembed[^>]*?.*?</noembed>@siu',
// Add line breaks before & after blocks '@<((br)|(hr))@iu', '@</?((address)|(blockquote)|(center)|(del))@iu', '@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu', '@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu', '@</?((table)|(th)|(td)|(caption))@iu', '@</?((form)|(button)|(fieldset)|(legend)|(input))@iu', '@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu', '@</?((frameset)|(frame)|(iframe))@iu',),
array( ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0", "\n\$0",),$text ) // Remove all remaining tags and comments and return. $text = strip_tags( $text );
Ok, so strip_tags sort of does this, but fails to remove script, style etc etc.
PHP performance tips
April 22, 2009
It’s Sunday morning, the weather’s terrific and I’m sitting at the desk in the office at home finishing off some project work that should have been done on Friday. Still, it’ll be done in a jiffy and I can take the dog out to the meadow in St Ives and reflect on why I don’t work at home every day rather than waste 4 hours commuting. It’s more productive, it doesn’t seem like work and I just get more time.
Anyway, I can’t talk too much about the project itself, other than it’s a website developed using PHP which involves extensive natural language parsing and processing. It needed to be performant and scalable so, although I’ve used PHP many times before, I thought it would be useful to revisit PHP performance.
Consequently, I build a small performance testing framework so I could quickly evaluate which PHP methods yielded the best results. Over the duration of the development I’ve compiled the following list which I thought I’d share.
- Use single quotes over double quotes.
- Use switch over lots of if statements.
- Avoid testing loop conditionals with function tests every iteration eg. for($i=0;i<=count($x);$i++){…
- Use foreach for looping collections/arrays.
- PHP4 items are byval PHP5 items are byref
- Consider using the Singleton Method when creating complex PHP classes.
- Use POST over GET for all values that will wind up in the database for TCP/IP packet performance reasons.
- Use ctype_alnum,ctype_alpha and ctype_digit over regular expression to test form value types for performance reasons.
- Use full file paths in production environment over basename/fileexists/open_basedir to avoid performance hits for the filesystem having to hunt through the file path. Once determined, serialize and/or cache path values in a $_SETTINGS array. $_SETTINGS["cwd"]=cwd(./);
- Use require/include over require_once/include_once to ensure proper opcode caching.
- Use tmpfile or tempnam for creating temp files/filenames
- Use a proxy to access web services (XML or JSOM) on foreign domains using XMLHTTP to avoid cross-domain errors. eg. wibble.com<–>XMLHTTP<–>wobble.com
- Use error_reporting (E_ALL); during debug.
- Set Apache allowoverride to “none” to improve Apache performance in accessing files/directories.
- Use a fast fileserver for serving static content (thttpd). static.mydomain.com, dynamic.mydomain.com
- Serialize application settings like paths into an associative array and cache or serialize that array after first execution.
- Use PHP output control buffering for page caching of heavilty accessed pages
- Use PDO prepare over native db prepare for statements. mysql_attr_direct_query=>1
- Do NOT use SQL wildcard select. eg. SELECT *
- Avoid using SQL directive DISTINCT
- Use database logic (queries, joins, views, procedures) over loopy PHP.
- Use shortcut syntax for SQL inserts if not using PDO parameters parameters. eg. insert into sometable (field1,feild2) values ((“x”,”y”),(“p”,”q”));
- Use Zend – it’s the best PHP library around
Comments on a postcard please.
QuickTime, short temper
February 19, 2009
I’m currently setting up a trancode farm using Rhozet CarbonCoder. What a great piece of software this is and a dream to configure and use when compared with FlipFactory. And it’s very cheap!
However, it requires QuickTime to run – no problem I thought – but you try installing and running QuickTime v7.6 on Windows Server 2003 SP1. It installs fine, but absolutely refuses to run, unhelpfully telling you…
QuickTime failed to initialize (error -2096).
A quick search reveals it may be a compatibility issue. There is an article indicating that turning off Compatibility mode would resolve this issue, but Compatibility mode is not turned on for iTunes or QuickTime.
The same issue is discussed in the Quicktime forum, here:
http://discussions.apple.com/thread.jspa?messageID=6400305
I struggled for a couple of hours, frustrated by something seemingly so simple stopping me doing what I needed to do – as usual in a hurry. I lost count of how many times I installed and then uninstalled QuickTime and CarbonCoder. I even resorted to cleaning the registry of any refernces to QT.
I eventually gave up and decided to go home. As I was walking out I realized I hadn’t actually tried installing an older version. I quickly grabbed v7.2 from the Apple site and – hey presto – it worked.
Now the point of all this really is the Apple support forum driving me mad. Yes, there’s an article posted there with a few replies http://discussions.apple.com/thread.jspa?messageID=6595367 but it’s been archived (so no replies) so now I can’t post my conclusive findings.
On the one hand I think Apple is great compared with Microsoft. On the other I think they are just as bad as each other. I guess the wind is blowing and it changed my mind for the day.
Is that video for real?
November 21, 2008
Extreme makeover: computer science edition
Stanford artificial intelligence researchers have developed software that makes it easy to reach inside an existing video and place a photo on the wall so realistically that it looks like it was there from the beginning. The photo is not pasted on top of the existing video, but embedded in it It works for videos as well – you can play a video on a wall inside your video. The technology can cheaply do some of the tricks normally performed by expensive commercial editing systems.
A nuSOAP star
November 20, 2008
It’s not often I’m impressed with a new soap star – other than Hollyoaks and there’s only one reason to watch that. However, I recently had to integrate a web service client in the web site that displays a list of all the senior level consultants in the company. The website in question is written using php so a php SOAP consumer would be needed. And, as usual, I needed to get it done quickly.
There were a few options:
After some investigation and for various reasons I chose nuSOAP which is staggeringly easy to use. Here’s the code I needed to make it work – and there’s not much of it.
<?php
defined('_JEXEC') OR defined('_VALID_MOS') OR die( "Direct Access Is Not Allowed" );
require_once('soap\nusoap.php');
$proxyhost = isset($_POST['proxyhost']) ? $_POST['proxyhost'] : '';
$proxyport = isset($_POST['proxyport']) ? $_POST['proxyport'] : '';
$proxyusername = isset($_POST['proxyusername']) ? $_POST['proxyusername'] : '';
$proxypassword = isset($_POST['proxypassword']) ? $_POST['proxypassword'] : '';
$useCURL = isset($_POST['usecurl']) ? $_POST['usecurl'] : '0';
$client = new soapclient('http://somewebsite.com/Consultants.asmx?WSDL', true, $proxyhost, $proxyport, $proxyusername, $proxypassword);
$client->setUseCurl($useCURL);
$result = $client->call('GetConsultants', array(), '', '', false, true);
if ($client->fault)
{
echo '<h2>Fault</h2><pre>';
print_r($result);
echo '</pre>';
}
else
{
$err = $client->getError();
if ($err)
{
echo '<h2>Error</h2><pre>' . $err . '</pre>';
}
else
{
print('<table class="contentpaneopen">');
for ($i=0; $i<sizeof($result["GetConsultantsResult"]["ConsultantDetail"]); $i++)
{
print('<tr><td>');
print('<h4>'.$result["GetConsultantsResult"]["ConsultantDetail"][$i]
["Firstname"].' '.$result["GetConsultantsResult"]
["ConsultantDetail"][$i]["Lastname"].'</h4>');
print('</td></tr>');
print('<tr><td>');
print($result["GetConsultantsResult"]["ConsultantDetail"][$i]
["Description"]);
print('</td></tr>');
}
print('</table>');
}
}
?>
nuSOAP is particularly easy to debug and display the SOAP request and responses. Incidently, this needed to be incorporated into a Joomla based site which normally means having to write a module or plugin. Yet again I was saved by the Jumi plugin which meant I could just put the php directly into an Article. Couldn’t be simpler. You can see it working here http://www.candelamedia.co.uk/index.php/candela-consulting/our-consultants
Go JIRA. King of the Monsters!
October 17, 2008
I’ve been using JIRA for bug, issue and project management for a fair while now – it’s almost industry standard and has an excellent developer community for plugins. I’m forever asked ‘What does JIRA stand for?’ So……
Back in the days when those days when Bugzilla ruled the world and the guys at Atlassian dreamed of building a monster capable of beating it, they realised there was only one option – Godzilla. But calling their wonderful new bug and issue tracking system Godzilla was a bit obvious, so they abbreviated the original Japanese name for Godilla – Gojira. So, Gojira became JIRA: King of the issue management systems.
So when you get asked what does JIRA stands for, just tell them is an abbreviation, not an acronym.
Background noise from News
September 5, 2008
Today the Web is mostly angry because they are not sure about Googles intentions…
Microsoft / Firefox claim “Big boys came, stole our browser”
http://blogoscoped.com/archive/2008-09-01-n47.html
But then it turns out they are in your house, looking at what you do…
http://www.theregister.co.uk/2008/09/03/google_chrome_eula_sucks/
Interesting to read how NBC and Yahoo fought it out during the Olympics
http://www.streamingmedia.com/article.asp?id=10633
http://www.nytimes.com/2008/08/25/sports/olympics/25online.html?_r=1&partner=rssnyt&emc=rss&oref=slogin
http://www.followthemedia.com/sportsmedia/nbcdelay12082008.htm
BBC News compare well as this
http://newteevee.com/2008/08/28/final-tally-olympics-web-and-p2p-numbers/
And the early European summary…
http://www.euractiv.com/en/infosociety/beijing-games-marked-surge-internet-viewers/article-174911
A comprehensive news aggregator…
It isn’t pretty, but has some nice features including the tracking of stories on the graph on this page
Highlights breaking news, developing story trends, etc…
http://press.jrc.it/NewsBrief/clusteredition/en/latest.html
Couple of powerful presentations on the use of photography
http://www.ted.com/index.php/talks/david_griffin_on_how_photography_connects.html
http://www.ted.com/index.php/talks/james_nachtwey_s_searing_pictures_of_war.html
The War comic
This ain’t the Beano…
http://shootingwar.com/
We are addicted…
If you click this link it proves you are an addict
http://www.theinquirer.net/gb/inquirer/news/2008/09/03/britons-officially-addicted
Radio Pop
“Sign up to Radio Pop and we will store your listening to BBC Radio. You can then see graphs, charts and lists of your listening, get recommendations from your friends, share your tastes and browse around to see what other people are hearing right now. “
http://www.radiopop.co.uk
Hidden Radio
My Gran would never find Terry Wogan on this
http://www.hiddenradio.johnvdn.com/
So you want an iPhone but no contract…
Or perhaps you are a criminal who needs the enhanced touch screen functionality only an iPhone can bring without the traceability of pesky contracts
http://business.timesonline.co.uk/tol/business/industry_sectors/technology/article4656323.ece
3D visualsation of the National Theatre…
http://www.nationaltheatre.org.uk/39432/online-tour/discover-online-tour.html
JSON and the Ajaxnauts
September 5, 2008
In a world of BBC News and broadcast media I’ve had a thrilling romp playing with Flex, JSON (and the Ajaxnauts) and revisited JavaScript. I’d forgotten what an awesome language it is when used properly.
Closures particularly are one of its most awesomely powerful features. Once properly understood they tend to roll off the keyboard almost by accident. It makes you wonder how you ever wrote software which doesn’t have them.
Much like my comments about C# thread syntax being easy to use and abuse without realising it, closures are easy to create, but have potentially harmful consequences for the unwary, particularly in IE. To avoid accidentally stepping on a closure landmine it is necessary to understand their mechanism. This depends largely on the role of scope chains in identifier resolution and so on the resolution of property names on objects.
Laurens van den Oever’s solution to stop you wetting yourself when programming for IE is a good one. Incidentally, I needed a good XML to JSON converter and fairly randomly chose Thomas Frank’s to get me going. I intended to revisit this and write my own, but it’s been so good I shan’t bother – it’s been flawless.
Pump me baby one more time
March 31, 2008
Ah the windows message pump – you just can’t get away from it regardless of the language you write in. I may be developing Windows programs on my Mac now (using Parallels) and I’m using C# for the current project, but necessity demands I write yet another version of the message pump. I’ve blogged about this before for various reasons.
I’m currently developing an application which will playout video on a live-to-air broadcast. Apart from the necessary robustness this demands, another crucial element is the timing and synchronising with a timebase. Generally as GUI developers we don’t have to deal with things this way around. It’s more likely we have to cope with displaying high volumes of data from a stream in a window somewhere. You’re then faced with the usual – do I throttle it, buffer it, lockless queue it type questions. As long as the painting of the screen doesn’t end up in white outs or frozen, partially painted windows the users are happy. I’ve written about this before and it’s all about keeping the message pump, pumping away.
The point here is you’re not particularly concerned that WM_PAINT and WM_TIMER messages, being the amoeba of the windows message world, get pushed to the back of the message queue, because eventually they will get processed.
However, if you have to provide screen updates with accurate, rock solid countdown timers that don’t stutter and can be absolutely relied upon, you need to tackle things slightly differently. You need accurate timers and timing which is not normally the forte of Win32.
For what I needed to achieve, working with video was a very accurate video frame counter and execute some code in the message pump every frame. Since it is PAL this means every 40mS and the code shouldn’t take more than 40mS to execute. If it does then it needs to take account of the overrun.
So the choices for accurate timing are:
GetTickCount()
Well you could use this, but you’d be mad. Although it is documented to return the current system time in milliseconds, the actual precision can range from a few up to 55mS depending on operating system.
Accuracy: 5 – 55 mS
Resolution : 1 mS
Execution time : 0.1 uS
timeGetTime()
As part of the multimedia system, this is accurate to 1 mS and can be relied upon, but it takes a long time to execute.
Accuracy: 1 mS
Resolution : 1mS
Execution time : 8.0 uS
Performance counter
The way to go if you want uS accuracy and resolution.
Accuracy: 1 uS or better
Resolution : 1uS
Execution time : 4.0 uS
Unfortunately, the performance counter is not available on all PCs. If you’re running on old hardware or an old version of Windows this API uses a programmable timer chip (part of the chipset) as the clock. This has a frequency of 1.19318MHz meaning an accuracy of around a uS. More recent chipsets will from Intel will give you a frequency of 3.579545MHz.
The frequency only affects the resolution and the API gives you a call to query it, so you can take it into account.
So how can we incorporate this into a message pump and get the accuracy required. A typical message pump looks like this:
MSG msg;
while (GetMessage(&msg, NULL, 0, 0))
{
TranslateMessage( &msg );
DispatchMessage( &msg );
}
The well known problem with this is GetMessage. It won’t return until a message is available to be dealt with, so we’re always waiting for Windows to give us something – obviously no good because we need to do stuff every 40 mS. The classic way to solve this uses a PeekMessage in the message pump to see if there is a message available. If not then you can call what ever code you need to execute. So the alternative which incorporates a rock solid 40 mS with a resolution of about 1 uS is like this:
MSG msg;
LONGLONG timeNow;
DWORD msPerFrame = 40;
LONGLONG timerFrequency;
LONGLONG nextTime=0;
BOOL okToDoWork = true;
::QueryPerformanceFrequency((LARGE_INTEGER *) &timerFrequency)
msPerFrame = timerFrequency / 25; // 25 frames per second i.e 40mS frame rate
::QueryPerformanceCounter((LARGE_INTEGER *) &nextTime);
::PeekMessage( &msg, NULL, 0, 0, PM_NOREMOVE);while (msg.message!=WM_QUIT)
{
if (PeekMessage( &msg, NULL, 0, 0, PM_REMOVE))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
else
{
if (okToDoWork)
{
DoSomeWork();
okToDoWork = false;
} ::QueryPerformanceCounter((LARGE_INTEGER *) &timeNow);
if (timeNow > nextTime)
{
DrawStuff();
nextTime += msPerFrame; // Effectively drops a frame if we've taken too long
// to draw stuff.
if (nextTime < timeNow)
nextTime = timeNow + msPerFrame;
okToDoWork = true;
}
}
}