Tech-Foo: 2008

Happy Holidays

This is the last post of 2008 - on reflection it's been a rather good year: there have been several posts on this blog worthy of a read (and several that were pure garbage), I've gotten involved in a few more projects (I hope to unveil a new one fairly soon - it's very exciting). Time will tell what 2009 has in store for us!

I'm going to visit family in the UK, and I don't expect I'll be back home before the new year. So, happy holidays to one and all - even if you've decided to work through them!

Cheers,

Jop Opening at Pebble Beach Systems

We have a couple of job openings available here at Pebble Beach Systems, due to expansion of the programming team.

If you know C++, a bit of SQL, think you know how to handle a thread, and think you have a handle on programming in general, feel free to contact me and I can pass your CV and cover letter along to the big boss-man. The best way to contact me is via email: thomir gmail com

We're looking for a graduate and a senior developer. Pebble Beach Systems is a great place to work - very friendly relaxed atmosphere with a great bunch of people.

Cheers,

Teaching Programming mk. 2

I blogged before about what I think we should teach programming students, and almost immediately wished I hadn't. Sometimes I feel that my blog posts are somewhat pointless meanderings through the garbage that inhabits my sleep-deprived brain. At other times I feel that I have contributed something useful to the general public. The post in question is firmly in the former category - but what can I do? I won't start deleting articles as soon as I fall out of favor with them, so I'm hereby correcting my earlier mistakes (at least, attempting to). Illiad Frazer knows how I feel:

The whole point of the previous post was that I felt that most graduate students were under-prepared for work in industry. My main evidence of this is that it seems to take a long time, and more importantly a lot of interviews before one strikes "candidate gold" when recruiting for a new programmer.
I will admit that this could be for many reason: perhaps our expectations are too high, perhaps we are not paying enough to attract the kind of graduate we're looking for, or perhaps the industry we're in isn't desirable enough to attract the better candidates. The list goes on endlessly - and yet I cannot ignore the fact that most graduates I meet are not up to scratch.

So what prompted this revision of a past article? I happened to read E. W. Dijkstra's article entitled "On the cruelty of really teaching computing science". In it, he postulates that the methods used by most universities are fundamentally flawed when it comes to teaching computer science, and more specifically when teaching computer programming. I'd like to quote part of this article:

...we teach a simple, clean, imperative programming language, with a skip and a multiple assignment as basic statements, with a block structure for local variables, the semicolon as operator for statement composition, a nice alternative construct, a nice repetition and, if so desired, a procedure call. To this we add a minimum of data types, say booleans, integers, characters and strings. The essential thing is that, for whatever we introduce, the corresponding semantics is defined by the proof rules that go with it.

Right from the beginning, and all through the course, we stress that the programmer's task is not just to write down a program, but that his main task is to give a formal proof that the program he proposes meets the equally formal functional specification. While designing proofs and programs hand in hand, the student gets ample opportunity to perfect his manipulative agility with the predicate calculus.

This method of programming - approaching the programming language as a kind of "predicate calculus' has it's advantages. It demands that the students pay attention to the features, rules, regulations and guarantees that the language provides. Whichever language is used (and to a certain extend it does not matter), the rules and regulations of that language are going to dictate the structure of the program. This is similar to the fact that the laws of math dictate the form of any mathematical proof; ignore the laws of the language, and your program (or proof, if you will) no longer makes sense. In the domain of integer mathematics, 2 + 3 will aways equal 5. In the domain of C++, local variables are destroyed in the reverse order that they were created in (insert whatever rule of the language you want there).

Consider for a moment my previous post; I listed 11 things which I thought were essential for any programming student to know. Looking back, I notice that the top five items are all specific to C++ (since that's the language I talk in). Is it a coincidence that the five most important things any programming student can know are specific to the language they are using? I think not.

Rather, I believe that to be a great programmer, one must have a deep understanding of the language at hand, and how that language allows you to express logical problems. One must approach a program like a mathematical problem - that is, one must know the rules of the language, and then use those rules to design a proof that conclusively solves the logical problem using the language at hand.

That last point is worth reiterating: Anyone can write a program that appears to solve a problem most of the time. However, for non-trivial problems it becomes much harder to guarantee that the program will solve the problem 100% of the time. As we get further into the "edge cases" of the application logic it becomes less likely such a naive implementation will work correctly. However, a program that has been built from the ground up using the guaranteed behavior of the language can still contain bugs, but it's much more likely that they are logic errors introduced by the programmer, rather than subtle bugs introduced through language misuse.

At this point I must point out that I do not believe that Dijkstra's idea is as good as he makes it sound. He addresses one point - that students should understand the rules of the language, but a "love of the language"is only half the picture. There are also many non-language related skills that come in to play. Consider debugging for example; there are formal techniques that can be used to debug certain types of errors. Knowing these techniques, and knowing when to employ them is a powerful aid in any language, and these are skills that should be taught, rather than learned in an ad hoc approach.

So, my revised top 10 list of things every programming student should know can now be revised into this, much shorter form:

Know your language. I don't care what your language is - if you want a job it had better be something that's being used, but you can be a great programmer even if all you know is an out-dated language. Not only do you need to know your language, you need to have a passion for knowing your language - you must actively want to extend your knowledge of the language and how it works, what guarantees it provides and which it doesn't. This knowledge will translate into programs that use the features of the language to create minimal, efficient, well structured and error-free programs.
Be willing to learn new techniques. There are so many useful techniques and skills for a new programmer to have that I cannot list them all here, and course designers cannot possibly include them all in their course material.

That's it - two things. Much better than the self-absorbed tripe I rattled off a few weeks ago. To anyone who actually bothered to read that, I apologize profusely.

All BLacks vs. Wales

I don't normally write about sport, but I thought I'd mention that I'm about to start the long drive to Wales to see the match this weekend. My money's on the All Blacks to take the match, but many people have suggested that it may be a close call.

Will post an update upon my return!

EDIT: Well, I'm back. The Rugby was awesome, possibly the best game I've ever seen. I wanted to post a link to a highlights video on YouTube, but it seems no one has posted one yet. Instead, I'll post a video of the initial stand-off between the teams after the Haka. The atmosphere was incredible - a full house (74 thousand strong) all screaming at the top of their lungs. Good fun. Anyway, you can see it here:

The data: URL scheme

Here's something you may not already know: You can include data directly in an (x)html page, instead of referencing it externally.

For example, most of the time when you want to display an image you would write code like this:

<img src="http://www.blogger.com/some_image.png" alt="some random image" />

Web browsers downloading your HTML source will download the text first, then download any external references, including "some_image.png" (assuming the user has not turned off image downloading).However, there are a few cases where you want to distribute an HTML file with images, but don't want to distribute multiple files. In those cases, the 'data:' URL scheme is what you need.

The scheme is documented in the (very readable) RFC2339. Essentially, you can include the binary data straight into your HTML code. The example they give in the RFC looks like this:

<img src="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw%20%20%20AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz%20%20%20ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp%20%20%20a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl%20%20%20ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis%20%20%20F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH%20%20%20hhx4dbgYKAAA7" alt="Larry" />

Which equates to this image:

Larry

There are many reasons why you wouldn't want to do this - it increases the size of your HTML file, forcing users to download more before they can see whether your content is what they want (especially if the embedded data is near the beginning of the file). There are also some limitations on the size of data and those limitations vary depending on where this technique is used. Still, it's a useful technique that can be used when you need to embed small amounts of binary data within an HTML file and you don't want to distribute multiple files.

Ten Things to Teach Programming Students

While talking to a friend recently, we began discussing the role of graduates in the industry. My belief is that employers employ graduates and expect them to have the same skill level as their existing, trained employees (I have certainly seen this first-hand). Having been on the "other side" of the problem I appreciate that graduates are rarely fit for the tasks set for them without further training.

This got me thinking: If there were 10 things graduates should know before graduating, what should they be? What short list of skills can graduates teach themselves to become better than their competition (and getting that first job is just that: a competition). That train of thought spawned the following list:

Ten things programming students should know before graduating:

Inheritance & Composition. In the land of OO, you must know what inheritance does for you. In C++, this means that you must know what public, protected and (rarely used) private inheritance means. If class A is publically inherited form class B, that does that tell you about the relationship between A and B? What about if the inheritance was protected, rather than public? In a similar vein, what does virtual inheritnace do, and when would you want to use it? Sooner or later a graduate programmer will discover a complex case of multiple inheritance, and they need to be able to cope with it in a logical fashion. Knowing the answers to the above questions will help.
Unfortunately, a lot of the time inheritance is over-used. Just because we have access to inheritance, doesn't mean we should use it all the time! Composition can be a useful tool to provide clean code where inheritance would muddy the waters. Composition is such a basic tool that many graduates don't even think of it as a tool. Experience will teach when to use composition and when to use inheritance. Graduates have to know that both can be solutions to the same problem.
Memory Allocation. So many graduates do not understand the importance of cleaning up after yourself. Some do not fully appreciate the difference between creating objects on the stack and on the heap. Some know that but fail to understand how memory can be leaked (exceptions are a frequent cause of memory leaks in novice programmers). Every programmer should know the basic usage of new, new[], delete and delete[], and should know when and how to use them.
Exceptions. Most programmers share a love / hate relationship with exceptions; You gotta know how to catch them, but at the same time you tend to avoid using them yourself. Why? Because exceptions should be .... exceptional! There's a reasonably large amount of overhead associated with throwing and catching exceptions. Using exception as return values or flow-control constructs are two examples of exception mis-use. Exceptions should be thrown only when the user (or programmer) does something so bad that there's no way to easily fix or recover from it. Running out of resources (whether it be memory, disk space, resource Ids or whatever) is a common cause for exceptions to be thrown.
Const correctness. Const correctness is so simple, yet so many programmers just don't bother with it. The big advantage of const-correctness is that it allows the compiler to check your code for you. By designating some methods or objects const you're telling the compiler "I don't want to change this object here". If you do accidentally change the object the compiler will warn you.
Threading. Threading is hard. There's no simple way around this fact. Unfortunately, the future of PC hardware seems to be CPUs with many cores. Programs that do not make use of multiple threads have no way to make use of future hardware improvements. Even though using libraries like Qt that make it ridiculously easy to create threads and pass data between threads, you still need to understand what a thread is, and what you can and cannot do. A very common thing I see in new programmers is a tendency to use inadequate synchronization objects in threads. Repeat after me: "A volatile bool is not a synchronization object!".
Source control. Every programmer on the planet should know how to use at least one version control system. I don't care if it's distributed or not, whether it uses exclusive locks or not, or even if it makes your tea for you. The concepts are the same. Very few professional programmers work alone. Graduates must be able to work in a team - that includes managing their code in a sensible fashion.
Compiler vs Linker. Programmers need to understand that compiling an application is a two step process. Compilation and Linking are two, discreet, and very different steps. Compiler errors and Linker errors mean very different things, and are resolved in very different ways. Programmers must know what each tool does for them, and how to resolve the most common errors.
Know how to debug. When something goes wrong, you need to know how to fix it. Usually, finding the problem is 90% of the work, fixing it is 5% of the work, and testing it afterwards is another 10%. No, that's not a typo - it does add up to more than 100%, which is why there's a lot of untested code out there! Of course, if you were really good you wouldn't write any bugs in the first place!
Binary Compatibility. This one is for all those programmers that write library code, or code that gets partially patched over time. As you probably already know, shared libraries contain a table of exported symbols. If you change that table so a symbol is no longer available (or it's signature changes), code that uses that symbol will no longer work. There's a list of things you can and cannot do while maintaining binary compatability, and it's very hard not to break those rules, even if you know what you're doing. I've blogged about this before, and linked to the KDE binary compatibility page on techbase - worth a read!
The main method of maintaining binary compatibility is to program to an interface, rather than to an implementation. Once you start paying attention to binary compatibility, you'll quickly realise that it's a very bad idea to export your implementation from a shared library, for one simple reason: If you want to change your implementation you're stuck with the restrictions placed upon you by the need to maintain binary compatibility. If all you export is a pure interface and a means to create it (possibly via a factory method) then you can change the implementation to your heart's content without having to resort to pimpl pointers.
Read the right books. There are a few movers and shakers in the programming industry that it pays to keep an eye on. There are many books worth reading, but I'm going to recommend just two. The first is "Design Patterns: Elements of Reusable Object-Oriented Software", and the second is the "Effective C++" series. Neither are considered to be great bedtime reading, both are considered to be packed from cover to cover with things that will help you out in every-day situations. Any programmer worth his or her salt will own a copy of at least one of these books, if not both. Of course, there are books on UI design and usability, threading, text searching, SQL and database maintenance, networking, hardware IO, optimisation, debugging... the list goes on.
Networking. What's this? An 11th item? That's right: it's in here because it cannot be ignored in most programming tasks. It's getting harder and harder to avoid networking. Most graduates will have to write code that sends data over a network sooner or later, so they'll need to know the difference between UDP and TCP and IP, as well as what the basic network stack looks like (think "Please Do Not Touch Steve's Pet Alligator"), and what each layer does. Being familiar with tools like wireshark helps here.

What's not in the list:

You may notice that I haven't included any specific technologies in this list. That's because I firmly believe that it really doesn't matter. Sure, there are some libraries that are better than others (I'd bet my life on a small set of libraries), but the programmer next to me has a different set. I care not one grote whether a graduate knows how to program in .NET, Qt, wxWidgets or anything else - as long as they're willing to learn something new (whatever I'm using on my project).

Which brings me nicely to the conclusion: The single quality I see in all the programmers I admire is a sense of curiosity; a restlessness and a sense of adventure. Our industry is constantly shifting. The best programmers are able to ride the changes and come out better for it.

Is this post horribly self-indulgent and boring? Probably, but it had to be done. Have I forgotten anything? Things you feel should be on the list that are missing? Remember that the point of the exercise is to keep a small list - I could list every programming skill and technology required under the sun, but that would not be very useful would it?

Releases Galore!

It would be negligent of me if I did not point out that today several important software releases were made:

The first is the project formerly known as project greenhouse - now known as Qt Creator. I've blogged about this before. You can now download a technical preview. I'm very excited about this - having played with it in it's beta state I can't wait to use it with some of my active projects. Unfortunately I won't be doing any coding this weekend, as I'm off to Switzerland for a long weekend. I'll have to try it out when I get back.

The second big release event today is the Ubuntu family of distributions. That's right, version 8.10 is out now. I'm a kubuntu man myself, so I'll be trying this out after my long weekend as well.

That's all from me - I have several projects in the wings to blog about in the coming weeks, but for now I need to catch some sleep - my taxi arrives at 5:00 AM tomorrow.

Cheers,

VMWare Server 2: Worse Than Failure

OK, so this is hardly breaking news, but I thought I'd share this mini-rant with you now any way.

At work We deal with multiple operating systems (WinXP, Vista, Linux), and multiple programming environments. A few weeks ago I decided to take the plunge and do all my development work inside virtual machines. The advantage of this approach is that it's very fast to switch from one environment to another (much faster than a whole machine reboot).

There are two problems with this approach:

Performance. Programming is a pretty CPU-intensive task. Well, compiling the code is anyway. Compiling our code base takes around an hour on a physical machine with top-of-the-line specs (this makes compiling the Linux kernel seem fast). On a virtual machine, that time doubles. This is an inescapable truth about software virtualization: there will always be some overhead.
Hardware. The applications I program need access to physical hardware, over a variety of interfaces, including RS232, RS422, USB, Parallel, TCP/IP, UDP/IP and a few others besides. Any software virtualization package must be able to forward all these hardware interfaces through to the virtual machine.

In my experience, the only software virtualization package that meets requirement 2 is VMWare. I know I've raved about Virtualbox before, and I was very tempted to use it again now, but it lacks the hardware support I need.

So, VMWare Server it is then. The next question then becomes: which version? I've used version 1.x before, and it fulfilled all my needs at the time. However, I noticed that version 2 is now available. I thought "In software, bigger numbers are better, right?"

After downloading the sevrer, and registerring for a free license key, I spent a busy 30 minutes clicking through the ubiquitous license agreements and installation options (does anyone ever read these things?). Everything was installed. I went to fire the application up, only to have it launch Mozilla Firefox. What's going on here? Then it hit me: a wave of fear and horror. What were they thinking?

The managment interface for VMWare Server 2 is web based.

That's right... you want to use your virtual machine? You need to run a web browser. The interface is slow, the browser plugin that supports the interface is buggy... I could go on, but some other people have detailed the problems with the software far better than I ever could (Yes, I realise that link is talking about the beta release. Trust me, nothing much has changed).

Maybe I'm missing something, some gold nugget of understanding that would make Server 2 more usable for me, but right now I just don't get it. Why would you decide that the primary interface to a virtual machine should reside inside a browser? Browsers are notorious for implementing different standards, being generally slow, memory-hogging apps that occaisonally crash. Who in their right mind would want to use a browser for their virtual machines?

Until then, I'll stick with VMWare server 1, thank you very much.

It's true: Qt Developer Days Rock!

Well, I'm back. I arrived home at 1:30 AM this morning. Qt Software's "Developer Days" conference was simply brilliant. On top of the many technical talks (I'll digest them and use them as inspiration for new posts here over the next few weeks), it was a great chance to meet other Qt developers, and the trolls themselves.

There's a lot of stuff to talk about - more than I can do justice in one post. I will quickly mention Project Greenhouse, the new IDE that should be released as alpha software in the next few weeks. Details are sketchy, but the following points are probably all correct:

"Project Greenhouse" is the development title - it may chance before release.
The IDE is aimed at creating a truly cross-platform development environment.
Uses GCC / GDB to compile / debug code.
Includes nice editor features like code folding, syntax highlighting, auto-completion etc.
It's been touted as a "replacement for Vi/Emacs" - although this is misleading - as I understand it no one is going to create a Vi clone editor in the IDE. It's a replacement in the sense that you will no longer need to use Vi in order to edit your code under Linux.

One point the trolls were very insistent upon was that they're not trying to replace Visual Studio, Eclipse, KDevelop, or XCode. Those IDEs have a great many features, whereas PG intends to deliver a core set of features - just enough to make Qt development a breeze across multiple platforms.

At least, that's my interpretation of the project. There's been lots of speculation across the internet and on the Qt-interest mailing list about the project, so we may have to wait and see.

TT Dev Days: Here I come!

Just thought I'd let you all know that I'll be attending the Trolltech Developer Days in Munich, Germany. I'm not taking my laptop, so I probably won't update this site while I'm there (not that you'd notice a change in activity levels, right?), but I hope to provide a few details once I get back.

Until then, keep coding!

End of the world as we know it?

This is a short post - I am overcome with grief. I have been waiting for spore to arrive on the Wii, and now I hear it will never arrive.

So many years of anticipation; so much disappointment - DRM issues aside I was still looking forward to this. Now I won't have a chance to sample that sweet candy of gaming goodness.

Slackware Rant

Yes, this is a rant. pure, unaltered hate and bile to follow:

<rant>

Check out the first sentence on the Slackware "General Info" page:

The Official Release of Slackware Linux by Patrick Volkerding is an advanced Linux operating system, designed with the twin goals of ease of use and stability as top priorities. Including the latest popular software while retaining a sense of tradition, providing simplicity and ease of use alongside flexibility and power, Slackware brings the best of all worlds to the table.

Does anyone using slackware actually believe this? How is a distribution that has no graphical installer, no network-enabled package managment and a severely restrictive list of packages available on the default install easy to use? How is slackware more advanced than a distribution that provides proper package managment services?

I guess when they say "while retaining a sense of tradition" they mean "feels like you just stepped into a time-warp into the Linux distros fo yester-year".

I guess it's fine for those of you who want the massochistic rituals of using Linux in the early 1990's (I was there: been there, done that, bought the T-shirt). The the sane members of the human race who want a proper Linux distribution I strongly recommend something else.

</rant>

I think I need more practise at ranting - that was pretty tame really, but I feel better, whcih is the most important thing.

Cheers,

threads vs. processes

It's been a few weeks since my last post. My excuse is that I've been busy - my job is always busy around this time of year due to the IBC trade show. Thankfully most of the work is now done. In my own time I've been working on a number of projects (to be unveiled shortly, once they're usable).

In the last few weeks we've seen the launch of the google chrome browser. I won't discuss it here directly (plenty of other people have reviewed it separately). I will, however mention the one feature that has converted me away from firefox and towards chrome:

Google Chrome has a multi-process architecture, meaning tabs can run in separate processes from each other, and from the main browser process.

What this means in theory is that a misbehaving website cannot bring down the entire browser session - you can just close / kill the offending tab / window.

This is pretty cool, but how does it relate to my own work?

I'm writing an application that makes extensive input of python scripts to customize the input and output of the application. I could have spend a while embedding the python scripts using the python API, and then running those separate scripts in a thread, but instead I chose to run them as separate processes spawned by a central application. The way I see it, this approach has several advantages:

Crashing child processes (in this case python scripts) are unlikely to bring down the entire application - you still need to be careful since there's likely to be communication between child and parent process, and the parent needs to be able to hand corrupted data in that communication. Other than that, if one of my scripts breaks, I can easily carry on; no error handling or cleanup needs to be done.
In my application, I need to repeatedly call the script files - probably thousands of times for a single application run. In a traditional threaded environment the possibility for memory leaks is huge. This way, since each sub-process is short lived, the operating system takes charge of cleaning up any memory not deallocated by the child process.
It's a damn site easier to program too. No thread synchronization required!

It's not all roses however.

From my experiences the main drawback is that it's much harder to pass data between parent and child processes. For simple communication it may be enough to use stdin and stdout, but for anything more complicated you'll want to use some form of proper IPC (sockets, named pipes, DBus etc). Even with proper IPC it's still harder to pass custom data structures between processes, since you'll need some sort of data serialization.

Think this all sounds obvious?

That's because it really is! I can't believe how much simpler this approach is, especially if your requirements are for simple delegation to scripts or sub-processes. If you require more complex interaction you might find this approach more trouble than it's worth.

virtualbox rocks

I know I've said this before, but VirtualBox kicks ass. It has that whole "it just WORKS" feel to it. I realise that repeating yourself is supposedly one of the hallmarks of a bad blog, but please indulge me this once; VirtualBox deserves your attention. The image to your right shows virtualbox running windows XP - I'm installing Qt4.4 in the hope that some of the bugs I mentioned in my previous post have been cleaned up.

I'm also compiling the latest KDE4, which is where a multi-core system would be really nice. Until then, I'll continue to struggle with my old box.

Twitter Submission

I posted previously that I just didn't "get" twitter. I guess this sums up my objections pretty well; unless you have anything interesting to say, stay silent. Do we really need one more idiot blathering on about the most mundane details of his life? I think not.

Even some of my most respected peers seem to be reduced to complete morons when it comes to their twitter feeds.

However, it turns out that the cost of my stubbornness to jump on the twit bandwagon has grown to the point where I am now on twitter. That's right, I'm bowing to peer pressure, and have officially jumped on the twit wagon - You can see my last twitter updates on the right, or you can follow me on twitter.

Now the entire world can join me in my mindless ramblings... of course, if you're reading this then you've already had that pleasure!

Buffalo Routers Rock!

Waaaaay back in Janurary I mentioned that I might build my own wireless router. Well, like most of my projects I got half way then got distracted. The other day i got fed up with the lack of wireless (I needed to get my Wii online, apart from anything else), so I splashed out and bought a Buffalo Air-Station (£45).

This baby comes pre-loaded with the open source dd-wrt firmware. It was dead-easy to set up, and boy, what a difference the open firmware makes! This is easily the best router firmware I have ever seen - it boasts more feature than you can shake a stick at. Better yet - it's aimed at people who know what they're doing, and doesn't try to hide it's functionality behind restrictive setup "wizards".

All in all, I'm a very happy man with my new wireless network ;)

pointer quiz

One of my pet peeves with regards to C++ is how very few people understand how the delete operator handles NULL pointers. Let's see if you pass the test!

Question: What happens when you do this:


myType *ptr = NULL;
delete ptr;

Well? Your choices are:

a) Crash, bang, boom, your computer is now a very heavy paperweight.
b) Nothing - the delete line ignores the ptr to be deleted.

It turns out that deleting a null pointer is safe. Section 5.3.5/2 of the C++ standard states that:

"In either alternative, if the value of the operand of delete is the
null pointer the operation has no effect."

This has been a pet peeve of mine for a while now. I can't count the number of times I've seen programmers write something like this in class destructors or cleanup methods:


if (pMyPtr)
 delete pMyPtr;

This is stupid for several reasons. For a start, the if statement is redundant - if the pointer is NULL the delete will do nothing. Secondly, the programmer never sets the pointer to NULL after deleting it, which means that if this code were to be called again, you would definitely experience problems.

In order to avoid these issues, and avoid angering me if I ever see your code, you should:

Always initialize pointers to NULL if you're not going to set them to something else straight away (i.e.- if the pointer is not always used).

Always set them to NULL after you delete them, especially if there's a chance that the delete can be called twice.

piwup: A Picasaweb Image Uploader for Linux

One of my pet peeves has always been that unless you want to run google's picasa application under Linux, the only way to upload photos to your picasaweb account is via a klunky web interface that only allows you to select 5 images at a time. When I come back from a trip I have hundreds of photos, so this gets tiresome very quickly.

There is a kipi plugin that is supposed to be able to do this, but it has not yet hit the Linux distribution I am using, and I'm not about to start compiling plugins from source. Besides, half the fun is in making the application!

This is definitely not a finished application! I got it to the point where I could upload my images in a batch, but it needs more work before it's useful to anyone else. Here's a few sample screen shots:

Selecting images to upload.

Uploading the first image.

The application still has a long way to go. Just some of the things yet to complete are:

Remove hard coded items from the code (account details, service host, album name), and make these configurable via a nice configuration dialog. Make sure password is stored in a secure form - via the KDE wallet perhaps.

Make the GUI half-decent. Originally I just wanted something to work - I need to go back and do it again with a proper menu and image thumbnail support.

Bug fixes too numerous to mention here... this is some rouch, cheap and nasty code!

Perhaps, once I get all this done I will attempt to get it officially released into some distros. I think it's a useful application, and the kipi plugin version doesn't seem to be moving along much. Yes, I realize that I'd be better off spending my time improving the kipi plugin, but to be honest I can't be bothered right now - this was a learning experiment for me as much as it was about making an application that solved one of my problems.

The entire application is written in C++ and Qt4. The more I use Qt the more I like it. This application was simplicity itself to make, and I look forward to continued development.

Problem with pimpls

That's "pimpl", not "pimple" - I'm talking about opaque pointers, these beauties help protect your public interfaces from changing implementation. This useful technique has a few drawbacks that aren't so well publicised. In order to prevent others making the same mistakes I have, I thought I'd outline the general use of the pimpl pointer, and some of it's drawbacks:

Here's an example of a first-pass class to encapsulate a user account (I'm making this up on the fly, so bear with me):


class
userAccount
{
public:
    // public methods go here
private:
    // private data members:
    unsigned int accountId;
    std::string username;
    std::string realName;
};

This will work just fine, but there's a problem. If you need to change the implementation (say you want to store the user's real name in two fields instead of one), unless you're very careful you will end up changing the size and / or the declaration of the class.

Changing the size of the class is a big problem if you're trying to maintain binary compatibility. Changing the declaration of the class is a problem because (some) compilers will now recompile every file that includes your changed header file, even if the changes make no difference to the binary output.

The solution comes in the form of an opaque pointer, or "pointer to implementation" (which is where we get the charming "pimpl" name from). The idea is that the implementation details are put in a separate class that is forward declared in the header file, and fully declared in the cpp file. Your external interface now only contains a single pointer - you can change the size of your implementation class to your hearts content, and you will never change the size or declaration of your external interface. The class above refactored to use a pimpl pointer looks like this:

Header File:


// forward declare implementation class:
class
userCountImpl;

class
userAccount
{
public:
    userAccount();
    // public methods go here
private:
// private data members:
    userAccountImpl *pimpl;
};

Source File:


class
userAccountImpl
{
public:
    unsigned int accountId;
    std::string username;
    std::string realName;
};


userAccount::userAccount()
: pimpl(new userAccountImpl)
{
}

Now any data members can be accesed via the private implementation class. There are several things you can do to extend the example above (using a shard_ptr is a start), but I want to keep things simple for the sake of the example.

Until recently, I took this method for granted and used it as often as I could. As so often happens when I learn something new I rush to use it in every possible situation, including ones where it doesn't make sense. The pimpl idiom has a few problems associated with it, which I will outline here:

Your object's memory footprint is now split into more than one place in memory. This may not be a huge problem for 90% of classes, but consider a small utility class that contains only standard data types:
```
class colour
{
public:
    // public methods go here
private:
    unsigned char red;
    unsigned char green;
    unsigned char blue;
};
```
In order to serialise this into a buffer (like a file), you can get away with using a memcpy or similar technique. Since your object's memory footprint is contiguos, copying the entire object into a file is simple (Yes yes, I know: there are many reasons why this isn't a good idea, but let's face it - this happens all the time). Once you start using pimpl pointers it gets a bit more difficult. Since your implementation class is private to the .cpp file, the code to do the copying needs to be in the same cpp file (otherwise it doesn't have access to the definition of the implementation class). This is relatively easy to work around, but the trouble doesn't stop there - consider what you need to do to un-serialise an object from a buffer. You can no longer be cheeky and use a reinterpret-cast like so:
```
colour *pCol = (colour*) pBuffer;
```
Again - I realise that this isn't the best idea in the world, but when you're programming with constraints sometimes this is the best way to do things.
The default new and delete operators are expensive. I never realized just how expensive they can be. In a recent bout of performance testing on some real-time software I saw the default new operator take 55ms to allocate a block of 4B of memory. That's way too slow for the real-time application i was working on, and may be too slow for other applications as well. What's more, the times get worse the more memory you allocate - so using the pimpl pointer may not be a good idea at all, since it adds the overhead of a new call to each constructor, and a delete to each destructor. ouch!

If you're looking for more info on the pimpl idiom, Sutter wrote a good article on the pimpl pointer, and a more recent article that talks about some of the performance issues associated with pimpl pointers. This technique is worth using - it can save many attacks of "code cheese" in the future, just be careful where you us it, or you may end up with some nasty performance issues you didn't expect!

Apathy, Apples and Understanding

You may notice that it's been a long time since my last post. Truth be told I've been lazy. It's tempting to say that I've been busy, but that's a sugar coating on apathy. For that I apologise! Hopefully I can get back into the habit of regular posting again soon.

Inspiration struck the other day when Jeff Atwood posted an interesting article on his blog. "Dealing With Bad Apples" talks about how single members of a programming team can be difficult, often working against the rest of the team.

Atwood quotes Robert Miesen:

I was part of a team writing an web-based job application and screening
system (a job kiosk the customer called it) and my team and our
customer signed on to implementing this job kiosk using Windows,
Apache, PHP5, and the ZendFramework -- everyone except one of our team
members, who I will refer to as "Joe". Joe kept advocating the use of
JavaScript throughout the technology deliberation phase, even though
the customer made it quite clear that he expected the vast majority of
the job kiosk to be implemented using a server-side technology and all
the validation should be done using server-side technology.
The fact that the customer signed off on this, however, did nothing
to deter Joe from advocating JavaScript -- abrasively. Every time our
project hit a bump in the road, Joe would go off on some tirade on how
much easier our lives would be if we were only writing this job kiosk
in JavaScript. Joe would constantly bicker about how we were all doing
this all wrong because we weren't doing it in JavaScript, not even
bother to learn the technologies we were actually using, and, whenever
fellow teammates would try and gently bring him back into the fold
(usually via email), Joe would just flame the poor guy. At the height
of Joe's pro-JavaScript bigotry, he would regularly belt off comments
like, "Well, if we had only done it in JavaScript," to such an extent
that the team would have been better off if he had just quit (or was
reassigned or fired.)

Jeff then goes on to suggest that perhaps the problem here is a "bad apple" - a team member that is doing more harm than good in a team. He's probably right, but I have a slightly different angle.

Perhaps the real problem here is poor team leadership / management? Without meeting "Joe" personally, I cannot make any accurate assessment of the situation, but it seems to me that perhaps Joe feels undervalued in his team? I say this because i recognize that behavior pattern - in myself.

In any team of programmers, each member will have different backgrounds, strengths and weaknesses. Joe obviously has experience using Javascript, and feels the need to share his expertise in that field. I'm not saying that this is a good thing, but perhaps the underlying problem is a lack of cohesion and understanding between team members?

So, further to Atwood's list of warning signs for detecting "bad apples", I have a list of actions team leaders could consider taking when dealing with a so-called "bad apple":

Listen to them. Most geeks (I use the term with all possible affection) are reasonable people. If a team member is repeating themselves, perhaps they feel that their point was never seriously considered in the first place? I can't count the number of times I've made contributions in meetings that were ignored, only to hear (usually six weeks later) "hey, we should have done X, what a pity it's too late now...". It always seems petty to point out that I suggested X from the start. Bear with me here - I'm certainly not suggesting that I'm always right - far from it; my point is that you ignore contributions from your team members at your own risk!

Finally, I'm not suggesting that team leaders always act on suggestions from their team members, but listening is a good start.
Once you start listening to bad apples, you may find that some of your team members have strengths you didn't expect. Can you use these strengths in the future? This depends a lot on your business model and workload. From my own experience I can understand that programming code that doesn't interest you week in - week out can be incredibly draining. Perhaps bad-apples can be encouraged to pull together with the team?
Finally - I don't know what the IT job market is like where Jeff lives, but you can't fire programmers and expect to get a replacement any time soon. Jeff writes:

You should never be afraid to remove -- or even fire -- people who do not have the best interests of the team at heart. You can develop skill, but you can't develop a positive attitude.

I say "bollocks" to that - it's incredibly expensive to fire and replace someone. Not only is there the cost of looking for, and hiring someone new, but there's the training overhead, and there's no guarantee that you can find someone with the appropriate skill set any time soon. From where I'm sitting it looks like we have to wait around 6-8 weeks between looking for, and hiring a new programmer. That's almost two months of productivity down the drain! Suggesting that you can't develop a positive attitude in your team-members is incredibly negative and close-minded. I'm certainly glad I'm not on a team with a leader like that!

That said and done, I do hope that the current skill shortage in this country develops a greater appreciation of the worker. I suspect that most companies vastly underestimate the value of their skilled (and unskilled) workers.
Next time you have a problem with someone, consider the massive cost of replacing them, and - more importantly - consider the huge amounts of good work they've done, before you concentrate on the bad.

Masters of the big buildup

The Foo Fighters are the masters of the big buildup. Listen to a track like "Let it Die" from their album "Echoes, Silence, Patience and Grace" - Brilliant!

QtScript: Exposing C++ classes (part two)

This is the second part of my QtScript walkthrough / tutorial. You can read the first part here.

As in the last tutorial, I must add the following disclaimer:

I'm writing these posts as I learn to use the QtScript module. As such, the solution I present here may not be the best / most elegant way of doing things. However, if I find a better solution, I'll certainly post back here in the future with corrections.

If there's any (QtScript-related) topic you'd like me to write about in the future, just leave a message and I'll endeavor to fulfill your request.

Background:

The last post was all about deferring some execution from a C++ class to an ECMAScript. The basic idea was that the C++ class could emit signals, zero or more of which were connected to functions in the script file. The script file set up these connections, so the the C++ code doesn't need to know which signals are being overridden, and which aren't.

This is all well and good, but you'll soon find that there's very little you can do in the script file, since the only functionality you have available is the builtin ECMASCript functions.What I'm going to look at today is how to make your own, and Qt classes available for use in the script file. In my example, I'm going to make the QMessageBox class available, but you can choose any other method you want.

Take 1:

First off, I want to be able to write "mb = new QMessageBox;" on my script, and have it work. This is actually pretty simple to achieve. The way I have chosen to do this is with a wrapper class. This wrapper derives from QMessageBox and QScriptable, and contains the magic "qscript_call" method. Here's the code:


// wrapper around QMessageBox:
class Wrapper_QMessageBox: public QMessageBox, protected QScriptable
{
 Q_OBJECT
 public:
 
 Wrapper_QMessageBox(QWidget *parent =0)
 : QMessageBox(parent) {}
 
 public slots:
  QScriptValue qscript_call(QWidget *parent = 0)
  {
   QMessageBox * const iface = new Wrapper_QMessageBox(parent);
   return engine()->newQObject(iface, QScriptEngine::AutoOwnership); 
  }
};

This is the complete wrapper so far. Note that normally I wouldn't write all this code inline, as I firmly believe that inline functions are evil, but for demonstration code I think this makes it more readable. As you can see, we inherit from QMessageBox and from QScriptable. Note the protected inheritance! QScriptable gives us the engine() call we need later on.

The constructor is pretty straight forward - no surprises there!

Then we have a special slot called "qscript_call". This seems to be an undocumented feature in Qt, (try searching for qscript_call on google and you get very few results). Essentially, this slot gets called when our wrapper class is called as a function. I guess this is similar to the meta-method "__call__" in python. All we do in this slot is make a new instance of the wrapper class (remember that our wrapper class is a QMessageBox as well), and return it.

Our wrapper class still isn't available in the scripts however. In order to do that, we need to add this one line of code:


 m_pEngine->globalObject().setProperty("QMessageBox", m_pEngine->newQObject(new Wrapper_QMessageBox, QScriptEngine::AutoOwnership));

This one line adds our wrapper class to the script global object. Once you've done this, you can no create QMessageBox instances from within a script file, like so:


 mb = new QMessageBox();
 mb.text = "Test!";
 mb.exec();

However, there are still a few issues left to resolve:

We can only use the default - empty constructor.
Only properties and slots of the QMessageBox are available from the script. For example, there's no way to set the window title text<> right now. If we could use the overloaded QMessageBox constructor that took the window title as a parameter then we could just specify it at creation time, but we'll need a way to export non-slot methods as well.

Let's tackle these issues one at a time:

Adding Overloaded Constructors:

This one's pretty simple. All you need to do is add another constructor that takes the additional parameters, and passes them up to the QMessageBox class. Then, you overload the qscript_call method in a similar fashion. Here's what my class looks like now:


// wrapper around QMessageBox:
class Wrapper_QMessageBox: public QMessageBox, protected QScriptable
{
 Q_OBJECT
 public:
 
 Wrapper_QMessageBox(QWidget *parent =0)
 : QMessageBox(parent) {}
 
 Wrapper_QMessageBox(Icon icon, const QString & title, const QString & text, StandardButtons buttons = NoButton, QWidget * parent = 0, Qt::WindowFlags f = Qt::Dialog | Qt::MSWindowsFixedSizeDialogHint)
 : QMessageBox(icon,title,text,buttons,parent,f) {}
 
 public slots:
  QScriptValue qscript_call(QWidget *parent = 0)
  {
   QMessageBox * const iface = new Wrapper_QMessageBox(parent);
   return engine()->newQObject(iface, QScriptEngine::AutoOwnership); 
  }
  QScriptValue qscript_call( Icon icon, const QString & title, const QString & text, StandardButtons buttons = NoButton, QWidget * parent = 0, Qt::WindowFlags f = Qt::Dialog | Qt::MSWindowsFixedSizeDialogHint )
  {
   QMessageBox * const iface = new Wrapper_QMessageBox(icon,title,text,buttons,parent,f);
   return engine()->newQObject(iface, QScriptEngine::AutoOwnership);
  }
};

You can now call this new constructor straight from your script file, just as you could earlier!

Exposing Additional Methods:

Finally, we need a way to expose arbitrary methods that are not slots in our base class. This is pretty simple. We make our own method with the same name, and call into the base class method with the parameters passed to us. Consider my wrapper class, now that I have exposed the setWindowTitle method:


// wrapper around QMessageBox:
class Wrapper_QMessageBox: public QMessageBox, protected QScriptable
{
 Q_OBJECT
 public:
 
 Wrapper_QMessageBox(QWidget *parent =0)
 : QMessageBox(parent) {}
 
 Wrapper_QMessageBox(Icon icon, const QString & title, const QString & text, StandardButtons buttons = NoButton, QWidget * parent = 0, Qt::WindowFlags f = Qt::Dialog | Qt::MSWindowsFixedSizeDialogHint)
 : QMessageBox(icon,title,text,buttons,parent,f) {}
 
 public slots:
  QScriptValue qscript_call(QWidget *parent = 0)
  {
   QMessageBox * const iface = new Wrapper_QMessageBox(parent);
   return engine()->newQObject(iface, QScriptEngine::AutoOwnership); 
  }
  QScriptValue qscript_call( Icon icon, const QString & title, const QString & text, StandardButtons buttons = NoButton, QWidget * parent = 0, Qt::WindowFlags f = Qt::Dialog | Qt::MSWindowsFixedSizeDialogHint )
  {
   QMessageBox * const iface = new Wrapper_QMessageBox(icon,title,text,buttons,parent,f);
   return engine()->newQObject(iface, QScriptEngine::AutoOwnership);
  }
  void setWindowTitle ( const QString & title )
  {
   QMessageBox::setWindowTitle(title);
  }
};

Simple!

Conclusion:

Combined with my earlier post, this should be all you need to make a Qt application extensible through ECMAScript files. I'm doing all this work to write a small game, but the techniques covered should work for all types of projects. If / When I find anything that requires updating, I will endeavour to update these posts. Since I'm still learning QtScript myself, I'm sure these pages will be filled with innacuracies.

How to make C++ classes available in QtScript

It's been a while since I posted here. What have I been doing? Mostly working, and in my spare time trying to get my head around QtScript. What follows is a short introduction into QtScript. I'm sure that there are points of inaccuracy in this post - this is only to be expected, since I'm just starting out.

What I need for my game, is a single C++ class that can defer some of it's processing to a script, thereby changing the behavior of the class. For example, I want a "GameObject" class, that takes a script filename as a constructor parameter, and defers certain parts of the processing to the script. In the case of the game object, I may want to defer the following functionality to a script:

Graphics loading (so different game objects look different).
AI processing.
Generic event handling.

However, not all scripts will do all of the above - some may only customize a very small amount.

The easiest way to do this is to have signals in the C++ class, that the script can choose to connect to (or not, as the case may be). As the C++ class is executed, it can emit these signals, and the corresponding slots in the script file can be called.

The Code

The C++ code is relatively simple. Consider the following header file:


#ifndef TESTSCRIPTOBJECT_H
#define TESTSCRIPTOBJECT_H
//
#include <qobject>
#include <qscriptvalue>
#include <qscriptable>
#include <qscriptengine>
#include <qmessagebox>
//

class QScriptEngine;


//
// Test script object class - used to demonstrate the fundamentals of QtScript.
//
class testScriptObject : public QObject
{
Q_OBJECT
public:
 // ctor for this class - pass in the script engine to bind to.
 testScriptObject( QScriptEngine *pEngine, QObject *parent =0);

 // attach ourselves to a script file - this could be done inside the ctor. I
 // have chosen to use a separate method instead.
 void runScript(const QString &strProgram);
 
 signals:
 // test signal - we can emit this, and have some QtScript code run.
 void signal1();
 
public slots:
 // test slot - just displays a message box. 
 void slot1();
 // second test slot - displays the string in a combo box.
 void displayMsg(QString strMsg);
 
private:
 // store a pointer to the script engine.
 QScriptEngine *m_pEngine;
 
 // store the "this" object, so we can manually call script functions if we need to.
 QScriptValue m_thisObject;
};

This should all be fairly easy stuff. If it's not, you should probably look at the Qt tutorials before going any further.

Before we can use this class in any sensible manner, we need to do two things:

Create a QScriptValue object that represents the "this" pointer. We do this so that we can call a script function, and pass it the C++ class object as "this" (we'll see this later).
Open a script file and execute a "create" function, the contents of which will set up any signal / slot connections we require.

Let's dive into the C++ code, and take a look at the constructor for the above class:


testScriptObject::testScriptObject(  QScriptEngine *pEngine, QObject *parent) 
 : QObject(parent),
 m_pEngine(pEngine)
{
 // create this object in the scripting land:
 m_thisObject = m_pEngine->newQObject(this);
 
}

This is also pretty simple - the only line of any significance is the one where we create "m_thisObject".

Now on to step 2 - running the script file. Let's take a look at the runScript method:


void testScriptObject::runScript(const QString &strAppName)
{
 QFile file(strAppName + ".js");
 
 if (! file.exists())
 {
  QMessageBox::critical(0, "Error", "Could not find program file!");
  return;
 }
 
 if (! file.open(QIODevice::ReadOnly | QIODevice::Text))
 {
  QMessageBox::critical(0, "Error", "Could not open program file!");
  return;
 }
 
 QString strProgram = file.readAll();
 
 // do static check so far of code:
 if (! m_pEngine->canEvaluate(strProgram) )
 {
  QMessageBox::critical(0, "Error", "canEvaluate returned false!");
  return;
 }
 
 // actually do the eval:
 m_pEngine->evaluate(strProgram);
 
 // uncaught exception?
 if (m_pEngine->hasUncaughtException())
 {
  QScriptValue exception = m_pEngine->uncaughtException();
  QMessageBox::critical(0, "Script error", QString("Script threw an uncaught exception: ") + exception.toString());
  return;
 }
 
 QScriptValue createFunc = m_pEngine->evaluate("create");
 
 if (m_pEngine->hasUncaughtException())
 {
  QScriptValue exception = m_pEngine->uncaughtException();
  QMessageBox::critical(0, "Script error", QString("Script threw an uncaught exception while looking for create func: ") + exception.toString());
  return;
 }
 
 if (!createFunc.isFunction())
 {
  QMessageBox::critical(0, "Script Error", "createFunc is not a function!");
 }
 
 createFunc.call(m_thisObject);
 
 if (m_pEngine->hasUncaughtException())
 {
  QScriptValue exception = m_pEngine->uncaughtException();
  QMessageBox::critical(0, "Script error", QString("Script threw an uncaught exception while looking for create func: ") + exception.toString());
  return;
 }
 // now emit our test signal:
 emit signal1();
}

Again, this is all pretty simple stuff. This method does the following:

Looks for, and opens the script file specified.
Does a static check of the code (makes sure that it's syntactically correct, but does not ensure that the script will run without error).
Evaluates the script file.
Retrieves the function named "create" from the script file.
Executes this function, passing our previously created "m_thisObject" as the "this" object for the script to use. This function then binds our signals to slots in the script file.
Finally, we emit our test signal.

Finally, let's look at the ECMAScript file I'm using:



// set up the game object - the 'this' object will be an actual QObject 
// derived class passed in from the application.
function create()
{
 // test 1 - call a slot in the class:
 this.slot1();
 
 this.signal1.connect(this, testSlot);
}

// create our own slot:
function testSlot()
{
 this.displayMsg("Testing testing.. 123");
}

...and that's it! The "create" method will first call the "test2" slot in our C++ class, then connect our test signal to a scripted function. When we emit the signal in our C++ class, the "testSlot" function will be called.

This is just a starting point. The more I use the QtScript module, the more I marvel at the possibilities created with this tool. I shall post follow-up articles with more information and techniques as I see fit. In the meantime, go add scripting extension support to your faviourite project. With tools this easy to use, there's no excuse not to!

Tech-Foo