Happy Holidays

This is the last post of 2008 - on reflection it's been a rather good year: there have been several posts on this blog worthy of a read (and several that were pure garbage), I've gotten involved in a few more projects (I hope to unveil a new one fairly soon - it's very exciting). Time will tell what 2009 has in store for us!

I'm going to visit family in the UK, and I don't expect I'll be back home before the new year. So, happy holidays to one and all - even if you've decided to work through them!


Cheers,

Jop Opening at Pebble Beach Systems

We have a couple of job openings available here at Pebble Beach Systems, due to expansion of the programming team.

If you know C++, a bit of SQL, think you know how to handle a thread, and think you have a handle on programming in general, feel free to contact me and I can pass your CV and cover letter along to the big boss-man. The best way to contact me is via email: thomir gmail com

We're looking for a graduate and a senior developer. Pebble Beach Systems is a great place to work - very friendly relaxed atmosphere with a great bunch of people.

Cheers,

Teaching Programming mk. 2

I blogged before about what I think we should teach programming students, and almost immediately wished I hadn't. Sometimes I feel that my blog posts are somewhat pointless meanderings through the garbage that inhabits my sleep-deprived brain. At other times I feel that I have contributed something useful to the general public. The post in question is firmly in the former category - but what can I do? I won't start deleting articles as soon as I fall out of favor with them, so I'm hereby correcting my earlier mistakes (at least, attempting to). Illiad Frazer knows how I feel:

The whole point of the previous post was that I felt that most graduate students were under-prepared for work in industry. My main evidence of this is that it seems to take a long time, and more importantly a lot of interviews before one strikes "candidate gold" when recruiting for a new programmer.
I will admit that this could be for many reason: perhaps our expectations are too high, perhaps we are not paying enough to attract the kind of graduate we're looking for, or perhaps the industry we're in isn't desirable enough to attract the better candidates. The list goes on endlessly - and yet I cannot ignore the fact that most graduates I meet are not up to scratch.

So what prompted this revision of a past article? I happened to read E. W. Dijkstra's article entitled "On the cruelty of really teaching computing science". In it, he postulates that the methods used by most universities are fundamentally flawed when it comes to teaching computer science, and more specifically when teaching computer programming. I'd like to quote part of this article:

...we teach a simple, clean, imperative programming language, with a skip and a multiple assignment as basic statements, with a block structure for local variables, the semicolon as operator for statement composition, a nice alternative construct, a nice repetition and, if so desired, a procedure call. To this we add a minimum of data types, say booleans, integers, characters and strings. The essential thing is that, for whatever we introduce, the corresponding semantics is defined by the proof rules that go with it.

Right from the beginning, and all through the course, we stress that the programmer's task is not just to write down a program, but that his main task is to give a formal proof that the program he proposes meets the equally formal functional specification. While designing proofs and programs hand in hand, the student gets ample opportunity to perfect his manipulative agility with the predicate calculus.
This method of programming - approaching the programming language as a kind of "predicate calculus' has it's advantages. It demands that the students pay attention to the features, rules, regulations and guarantees that the language provides. Whichever language is used (and to a certain extend it does not matter), the rules and regulations of that language are going to dictate the structure of the program. This is similar to the fact that the laws of math dictate the form of any mathematical proof; ignore the laws of the language, and your program (or proof, if you will) no longer makes sense. In the domain of integer mathematics, 2 + 3 will aways equal 5. In the domain of C++, local variables are destroyed in the reverse order that they were created in (insert whatever rule of the language you want there).

Consider for a moment my previous post; I listed 11 things which I thought were essential for any programming student to know. Looking back, I notice that the top five items are all specific to C++ (since that's the language I talk in). Is it a coincidence that the five most important things any programming student can know are specific to the language they are using? I think not.

Rather, I believe that to be a great programmer, one must have a deep understanding of the language at hand, and how that language allows you to express logical problems. One must approach a program like a mathematical problem - that is, one must know the rules of the language, and then use those rules to design a proof that conclusively solves the logical problem using the language at hand.

That last point is worth reiterating: Anyone can write a program that appears to solve a problem most of the time. However, for non-trivial problems it becomes much harder to guarantee that the program will solve the problem 100% of the time. As we get further into the "edge cases" of the application logic it becomes less likely such a naive implementation will work correctly. However, a program that has been built from the ground up using the guaranteed behavior of the language can still contain bugs, but it's much more likely that they are logic errors introduced by the programmer, rather than subtle bugs introduced through language misuse.

At this point I must point out that I do not believe that Dijkstra's idea is as good as he makes it sound. He addresses one point - that students should understand the rules of the language, but a "love of the language"is only half the picture. There are also many non-language related skills that come in to play. Consider debugging for example; there are formal techniques that can be used to debug certain types of errors. Knowing these techniques, and knowing when to employ them is a powerful aid in any language, and these are skills that should be taught, rather than learned in an ad hoc approach.

So, my revised top 10 list of things every programming student should know can now be revised into this, much shorter form:
  1. Know your language. I don't care what your language is - if you want a job it had better be something that's being used, but you can be a great programmer even if all you know is an out-dated language. Not only do you need to know your language, you need to have a passion for knowing your language - you must actively want to extend your knowledge of the language and how it works, what guarantees it provides and which it doesn't. This knowledge will translate into programs that use the features of the language to create minimal, efficient, well structured and error-free programs.

  2. Be willing to learn new techniques. There are so many useful techniques and skills for a new programmer to have that I cannot list them all here, and course designers cannot possibly include them all in their course material.

That's it - two things. Much better than the self-absorbed tripe I rattled off a few weeks ago. To anyone who actually bothered to read that, I apologize profusely.

All BLacks vs. Wales

I don't normally write about sport, but I thought I'd mention that I'm about to start the long drive to Wales to see the match this weekend. My money's on the All Blacks to take the match, but many people have suggested that it may be a close call.

Will post an update upon my return!

EDIT: Well, I'm back. The Rugby was awesome, possibly the best game I've ever seen. I wanted to post a link to a highlights video on YouTube, but it seems no one has posted one yet. Instead, I'll post a video of the initial stand-off between the teams after the Haka. The atmosphere was incredible - a full house (74 thousand strong) all screaming at the top of their lungs. Good fun. Anyway, you can see it here:

The data: URL scheme

Here's something you may not already know: You can include data directly in an (x)html page, instead of referencing it externally.

For example, most of the time when you want to display an image you would write code like this:

<img src="http://www.blogger.com/some_image.png" alt="some random image" />


Web browsers downloading your HTML source will download the text first, then download any external references, including "some_image.png" (assuming the user has not turned off image downloading).However, there are a few cases where you want to distribute an HTML file with images, but don't want to distribute multiple files. In those cases, the 'data:' URL scheme is what you need.

The scheme is documented in the (very readable) RFC2339. Essentially, you can include the binary data straight into your HTML code. The example they give in the RFC looks like this:


<img src="%20%20%20AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz%20%20%20ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp%20%20%20a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl%20%20%20ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis%20%20%20F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH%20%20%20hhx4dbgYKAAA7" alt="Larry" />


Which equates to this image:

Larry

There are many reasons why you wouldn't want to do this - it increases the size of your HTML file, forcing users to download more before they can see whether your content is what they want (especially if the embedded data is near the beginning of the file). There are also some limitations on the size of data and those limitations vary depending on where this technique is used. Still, it's a useful technique that can be used when you need to embed small amounts of binary data within an HTML file and you don't want to distribute multiple files.

Ten Things to Teach Programming Students

While talking to a friend recently, we began discussing the role of graduates in the industry. My belief is that employers employ graduates and expect them to have the same skill level as their existing, trained employees (I have certainly seen this first-hand). Having been on the "other side" of the problem I appreciate that graduates are rarely fit for the tasks set for them without further training.

This got me thinking: If there were 10 things graduates should know before graduating, what should they be? What short list of skills can graduates teach themselves to become better than their competition (and getting that first job is just that: a competition). That train of thought spawned the following list:

Ten things programming students should know before graduating:
  1. Inheritance & Composition. In the land of OO, you must know what inheritance does for you. In C++, this means that you must know what public, protected and (rarely used) private inheritance means. If class A is publically inherited form class B, that does that tell you about the relationship between A and B? What about if the inheritance was protected, rather than public? In a similar vein, what does virtual inheritnace do, and when would you want to use it? Sooner or later a graduate programmer will discover a complex case of multiple inheritance, and they need to be able to cope with it in a logical fashion. Knowing the answers to the above questions will help.
    Unfortunately, a lot of the time inheritance is over-used. Just because we have access to inheritance, doesn't mean we should use it all the time! Composition can be a useful tool to provide clean code where inheritance would muddy the waters. Composition is such a basic tool that many graduates don't even think of it as a tool. Experience will teach when to use composition and when to use inheritance. Graduates have to know that both can be solutions to the same problem.

  2. Memory Allocation. So many graduates do not understand the importance of cleaning up after yourself. Some do not fully appreciate the difference between creating objects on the stack and on the heap. Some know that but fail to understand how memory can be leaked (exceptions are a frequent cause of memory leaks in novice programmers). Every programmer should know the basic usage of new, new[], delete and delete[], and should know when and how to use them.

  3. Exceptions. Most programmers share a love / hate relationship with exceptions; You gotta know how to catch them, but at the same time you tend to avoid using them yourself. Why? Because exceptions should be .... exceptional! There's a reasonably large amount of overhead associated with throwing and catching exceptions. Using exception as return values or flow-control constructs are two examples of exception mis-use. Exceptions should be thrown only when the user (or programmer) does something so bad that there's no way to easily fix or recover from it. Running out of resources (whether it be memory, disk space, resource Ids or whatever) is a common cause for exceptions to be thrown.

  4. Const correctness. Const correctness is so simple, yet so many programmers just don't bother with it. The big advantage of const-correctness is that it allows the compiler to check your code for you. By designating some methods or objects const you're telling the compiler "I don't want to change this object here". If you do accidentally change the object the compiler will warn you.

  5. Threading. Threading is hard. There's no simple way around this fact. Unfortunately, the future of PC hardware seems to be CPUs with many cores. Programs that do not make use of multiple threads have no way to make use of future hardware improvements. Even though using libraries like Qt that make it ridiculously easy to create threads and pass data between threads, you still need to understand what a thread is, and what you can and cannot do. A very common thing I see in new programmers is a tendency to use inadequate synchronization objects in threads. Repeat after me: "A volatile bool is not a synchronization object!".

  6. Source control. Every programmer on the planet should know how to use at least one version control system. I don't care if it's distributed or not, whether it uses exclusive locks or not, or even if it makes your tea for you. The concepts are the same. Very few professional programmers work alone. Graduates must be able to work in a team - that includes managing their code in a sensible fashion.

  7. Compiler vs Linker. Programmers need to understand that compiling an application is a two step process. Compilation and Linking are two, discreet, and very different steps. Compiler errors and Linker errors mean very different things, and are resolved in very different ways. Programmers must know what each tool does for them, and how to resolve the most common errors.

  8. Know how to debug. When something goes wrong, you need to know how to fix it. Usually, finding the problem is 90% of the work, fixing it is 5% of the work, and testing it afterwards is another 10%. No, that's not a typo - it does add up to more than 100%, which is why there's a lot of untested code out there! Of course, if you were really good you wouldn't write any bugs in the first place!

  9. Binary Compatibility. This one is for all those programmers that write library code, or code that gets partially patched over time. As you probably already know, shared libraries contain a table of exported symbols. If you change that table so a symbol is no longer available (or it's signature changes), code that uses that symbol will no longer work. There's a list of things you can and cannot do while maintaining binary compatability, and it's very hard not to break those rules, even if you know what you're doing. I've blogged about this before, and linked to the KDE binary compatibility page on techbase - worth a read!
    The main method of maintaining binary compatibility is to program to an interface, rather than to an implementation. Once you start paying attention to binary compatibility, you'll quickly realise that it's a very bad idea to export your implementation from a shared library, for one simple reason: If you want to change your implementation you're stuck with the restrictions placed upon you by the need to maintain binary compatibility. If all you export is a pure interface and a means to create it (possibly via a factory method) then you can change the implementation to your heart's content without having to resort to pimpl pointers.

  10. Read the right books. There are a few movers and shakers in the programming industry that it pays to keep an eye on. There are many books worth reading, but I'm going to recommend just two. The first is "Design Patterns: Elements of Reusable Object-Oriented Software", and the second is the "Effective C++" series. Neither are considered to be great bedtime reading, both are considered to be packed from cover to cover with things that will help you out in every-day situations. Any programmer worth his or her salt will own a copy of at least one of these books, if not both. Of course, there are books on UI design and usability, threading, text searching, SQL and database maintenance, networking, hardware IO, optimisation, debugging... the list goes on.

  11. Networking. What's this? An 11th item? That's right: it's in here because it cannot be ignored in most programming tasks. It's getting harder and harder to avoid networking. Most graduates will have to write code that sends data over a network sooner or later, so they'll need to know the difference between UDP and TCP and IP, as well as what the basic network stack looks like (think "Please Do Not Touch Steve's Pet Alligator"), and what each layer does. Being familiar with tools like wireshark helps here.

What's not in the list:

You may notice that I haven't included any specific technologies in this list. That's because I firmly believe that it really doesn't matter. Sure, there are some libraries that are better than others (I'd bet my life on a small set of libraries), but the programmer next to me has a different set. I care not one grote whether a graduate knows how to program in .NET, Qt, wxWidgets or anything else - as long as they're willing to learn something new (whatever I'm using on my project).

Which brings me nicely to the conclusion: The single quality I see in all the programmers I admire is a sense of curiosity; a restlessness and a sense of adventure. Our industry is constantly shifting. The best programmers are able to ride the changes and come out better for it.

Is this post horribly self-indulgent and boring? Probably, but it had to be done. Have I forgotten anything? Things you feel should be on the list that are missing? Remember that the point of the exercise is to keep a small list - I could list every programming skill and technology required under the sun, but that would not be very useful would it?


Releases Galore!

It would be negligent of me if I did not point out that today several important software releases were made:

The first is the project formerly known as project greenhouse - now known as Qt Creator. I've blogged about this before. You can now download a technical preview. I'm very excited about this - having played with it in it's beta state I can't wait to use it with some of my active projects. Unfortunately I won't be doing any coding this weekend, as I'm off to Switzerland for a long weekend. I'll have to try it out when I get back.

The second big release event today is the Ubuntu family of distributions. That's right, version 8.10 is out now. I'm a kubuntu man myself, so I'll be trying this out after my long weekend as well.

That's all from me - I have several projects in the wings to blog about in the coming weeks, but for now I need to catch some sleep - my taxi arrives at 5:00 AM tomorrow.

Cheers,