Showing posts with label programming. Show all posts

Visualising the Ubuntu Package Repository

Like most geeks, I like data. I also like making pretty pictures. This weekend, I found a way to make pretty pictures from some data I had lying around.

The Data: Ubuntu Packages

Ubuntu is made up of thousands of packages. Each package contains a control file that provides some meta-data about the package. For example, here is the control data for the autopilot tool:

Package: python-autopilot
Priority: optional
Section: universe/python
Installed-Size: 1827
Maintainer: Ubuntu Developers 
Original-Maintainer: Thomi Richards 
Architecture: all
Source: autopilot
Version: 1.2daily12.12.10-0ubuntu1
Depends: python (>= 2.7.1-0ubuntu2), python (<< 2.8), gir1.2-gconf-2.0, gir1.2-glib-2.0, gir1.2-gtk-2.0, gir1.2-ibus-1.0, python-compizconfig, python-dbus, python-junitxml, python-qt4, python-qt4-dbus, python-testscenarios, python-testtools, python-xdg, python-xlib, python-zeitgeist
Filename: pool/universe/a/autopilot/python-autopilot_1.2daily12.12.10-0ubuntu1_all.deb
Size: 578972
MD5sum: c36f6bbab8b5ee10053b63b41ad7189a
SHA1: 749cb0df1c94630f2b3f7a4a1cd50357e0bf0e4d
SHA256: 948eeee40ad025bfb84645f68012e6677bc4447784e4214a5512786aa023467c
Description-en: Utility to write and run integration tests easily
 The autopilot engine enables to ease the writing of python tests
 for your application manipulating your inputs like the mouse and
 keyboard. It also provides a lot of utilities linked to the X server
 and detecting applications.
Homepage: https://launchpad.net/autopilot
Description-md5: 1cea8e2d895c31846b8d3482f96a24d4
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Origin: Ubuntu

As you can see, there's a lot of information here. The bit I'm interested in is the 'Depends' line. This lists all the packages that are required in order for this package to work correctly. When you install autopilot, your package manager will install all it's dependencies, and the dependencies of all those packages etc. This is (in my opinion), the best feature of a modern Linux distribution, compared with Windows.

Packages and their dependant packages form a directed graph. My goal is to make pretty pictures of this graph to see if I can learn anything useful.

Process: The Python

First, I wanted to extract the data from the apt package manager and create a graph data structure I could fiddle with. Using the excellent graph-tool library, I came up with this horrible horrible piece of python code:

#!/usr/bin/env python

from re import match, split
from subprocess import check_output
from debian.deb822 import Deb822
from graph_tool.all import *

graph = Graph()
package_nodes = {}
package_info = {}


def get_package_list():
    """Return a list of packages, both installed and uninstalled."""
    output = check_output(['dpkg','-l','*'])
    packages = []
    for line in output.split('\n'):
        parts = line.split()
        if not parts or not match('[uirph][nicurhWt]', parts[0]):
            continue
        packages.append(parts[1])
    return packages


def get_package_info(pkg_name):
    """Get a dict-like object containing information for a specified package."""
    global package_info
    if pkg_name in package_info:
        return package_info.get(pkg_name)
    else:
        try:
            yaml_stream = check_output(['apt-cache','show',pkg_name])
        except:
            print "Unable to find info for package: '%s'" % pkg_name
            package_info[pkg_name] = {}
            return {}
        d = Deb822(yaml_stream)
        package_info[pkg_name] = d
        return d


def get_graph_node_for_package(pkg_name):
    """Given a package name, return the graph node for that package. If the graph
    node does not exist, it is created, and it's meta-data filled.

    """
    global graph
    global package_nodes

    if pkg_name not in package_nodes:
        n = graph.add_vertex()
        package_nodes[pkg_name] = n
        # add node properties:
        pkg_info = get_package_info(pkg_name)
        graph.vertex_properties["package-name"][n] = pkg_name
        graph.vertex_properties["installed-size"][n] = int(pkg_info.get('Installed-Size', 0))
        return n
    else:
        return package_nodes.get(pkg_name)


def get_sanitised_depends_list(depends_string):
    """Given a Depends string, return a list of package names. Versions are
    stripped, and alternatives are all shown.

    """
    if depends_string == '':
        return []
    parts = split('[,\|]', depends_string)
    return [ match('(\S*)', p.strip()).groups()[0] for p in parts]

if __name__ == '__main__':
    # Create property lists we need:
    graph.vertex_properties["package-name"] = graph.new_vertex_property("string")
    graph.vertex_properties["installed-size"] = graph.new_vertex_property("int")

    # get list of packages:
    packages = get_package_list()

    # graph_nodes = graph.add_vertex(n=len(packages))
    n = 0
    for pkg_name in packages:
        node = get_graph_node_for_package(pkg_name)
        pkg_info = get_package_info(pkg_name)
        # purely virtual packages won't have a package info object:
        if pkg_info:
            depends = pkg_info.get('Depends', '')
            depends = get_sanitised_depends_list(depends)
            for dependancy in depends:
                graph.add_edge(node, get_graph_node_for_package(dependancy))
        n += 1
        if n % 10 == 0:
            print "%d / /%d" % (n, len(packages))

    graph.save('graph.gml')

Yes, I realise this is terrible code. However, I also wrote it in 10 minutes time, and I'm not planning on using it for anything serious - this is an experiment!

Running this script gives me a 2.6MB .gml file (it also takes about half an hour - did I mention that the code is terrible?). I can then import this file into gephi, run a layout algorithm over it for the best part of an hour (during which time my laptop starts sounding a lot like a vacuum cleaner), and start making pretty pictures!

The Pretties:

Without further ado - here's the first rendering. This is the entire graph. The node colouring indicates the node degree (the number of edges connected to the node) - blue is low, red is high. Edges are coloured according to their target node.

These images are all rendered small enough to fit on the web page. Click on them to get the full image.

A few things are fairly interesting about this graph. First, there's a definite central node, surrounded by a community of other packages. This isn't that surprising - most things (everything?) relies on the standard C library eventually.

The graph has several other distinct communities as well. I've produced a number of images below that show the various communities, along with a short comment.

C++

These two large nodes are libgcc1 (top), and libstdc++ (bottom). As we'll see soon, the bottom-right corder of the graph is dominated by C++ projects.

Qt and KDE

This entire island of nodes is made of up the Qt and KDE libraries. The higher nodes are the Qt libraries (QtCore, QtGui, QtXml etc), and the nodes lower down are KDE libraries (kcalcore4, akonadi, kmime4 etc).

Python

The two large nodes here are 'python' and 'python2.7'. Interestingly, 'python3' is a much smaller community, just above the main python group.

System

Just below the python community there's a large, loosely-connected network of system tools. Notable members of this community include the Linux kernel packages, upstart, netbase, adduser, and many others.

Gnome

This is GNOME. At it's core is 'libglib', and it expands out to libgtk, libgdk-pixbuf (along with many other libraries), and from there to various applications that use these libraries (gnome-settings-daemon for example).

Mono

At the very top of the graph, off on an island by themselves are the mono packages.

Others

The wonderful thing about this graph is that the neighbourhoods are fractal. I've outlined several of the large ones, but looking closer reveals small clusters of related packages. For example: multimedia packages:

This is Just the Beginning...

This is an amazing dataset, and really this is just the beginning. There's a number of things I want to look into, including:

Adding 'Recommends' and 'Suggests' links between packages - with a lower edge weight than Depends.
Colour coding nodes according to which repository section the package can be found in.
Try to categorise libraries vs applications - do applications end up clustered like libraries do?

I'm open to suggestions however - what do you think I should into next?

Experimenting with C++ std::make_shared

C++11 is upon us, and one of the more utilitarian changes in the new standard is the inclusion of the new smart pointer types: unique_ptr, shared_ptr and weak_ptr. An interesting related feature is std::make_shared - a function that returns a std::shared_ptr wrapping a type you specify. The documentation promises efficiency gains by using this method. From the documentation:

This function allocates memory for the T object and for the shared_ptr's control block with a single memory allocation. In contrast, the declaration std::shared_ptr p(new T(Args...)) performs two memory allocations, which may incur unnecessary overhead.

I was curious: How much faster is make_shared than using new yourself? Like any good scientist, I decided to verify the claim that make_shared gives better performance than new by itself.

I wrote a small program and tested it. Here's my code:

#include <memory>
#include <string>

class Foo
{
public:
    typedef std::shared_ptr<Foo> Ptr;

    Foo()
    : a(42)
    , b(false)
    , c(12.234)
    , d("FooBarBaz")
    {}

private:
    int a;
    bool b;
    float c;
    std::string d;
};

const int loop_count = 100000000;
int main(int argc, char** argv)
{
    for (int i = 0; i < loop_count; i++)
    {
#ifdef USE_MAKE_SHARED
        Foo::Ptr p = std::make_shared<Foo>();
#else
        Foo::Ptr p = Foo::Ptr(new Foo);
#endif
    }
    return 0;
}

This is pretty simple - we either allocation 100 million pointers using new manually, or we use the new make_shared. I wanted my 'Foo' class to be simple enough to fit into a couple of lines, but contain a number of different types, and at least one complex type. I built both variants of this small application with g++, and used the 'time' utility to measure it's execution time. I realise this is a pretty crude measurement, but the results are interesting nontheless:

My initial results are confusing - it appears as if std::make_shared is slower than using new. Then I realised that I had not enabled any optimisations. Sure enough, adding '-O2' to the g++ command line gave me some more sensible results:

OK, so make_shared only seems to be faster with optimisations turned on, which is interesting in itself. At this point, I started wondering how other compilers would fare. I decided to pick on clang and run exactly the same tests once more:

Once again we see a very similar pattern between the optimised and non-optimised code. We can also see that clang is slightly slower than g++ (although it was significantly faster at compiling). For those of you who want the numbers:

Now I have evidence for convincing people to use make_shared in favor of new!

Python GObject Introspection oddness

I recently ported indicator-jenkins to Gtk3 using the python GObject Introspection Repository (gir) bindings. Ted Gould did most of the work, I just cleaned some bits up and made sure everything worked. One issue that puzzled me for a while is that the GObject library changed the way it's "notify" signal works between GObject 2 and GObject 3. I've not seen any documentation of this change, so I'll describe it here.

For this example, let's make a very simple class that has a single property:

import gobject class MyClass(gobject.GObject):     prop = gobject.property(type=int)
...and a very simple callback function that we want to call whenever the value of 'prop' changes:

def cb(sender, prop):     print "property '%s' changed on %r." % (prop.name, sender)
Finally, with GObject 2 we can create an instance of 'MyClass' and connect to the 'notify' signal like this:

inst = MyClass() inst.connect("notify", cb) inst.prop = 42
When we run this simple program we get the following output:
property 'prop' changed on . ... which is what we expected. However, if we port this code to GObject 3, it should look like this:

from gi.repository import GObject class MyClass(GObject.GObject):     prop = GObject.property(type=int) def cb(sender, prop):     print "property '%s' changed on %r." % (prop.name, sender) inst = MyClass() inst.connect("notify", cb) inst.prop = 42
However, running this gives an error:

/usr/lib/python2.7/dist-packages/gi/_gobject/propertyhelper.py:171: Warning: g_value_get_object: assertion `G_VALUE_HOLDS_OBJECT (value)' failed instance.set_property(self.name, value) Traceback (most recent call last): File "gobject3.py", line 8, in cb     print "property '%s' changed on %r." % (prop.name, sender) AttributeError: 'NoneType' object has no attribute 'name'
The 'prop' parameter in the callback is set to None.

There is a solution however - connecting the callback to a more specific notification signal works as expected:

from gi.repository import GObject class MyClass(GObject.GObject):     prop = GObject.property(type=int) def cb(sender, prop):     print "property '%s' changed on %r." % (prop.name, sender) inst = MyClass() inst.connect("notify::prop", cb) inst.prop = 42
It took me a while to figure this out - hopefully I've saved someone else that work.

How to Compile Unity from Source

These instructions will help you build unity from source. However, there are a
few things to consider:

I recommend that you never copy anything you've built locally outside your home directory. Doing so is asking for trouble, especially as we're building the entire desktop shell. If you manage to ruin your system-wide desktop shell you'll be a very sad programmer!
I'm assuming that you're running the precise Ubuntu release (still in alpha at the time of writing, but very usable).
I'm also assuming that you want to build unity from trunk (that is, lp:unity).

Without further ado, let's get to it:

Getting the source code

If you don't already have Bazaar installed, install it now:

sudo apt-get install bzr

You may want to make yourself a folder for the unity code. I tend to do something like this:

mkdir -p ~/code/unity
cd ~/code/unity

Let's grab the code from launchpad:

bzr branch lp:unity trunk

This may take a while. If you prefer to use Bazaar checkouts instead of branches, that's fine to.

Installing Build Dependancies

We need to get the build-dependancies for unity. Thankfully, apt-get makes this trivial:

sudo apt-get build-dep unity

Compiling Unity

I have a set of bash functions that makes this step significantly easier. To use them, copy the following bash code into a file in your home directory called ".bash_functions":

function recreate-build-dir()
{
   rm -r build
   mkdir build
   cd build
}

function remake-autogen-project()
{
    ./autogen.sh --prefix=/home/thomi/staging --enable-debug
    make clean && make && make install
}

function remake-unity()
{
    recreate-build-dir
    cmake .. -DCMAKE_BUILD_TYPE=Debug -DCOMPIZ_PLUGIN_INSTALL_TYPE=local -DCMAKE_INSTALL_PREFIX=/home/thomi/staging/ -DGSETTINGS_LOCALINSTALL=ON
    make  && make install
}

function unity-env
{
 export PATH=~/staging/bin:$PATH
 export XDG_DATA_DIRS=~/.config/compiz-1/gsettings/schemas:~/staging/share:/usr/share:/usr/local/share
 export LD_LIBRARY_PATH=~/staging/lib:${LD_LIBRARY_PATH}
 export LD_RUN_PATH=~/staging/lib:${LD_RUN_PATH}
 export PKG_CONFIG_PATH=~/staging/lib/pkgconfig:${PKG_CONFIG_PATH}
 export PYTHONPATH=~/staging/lib/python2.7/site-packages:$PYTHONPATH
}

Note: You will need to replace all instances of "/home/thomi" with your own home directory path!

Now run this in a terminal:

echo ". ~/.bash_functions" >> ~/.bashrc

This ensures that the next time you open a bash shell the functions listed above will be available to you. To avoid having to close and re-open a terminal, we can read them manually just this once:

. ~/.bash_functions

You should now be able to run:

remake-unity

from the trunk/ directory we created earlier. That's it - you're building unity!

Not so Fast!

Chances are, while trying to build unity, you found that it needed a newer version of one of the several supporting projects than you had installed. At the time of writing, you can't compile unity without first building nux from sources first. Thankfully, that's pretty easy with the use of the functions you now have set up.
First we get the source code:

mkdir -p ~/code/nux
cd ~/code/nux
bzr branch lp:nux trunk
cd trunk

Then we need to get the build dependencies for nux.

sudo apt-get build-dep nux

Unfortunately there are a fewpackages missing, so you'll want to install them as well:

sudo apt-get install gnome-common libibus-1.0-dev libgtest-dev google-mock libxtst-dev

Then we use the functions above to build nux:

remake-autogen-project

That's it! You can then go back and build unity - hopefully this time with better success.

Build Notes

You may have noticed that the remake-* scripts do a complete rebuild every time. If you'd prefer to just build the files that have changed since last time, change to the trunk/build/ directory, and run:

make && make install

Running Unity

If you'd like to run the version of unity you've built, rather than the system-wide version, open a terminal and run the following commands:

unity-env
unity --replace &

The first line patches several environment variables such that unity will subsequently be launched from your local staging directory. These environment variables will remain changed until you close the terminal, so you need only run unity-env once.

Sloecode: now with 100% more website

Sloecode now has a real website! It's a rather minimalist affair at present, but it's better than nothing!

Since I last blogged, the setup instructions for the Sloecode server have changed a lot. The new instructions are available on the new site.

Sloecode: Now with Ubuntu Packages!

We're still working towards a sloecode 1.0 release. We're now dangerously close - we have some UI tweaks to complete, and a few other bits and pieces. However, we now have Ubuntu packages for both the client and server components, ready for you to test!

What is Sloecode?

Sloecode is an open source project, hosted on launchpad.net that aims to provide a comprehensive, installable code forge. A "code forge" is a set of tools that help groups of people write software (think sourceforge, launchpad etc). It typically includes a revision control system, and optionally project wikis, bug tracking, issue tracking, feature planning and any other features people might need. However, it's important to note that we're not trying to replicate Launchpad! Launchpad is a great service, and we don't want to compete with it. Instead, we're providing a tool for people who:

Don't want their code to be public - either because it's commercial software or because it's not ready yet. You can use sloecode in your business, college, university without having to release your software to the wider world. You can do this on launchpad, but you must pay for the privilege. The project was borne out of the need to have a code forge for an educational environment: putting students' work online is not an option.
Want their code forge system to be hosted locally. For large code bases, the communication times between a local machine and the central launchpad.net servers can become expensive - especially in locations where the Internet connection is poor. Running sloecode is easy (as we'll see soon) and gives you complete control over the system. You're are responsible for server maintenance, backup, and general administration.

Compared to Launchpad, we're aiming for a completely different user base.

How does it Work?

Sloecode is made up of two distinct parts:

The Bazaar/SSh smart server is a python-twisted application that understands the SSH protocol, and runs bzr-serve on demand for users. Every time you interact with the server using a Bazaar client application you're actually talking to the smart server. We need this component for several reasons: first, we don't want users to require a system account on the Linux server (which is the usual way of setting up a Bazaar server), but we do still want to use SSH public/private key authentication. This means the SSH server needs to know about our database and be able to retrieve keys from it on demand.

The Sloecode Web App is the front-end for the whole system. It's written in pylons, and makes use of a whole host of other libraries including jinja2 (page templates), formencode (input validation), YUI3 (javascript components), repoze.what and repoze.who (authorisation and authentication), and a whole lot more. The web-app provides control for managers and administrators to create projects and users and assign users to projects (with different levels of access). Regular users can see basic information about their bazaar branches (both personal repositories and project repositories), and manage their SSH keys.

The plan for the future is to add optional components - the key word being optional. One of our design goals is to make sloecode easy to install; this includes minimising install-time dependancies.

The key thing to understand about Sloecode is that we're not writing the VCS ourselves - we're simply making Bazaar easier to install and manage.

Enough already, give me the server!

Sloecode has not yet been released, but you are welcome to participate in testing. To set up the server, you'll first need the sloecode PPA in your sources.list. To achieve this, run the following command:

sudo add-apt-repository ppa:sloecode

Then update your package lists:

sudo apt-get update

You can now install the sloecode server package:

sudo apt-get install sloecode

The server package should install all the package dependancies. However, before you can run the server, you need to set up the database back-end. The default back-end us sqlite, which is fine for testing or for very low-use installations. Sloecode uses sqlalchemy, which allows us to use any number of databse backends. This, and many other configuration details can be edited by tweaking this file:

/etc/default/sloecode-production.ini

Once you're happy with the settings, you need to create the database tables. Do this by running:

sudo paster setup-app /etc/default/sloecode-production.ini

This command will read the values present in the configuration file, and create the default database tables for you. You only need to run this command once (to create the tables in your database)!

If you have edited any of the other values, you will want to restart the server:

sudo service sloecode restart

By default, log files are in /var/log/sloecode, Bazaar repositories are created in

/srv/sloecode/bzr-repos

and the default database is a simple SQLite file (which won't handle any kind of heavy workload or concurrency, so you may wish to change that). The RSA public/private key pair the smart server uses to communicate with clients are stored in

/var/sloecode/keys

And the clients?

Client setup is much simpler. Simply follow these steps:

Make sure you have an account created for you in the sloecode web interface. Log in to the web interface with your username and password.
Click on the "Manage SSH Keys" link on your home page. You need to paste your SSH public key in the form provided. This allows your Bazaar client to authenticate with the sloecode server. If you don't have an SSH key, you can generate one by running:
```
ssh-keygen
```
and view the public key by running:
```
cat ~/.ssh/id_rsa.pub
```
All client machines need the 'bzr' and 'bzr-sloecode' packages installed. Assuming you have the sloecode PPA installed (see instructions for doing this in the server section, above), you can run:
```
sudo apt-get install bzr bzr-sloecode
```
Finally, you need to tell the Bazaar sloecode plugin where the sloecode server is. The client plugin looks for an environment variable called "SLOECODE_SERVER", and will complain if it is not found. The easiest way to set this environment variable is to edit your "~/.bashrc" file and add this line to the end:
```
export SLOECODE_SERVER=domain.of.your.server.com
```
The value of this variable should be either a domain name, OR an IP address that points to the sloecode server you wish to use. For example, if the server is installed on the local network you might have the following:
```
export SLOECODE_SERVER=192.168.1.10
```
Note that you must not add any network protocol specification - the sloecode client plugin takes care of that for you. If you are running the sloecode ssh service on a port other than 22, you must add this port to the end of the string, like so:
```
export SLOECODE_SERVER=192.168.1.10:4022
```
Finally, if your username on the sloecode server is different from your local computer username, you need to tell the Bazaar sloecode plugin what username it should use. To do this, run the command:
```
bzr sc-login sloecode_username
```
Where "sloecode_username" is the username you use to log in to the sloecode web interface. For example, on my machine the command is:
```
bzr sc-login thomir
```
If you run the command without a username it will tell you what username is currently set.

That's it! Everything is all set up. Now you can start using the sloecode server.

Bazaar / Sloecode Basics

I won't cover the details of how to use Bazaar. If you need that instruction, look at the official Bazaar documentation. However, here are a few pointers:

Every user on the sloecode server has a personal repository. To push code to your personal repository, do this:
```
bzr push sc:~sloecode_username/branch_name
```
By default personal repositories are created with no branches, so whenever you push or pull from a personal repository you must always specify a branch name!
Personal repositories are private - you are the only one who can read or write to your personal repository. If you need to share your code with someone else, you need to store it in a project repository.
To pull code from a project repository, the command looks like this:
```
bzr pull sc:project_name/branch_name
```
Note that there is no tilde character before the project name. Also note that you need not specify the branch name if you want the special branch called "trunk". For example, if I want a copy of the trunk branch of project "elastic", I'd run the following command:
```
bzr pull sc:elastic
```
However if I want a specific branch called "fix-bug-1234", I'd run the following:
```
bzr pull sc:elastic/fix-bug-1234
```

Final Thoughts:

We're still writing sloecode - it's an ongoing effort and won't be finished any time soon. We welcome your feedback - the best way to contact is us via the sloecode-developers mailing list, or in the #sloecode channel on freenode IRC, or leave a comment below. Have we missed anything? Spot any bugs? Have suggestions for improvements/future direction? We'd love to hear from you!

Visual Studio Fail

Perhaps this is a symptom of the underlying operating system, rather than the Visual Studio IDE. In either case it sucks:

I like to keep my files organised in folder hierarchies. Now I'm being forced to use a flat, wide folder tree by my IDE.

Not. Happy. At. All.

Console Hacking

Several news outlets are reporting that the PS3 has been compromised. What strikes me as odd is that most of the time, the people doing the hacking have no interest in piracy (at least, that's their claim). Instead, their motives seem to be towards allowing home-brew app creation. This is a noble goal, and one that will surely become more and more popular, as we start moving away from passive entertainment towards a more participatory model.

My question is this: Why do console manufacturers still struggle to prevent home-brew projects? The time spent trying to prevent these sorts of exploits must be incredible, and so far, none of them have succeeded.

Personally, I'd love a console that allowed me to run whatever code I want on it - imagine the uses! MythTV running on an XBox? Awesome!

Why your python editor sucks

I'm doing a reasonable amount of python-coding work these days. It would help me to have an editor that doesn't suck. My requirements are:

Small & Fast. I'm not after a massive clunky IDE, just an editor with enough smarts to make editing multiple python files easier.
Sensible syntax highlighting.
Understands python indentation, PEP8 style. Specifically, indents with 4 spaces, backspace key can be used to unindent.
Can be integrated with one or more lint checkers. Right now I use a wonderful combination of pep8, pyflakes and pylint. I want the output of these to be integrated with the editor so I can jump to the file & line where the problem exists.

That's it. I don't think I'm asking too much. Here's the editors I've tried, and why they suck:

KATE. I love kate, it's my default text editor for almost everything. However, there is no way to integrate lint checkers. I could write a plugin, but that's yet another distraction from actually doing my work.
Vim. I'm already reasonably skilled with vim, and Alain Lafon's blog post contains some great tips to make vim even better. My problem with vim is simply that it's too cryptic. Sure, I could spend a few years polishing my vim skills, but I want it to just work. Vim goes in the "kind of cool, but too cryptic" basket.
Eric. When you launch eric for the first time it opens the configuration dialog box. It looks like this:
How many options do I really need for an editor? Over-stuffed options dialogs is the first sign of trouble. It gets worse however, once you dismiss the settings window, the editor looks like this:

Need I say more?
Geany. Looks promising, but no integration into lint checkers.
pida. Integrates with vim or emacs for the editor component. Looks promising, although the user interface is slightly clunky in places. Pida suffers from exactly the same problems as vim does however, but I may end up using it.

There are a few options I have not tried, and probably won't:

Eclipse & pydev. Eclipse is a huge, hulking beast. I want a small, fast, lean editor, not an IDE.
Emacs. Can't be bothered learning another editor. Doesn't look that much different to vim, so what's the point in learning both?
KDevelop. Same reason as Eclipse, above.

I suspect there's a market for a simple python editor that just works. Please! Someone build it!

Visual Studio Exception Woes

Microsoft, in their infinite wisdom have decided to make programming easier. How? By setting the default behavior for Visual Studio 2010 Ultimate to be to ignore (i.e.- not break on) exceptions thrown from non user-code. Behold the default settings for exceptions in a brand new C# project:

Try as I might, I have not yet discovered a way to change the default for these settings for all projects. How am I supposed to teach students about exception handling when Microsoft are doing their best to get rid of them?

Bah.

Attention all Programmers:

As a user of open source software, I like to try and give something back to the community whenever I can. As a somewhat proficient programmer i can do this more often than most, but one of the most effective ways of giving back for non-programmers is by filing bug reports.

Unfortunately, there are two main issues with this:

Submitting a bug report is often incredibly painful. Most software bug trackers I have seen require an account, which means registering a new username & password (I can't wait for more non-essential services like bug trackers to start using openId), activating my account... all this can take 30 minutes of more. Submitting a bug report should be a fire-and-forget affair, taking 10 minutes tops: any longer and I can't afford to spend my time.

Many bug trackers ask users for information that is hard to obtain, or intimidating to non-programmers. How many users know their CPU architecture? Or distribution? Or even the software version they're using? One way around this is to have the bug-reporting done from within the application on the client machine itself, but still - bug trackers should be as friendly to users as possible. How about posting some simple instructions on how to obtain this information for non-technical users?
Even after navigating the multiple hurdles involved in submitting a bug, you then have to deal with the programmers fielding the bug report. This is where it gets tricky. Many programmers view bug reports as a personal insult to them (perhaps subconsciously). Many programmers will triage bugs that they don't want to fix, giving excuses like "It's like that by design", or simply "Low priority, won't fix".

Here's the thing though: The customer is (nearly) always right.

If a user has taken the time to navigate your awful bug tracking software and submit a bug, it must be a big deal to them. If the matter at hand really is like that "by design", your design is probably screwy. If you won't fix it because it's low priority then you need to stop adding new features, and fix the ones you already have.

Open source software seems to suffer from these problems more than commercial software. I guess it's because we're not trying to extract money from our clients. Can you imagine a professional code shop telling a paying customer "I'm sorry, we're not going to fix that bug you reported, because we intended it to work like that"? Yeah, right.

So how do we fix this for the open source world?

There's no simple answer that I can fathom. It requires programmers to be a bit smarter and have a bit more empathy for the mere mortals who have to use their software. As a programmer, I include myself in this category.

That is all, thank you.

Code Craftsmanship

Dunedin now has a local code craftsmanship group!

Code craftsmanship is like craftsmanship in any other area. It describes the transition between knowing how to write code, and knowing how to write good code. Like most crafts, there's an element of constant iterative learning involved. Working with people who have more experience than you can save you some of those iterations.

This is an excellent opportunity to meet other programmers, and discover the rich, yet hidden IT talents in Dunedin. If you're interested in joining us, check out the (brand new, as yet unfinished) website.

Project Documentation

Why is it that most open source project pages are so terrible at documenting their own project?

I'm not talking about API or technical documentation - I'm talking about telling new visitors to your site what the hell your code is about.

Project authors, here are some handy tips:

On your project front page, right at the top, put a simple explanation of what your code does (or what you hope it will do someday). Remember that your audience may not have the same level of technical experience as you do. Examples (screenshots, code snippets) are a MUST. A picture speaks a thousand words and all that...
Make sure you include the development status of your project. I can't count the number of times I've spent 30 minutes looking at a project only to realize that it's not nearly complete enough to be usable to me. There's no shame in saying "this library is working, but not production ready. It is missing features X, Y, Z"
Inject some enthusiasm! How many boring, dull, dry project descriptions do I have to read through? Most sound like the authors aren't passionate about their product. Sell your project; inject some enthusiasm, and maybe your viewers will become more enthusiastic in the process!

Well, that's my rant for the day. Now I must go update my project documentation...

Design and Implementation

One of the key tenets in good software deisgn is to separate the design of your product from it's implementation.

In some industries, this is much harder to do. When designing a physical product, the structural strength & capabilities of the material being used must be taken into account. There's a reason most bridges have large columns of concrete and steel going down into the water below. From a design perspective, it'd be much better to not have these pillars, thereby disturbing the natural environment less and allowing shipping to pass more easily.

Photo by NJScott. An example of design being (partially) dictated by implementation.

Once you start looking for places where the implementation has "bubbled up" to the design, you start seeing them all over the place. For example, my analogue wristwatch has a date ticker. Most date tickers have 31 days, which means manual adjustment is required after a month with fewer than 31 days. I'm prepared to live with this. However, the date ticker on my watch is made up of two independent wheels - and it climbs to 39 before rolling over, which means manual intervention is required every month! What comes after day 39? day 00 of course!

It's easy to understand why this would be the case - it's much simpler to create a simple counting mechanism that uses two rollers and wraps around at 39 than it is to create one that wraps at the appropriate dates. I have yet to see an analogue wristwatch that accounts for leap-years.

Software engineers have a much easier time; our materials are virtual - ideas, concepts and pixels are much easier to manipulate than concrete and steel. However, there are still limitations imposed on us - for example data can only be retrieved at a certain speed. Hardware often limits the possibilities open to us as programmers. However, these limitations can often be avoided or disguised. Naive implementations often lead to poor performance. A classic example of this is Microsoft's Notepad application. Notepad will load the entire contents of the file into memory at once, which can take a very long time if the file you are opening is large. What's worse is that it will prevent the user from using the application (notepad hangs, rendering it unusable) while this loading is happening. For example, opening a 30MB text file takes roughly 10 seconds on this machine. This seems particularly silly when you consider that you can only ever see a single page of the data at a time - why load the whole file when such a small percentage of it is required at any one time? I guess the programmers who wrote notepad did not intend for this use case, but the point remains valid: an overly-simple implementation led to poor performance.

The unfortunate state of affairs is that the general population have been conditioned to accept bad software as the norm. There really is no excuse for software that is slow, crashes, or is unnecessarily hard to use. It's not until you use a truly incredible piece of software that you realise what can be achieved. So what needs to change? Two things:

Developers need to be given the tools we need to make incredible software. These tools are getting better all the time. My personal preference for the Qt frameworks just paid off with the beta release of Qt 4.7 and QT Creator 2.0. I plan on writing about the new "Quick" framework in the future: I anticipate it making a substantial difference to the way UI designers and developers collaborate on UI design and construction.
Users need to be more discerning and vocal. As an application developers it can be very hard to know what your users think. If you don't get any feedback, are your users happy, or just silent? We need a better way for users to send feedback to developers; it needs to be low-effort fast and efficient.

Spolsky loses his cool

Today I stumbled across Joel Spolsky's article "The Duct Tape Programmer". Essentially it's a thousand word rant to make this simple point:

A 50%-good solution that people actually have solves more problems and survives longer than a 99% solution that nobody has because it’s in your lab where you’re endlessly polishing the damn thing. Shipping is a feature. A really important feature. Your product must have it.

Of course he's right - however, his post is ten agonising paragraphs wherein he rants about design patterns, extended C++ features such as template classes (wait, they've been around for a while now - can we still call them "extended" features), and multi-threading (!!!), and finally one succinct paragraph in which he makes his point (most of which I have quoted above). Now don't get me wrong - I am by no means criticising his writing style ("people in glass houses..." and all that) - all I'm suggesting is that someone with Joel's reputation may wish to think a little harder before posting this sort of tripe online, lest he tarnish his otherwise good reputation. Let me give an example:

One principle duct tape programmers understand well is that any kind of coding technique that’s even slightly complicated is going to doom your project. Duct tape programmers tend to avoid C++, templates, multiple inheritance, multithreading, COM, CORBA, and a host of other technologies that are all totally reasonable, when you think long and hard about them, but are, honestly, just a little bit too hard for the human brain.

So Joel Spolsky is seriously suggesting that C++, templates, multiple inheritance and multi-threading are invariably going to "doom your project"? Come on. Multi-threading is critical to the success of many projects - without it, or something similar, a huge portion of applications simply wouldn't exist, or at least would be a lot more complicated. I challenge Joel to write a print spooler as part of an interactive application in a single thread. I challenge Joel to write a tool for scientific analysis that must process lots (gigabytes? exabytes?) of data while maintaining an interactive user interface.

As I mentioned earlier, Joel has a point - however, instead of suggesting that any slightly-complicated technology be banned outright, I'll instead suggest that any slightly complicated technology had better be understood by your programmers before you use it in your project. Don't use multi-threading because it sounds cool, use it because it's the right tool for the job.

Compiling != Testing

Just a small note, folks, to remind you all that just because your code compiles, it's not guaranteed to work. Writing the code is only 10% of the total effort.

I've neglected this blog for a long time now. Hopefully I'll be back soon, but until then, watch this space!

WiiWare: Innovation and mistakes

I bought a Nintendo Wii earlier this year. I've never actually owned a console before, but have a reasonably strong loyalty to Nintendo. They appear to publish the best games (of course, that's entirely subjective). My game catalogue now includes the following titles:

The Legend of Zelda: Twilight Princess
Super Mario Galaxy
Metroid Prime: Corruption
Wii Sports
Mario Kart
Super Smash Bros Brawl
Star Wars The Force Unleashed (this game barely makes it into the list, I'm thoroughly disgusted with this title, and am considering using it as a coaster)

You may have noticed that I'm not a big fan of the more lighthearted "party" style games out there - I prefer the more focused, single player games.Once I had purchased those titles I began to look for something else, but quickly found that there's not a whole lot of choice out there right now. Most new Wii games tend to be in the "party" category.

Thankfully, Nintendo have launched WiiWare. WiiWare is a collection of titles created by third party developers. There are many different titles to choose from, and each title costs around £10. I ended up purchasing two titles:

These are both splendid games. However, once again, the pool of good games in the WiiWare collection is very limited - the main reason for this as far as I can see is that it's incredibly difficult to get your hands on the tools required to develop games for the Wii. For a start, Nintendo are only selling their development kit to well-established development houses (you need a registerred business, proper offices, previously published titles etc.). Their application form states that:

The Application includes a NonDisclosure Agreement (NDA). Once the Application and NDA are
submitted by you, we will email you a copy of the Application and NDA for your records. Please
note that your submission of an Application and NDA does not imply that your company is approved,
or will be approved, as an Authorized Developer for the platforms above.

...
If the Application is approved by Nintendo, we will notify you by email. At this point, your
company will be considered an Authorized Developer for the platform(s) specified. If your company
is approved for Wii, this also includes WiiWare. If approved the appropriate SDKs can be downloaded
from Warioworld, and development kits can be purchased from Nintendo of America.

So First you need to sign an NDA, Then, if you are accepted you need to purchase the development kit (priced at over $1000 USD). All this makes is increadibly hard for "joe programmer" to start cutting code for the Wii.

I really think Nintendo have missed a trick here; imagine the community that could form behind a free development kit. Think about the success of the Apple AppStore for the iPhone, but with games instead. The Wii is a revolutionary platform, with a unique control interface: surely lowering the barriers to entry can only be a good thing?

There's another side to this as well: The Wii Homebrew team have already done a lot of work reverse engineering the Wii, to the point where there is already an SDKs available for use. Is it usable? I haven't tried it myself yet (perhaps when I finish some of my current projects I'll play with it), but there are already a fair number of games available for the homebrew channel: I count more than 70 games listed, as well as a number of utilities, emulators and other bits and pieces.

The free development kit is based on the gcc PPC port, and comes bundled with everything you need to start development. GNU gcc has been a well established payer on the compiler scene, so it's not like we're playing with untested technology here.

Given that many of the secrets of the Wii are out (or are being reverse engineered even as you read this), wouldn't it be prudent for Nintendo to officially welcome third party developers to the fold? More importantly, for other, future consoles, imagine a world where:

The original manufacturer (Nintendo, Microsoft, Sony or whoever) use an open source toolchain from the beginning. I assume that Nintendo have spent a lot of time and money developing their toolchain, which seems a little wasteful to me, when an open source solution already exists. Sure, it may need to be tailored for the Wii, but I'm sure there are plenty of people who would embrace these changes. An open source toolchain lowers development costs, and lowers the barrier to entry for third party developers.
Third party developers are encouraged to write applications themselves, and the cost to entry is kept as low as possible. The manufacturer supplies the hardware, points to a pre-packaged toolchain of open source applications, and provides a development SDK with decent documentation. If all you need to test your games is a copy of the console itself, that would be great. However, why not build an emulator that can run on a standard PC?
The manufacturer provides bug-fixes for the SDK when needed, and creates a community-oriented website for budding developers.
The manufacturer provides a free (or very cheap) means of distributing third party applications via the internet, and offers the option of DRM routines, should the initial autors wish to make use of them.

I believe this setup could bring about a number of beneficial changes to the console gaming market:

An overall increase in the diversity and quality of available games.
A vibrant community of developers who help the manufacturer maintain the platform SDK and development toolchain by submitting bugs, feature requests and other suggestions.
Increased popularity for the platform (I'd buy any platform that offered all of the above).

Unfortunately, I can't see it happening any time soon. It seems to me that the big three console manufacturers are still engrossed in the "proprietary hardware, closed source" paradigm. Still, a guy can dream, right?

Jop Opening at Pebble Beach Systems

We have a couple of job openings available here at Pebble Beach Systems, due to expansion of the programming team.

If you know C++, a bit of SQL, think you know how to handle a thread, and think you have a handle on programming in general, feel free to contact me and I can pass your CV and cover letter along to the big boss-man. The best way to contact me is via email: thomir gmail com

We're looking for a graduate and a senior developer. Pebble Beach Systems is a great place to work - very friendly relaxed atmosphere with a great bunch of people.

Cheers,

Teaching Programming mk. 2

I blogged before about what I think we should teach programming students, and almost immediately wished I hadn't. Sometimes I feel that my blog posts are somewhat pointless meanderings through the garbage that inhabits my sleep-deprived brain. At other times I feel that I have contributed something useful to the general public. The post in question is firmly in the former category - but what can I do? I won't start deleting articles as soon as I fall out of favor with them, so I'm hereby correcting my earlier mistakes (at least, attempting to). Illiad Frazer knows how I feel:

The whole point of the previous post was that I felt that most graduate students were under-prepared for work in industry. My main evidence of this is that it seems to take a long time, and more importantly a lot of interviews before one strikes "candidate gold" when recruiting for a new programmer.
I will admit that this could be for many reason: perhaps our expectations are too high, perhaps we are not paying enough to attract the kind of graduate we're looking for, or perhaps the industry we're in isn't desirable enough to attract the better candidates. The list goes on endlessly - and yet I cannot ignore the fact that most graduates I meet are not up to scratch.

So what prompted this revision of a past article? I happened to read E. W. Dijkstra's article entitled "On the cruelty of really teaching computing science". In it, he postulates that the methods used by most universities are fundamentally flawed when it comes to teaching computer science, and more specifically when teaching computer programming. I'd like to quote part of this article:

...we teach a simple, clean, imperative programming language, with a skip and a multiple assignment as basic statements, with a block structure for local variables, the semicolon as operator for statement composition, a nice alternative construct, a nice repetition and, if so desired, a procedure call. To this we add a minimum of data types, say booleans, integers, characters and strings. The essential thing is that, for whatever we introduce, the corresponding semantics is defined by the proof rules that go with it.

Right from the beginning, and all through the course, we stress that the programmer's task is not just to write down a program, but that his main task is to give a formal proof that the program he proposes meets the equally formal functional specification. While designing proofs and programs hand in hand, the student gets ample opportunity to perfect his manipulative agility with the predicate calculus.

This method of programming - approaching the programming language as a kind of "predicate calculus' has it's advantages. It demands that the students pay attention to the features, rules, regulations and guarantees that the language provides. Whichever language is used (and to a certain extend it does not matter), the rules and regulations of that language are going to dictate the structure of the program. This is similar to the fact that the laws of math dictate the form of any mathematical proof; ignore the laws of the language, and your program (or proof, if you will) no longer makes sense. In the domain of integer mathematics, 2 + 3 will aways equal 5. In the domain of C++, local variables are destroyed in the reverse order that they were created in (insert whatever rule of the language you want there).

Consider for a moment my previous post; I listed 11 things which I thought were essential for any programming student to know. Looking back, I notice that the top five items are all specific to C++ (since that's the language I talk in). Is it a coincidence that the five most important things any programming student can know are specific to the language they are using? I think not.

Rather, I believe that to be a great programmer, one must have a deep understanding of the language at hand, and how that language allows you to express logical problems. One must approach a program like a mathematical problem - that is, one must know the rules of the language, and then use those rules to design a proof that conclusively solves the logical problem using the language at hand.

That last point is worth reiterating: Anyone can write a program that appears to solve a problem most of the time. However, for non-trivial problems it becomes much harder to guarantee that the program will solve the problem 100% of the time. As we get further into the "edge cases" of the application logic it becomes less likely such a naive implementation will work correctly. However, a program that has been built from the ground up using the guaranteed behavior of the language can still contain bugs, but it's much more likely that they are logic errors introduced by the programmer, rather than subtle bugs introduced through language misuse.

At this point I must point out that I do not believe that Dijkstra's idea is as good as he makes it sound. He addresses one point - that students should understand the rules of the language, but a "love of the language"is only half the picture. There are also many non-language related skills that come in to play. Consider debugging for example; there are formal techniques that can be used to debug certain types of errors. Knowing these techniques, and knowing when to employ them is a powerful aid in any language, and these are skills that should be taught, rather than learned in an ad hoc approach.

So, my revised top 10 list of things every programming student should know can now be revised into this, much shorter form:

Know your language. I don't care what your language is - if you want a job it had better be something that's being used, but you can be a great programmer even if all you know is an out-dated language. Not only do you need to know your language, you need to have a passion for knowing your language - you must actively want to extend your knowledge of the language and how it works, what guarantees it provides and which it doesn't. This knowledge will translate into programs that use the features of the language to create minimal, efficient, well structured and error-free programs.
Be willing to learn new techniques. There are so many useful techniques and skills for a new programmer to have that I cannot list them all here, and course designers cannot possibly include them all in their course material.

That's it - two things. Much better than the self-absorbed tripe I rattled off a few weeks ago. To anyone who actually bothered to read that, I apologize profusely.

Ten Things to Teach Programming Students

While talking to a friend recently, we began discussing the role of graduates in the industry. My belief is that employers employ graduates and expect them to have the same skill level as their existing, trained employees (I have certainly seen this first-hand). Having been on the "other side" of the problem I appreciate that graduates are rarely fit for the tasks set for them without further training.

This got me thinking: If there were 10 things graduates should know before graduating, what should they be? What short list of skills can graduates teach themselves to become better than their competition (and getting that first job is just that: a competition). That train of thought spawned the following list:

Ten things programming students should know before graduating:

Inheritance & Composition. In the land of OO, you must know what inheritance does for you. In C++, this means that you must know what public, protected and (rarely used) private inheritance means. If class A is publically inherited form class B, that does that tell you about the relationship between A and B? What about if the inheritance was protected, rather than public? In a similar vein, what does virtual inheritnace do, and when would you want to use it? Sooner or later a graduate programmer will discover a complex case of multiple inheritance, and they need to be able to cope with it in a logical fashion. Knowing the answers to the above questions will help.
Unfortunately, a lot of the time inheritance is over-used. Just because we have access to inheritance, doesn't mean we should use it all the time! Composition can be a useful tool to provide clean code where inheritance would muddy the waters. Composition is such a basic tool that many graduates don't even think of it as a tool. Experience will teach when to use composition and when to use inheritance. Graduates have to know that both can be solutions to the same problem.
Memory Allocation. So many graduates do not understand the importance of cleaning up after yourself. Some do not fully appreciate the difference between creating objects on the stack and on the heap. Some know that but fail to understand how memory can be leaked (exceptions are a frequent cause of memory leaks in novice programmers). Every programmer should know the basic usage of new, new[], delete and delete[], and should know when and how to use them.
Exceptions. Most programmers share a love / hate relationship with exceptions; You gotta know how to catch them, but at the same time you tend to avoid using them yourself. Why? Because exceptions should be .... exceptional! There's a reasonably large amount of overhead associated with throwing and catching exceptions. Using exception as return values or flow-control constructs are two examples of exception mis-use. Exceptions should be thrown only when the user (or programmer) does something so bad that there's no way to easily fix or recover from it. Running out of resources (whether it be memory, disk space, resource Ids or whatever) is a common cause for exceptions to be thrown.
Const correctness. Const correctness is so simple, yet so many programmers just don't bother with it. The big advantage of const-correctness is that it allows the compiler to check your code for you. By designating some methods or objects const you're telling the compiler "I don't want to change this object here". If you do accidentally change the object the compiler will warn you.
Threading. Threading is hard. There's no simple way around this fact. Unfortunately, the future of PC hardware seems to be CPUs with many cores. Programs that do not make use of multiple threads have no way to make use of future hardware improvements. Even though using libraries like Qt that make it ridiculously easy to create threads and pass data between threads, you still need to understand what a thread is, and what you can and cannot do. A very common thing I see in new programmers is a tendency to use inadequate synchronization objects in threads. Repeat after me: "A volatile bool is not a synchronization object!".
Source control. Every programmer on the planet should know how to use at least one version control system. I don't care if it's distributed or not, whether it uses exclusive locks or not, or even if it makes your tea for you. The concepts are the same. Very few professional programmers work alone. Graduates must be able to work in a team - that includes managing their code in a sensible fashion.
Compiler vs Linker. Programmers need to understand that compiling an application is a two step process. Compilation and Linking are two, discreet, and very different steps. Compiler errors and Linker errors mean very different things, and are resolved in very different ways. Programmers must know what each tool does for them, and how to resolve the most common errors.
Know how to debug. When something goes wrong, you need to know how to fix it. Usually, finding the problem is 90% of the work, fixing it is 5% of the work, and testing it afterwards is another 10%. No, that's not a typo - it does add up to more than 100%, which is why there's a lot of untested code out there! Of course, if you were really good you wouldn't write any bugs in the first place!
Binary Compatibility. This one is for all those programmers that write library code, or code that gets partially patched over time. As you probably already know, shared libraries contain a table of exported symbols. If you change that table so a symbol is no longer available (or it's signature changes), code that uses that symbol will no longer work. There's a list of things you can and cannot do while maintaining binary compatability, and it's very hard not to break those rules, even if you know what you're doing. I've blogged about this before, and linked to the KDE binary compatibility page on techbase - worth a read!
The main method of maintaining binary compatibility is to program to an interface, rather than to an implementation. Once you start paying attention to binary compatibility, you'll quickly realise that it's a very bad idea to export your implementation from a shared library, for one simple reason: If you want to change your implementation you're stuck with the restrictions placed upon you by the need to maintain binary compatibility. If all you export is a pure interface and a means to create it (possibly via a factory method) then you can change the implementation to your heart's content without having to resort to pimpl pointers.
Read the right books. There are a few movers and shakers in the programming industry that it pays to keep an eye on. There are many books worth reading, but I'm going to recommend just two. The first is "Design Patterns: Elements of Reusable Object-Oriented Software", and the second is the "Effective C++" series. Neither are considered to be great bedtime reading, both are considered to be packed from cover to cover with things that will help you out in every-day situations. Any programmer worth his or her salt will own a copy of at least one of these books, if not both. Of course, there are books on UI design and usability, threading, text searching, SQL and database maintenance, networking, hardware IO, optimisation, debugging... the list goes on.
Networking. What's this? An 11th item? That's right: it's in here because it cannot be ignored in most programming tasks. It's getting harder and harder to avoid networking. Most graduates will have to write code that sends data over a network sooner or later, so they'll need to know the difference between UDP and TCP and IP, as well as what the basic network stack looks like (think "Please Do Not Touch Steve's Pet Alligator"), and what each layer does. Being familiar with tools like wireshark helps here.

What's not in the list:

You may notice that I haven't included any specific technologies in this list. That's because I firmly believe that it really doesn't matter. Sure, there are some libraries that are better than others (I'd bet my life on a small set of libraries), but the programmer next to me has a different set. I care not one grote whether a graduate knows how to program in .NET, Qt, wxWidgets or anything else - as long as they're willing to learn something new (whatever I'm using on my project).

Which brings me nicely to the conclusion: The single quality I see in all the programmers I admire is a sense of curiosity; a restlessness and a sense of adventure. Our industry is constantly shifting. The best programmers are able to ride the changes and come out better for it.

Is this post horribly self-indulgent and boring? Probably, but it had to be done. Have I forgotten anything? Things you feel should be on the list that are missing? Remember that the point of the exercise is to keep a small list - I could list every programming skill and technology required under the sun, but that would not be very useful would it?