Lee's Software Brain Dump

Friday, 9 August 2013

The one about static typing

Static Typing

This phrase has been at the centre of a religious war for at least the last 7 years, and there's little sign of an end to the fighting. The recent Javascript Renaissance is what made it relevant to the average Joe programmer.

There was a time in the not-too-distant past, when things like Node.js would have simply been considered dumb, and ignored in favour of a Spring MVC stack or something similar.

Lets talk about compilation (in the commonly talked about sense).

If you've ever learnt C, C++, Java, C# or any other language with a compiler, you're probably familiar with the concept. A compiler takes the source code that you write, and translates it into a form usable by the target runtime, be that machine code for an x86 processor, or Java bytecode for a JVM.

Compilation is the thing you have to do before you can hit 'Run' and see "Hello, World!".

It's the thing that says "Syntax Error" when you've left out a semicolon.

Besides translating code into simpler code(code with less concepts), compilers also implement static type checking; the thing which tells you that you can't call a function expecting an number with a string.

In the days of old, when computers were slow and abstractions were young, this kind of compiler was a necessary part of making your computer program work. Letting your compiler in on the kind of data you're going to be moving around at every stage was needed, so that the produced machine code(written in terms of memory rather than variables) could allocate the correct amounts of memory to store and move data around.
Given this, you couldn't really avoid making sure the right data types where going to the right places, making this basic form of static type checking an easy win.

Simula 67 was, according to Wikipedia, the first language to directly implement constructs like classes and interfaces, giving the compiler more work to do, and checking even more at compile time.

Smalltalk implemented some similar features, but avoided putting this extra machinery in the compiler. It also did away with a lot of the other work that compilers did, making runtime use of only references to data, rather than copying around the different-sized memory blocks for ints, longs, bools, etc...
In fact, Smalltalk didn't really have a compiler in the normal sense, where you apply it to all of your code, then run the result; rather, you compile code construct-by-construct, and add it to the whole, much like Javascript.

More recently, languages like Haskell, Scala, and F#, push the boundaries of how much you can check at compile time, by adding richer constructs to be analysed by the compiler. With this, the compiler is not just a tool to produce machine code; its used also as a platform for analysis.

Static v Dynamic

What we see here are two contrasting ways to write programs, and there are a number of popular languages associated with each side.
In the static camp, we have Java, C and C++ supporting a static type system, and Haskell, Scala, F# and C# (since 4.0) supporting some more advanced features such as variance annotations.

There are also the dynamic ones: Javascript, Python, Ruby and Clojure.

One could write certain forms of type-checking into Javascript or any dynamic language if so inclined, though code making use of it would need to be written differently.
Clojure includes a type system as a library that you can use if you want: https://github.com/clojure/core.typed

On the other side, the aforementioned statically-inclined languages, besides Haskell, allow one to use typecasting to pass around any kind of value with no type-checking whatsoever.

It's just a Feature

Many discussions take place on and off the web as to how crucial these things are to the development of maintainable, well-performing computer programs (arguably the holy grail of software engineering, and the point of everything).

Some systems go even further than type checking, such as Contracts in Eiffel, which seek to prove certain facts not just about which classes are used, but about the possible values which variables/parameters can take, all by providing more annotations to the compiler.

Static typing is lauded by some as being an absolute necessity for making good software, while others talk about static typing as if a premature optimisation, more irrelevant to the problem domain than a helpful tool, making code less amenable to refactoring and evolution.

While once a technical necessity, static type annotations are no longer this, and after much evolution and research by some very clever people is now simply part of static type checking: a very advanced, powerful feature in some cases, but no longer a necessity for the reasons it once was.

At the dynamic end of the spectrum, Clojure at its most basic supports minimal checking of the existence of variables, in that you can't write a function which makes use of a variable or function you haven't already declared one way or another; that's all.

Is it as powerful as Haskell's typeclasses? Of course not.

Is it less cool? Arguably.

Is it any less useful? Depends.

To determine if a feature is useful, you need to consider who's going to be using it, what it's going to be used for, and under what conditions.

I'm going to shy away from permuting all the combinations of those that I can think of, and instead ask you to consider the Bash shell:

It's arguably one of the more dynamic languages you can get. You can write literally anything in Bash, and there's no compiler to complain. Everything's a string and parsed into other forms when necessary. If you're using a Linux system(even Android), the chances are that a large part of the system is written in Bash.

With the complete lack of any compile-time checking or compile-time anything-whatsoever, it still somehow works.

Sunday, 19 May 2013

Half-technical discussion on using Abstraction, APIs and a bit on Lisp

Why abstract?

Abstraction is about generalisation. it allows us to do more, while knowing less. If a baby is given some beans to hold, it may not be long before they discovered that transferring a bean from one hand to another means the number of beans in the from-hand goes down, and the number of beans in the to-hand goes up. They may also figure out the process is reversible, and transferring a bean in the opposite direction leaves them equivalent to where they started in terms of per-hand bean-quantities. There's a common abstraction to deal with such situations, known as arithmetic. Arithmetic is very much an abstraction, because most of the time it doesn't deal directly with what you're using it for. Arithmetic works in terms of numbers. Counting or estimation can map these intangible entities onto something real. As long as we know how to count the various different real life things we might use, everything we know in terms of 'arithmetic' is useful, and can help us solve our problem.

Abstraction is an important part of software, as with any complex system built with or used by limited human brains. Actually creating these abstractions, I would have no qualms calling an art form. What's different about this art form however, is that the product has to be actually useful to someone.

Programmers make use of abstractions all the time. A common example would be structured programming, that is, programming with 'functions'.

In most cases, a function is a sequence of operations with an associated lexical context. By lexical context, I refer to the ability to give things names and refer to those things using the same name, but only while inside the particular function.

Functions and lexical contexts aren't tangible, yet a structured programming language is perfectly capable of describing these things. They are an abstraction, and a useful one, born out of a common need to break problems up into simpler parts. Before functions were formalized in structured programming languages, this behaviour would have been implemented using a lower level abstraction, such as machine code to manage groups of related values, which is itself an abstraction built atop electronics in the computer's processor.

Most computer programs are written in terms of functions. Your web browser would likely have hundreds of thousands of functions; There will almost certainly be one to calculate the width of each character on this page, for example.

Stacking abstractions

Abstractions are often layered; That is to say, it may be useful to define one abstraction in terms of another. Natural language is as good example as any;

English-speakers make various sounds, yet the English language doesn't apply meanings directly to these sounds, but instead defines 'words' in terms of these sounds. A major advantage to this is that it makes the English language 'portable'; You can choose to either speak the word "house" audibly, or write it down, and it still has the same meaning.

Grammar is above still, defined in terms of word types(adjectives, verbs...). These word types are in themselves defined by the words included in them.

Building up abstractions and dissecting them really isn't an exact science; There really is more than one way to skin a cat.

Programmers often build abstractions atop these functions to represent higher-level concepts, such as this text. If creating a web browser, they would likely want to create an abstraction to represent visible things. This would be useful, as there are all kinds of different visible things on a web page, which are also used in similar ways; Being drawn on the screen, for example, is something which happens to all visible objects. Being clicked doesn't make sense for all visible objects, meaning that deciding what to do when a mouse clicks the item may not have a place in this particular abstraction.

This abstraction could be built in terms of C# functions by, for example, creating a function named "Draw", for each different kind of visible object, whose job it was to draw the particular thing. This way, other parts of the program wouldn't need to deal with how many curves and lines make up a "Subscribe" button, they would only need to be able to run the appropriate "Draw" function for the visible object.

This is a very simple example, and you would likely want to do more than just draw a visible object.

You would probably also want to be able to calculate the shape and size of a visible thing's outline before calling its "Draw" function. With this information, you could decide where it should be drawn so it doesn't collide with other drawn things.

We're separating the "How" from the "What". We're allowing one to draw a button or other visible object("What"), without needing to know "How" to draw it. In fact, given the right machinery, it needn't know it's a button at all. Just like we can reason about bean quantities without learning how beans actually work, because counting and estimation operations are really all that's required for the "arithmetic" abstraction.

APIs

API stands for "Application Programming Interface". An API is an abstraction built on top of a programming system, often in terms of functions, used to achieve some end. You could argue that our "visible object" abstraction above was a kind of API and I think you would be right. An interface in the broadest sense is a view through which you can interact with some system or another. Dead simple real life example is a headphone socket (Headphones don't care if they're playing rock or jazz, because that's not the abstraction they are designed to).

Modern programming languages offer many abstractions besides just functions, including:

Interface - A set of functions which operate on the same item of data.
Array - A sequence of data of a certain kind.

Tower of abstraction

You may have noticed the trend to build new abstractions in terms of old ones.
Interfaces, for example, are based on functions; To use an interface, you need to know what an interface is, and also what a function is, and all about functions which, in some programming languages like C#, can be quite complex creatures.

Looking at an abstraction like the C# programming language, it's mostly/completely isolated from what's below. It has its own data types, terms and semantics that you can build solutions from.
I think here, the 'purity' of certain abstractions becomes most apparent.

Unfortunately for the programmers, the domains of the problems they're *actually* going to be solving often have nothing to do with constructs such as strings, classes or network sockets, meaning that you still have a lot of work to do in terms of supporting the solution somewhere above the programming language by building further layers of abstraction.

Visual Basic was a programming system developed by Microsoft which, I would say, focused on bringing the level of abstraction closer to the problem domain than many systems. A major example was the notion of the "Form" - a modular interactive graphical container to hold "controls"(Buttons, text boxes...). These things were represented in the Win32 API, but the drag-and-drop form designer allowed you to create GUIs not by way of calling "functions" or talking to "objects" via "interfaces", but rather by...well...creating a GUI, in the most direct way imaginable.

I'm not saying that Visual Basic was the best all-round programming system I've ever solved a problem with(far from it), but these kind of things made it very useful indeed.

HTML is another common example of this kind thing. You don't call functions and assign event handlers to create DOM nodes representing visible elements, rather you use HTML to directly create DOM nodes which represent visible elements. The reason there's still another layer there, and no good(tell me if you know one) drag-and-drop designers for the web is that the web isn't about positioned components on a page, it's about a flexible, reflowable, semantically rich layout which doesn't care too much about screen size. It's not quite as perfect as that last sentence makes it sound, but I do believe we're getting there... :).

These very "pure" abstractions which stand on their own, making use of only constructs which are part of a problem domain are often called Domain Specific Languages(DSL) .

Domain Specific Languages

There is really no clear definition of what constitutes a DSL; it's very much down to intent, I believe.

Some examples of successful domain specific languages:

Makefiles
HTML
Most things built on XML

I'll point out that you can still effectively hide intermediate levels of abstraction without writing your own parsers or using XML; You can indeed embed clean abstractions in programming languages using relatively simple concepts like arrays and meaningful parameter names in function calls as a backbone, and build a language like that.

Waste of time

There are disadvantages to this approach; Even if your discipline and methodology and separation-of-concerns is perfect, jumping through all of those abstraction layers(which you'll have to do at run-time), will have a performance impact, a major one sometimes; Look at the speed of Win32 UIs vs Java Swing UIs - Swing loses...badly.
What you can end up with is basically a poorly performing interpreter.

Sure, there are optimizations you can do to mitigate this, like runtime code generation using things like Javassist, LLVM or .NET's IL-generation classes, but a far simpler approach which doesn't leave the user waiting for their program to compile the rest of itself(which is essentially what will be happening), is to take care of this abstraction level at the appropriate time - The time when you should be doing all of the mindless work to connect what you've written to something which isn't going to need to change at run-time, the time used to convert abstractions into their run-time form: Compile time.

This is why C has macros, why C++ has templates, why Javascript has eval(), and one of the best things about Lisp.

Lisp for DSLs

Lisp (and Smalltalk so I read) implementations make this kind of thing very easy, as they don't generally distinguish between compile-time and run-time; Your build process consists of running a bunch of code to set up the state(such as code to define functions), then you image the whole runtime, serializing all the state, and ship it like that.

This means that if you did wish to implement a DSL with data structures, you could process those structures as part of your build, making it far closer to 'source code', which is what it is.

Lisp goes a step further than smalltalk and ONLY uses data structures for source code, making it very easy to implement your own languages using the available syntactic constructs with your own semantics.

Lisp object notations are normally very lightweight;

Commonly, list data is represented with something like: (1 2 3 4), a function call taking the form: (functionname arg1 arg2).

Clojure, a fairly modern Lisp which is gaining popularity, has a rich set of first-class data structures including lists: (1 2 3), vectors: [1 2 3], sets: #{1 2 3} and maps: {1 "a" 2 "b"}.

These things come together to make a great environment for DSL construction.

That's all I've got to say today. This post has been sitting unfinished in drydock for a while, I like to think it's more understandable as a result of the extra thought.

Tuesday, 30 April 2013

Lexical Closures

Today, I found the first confusing bug in the software I've been working on. It had me going for pretty much the whole day, not helped by the fact that my MonoDevelop also had a bug causing it to crash when the IKVM runtime threw a meaningless exception, meaning I couldn't actually get to see any of what was happening.

I've been using a fairly out of date MonoDevelop, as it's the one available from the ubuntu repos. Strangely enough, recent builds on the website are only available for Windows, OSX and Suse.
Luckily, there doesn't seem to be any proper reason for this, so I basically did what this page told me to and built MD 3 from source. So far, it works a treat and doesn't seem to feature the annoying bug I mentioned. Nice :)

Anyway, the bug. It seems really stupid now, so in order to feel less stupid I'm going to turn this into a post where I explain lexical closures in the context of C#.

So, what is a lexical closure? Why, it's a function object which captures the bindings of free variables (variables which aren't defined in the function or it's parameter list).

I'll elaborate:

Delegates

If you've written C# before, you've probably used function objects or delegates, as they call them:

// Declare the delegate type outside a function somewhere:
delegate int OneAdder(int a);

// Instantiate the delegate
OneAdder addOne = delegate(int a) { return a + 1; };

// Use the delegate to print a "4"
Console.WriteLine(addOne(3));

You can also use the lambda syntax to more succinctly create delegates:

OneAdder addOne = (a) => a + 1;

There's more; From .NET 3.5 there are some standard generic delegate types already predeclared:

Action<int> : void-returning function with an int parameter
Action<int, int> : void-returning function with two int parameters
Func<int, string> : string-returning function with an int parameter
Func<int, int, string> : string-returning function with two int parameters

As you might intuit, delegates are reference types just like classes and amenable to the same assignment/passing-around semantics. For instance, one could write a function:

static int CallMyDelegate(Func<int, int> func)
{
 return func(5);
}

Lexical Closure

Delegates in C# have 'Lexical closure'. That means that when you call this, passing it 5:

static Func<int, int> MakeAddingFunction(int number)
{
 return delegate(int a) { return a + number; };
}

What you get back is a function which adds 5 to things.
It's called a lexical closure, because the function 'closes over' the 'number' variable (which is in its lexical scope), so it can use it later. One can also say the delegate 'captures the lexical environment'.
Think on that for a while until you understand it. Try it out, even.

This whole delegate thing is really useful as it gives us a nice, succinct way to abstract some operation in terms of it's inputs and outputs, so it can be passed to code which can then apply it to some data without understanding the operation. C#'s "Linq" is based on this kind of thing. If you haven't heard of Linq go and read up on it once you've finished reading this.

Peculiarities

Lexical closure is about capturing the *bindings* of variables in scope, not the *values* of them.
The binding is the invisible thing which is given a name and associated with different values over the variable's lifetime.

This means that the delegate has full access to the variable, rather than just receiving some copy of its value.

Consider this code:

static void Main(string[] args)
{
 int someNumber = 3;
 Action<int> addThis = delegate(int a) { someNumber = someNumber + a; };
  
 addThis(10);
 
 Console.WriteLine(someNumber); // This prints "13"
}

That's right - it prints 13! The someNumber variable was modified by addThis.
Whenever you open up a function, a loop, or an if block, or anything with curly braces, that conceptually creates a new 'lexical environment' and a new set of bindings for all variables declared within those braces. In the case of loops, a new environment is created for each iteration, so you can bind to different variables each time round, even though they have the same name:

var setters = new List<Action<int>>();
var getters = new List<Func<int>>();

for(int i = 0; i < 3; i++)
{
 int closedOverVariable = 0;
 setters.Add(delegate(int newValue) { closedOverVariable = newValue; });
 getters.Add(() => closedOverVariable);
}

// Set the first int to 5
setters[0](5);

// Set the second int to 100
setters[1](100);

// Print the value of the first int : "5"
Console.WriteLine(getters[0]);

// Print the value of the second int : "100"
Console.WriteLine(getters[1]);

We just created three ints with no way to access them apart from via the appropriate getter and setter delegates. I always found that kinda spooky; It's like the ghost of a variable...

The Bug

So where did it all go wrong today? Here's a piece of code which demonstrates what I got wrong:

// A nice array of ints
int[] ints = new[] { 1, 2, 3 };

// A list of delegates
var delegates = new List<Func<int>>();

// Lets go through the ints, and make a 
// delegate for each one which returns the corresponding int.
foreach(int i in ints)
{
 delegates.Add (() => i);
}

// Now go through all three delegates, printing the return values.
// It should go "1, 2, 3", right?
Console.WriteLine (delegates[0]());
Console.WriteLine (delegates[1]());
Console.WriteLine (delegates[2]());

WRONG! It will go "3, 3, 3"!

That's because the foreach statement in C# doesn't create a new binding for 'i' at each iteration. It behaves more like 'i' is declared outside the loop and the value reassigned 3 times. Each new delegate we're creating is getting the same binding - accessing the same variable. I find this a little counter-intuitive, as I would expect there to be 3 different bindings created in 3 different lexical environments as we iterate over the ints, just as there would if 'i' had been declared inside the loop.
That last part is actually the way to fix this mess:

int[] ints = new[] { 1, 2, 3 };

var delegates = new List<Func<int>>();

foreach(int i in ints)
{
 // 3 different bindings will be created for 'tmp'
 // in 3 different lexical environments.
 // We're initializing it with the *value* of 'i'
 int tmp = i;

 // Close over the binding of 'tmp', rather than that of 'i'
 delegates.Add (() => tmp);
}

Console.WriteLine (delegates[0]());
Console.WriteLine (delegates[1]());
Console.WriteLine (delegates[2]());

And that's all, folks. Hopefully I've helped someone understand lexical closures better. Maybe I've even managed to steer someone away from falling into this trap.

Happy coding :)

Monday, 29 April 2013

More TestVisor Development

Firstly...

Git

I'd like to point out that Git is pretty darn good.

I picked GitHub for the hosting because I liked the look of it and had heard ravings about Git; This is probably one of the best decisions I made for this project.

I've only had extended contact with Mercurial and Subversion in the past and I can definitely appreciate the difference in target markets:

Git is targeted toward people who value a holistic view of their environment; It's a more toolbox than black box. Mercurial(or Hg), is more of a nice extensible black box with a good set of buttons on the top.

Any software dev team should really be familiar with at least one of these tools.

IKVM

IKVM seems like a magical piece of software. It's basically a JVM running on CLI/.NET, and it works!

I had a problem - I'm developing in mono, but I was lusting after Mozilla Rhino, a nice Javascript engine written in Java. After a quick look around, I didn't really find anything that worked as well for my platform, so I went ahead and downloaded Rhino. One call to "ikvmc", and my problem seems well and truly solved! I got a Rhino.dll that I can link into my mono project with the worst of the hassle being that I had to also include /usr/lib/ikvm/IKVM.OpenJDK.Core.dll (to use of some jvm types like java.lang.Class), which took around a minute to locate and reference.

I've gotta say, it felt weird typing this into a C# project:

return Context.javaToJS(new java.lang.Boolean(func(testParams, testKey)), s1);

TestVisor

It's been an exciting few days for this project. I'm still on holiday, so I've been tinkering with it pretty much nonstop.

I'm hopefully getting some help with the work from a University chum before long.

Here's a little demo of Javascript test plans:

(On GitHub)

Tuesday, 23 April 2013

TestVisor passes its first test

TestVisor

I've been working away on this little project for a while as an exercise in coding something useful in my free time.

The end result will be an ajaxilicious virtual machine hypervisor manager which will be able to run sequences of tests on a virtual machine instance,

making use of snapshotting to deal with real world automated testing in a state-controlled way.

It's a long way off done, but here's a taster:

Here we see it initializing some stuff, powering up a Windows 7 Virtualbox instance, downloading a test, executing the test and uploading the results.

My holiday started today and I've been hacking away on this since the morning, I'm quite pleased with what I've got so far. I've been poking this on and off for a few weeks now, but there's truly no substitute for 'the zone' which I finally managed to get into with this.

I was also pleased to find that I could compile a mono executable on the host linux machine I'm developing on and download it to the Windows 7 VM to run under CLR with 0 problems.

It's on GitHub, but not in an amazing state currently. Largely stubs, test code, commented out test code and some hints of where it's going.

Exciting to finally get another project underway, I can't even remember what my last one was...

Saturday, 20 April 2013

Being an OOP Expert

It's tough to be an expert at Object-oriented programming. There all kinds of things you have to know about, like singletons, visitors, abstract superclasses, methods, factories, facades, the bridge pattern...

If you don't know what these things are, how can you ever hope to produce scalable, maintainable software?

A Job Interview

At the interview for my current job, I was presented with a matrix to fill out. The rows were different skills or technologies, and the columns went from "Don't know about" to "Expert" or something similar.
Amongst others, there was a row for Unit testing, a row for .NET Framework and a row for Object Oriented Programming.
I ticked the "Expert" box for this row only.

Now you're probably thinking how arrogant I am. Some of you, maybe those versed in CLOS, Multimethods or Haskell's typeclasses are probably already thinking of how best to blow my mind with a humbling comment.

That's why I'm explaining now, that I'm aware of the number of different technologies and practices which "Object Oriented Programming" can be made to refer to. I'm also aware that those technologies are almost certainly not the ones they mean when they're asking how much I know about OOP.

For the rest of this post when I'm speaking about OOP, I'm referring to what 90% of the software world know as OOP; That is a class-based, stateful object system with information hiding annotations and subclassing. Java, C# or C++ are the systems I speak of.

I'm not convinced

At the risk of sounding arrogant all the same, I'd say I've written plenty of code in Java and a reasonable amount of C#, I've been to hell and back with this idea of OOP.

I used to be an OOP proponent. A zealot.

Once, I accidentally used a 2D array to represent a matrix for a scholarship project. I was advised by my supervisor to do something more "object oriented" such as represent a row with a class and use a Java List<T> for the columns.

This greatly improved the scalability and modularity of my code, so much so that after the project, my code was bought from me for a large sum of money by a multinational firm, and it currently sits at the core of a complex probabilistic model used to drive the world's cruise missiles. Couldn't have done that with a mere 2D array, could I now?

Ok, I promise I'll stop the sarcasm now.

"More object oriented"...

What I was really being advised to do was to make use of a larger number of the OOP constructs provided by the language. I went ahead and did this, as it seemed like a good idea at the time.

Like any nerd I like shiny things and by god, the result was that.

It was also completely nonsensical to anyone who actually understood how the solution worked.

This is what happens when the solution you want to implement gets really departed from the terms you're implementing it in.

This is actually the idea behind Domain Specific Languages (DSLs), where the language is designed around concepts involved in solving the problem.

I had an excuse - I was 16 years old and I had just started really playing with OOP, falling into all the traps, spending most of my time figuring out how to represent my solution with sets of classes, and generally going nuts with the whole idea.

This is not a professional or productive way to work. This kind of thinking is not engineering.

Programming

To me, programming is about finding the route from the problem to the logic gates in the processor. It's a game of representations, that is, abstractions. Abstractions are necessary so you can make large problems manageable by putting them into simple terms. As a software engineer, your job is often to design these abstractions and make them useful in as many situations as possible, whether that's by extending a language like Java with a fancy library, or creating a new DSL over a host syntax like XML or even with its own custom parser.

This is a big part of what makes software engineering so hard; There are many ways to look at a particular problem.

Putting aside the awkwardness we know about, you might think representing that matrix as a list of column objects would be a reasonable idea. But just wait until someone wants to build a system on top of it which needs to look at matrices row by row or god forbid, even diagonals.

It's not intractable; You can indeed wrap it up in a "float[] GetMatrixRow(int rowIndex)" function, but that's another layer of abstraction to maintain, another layer between what the compiler reads and your actual intent(bye bye code optimizations), and more code to drown in when you can't figure out where that NaN is coming from.

Another problem with building a code tower like this is now that your actual "intent", whatever that was, is now just some part of this mass of abstraction. As far as the compiler is concerned, it's a lump of code. It'll compile it, but there's not going to be any reasoning about it.

What you're writing is not something which describes your solution particularly well, rather something which after being run will produce correct behaviour. This makes the implementation less valuable, in my opinion.

This all sounds a little extremist I'm sure, but keep in mind the problem I posed out was an extremely simple one - doing something columnish with a matrix of numbers.

erm... OOP?

Yeah yeah, I know, I've been banging on about domain specific languages and their benefits for a good number of paragraphs now; This was supposed to be about OOP.

OOP represents an extension of what gets commonly called the "imperative" programming model;

The "Do this, now do this to change that, now check if that's changed, do these things if it has" model, which should really be considered a pile of fairly sophisticated syntactic sugar for machine code, as it really isn't all that far departed from the basic Von-Neumann model which is what's actually happening in your computer case.

OOP is an extra layer of sugar up, and a slight move into another dimension - there is an introduction of some extra concepts which aren't just about sequenced operations.

Here's the thing: In order to solve most any real world problems, you have to build abstractions atop your processor. Unless your problem is to add 3 numbers in sequence together for which, you could say your processor supports hardware acceleration :P.

Some problems are solved by computing linear functions to calculate what direction your cruise missile is travelling in.

Some are solved by running an optimal sequence of dependent steps in the most efficient way, such as a build system (Makefiles are a good example of this).

Some need to keep track of groups of values which evolve and affect one another in predefined ways, such as a particle system.

Wikipedia tells me that OOP came about trying to solve problems similar to the last one.

It's no secret that as long as your machine abstraction(model) is Turing Complete (TC), you can represent any computable function. That goes for OOP too; There's no saying it's impossible to code certain things in an OOP or non-OOP TC language.

The point I'm making is that having the OOP abstraction as part of your language really isn't that applicable and useful by itself, and it's definitely not something which will increase productivity and scalability for most classes of problems.

Unless your problem is to represent the state of synchronous message-receiving groups of values, it won't solve much of your problem either. Yes, I know you can use those constructs to represent a solution to the problem, but you could also probably build a lot of furniture with hammers and nails. If you've got a hammer available, that's great, but it doesn't mean it's time to start either building nail-oriented furniture, or building more appropriate tools by nailing multiple hammers together to create some kind of saw and......

You get my drift; Go and buy more tools. :)

Patterns

So, you've got an OOP language: A language with first-class support for managing groups of values and synchronous messaging between those groups of state.

Great, Now you want to solve your problem with it.

Design patterns to the rescue!

Yes, conceptual adaptors if you will, to convert the model your OOP language gives you into something you can use to solve a common class of problem. There are some patterns which are only a little extra atop the OOP model. Other patterns however, are quite a jump. The Visitor pattern, for instance. Now I'm not going to start whining about the Visitor pattern and how complicated and unreadable it is to implement, because that's not important. What's important is how readable the result is, and in my humble opinion, the result works quite nicely once you look past the boilerplate. This is important to my point, as I can now point out that having that OOP model somewhere in there didn't really help us at all. It didn't bring us any closer to traversing that tree or whatever we used the Visitor pattern for, if anything it got in the way a bit.

It got in the way a bit

That's what I commonly find with OOP ways of thinking. I've seen some absolutely ghastly code which was written in a way which seemed to say: "Look, I managed to find a use for OOP here, here and here!", in fact I've written a fair amount of it. I'm not saying that OOP is the devil, because it is not. It's simply a model for solving certain classes of problems.

So if OOP isn't going to be the centrepiece what concepts should a language designer make first-class citizens in a "General Purpose" Programming Language?

You can go for pure functional, which can be a pretty close fit for some problems, but of questionable use for matters of state.

There's the C#/C++ approach of including as much as possible. Clojure is a really good example of this, but the rules are a bit different there as it's a Lisp, and the distinctions between language features and library features tend to disappear.

Wait!

I haven't mentioned a lot to do with OOP, I know. I haven't really talked about encapsulation that much. I haven't talked about subclassing or polymorphism.

How on earth can this be a post about OOP without those things?

My answer to that is that it's not really a post about OOP. It's a post about representing problems, the misunderstood role of OOP as the saviour of software complexity, and why I think I can write that I'm an "Expert" on OOP.

I can write that I'm an "Expert" on OOP, because I know what it is and what it's not. I know when it's appropriate, and I know when it just gets in the way a bit. I also know that it certainly doesn't get the engineer out of having to solve the actual problem!

This turned out longer than I expected, I'm now off to watch today's Doctor Who ep.

'Til next time.

Back again

Yesterday was friday. I got down on friday a little too much by sleeping for about 14 hours after work. Here I am on Saturday, and I'm sitting in a Starbucks committing a new post to my long-untouched blog.
I feel like I've got a couple of things to write about, so I should do it properly - with smooth jazz and a large latte from the oddly friendly baristas.

Gi'z a Job?

When I wrote my last post, I was about to embark on a grand tour of england to ask ~10 companies for work. That was around 2 years ago, and turned out quite well. The first interview went quite well, I did their test, presented myself professionally(or so I thought), but apparently they didn't like me.
As it happens, I only recently figured out how I managed to make a pigs ear of that one. It was like a lover's quarrel - one unnoticed technical misunderstanding lead to a nonsensical quip and before long, the interview had spiralled out of control while I, obliviously, was talking out of my arse with such prowess I'm surprised they didn't hire me for a sales position (hurr hurr).
I visited a number of other companies including Altera(great place to work by the looks of it. Would not have been a good fit for me at the time). I ended up at a great little firm in Bristol: Multicom. Not many people, lots of work; I couldn't have asked for a better first job.

'Professional'

I like to think I became a professional in the six months of employment which followed. I learned the ropes, I learned the shape of things to come, I learned how my skills fit into a business and what it's like to blend fun with that which makes money.
Being honest, I went into there almost suspecting that I was going to be nothing more than a drain on resources and general annoyance. Quite the opposite happened - I gained a huge amount of confidence and it turns out that employment was what I needed to start enjoying life.
After six months, pounding out Java to mine data from all kinds of sources from clean XML APIs, ghastly XML APIs, nonsensical websites and excel spreadsheets(http://en.wikipedia.org/wiki/Apache_POI is decent for this),
I moved back to Aberdeen for unavoidable family-type reasons(I wasn't actually homesick).

Oops, I learned something

It turns out it was much easier to get a job the second time round. With my CV bulked up with all kinds of experience, I was quickly snapped up and now spend most of my time working on fancy geology software.
I've worked there for about 16 months, now.
I have no regrets regarding my decision to take work instead of another year at university and an honours degree. It suited me, it might not suit you, or someone at another university on a different course.

Things I noticed along the way:

Employers are scared of graduates (rightly so, given what a joke some modern CS courses are).
It's good to customize your CV for different kinds of work you're willing to do. I personally kept 2 CVs around. One, I like to call my J2EE CV, which boasts of Enterprisey experience and my favourite Java features. My other CV has more of a tinkerer feel to it, talking about how I'm used to a C/UNIX environment and like to play with microcontrollers.
Software Engineering is a young profession and as such, there are lots of cults, misinformation, obsessions and unknowns. It's also strangely hard at times.
It's good to stuff your CV with buzzwords, so long as they're accurate: Your CV will almost certainly pass under the nose of someone who is looking for them.

Saturday, 21 May 2011

Jobs

With the close of my AI BSc in sight, I've started making plans for what's going to happen afterward. I'm not having a ceremonial graduation because I'm a cheapskate, and likely to have been bored to death by it.

Exams are on 24th, 27th and 30th of this month, and I'm actually fairly ready for them.
I've been looking through job offers and telephoning around, have had a couple of phone interviews and been talking with agents. The most promising position so far, is for a python/javascript/django developer in London.
After I started searching for jobs, I anticipated a move would be necessary, but it's a fair way away from Aberdeen.

Onward and upward...

On the 27th, I've got a phone interview with another company who may want me for a Java role, accompanied by a coding test over a collaborative text editor. For that, I'm pretty damned nervous in all honesty.

I've had an interview for the first company already, which went pretty damned well. They've now invited me down for a face-to-face interview on the 31st - a day after my last exam. It's probably going to end up being on a plane or an overnight train.

I didn't manage to make the ARM position, not sure why.

Ah, well...

Not much time lately for messing about - I've been studying my arse off for the coming exams, specifically my 'Computer Games AI' course, a lot of the material is shared with my Higher exam, back in secondary school.

There was also a period when it felt likely that I'd get a position in France, so I decided to crash-study French. It was easier than I thought, and I found myself to be potentially very good at languages. Sadly, I've forgotten most of what I learned that day. I'll get back to it at some point.

Lee.

A rant about iPads

I've got a friend who's Macbook pro recently died without warning.
Apple were willing to replace the dead motherboard for about £419. A fair bit, for a 4/5 year-old machine.
Also recently, the same friend acquired an iPad 2 for the price of £500.
He says he's looking for a new Apple laptop to replace his old one, 'To do university work on'.
So, I thought: he just forked out for an iPad 2 - Isn't there an app for that, or something?
I'm serious - you buy a computer, but it can't do quite everything that a normal machine can do, so you have to buy one of those, to make up for the deficits of the first one.

If he can't write and compile code and write reports on his iPad 2, what the hell is the point of the unit?

Steve Jobs cites the iPad as a 'missing link'-device, 'bridging the gap between netbooks and smartphones.'
Some iPad proponents accept these deficits, and attempt to justify them with claims fitting the form of: 'But it's not a computer, it's an appliance - it's not meant for serious computer users, it's meant for people who want to do light tasks'.
(For such a discussion, read: http://www.geeksix.com/2010/01/the-really-big-point-that-ipad-haters-are-missing/)

I'm going to summarise the point made by that page in my own language:
"The iPad can't do things, not because it's a 'weaker'-type of computer, but because it's not a computer at all, and as such, it's not supposed to do these things. Also, anyone who still doesn't understand is a stupid-face who is unable to grasp subtlety and good design!".

What Apple have in fact done, is take a successful product(The tablet pc, a dubious device for a normal user anyway), remove features that most people don't use, and market it as a separate 'kind' of device.

But the iPad isn't even that good. It omits features that normal people(even novices) DO want, such as an SD card slot, a keyboard and a USB port. I know that these things are available as addons, but the dongles look bad, and when attached, remove some of the device's actual selling points(Nice shape, pretty finish, and self-contained appearance).

It also seems to actually require the presence of a 'full' computer, running Apple's iTunes application to update iOS.

The argument that the iPad is 'not a computer in the normal sense', presented in the article on http://www.geeksix.com doesn't hold water, because:

The purported use of the iPad is exactly the same as the use of a normal desktop/laptop computer most of the time, for many users. We know this is true anyway, or there would be a real lack of user-base.
Addons exist which allow the iPad to duplicate much of the functionality of modern PCs, such as an SD card adaptor, USB adaptor, and addon keyboard.
There is an online repository of free and paid applications available for the iPad. In it's stock state, the iPad is perfectly capable of 'normal-person-tasks', such as playing music, social networking, web surfing, email checking. In fact, i'm surprised apple even have an app store, considering how much it detracts from being different from laptops/desktops/netbooks/smartphones.

The iPad is not another class of device. It's mostly a normal computer. It's a hardware/software combination which can carry out tasks.
The iPad is designed to create a synthetic market segment, it's OS is simply another platform, separated and incompatible with others already in place such as OS X and Windows.

Many-an-ignorant-user has presented me with the same argument in fact, for Apple's desktops/laptops - that they aren't meant for 'power users', they're 'just for normal people who don't want to know how computers work'. This is more obviously flawed than the similar iPad argument, I believe, due to the larger, more open ecosystem of applications available for OS X, when contrasted with that of iOS.

In fact, I'll bet that if the device survives, we'll start seeing IDEs and other similar tools which aren't for 'normal people', running on some incarnation of iOS, before long.

I think I've been quite good here, in that I've avoided poking the normal holes in apple products, like 'delicate' or 'overpriced'.
This has been more of an attack on the philosophy/ideals held by Apple and some of it's users.

If you disagree, show me your worst in the comments, I look forward to reading them.

'Til next time :)

ADDENDUM:
If you do feel like swapping £500 for something with a touch screen, may I recommend the Dell Inspiron Duo

Wednesday, 4 May 2011

A driver

I've always been fascinated by the Linux kernel, mainly how so many drivers and bits get written into it with so little 'real' documentation. Some of it's there, scattered across the web. I've yet to find anything as comforting as a Linux kernel doxygen manual.

The closest thing that's currently up and running is LXR (http://lxr.linux.no) - a sourcecode viewer/search site.

I've been holding off from actual kernel hacking for a long time (this is mainly down to intimidation), but yesterday I finally bit the bullet and started reading source code properly and trying to understand it. And turns out, it's not so bad after all. To prove it, I wrote a simple proc file driver: http://fluffy.bizarrefish.org.uk/sync/fun/lee.c

Toodleoo. I'll have something big to write about soon, I imagine.