Biblical kings and boxplots

When you read through the biblical books of Kings, you may have been struck by a phrase that repeats itself for every monarch:

In the Xth year of (king of kingdom B), (name of king) became king of (kingdom A). He reigned N years, and did (evil|good) in the sight of the Lord.

If you’ve read through these books several times, you will probably have noticed that the shorter reigns tend to belong to kings deemed to have done evil, with a record-breaking 3 months for Jehoahaz and Jehoiachin. Let’s see if there’s any relationship between reign duration and “goodness” of the king. First we prepare the data in a form suitable for analysis: <- c("Name Deeds Reign
                David Good 40
                Solomon Good 40
                Rehoboam Evil 17
                Abijah Evil 3
                Asa Good 41
                Jehoshaphat Good 25
                Jehoram Evil 8
                Ahaziah Evil 1
                Joash Good 40
                Amaziah Good 29
                Azariah Good 52
                Jotham Good 16
                Ahaz Evil 16
                Hezekiah Good 29
                Manasseh Evil 55
                Amon Evil 2
                Josiah Good 31
                Jehoahaz Evil 0.25
                Jehoiakim Evil 11
                Jehoiachin Evil 0.25
                Zedekiah Evil 11")

kings <- read.table(textConnection(,
                    header = TRUE,
                    row.names = 1)

The kings data frame holds one row for each king. The row names are the names of the kings; the column Deeds is a two-level factor telling if their reign was deemed good or evil. (We assume here that Solomon was a good guy in spite of what happened towards the end of his life.) The Reign column records the length of the reign as given in the Bible in years, with fractional values for reigns shorter than one year.

We have 11 evil kings and 10 good kings:

> table(kings$Deeds)

Evil Good 
  11   10 

Here we compute the median reign duration depending on the rating of the deeds:

> with(kings, tapply(Reign, Deeds, median))
Evil Good 
 8.0 35.5 

There’s already a good indication that the length of the reign depends on the deeds. We can now plot the length of the reigns:

Boxplot(Reign ~ Deeds,

Duration of biblical reign, depending whether the Bible judges the king to have been a just or an evil one

This plot confirms our impression: “evil” kings tend to have shorter reigns that “good” kings, with the obvious exception of Manasseh, the same one of whom it was said

I’ve arranged for four kinds of punishment: death in battle, the
corpses dropped off by killer dogs, the rest picked clean by vultures,
the bones gnawed by hyenas. They’ll be a sight to see, a sight to
shock the whole world—and all because of Manasseh son of Hezekiah and
all he did in Jerusalem. (Jer 15:3-4)

So what does this all prove? Probably nothing. Plotting data is its own reward. :-)

C++: when delete doesn’t delete

We once spent almost a week chasing after a mysterious memory leak in our application, built on top of the highly regarded eCos real-time operating system. The leak appeared after we had rewritten some of our code in C++ after recognising that C’s object-oriented capabilities were no longer adequate for our needs.

After about half a day we could reproduce the memory leak on the target system with code that essentially looked like this:

Controller* controller = new Controller();
delete controller;

What baffled us most was that running this code in unit tests on the development machines exposed no such memory leak. We routinely run all our unit tests under Valgrind to identify memory usage errors, but in this case there was none. It was very unlikely that the leak was caused by defective code.

What’s more, the leak was almost-but-not-quite consistent. We leaked about 952 bytes on average, but that figure could be as low as 920 or as high as 968. It was always a multiple of 8 bytes. After about 8.5 hours, the system would reboot, presumably because it ran out of memory. We used the mallinfo() function to display the amount of available memory.

After almost a week we found the answer. According to the documentation, the default implementation of the delete operator in eCos is a no-op! I suppose the rationale is that most developers of embedded systems tend to shy away from dynamic memory allocation, and that it is better to reduce the size of the firmware by not providing a (rarely-needed) delete operator.

Except when we need one, of course.

To enable a proper delete operator you simply disable the CYGFUN_INFRA_EMPTY_DELETE_FUNCTIONS option in your eCos configuration file.

Going for one-week sprints: a good wrong idea

A few weeks ago, our team held a sprint retrospective (which I unfortunately couldn’t attend) during which it was decided to shorten our sprint length from two weeks to one. The team was right in their decision, but probably for the wrong reasons and here’s why I think so.

The main driver behind this decision was Neurobat’s involvement with the Aargau Heizt Schlau project: a canton-wide project to measure the efficiency of our system on 50-100 individual houses in the Aargau canton during the 2015-2016 winter. The goal is to have an independent assessment of the energy-savings potential of our product, a replication of our own peer-reviewed investigation. The project is mostly driven from a team in Brugg, with ample support from our R&D team in Meyrin.

Keeping the project on track and on schedule turned out to be extremely challenging. Very soon, urgent support requests began to come at unpredictable times, and we were having trouble keeping our sprint commitments.

The realisation that urgent, random support requests were going to be the norm for this heating season is the main reason why the team decided to experiment with one-week sprints. They have a 5-year long history of sprint retrospectives and I’m convinced they collectively understand the principles underlying the practice of timeboxed iterations. (A practice always proceeds from a principle, and can be modified only when the principle is fully understood.) A less mature team should not have made this decision and should stick with 2-week sprints; but I believe our team was mature enough to carry out this experiment.

Many teams, when they begin with Scrum, will object to the “overhead” introduced by daily standups, sprintly retrospectives and planning meetings. And since they are likely to miss their sprint commitments during the first few months, they are very likely to ask for longer sprints. Resist this temptation.

Mike Cohn tells the story of a team facing exactly this problem: good quality work but systematic overcommitments. He agreed to let the team change the sprint duration, but went against the team’s request for longer sprints. Instead, they went for shorter ones. His rationale against longer sprints is simple:

The team was already pulling too much work into a four-week sprint.
They were, in fact, probably pulling six weeks of work into each
four-week sprint. But, if they had gone to a six-week sprint, they
probably would have pulled eight or nine weeks of work into those!

So if shorter sprints are generally to be preferred over longer ones, why do I think the decision was a solution to the wrong problem? Because I believe that switching to shorter sprints will only perpetuate the root cause of the situation we are in. We decided to go for shorter sprints because urgent support requests were coming in more frequently than ever. How will switching to shorter sprints solve that problem?

I’m reminded of this quote, which I believe came from Mike Cohn’s Succeeding with Agile:

Few organizations are in industries that change so rapidly that they cannot set priorities at the start of a two-week sprint and then leave them alone. Many organizations may think they exist in that environment; they don’t.

If your organisation has trouble planning for more than a week ahead, then do your development team a favour and try, at all costs, to address the underlying problem. Your team members should not be the ones whose productivity should suffer for the lack of foresight elsewhere in the organisation.

Scrum stories that are juuuust right

On thing has been bugging me for quite some time now as I observe our team at Neurobat. Most stories on our sprint board are being worked on by one developer each, leading to daily scrums where everyone reports on work that is completely independent from that of the others.

Even though we encourage people to pair program, the fact remains that most stories are such that one person can implement them by himself, with the possible exception of testing. (We have a rule that a developer may not write the acceptance tests for his own story, much less execute them.)

That, in turn, leads to a very quiet office. We work in an open-space office where the six of us are in direct line of sight of each others. Yet for most of the time, there is very little chatter as each of us is busy with “his bit”.

Perhaps Mike Cohn summarises the issue best, in his User Stories Applied:

Most user stories should be written such that they need to be worked on by more than one person, such as a user interface designer, programmer, database engineer, and a tester. If most of your stories can be completed by a single person, you should reconsider how your stories are written. Normally, this means they need to be written at a higher level so that work from multiple individuals is included with each.

I’m not a big fan of hyperbole, but this passage was a little bit of a revelation to me. Here we had been faithfully trying hard to break up stories that were too large into tiny weeny stories that could be implemented in a couple of days or two by a motivated developer; and now I’m being told that there is such a thing as a story that is too small? Talk about being in a Goldilock-ish fix.

Very well Goldilocks er… I mean Mr Cohn, I’ll bring this up at our next retrospective and we’ll see whether our stories are really too small.

Running CARNOT models under OSX

CARNOT (Conventional And Renewable eNergy systems Optimization Toolbox) is a set of MATLAB & Simulink models for simulating buildings and building systems, e.g. boilers, heat distributors etc. It’s been developed by a collaboration involving several companies and universities and is generally well-regarded. It’s one of several MATLAB toolboxes dedicated to the problem of simulating building physics; other toolboxes with a similar goal include the International Building Physics Toolbox and SIMBAD.

CARNOT is in the process of being moved to a new hosting provider. In the meantime, I’ve recently obtained a copy of this toolbox and here is my experience in getting it to run under OSX.

CARNOT is distributed as a zip file, I decompress it and find what looks like a Simulink top-level model called carnot.slx, and several sub-folders. Very encouragingly, I see there’s an installation guide:

CARNOT root folder

I move the decompressed folder to the folder where I keep all my in-progress projects, and create a symbolic link to it named more simply carnot.

The installation guide is very well written, and the main steps consist in:

  1. Decompressing the toolbox;
  2. Running the init_carnot.m script, that will setup all the paths correctly;
  3. Compiling all the MEX file with the provided script.

When I ran init_carnot.m for the first time on my Mac, it didn’t work and the error message made it very clear that most file paths have been written with the assumption that the toolbox was going to be used on Windows. At this point, I could have done either of two things:

  1. Fix the issues myself as quickly as possibly and get on with the work;
  2. Fix the issues myself carefully, making sure that my fixes could then be sent back to the maintainer of the package.

Being currently on a business trip, I felt like I had the leisure to go for option 2. Seeing that this toolbox was going to need some fixing, I made a git repo out of it and added all its files. (I didn’t know at this point which, if any, files were generated. I figured that this was something I could worry about later and remove those files from the repo.)

I fixed the paths problem, which mostly lay in path_carnot.m, called by init_carnot.m. I created a patch from it and sent it to one of the maintainers. The paths were now correctly setup.

Next I ran mex -setup and this ran fine, MATLAB picked up my XCode installation with the Clang compiler. So the next step in the installation guide was to run MakeMEX.m from the version_manager directory. That ran fine for several .c files until it tried to compile dir2_mex.c. When I opened that file in the editor I saw that it depended on the Windows API. Here I had two options:

  1. Try to understand what this file was doing and try to rewrite it without using the Windows API;
  2. Skip the compilation of this file and hope that the rest of the toolbox would run fine without it.

The problem with option 1 was that I didn’t know at this point whether there were going to be other files with the same problem. I certainly didn’t want to enter that kind of endless loop of fixing file after file that depended on the Windows API. Since the build process had stoppped on encountering the first error, I had no way of knowing if there was going to be many other problems.

So that’s why felt more appropriate to find where in the call mex was being called, and wrap that call in a try/catch block, yielding a warning if any file failed to be compiled. Re-running MakeMEX now compiled all the files correctly except the single dir2_mex.c, which I hoped would not be needed to run any of the simulations I was planning.

Once this was done, I could finally type carnot at the command line and the toolbox would open:

Screen Shot 2015-12-04 at 03.56.40

I was immediately drawn to the box that says double click to open examples and that yielded another set of errors, again related to file paths. After fixing those I could open an example model, the example_House_SFH45, click run, and saw the simulation running. I was all set and done.

A couple of days after writing the first draft of this article, I learned from one of the main developers that they plan to setup a proper SVN repository for the code, and that the whole toolbox was going to released under a BSD licence instead of the current LGPL. Until this is done, I’ve pushed my fixes to a pubic repository on GitHub, to which I have contributed some extra small fixes. But keep in mind that this is in no way the official repository for CARNOT; that will be announced shortly.


The only problem with daily scrums

Over the past five years, our team has attended more than 120 daily standup meetings, carefully following the “canonical” format and having each team member answer the usual questions:

  1. What did you do yesterday?
  2. What will you do today?
  3. Any impediments?


There seems to be one flaw with this format, however. The flaw is that you cannot say what you will do for the day before having heard if anyone else has an impediment.

For example, if Alice says her bit, announcing what she intends to do for the day and declines to mention any impediments, then when Bob’s turn comes and he mentions that he’s having some trouble and could use some help, then Alice will have to come back to what she has just said and amend her plans for the day.

In Neurobat we’ve implemented a partially debugged hack around this problem by having two standup rounds. During the first round we do the canonical standup meeting, then the ScrumMaster asks if anyone needs a second round. The goal of this second round is two-fold:

  1. To let anyone say something he may have forgotten about during the first round.
  2. To let anyone amend their plans for the day due to something they may have heard during the first round.

This solution is far from ideal, and sounds annoyingly like a two-pass compiler. But it is, for now, the best approach we have found to deal with what I perceive to be the main drawback to the canonical form of the daily scrum.

How to fix rotation problems with iPhone pictures

When I take a picture with my vertically-held iPhone, here is what happens when I insert it as-is in this blog:

Wrongly rotated iPhone picture

But the picture shows up correctly when I open it in any OSX application, such as Preview. The issue is that when you take a picture with your iPhone, a meta-data tag gets written to the file telling OSX how to rotate the picture when it is displayed. You can see the tag by using the inspector in Preview:

Inspector data for iPhone picture

The offender here is that Orientation tag, which seems to be used only by OSX applications. The best way to fix this is to remove the tag, rotate the picture correctly with Preview, and save it again.

To remove the tag, I recommend using a tool called ExifTool. It’s a neat command-line tool that you can download here. Once downloaded, removing the tag is a simple as this:

$ exiftool -Orientation= filename.jpeg

This replace filename.jpeg with the same file but with the tag removed, and save a copy of the original file as filename.jpeg.original. Give it a try, I really recommend it.

Reviewer queue

During a recent sprint retrospective we raised a problem with the way we assign code reviews. Not the formal, whole-team ones, but the regular ones we solicit for each pull request.

The problem was that we tend to select our reviewers based on various subjective criteria, including how well we like the person. I admit I am guilty of this myself. What’s more, during the discussion it became clear that my own help in reviewing code was not asked as often as it used to.

At Neurobat we currently have a rule that all pull requests must be reviewed by two other team members (one, if the pull request was paired on). To ensure these reviewers are selected fairly and without subjectivity, we have now introduced a reviewer queue: our names are listed on the main whiteboard and an arrow is drawn, showing who is next in the review queue. When a reviewer is assigned, the arrow moves to the next name.

Neurobat reviewer queue

We’ve had this in place for a couple of sprints now and the results have been very satisfying:

  • people get to review parts of the code they had never seen before
  • people are “forced” to review code written in unfamiliar languages
  • the reviews are more likely to be honest and thorough
  • the review work gets more evenly spread out among the team

An added benefit for myself is that by explicitly putting my name among the review queue, I announce my willingness to participate in the reviewing process as much as anyone else. As a result, I’ve been reviewing much more code this last couple of weeks than ever before.

If you have a problem in the selection of reviewers in your own team, do consider setting up a review queue and let me know whether that works out for you.

How to test for floating point exceptions with CppUTest

Some programmers, when confronted with a problem, think “I know, I’ll
use floating point arithmetic.” Now they have 1.999999999997 problems.
// Tom Scott

Floating point arithmetic is notoriously hard to get right. I consider writing a bug-free, optimally performant numeric library to be approximately as hard as writing a compiler. Fortunately, most programmers don’t need to deal with it, unless your work involves anything to do with science or engineering.

There’s one subject though where I think you need to be a bit more careful. This is about understanding when and why your program will catch floating point exceptions (FPE). Let’s consider a couple of examples.

Consider first this program

public class FPE {
  public static void main(String[] args) { 
    int i = 0; 
    System.out.println("1 / 0 = " + (1 / i));

Compiling it and running it yields:

$ javac
$ java FPE
Exception in thread "main" java.lang.ArithmeticException: / by zero at FPE.main(

In Java, dividing an integer by zero yields an ArithmeticException. Fair enough. What about floating points?

public class FPE2 { 
  public static void main(String[] args) { 
    double i = 0; 
    System.out.println("1 / 0 = " + (1 / i));

Now this yields something different:

$ javac
$ java FPE2
1 / 0 = Infinity

I’m not sure I like having such a wildly different behavior. But consider now the same programs in C:

#include <stdio.h>

int main() {
  int i = 0;
  printf("1 / 0 = %d\n", 1 / i);

This is the result (under OSX):

$ gcc -o FPE FPE.c
$ ./FPE
Floating point exception: 8

Not exactly the most helpful error message ever, but at least the program crashes. Now the same thing with doubles:

#include <stdio.h>

int main() { 
  double i = 0.;
  printf("1 / 0 = %g\n", 1 / i);

And here’s the result:

$ gcc -o FPE2 FPE2.c
$ ./FPE2 
1 / 0 = inf

So Java and C behave similarly: dividing an integer by zero crashes the program, but dividing a double by zero does not. I find it rather unsettling that 1 / 0 should result in a completely different program than 1 / 0.. I realise now that I had assumed all divisions by zero would be caught at runtime and cause the program to fail. This is, however, simply not true.

Our code at Neurobat includes a fair amount of numeric algorithms, which are decently covered by our unit tests. However, there remained the small possibility that the code could execute “illegal” floating point operations and silently fail.

There is no portable way to force a program to crash when a floating point exception is raised. You need to make sure that floating point exceptions cause a SIGFPE signal to be sent to your program. Only google can help you here, but for OSX here is how you do it.

What you can do in a portable way is to test if a floating point exception was raised, and I highly recommend that you check for most floating-point exceptions in your unit tests. I say “most”, because you probably don’t need to test for FE_INEXACT. See the manpage for fenv for details.

Here is how we do it in the CppUTest framework. You need to test for exceptions before and after running your unit tests. We use plain assertions because CppUTest doesn’t like that we use its assertions outside of a test run.

#include "CppUTest/CommandLineTestRunner.h"

#include <cassert>
#include <fenv.h>

void assert_no_fpe_raised(void) {
  assert(0 == fetestexcept(FE_INVALID) && "Invalid floating-point exception raised during tests.");
  assert(0 == fetestexcept(FE_DIVBYZERO) && "Division by zero raised during tests.");
  assert(0 == fetestexcept(FE_OVERFLOW) && "Overflow raised during tests.");
  assert(0 == fetestexcept(FE_UNDERFLOW) && "Underflow raised during tests.");
  assert(0 == fetestexcept(FE_DENORMALOPERAND) && "Denormal operand raised during tests.");
  assert(0 == fetestexcept(FE_ALL_EXCEPT & ~FE_INEXACT) && "Floating-point exceptions (other than inexact) raised during tests.");

int main(int argc, char** argv) {
  int result;
  assert(0 == fetestexcept(FE_ALL_EXCEPT) && "Floating-point exceptions active before tests begin.");
  result = RUN_ALL_TESTS(argc, argv);
  return result;

So did we ever catch any bug with this? Indeed we did. We use an off-the-shelf optimisation algorithm that minimises an objective function in an $N$-dimensional space. At each iteration, the algorithm needs to compute the middle between two points where the objective function is to be evaluated. It does this by taking the mean of the points’ coordinates, in the naive way: $x’ = \frac{x1 + x2}{2}$. What we found was, that if $x1$ or $x2$ is large enough, their sum could overflow. What’s worse, the program would not terminate or fail in any visible way, but just return rubbish.

Bottom line is that if your program does any kind of floating point computation, consider having your unit test framework check for floating point exceptions. It probably won’t do it by default.