h1

The Gorilla Process Conundrum

November 17, 2014

There is a story that is told to show us how we all behave stupidly in corporations

  1. Place five gorillas in an outdoor cage in a cold, windswept location.
  2. Suspend a banana in the cage above a ladder.
  3. When any gorilla attempts to use the ladder, soak all five gorillas with a fire-hose.
  4. Repeat until they avoid the ladder. After a short period of avoidance, replace one gorilla.
  5. Note that when the new gorilla attempts to use the ladder, the other gorillas will beat him up. Take the fire-hose away.
  6. Repeat step 4 until all original gorillas have been replaced.
  7. Note that at this point, no gorillas use the ladder, any that try are beaten up, and none of them knows why.

…and aren’t the gorillas all stupid and everything for being unpleasant to each other for no reason.

The cheeky bit in this story is that little bit at the end of step 5: we know the fire-hose has been taken away but the gorillas don’t. People who have more information than others should not feel they are being clever, or the others are being stupid, when the difference is merely who has been told what.

If we keep the fire-hose with intent to use it, then the moral of the story becomes very different: the gorillas as a group are behaving in a way that keeps them dry even if they don’t know why.

Which is one use of process. We can follow process without requiring deep understanding, and that means in practice we can quickly learn ways of working usefully before finding out why it’s better to work that way. Once we know the work and the team well enough, then we can – and should – challenge poorly understood process.

That is, there are of course problems with simply accepting process or traditional practice, but there are also problems with letting every new gorilla climb the ladder to find out for himself: it takes time, and everyone gets wet and cold and miserable.

Many skilful tasks can be reduced to a checklist for routine use. For example, deploying webapps to Weblogic can become a routine task and there is no need (or wish!) for a skilled Weblogic guru to do this simple task when it can be written on a list and given to ‘junior’, cheaper staff, leaving the expert to work on other things.

Having a check list improves organisation capability and reduces effort. Coders whose attention is on building applications to solve difficult business problems can work from that checklist to to deploy their work themselves, without also having to learn the quirks of this particular web container or wait on the expert The checklist will tend to be written for simplicity and readability, so may contain some redundant or long-way-around methods. As odd problems arise, extra steps may be introduced (such as routinely restarting the container) that are not strictly necessary. It is not the ‘correct’ way to deploy, but a pragmatic one.

The list can be distributed without the expert which can be useful for quickly gaining capability, or can cause failures where they are not appropriate and people have referred to the list rather than look up an expert for their particular problem. The expert might leave out of boredom when all the challenging setup work has been done, or be ‘leaned out’. In both these cases we now have a checklist, or process, without the expertise to monitor it or review it. Workarounds are added by ill-informed rumour and the process gains warts and weights. Business needs change and the tools change, and process users attempt to shoe-horn them into the process, because there is no-one around to review it, or even authorise change to it – or worse, there may be people around whose job it is to enforce use of it, without understanding it.

Discarding process at any excuse is also not helpful as people then continuously relearn tasks, and the mistakes of the past, to a depth that exposes the reasons why the discarded process is the way it was.

So where then is the happy medium? An organisation should be willing to challenge process by the users of that process. It should have, ‘close’ to that process, experts who can carry out the technical arbitration when process users want different things from it. And the risk holders must be aware that not changing a process might be more harmful than changing it.

h1

Holidays and Engineers

October 24, 2014

It was holiday season and staff all over the place were going away and coming back a couple of weeks and lots of miles and unfulfilled dreams later. Everyone had forgotten to plan holidays into the schedules, and all sorts of key knowledge was away on the beach getting soaked in bubbly or gin, or high in the mountains getting altitude sickness, or somewhere steamy getting cerebral malaria. Those minions left behind flit from crisis to crisis, usually head-down in the detail, and working through those incremental changes and discussions and events and reasoning that those away are blind to in their holiday-bubble-bliss.

So when they come back, feeling like they have lived for years on a different planet, there is always some reconciliation and adjustment.

For returning seniors this is easy. Usually they expect things to continue according to plan, which of course they never have. “Why isn’t it different?!” they roar, and I shamelessly cower shamefully beneath their fury and whine “But sir! Without the tingle tangle of your invigorating lashes, smarmy-smarm, how can we work without you? toady-flinch” and they swear undying punishment and spit fiery acid and depart, mollified by this proof of their motivational powers, to make a new Action Plan and Move the Dates to the Right.

Returning engineers are more tricky. Engineers expect things that ‘belong’ to them – the things they look after, or have created, that are therefore personally theirs – to remain the same. It doesn’t matter if they knew it was a bug-ridden fragile morass of spaghetti code, or if they’d not even finished it, or even if they’d only just started sketching out some ideas. Beware anyone who has touched it, let alone written a replacement! “Why is this different?!” the returning holiday-enhanced minions cry, aghast at the Unexpected Thing that occupies their gaze. “This was working fine!” and “We agreed to do it my way!” they outrage, being free with the meanings of ‘this’, ‘working’, ‘fine’, ‘we’, ‘agreed’ and ‘do it my way’.

The usual response by the bitter, tired and holiday-less minion is personal and tactless: “Your stuff wouldn’t work” to which the outraged minion replies disbelievingly “Oh yes? What was wrong?” followed by a short verbal debug, an outline of why it was stupid to try using it that way, barely veiled assertions on each other’s appalling lack of skills, and further discussion until Huff.

Tact can only add a single step to the start of the sequence, such as the impersonal “We couldn’t get the existing set to work” to which the returning minion cries “what, my stuff?” and the discussion continues as without tact.

About the only useful mollify is to blame external factors or teams: “We were tasked to provide the outer nodule with more kahoobles so we had to do it this way” at least brings on Straight Huff, possibly with faint “This could have done it” whine, and reduces the shouting.

I’m going on holiday soon. Don’t Touch My Stuff.

h1

“Just Integrated Enough” – Coherence vs Agility

September 27, 2012

Look around any enterprise and we will see duplication, unnecessary redundancy, specialized isolated systems that do similar things to other specialized isolated systems, and gaps between information systems that people have to bridge. “Aha!” we cry “There is waste here – these things cost us money, and time, and attention. We must Do Something about it”. We set up a central integration committee with the power to set and enforce standards, and so the enterprise’s developers build systems that inter-operate across the enterprise, and so these costs disappear.

Sadly, setting standards is not free. It costs money, but it also costs time, it costs attention, and it costs innovation – just the things we are trying to save. If we are not careful, we can cause more delay, cost and distraction than we save. So when we consider the role of central coordinating boards – because we know we need them, completely ad-hoc chaotic disconnected systems are also harmful – we need to understand these costs so they can be weighed against the costs of the activities we want to reduce. And of course, we must bear in mind that this weighing activity itself costs money, time and attention…

For a central integration board to have the power to force integration it essentially must have the remit to dictate what a project must do; the project is not allowed to complete until it has satisfied the board in some way.

This essentially means the board becomes yet another veto-ing ‘touchpoint’ for any proposed change. It means time and effort has to be spent by the board understanding the nature of the integration needs of the new system, and the new system must ‘fit’ into the concepts of the board – which is often far removed from the task at hand. As the board personnel rotate through, certification will pause as the new staff get to grips with the situation. The board not only acts as yet another barrier to change (many enterprises already have ‘too many’ stakeholders with the ability to veto work), it acts as a bottleneck to the changes it finally approves. This doesn’t just take time, it takes brain power that can be used elsewhere.

As an example, consider “Bottom up” standards. These are driven by the requirements of the task operators (the people carrying out the work) and their need to communicate. The requirements are collected and collated, and a central standard can then be specified. The system developers can then submit their interfaces to be certified by the central board, and once approved can then implement systems that fulfill those specifications. Even in the best cases, this imposes delays. In the worst cases – where the needs of the task operators change before the previous needs have been fulfilled – it never completes.

Each round-trip generates another standard, all versions of which must be implemented by the appropriate systems to ensure backwards compatibility with those who have not yet changed.

This all costs money as well as time. The requests and approvals (and refusals, and resubmissions, and…) cost enterprise attention – it distracts the enterprise’s experts from the work they are being employed to do. The extra barrier reduces the willingness of people to experiment, to adapt, to innovate. Except, possibly, in ways to bypass onerous central standards committees.

(“Top down” standards on the other hand are nearly always inappropriate to the needs of the task operators, and system integrators resort to workarounds and misuses of the standards, leading again to extra work and extra delays – and increasing disconnection between the shared, documented interface protocols and the actual ones).

To avoid these problems it can be tempting to be abstract, to be vague, to provide overarching guidelines and approaches. The result of which is a set of compliance requirements that are so abstract that they do not inform or support integration efforts, but still require… compliance.

To provide core interoperability with the freedom to innovate, ‘extensible’ standards can provide the best of both worlds. Sadly, without great care, they can also provide the worst of both: the delays required to approve integration work, and a chaotic mix of incompatible extensions.

When XML first arrived it was hailed as the new interoperable standard that would – at last – mean that anything could talk to anything else. As a format it does indeed get around many of the issues with bespoke binary formats (while introducing some others), but even with a schema it solves only the format problem, not the rather more critical issues of agreed meaning and protocols. Without an agreed meaning of the data and what is required and what is not, a common format is useless. Indeed, format has been the most trivial issue in interfacing – even at low, detailed, levels – for some time. The same applies to higher level integration. The media tends to be secondary to the need to agree meanings and ways to resolve misunderstandings and uncertainties.

If the tasks and organisation structures remain similar for long periods these issues tend to disappear. For example, the coalition forces in Afghanistan have a highly complex impressively integrated network of collaborations as there has been time to crystalise collaboration TTPs around the tasks. Strong standards that have emerged from years of work are welcome in this environment. Strong policing boards that govern such stable systems are not so welcome if they prevent those systems from adapting quickly when the tasks change, as they are likely to between contingency operations.

The Solution
The solution is, of course: ‘it depends’.

The first step is simply to recognise that the costs exist, and therefore to identify them and compare them. Costs might be delivery time, money spent up-front, money spent maintaining, effort (manpower) spent maintaining, various qualities of the deliverable (reliability, etc) and so on. So you might decide that removing redundancy is worth some extra design time, for example (although this is rare: “Sooner” usually has much more weight than “Better”). Or you might decide the opposite, in which case the calculation can be bundled in with the programme oversight so that someone doesn’t come along and go “Oh look there’s some redundancy [ie we are wasting money and time], we should do something [ie spend money and time] about that”

By making these comparisons we can hopefully avoid the ‘flip flop’ between strong integration committees that force everything to a grinding halt, and strongly ‘agile’ approaches that result in unwelcome and (worse) unexpected gaps between important systems at critical times.

Between these extremes of ‘Ultimate Coherence Never’ and ‘Agile Incoherence Now’ should be sweet spots – or at least not too bitter ranges – of arranging for ‘Just Integrated Enough’.

For example the integration board could be a broker rather than an enforcer: “So you want to connect your 3d terrain data to that 3d terrain data? Well such-and-such did that, they already have an interface protocol/specification”. Having a proven interface specification can improve the speed and reliability of a new integration so is attractive to integrators if it is appropriate, and the people who can judge that are the integrators not a remote board.

The board could be a ‘goto’ place for a repository or library of existing emerging interface protocols, storing and noting which systems use which protocols and so being able to ‘tweak’ and ‘adjust’ emerging standards rather than dictate them.

Ideally, the board should be a place that creates the right incentives across the enterprise so that people sort out the cross communications amongst themselves, in the same way as complex supply chains form and reform along the aligned incentives of money.

Enterprise-wide interoperability requires enterprise-wide governance, but governing bodies should set policy, not attempt to police.

This essay is a result of discussion with several people at the RUSI information management conference in September 2012

h1

It’s Just A Dependancy

July 17, 2011

Peter RabbitThe rabbits in my garden are eating my carrot shoots, in the best traditions of Peter Rabbit. Unlike Farmer McGregor my livelihood does not depend on carrots; if it comes to that I don’t particularly like them, but I have a garden and apparently you should grow vegetables in it.

We can say that the rabbits depend on my carrot shoots to live. Yet if I stop growing carrots, the rabbits will not starve: they will eat something else. Probably whatever it is I try to grow instead of the carrots. If that’s not palatable, then they will no doubt eat something else, even if it’s not as tasty. They may even find something else that is more tasty, more nutritious, or more sleeky on the fur.

Carrot shoot

In other words, any observer can see that they depend on the carrot shoots for food for their current lifestyle. But it’s not an obvious step – it requires domain knowledge – to realise that not growing carrots will not cause the rabbits to starve to death (which is different from rabbit starvation). It is a resilient system, at least as far as the rabbits are concerned.

However another view is that any change results in a change of system. If I concrete over the vegetable patch, the rabbits will resort to grazing the lawn, but this is a different system. Even replacing carrots with potatoes will result in slight changes in nutrition which will in turn feed (ha ha) other consequences. In other words, the old system breaks easily at the slightest disturbance and has to be replaced by a new one – it is a fragile system.

These two rather distinct ways of looking at the same system can result in very different assessments. During and after the ‘credit crisis’ the term ‘house of cards’ was frequently used to talk about modern economies, as if the failure of a few companies within a few industries could destroy the whole practice of using tokens to describe the values of things we exchange. There are similar (anecdotal) concerns about ‘oil’, or ‘food’ or ‘our whole way of life’, backed by descriptions of the current set up and how easy it would be to attack a point on it. The ‘delicate balance of nature’ too is cast in the frame of a ‘fragile’ system where any disturbance causes failure; this is awkward as nature’s chaotic, er, nature, means that its various subsystems disturb each other all the time.

When assessing the resilience or fragility of a system we need to assess not the dependencies of the moment – the instances – but the outcomes of change to the various participants. This too is not straightforward as many systems have multiple participants with several interrelated goals, and most users have several available systems to draw on. It’s not enough to just pick on particular harms; these have to be traded off against benefits and other harms, and don’t forget those of not changing the system.

For example, I can fence the vegetable patch which would improve my garden system in my favour, but not the rabbits’. The rabbits can eat something else.

But then, so can I:

Rabbits & Carrot

h1

Obvious deficiencies

June 25, 2011

“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.

The first method is far more difficult. It demands the same skill, devotion, insight, and even inspiration as the discovery of the simple physical laws which underlie the complex phenomena of nature.”

– Tony Hoare in his Turing Award speech, 1980

h1

Quality & Peer Review. Again.

March 10, 2011

The House of Commons Science & Technology Select Committee are holding an inquiry into Peer Review.

Like previous investigations, they focus on peer review as a vehicle for quality assurance and scientific discourse, rather than starting with what they want and working backwards. As peer review occurs after the work has been done, it simply cannot be used to assure quality of the work – although it may be used to assure quality of the published paper

Instead the government could develop the quality controls already being introduced by academic institutions, and use these to assure the quality we would like to see in studies used to inform policy.

Today I submitted a short paper to the inquiry saying this in more detail:

MartinHillForSTC3

(Word doc, about 3 pages in large type, with a few minor language corrections from the submitted document)

h1

An Evidence Based Approach to Scoping Reviews

February 24, 2011

“My” second paper (An Evidence Based Approach to Scoping Reviews, published in the Electronic Journal for Information Systems Evaluation) grew from a requirement I made of a PhD student I was the industrial mentor for, when I still wore a commercial hat. Essentially I wanted a more structured approach to the apparently haphazard review of existing work, so that I (as commercial customer) could be confident about what concepts were already available.

When I left that company I lost track of the work, and only recently reconnected when I started at Cranfield. In the meantime it had lost direction, mostly because trying to inspect how you do something while you do it tends to get in the way of the work (not a new problem).

The paper as initially submitted to ECIME was made of three poorly connected pieces of work, and written by a foreign student whose English wasn’t all that good, much sympathy for him. I re-edited it, ran some exercises to fill in some examples, and added some pieces from my initial comments when the work first started, so I got to be the last author.

There’s a lot more that still needs to be done in this field. It’s on my list of things to do…