Archive for the ‘Engineering’ Category


“Just Integrated Enough” – Coherence vs Agility

September 27, 2012

Look around any enterprise and we will see duplication, unnecessary redundancy, specialized isolated systems that do similar things to other specialized isolated systems, and gaps between information systems that people have to bridge. “Aha!” we cry “There is waste here – these things cost us money, and time, and attention. We must Do Something about it”. We set up a central integration committee with the power to set and enforce standards, and so the enterprise’s developers build systems that inter-operate across the enterprise, and so these costs disappear.

Sadly, setting standards is not free. It costs money, but it also costs time, it costs attention, and it costs innovation – just the things we are trying to save. If we are not careful, we can cause more delay, cost and distraction than we save. So when we consider the role of central coordinating boards – because we know we need them, completely ad-hoc chaotic disconnected systems are also harmful – we need to understand these costs so they can be weighed against the costs of the activities we want to reduce. And of course, we must bear in mind that this weighing activity itself costs money, time and attention…

For a central integration board to have the power to force integration it essentially must have the remit to dictate what a project must do; the project is not allowed to complete until it has satisfied the board in some way.

This essentially means the board becomes yet another veto-ing ‘touchpoint’ for any proposed change. It means time and effort has to be spent by the board understanding the nature of the integration needs of the new system, and the new system must ‘fit’ into the concepts of the board – which is often far removed from the task at hand. As the board personnel rotate through, certification will pause as the new staff get to grips with the situation. The board not only acts as yet another barrier to change (many enterprises already have ‘too many’ stakeholders with the ability to veto work), it acts as a bottleneck to the changes it finally approves. This doesn’t just take time, it takes brain power that can be used elsewhere.

As an example, consider “Bottom up” standards. These are driven by the requirements of the task operators (the people carrying out the work) and their need to communicate. The requirements are collected and collated, and a central standard can then be specified. The system developers can then submit their interfaces to be certified by the central board, and once approved can then implement systems that fulfill those specifications. Even in the best cases, this imposes delays. In the worst cases – where the needs of the task operators change before the previous needs have been fulfilled – it never completes.

Each round-trip generates another standard, all versions of which must be implemented by the appropriate systems to ensure backwards compatibility with those who have not yet changed.

This all costs money as well as time. The requests and approvals (and refusals, and resubmissions, and…) cost enterprise attention – it distracts the enterprise’s experts from the work they are being employed to do. The extra barrier reduces the willingness of people to experiment, to adapt, to innovate. Except, possibly, in ways to bypass onerous central standards committees.

(“Top down” standards on the other hand are nearly always inappropriate to the needs of the task operators, and system integrators resort to workarounds and misuses of the standards, leading again to extra work and extra delays – and increasing disconnection between the shared, documented interface protocols and the actual ones).

To avoid these problems it can be tempting to be abstract, to be vague, to provide overarching guidelines and approaches. The result of which is a set of compliance requirements that are so abstract that they do not inform or support integration efforts, but still require… compliance.

To provide core interoperability with the freedom to innovate, ‘extensible’ standards can provide the best of both worlds. Sadly, without great care, they can also provide the worst of both: the delays required to approve integration work, and a chaotic mix of incompatible extensions.

When XML first arrived it was hailed as the new interoperable standard that would – at last – mean that anything could talk to anything else. As a format it does indeed get around many of the issues with bespoke binary formats (while introducing some others), but even with a schema it solves only the format problem, not the rather more critical issues of agreed meaning and protocols. Without an agreed meaning of the data and what is required and what is not, a common format is useless. Indeed, format has been the most trivial issue in interfacing – even at low, detailed, levels – for some time. The same applies to higher level integration. The media tends to be secondary to the need to agree meanings and ways to resolve misunderstandings and uncertainties.

If the tasks and organisation structures remain similar for long periods these issues tend to disappear. For example, the coalition forces in Afghanistan have a highly complex impressively integrated network of collaborations as there has been time to crystalise collaboration TTPs around the tasks. Strong standards that have emerged from years of work are welcome in this environment. Strong policing boards that govern such stable systems are not so welcome if they prevent those systems from adapting quickly when the tasks change, as they are likely to between contingency operations.

The Solution
The solution is, of course: ‘it depends’.

The first step is simply to recognise that the costs exist, and therefore to identify them and compare them. Costs might be delivery time, money spent up-front, money spent maintaining, effort (manpower) spent maintaining, various qualities of the deliverable (reliability, etc) and so on. So you might decide that removing redundancy is worth some extra design time, for example (although this is rare: “Sooner” usually has much more weight than “Better”). Or you might decide the opposite, in which case the calculation can be bundled in with the programme oversight so that someone doesn’t come along and go “Oh look there’s some redundancy [ie we are wasting money and time], we should do something [ie spend money and time] about that”

By making these comparisons we can hopefully avoid the ‘flip flop’ between strong integration committees that force everything to a grinding halt, and strongly ‘agile’ approaches that result in unwelcome and (worse) unexpected gaps between important systems at critical times.

Between these extremes of ‘Ultimate Coherence Never’ and ‘Agile Incoherence Now’ should be sweet spots – or at least not too bitter ranges – of arranging for ‘Just Integrated Enough’.

For example the integration board could be a broker rather than an enforcer: “So you want to connect your 3d terrain data to that 3d terrain data? Well such-and-such did that, they already have an interface protocol/specification”. Having a proven interface specification can improve the speed and reliability of a new integration so is attractive to integrators if it is appropriate, and the people who can judge that are the integrators not a remote board.

The board could be a ‘goto’ place for a repository or library of existing emerging interface protocols, storing and noting which systems use which protocols and so being able to ‘tweak’ and ‘adjust’ emerging standards rather than dictate them.

Ideally, the board should be a place that creates the right incentives across the enterprise so that people sort out the cross communications amongst themselves, in the same way as complex supply chains form and reform along the aligned incentives of money.

Enterprise-wide interoperability requires enterprise-wide governance, but governing bodies should set policy, not attempt to police.

This essay is a result of discussion with several people at the RUSI information management conference in September 2012


It’s Just A Dependancy

July 17, 2011

Peter RabbitThe rabbits in my garden are eating my carrot shoots, in the best traditions of Peter Rabbit. Unlike Farmer McGregor my livelihood does not depend on carrots; if it comes to that I don’t particularly like them, but I have a garden and apparently you should grow vegetables in it.

We can say that the rabbits depend on my carrot shoots to live. Yet if I stop growing carrots, the rabbits will not starve: they will eat something else. Probably whatever it is I try to grow instead of the carrots. If that’s not palatable, then they will no doubt eat something else, even if it’s not as tasty. They may even find something else that is more tasty, more nutritious, or more sleeky on the fur.

Carrot shoot

In other words, any observer can see that they depend on the carrot shoots for food for their current lifestyle. But it’s not an obvious step – it requires domain knowledge – to realise that not growing carrots will not cause the rabbits to starve to death (which is different from rabbit starvation). It is a resilient system, at least as far as the rabbits are concerned.

However another view is that any change results in a change of system. If I concrete over the vegetable patch, the rabbits will resort to grazing the lawn, but this is a different system. Even replacing carrots with potatoes will result in slight changes in nutrition which will in turn feed (ha ha) other consequences. In other words, the old system breaks easily at the slightest disturbance and has to be replaced by a new one – it is a fragile system.

These two rather distinct ways of looking at the same system can result in very different assessments. During and after the ‘credit crisis’ the term ‘house of cards’ was frequently used to talk about modern economies, as if the failure of a few companies within a few industries could destroy the whole practice of using tokens to describe the values of things we exchange. There are similar (anecdotal) concerns about ‘oil’, or ‘food’ or ‘our whole way of life’, backed by descriptions of the current set up and how easy it would be to attack a point on it. The ‘delicate balance of nature’ too is cast in the frame of a ‘fragile’ system where any disturbance causes failure; this is awkward as nature’s chaotic, er, nature, means that its various subsystems disturb each other all the time.

When assessing the resilience or fragility of a system we need to assess not the dependencies of the moment – the instances – but the outcomes of change to the various participants. This too is not straightforward as many systems have multiple participants with several interrelated goals, and most users have several available systems to draw on. It’s not enough to just pick on particular harms; these have to be traded off against benefits and other harms, and don’t forget those of not changing the system.

For example, I can fence the vegetable patch which would improve my garden system in my favour, but not the rabbits’. The rabbits can eat something else.

But then, so can I:

Rabbits & Carrot


When are Systems Of Systems not Systems?

November 26, 2010

Only the most trivial of systems are not composed of other systems, yet the term ‘System of Systems’ is used as if describing something distinct. So what is it? What’s the difference between a ‘system of systems’ and a ‘system of … things that aren’t systems’?

Is it a bigger thing?

For example, this paper (Net-Centric, Enterprise-Wide System-of-Systems Engineering And The Global Information Grid PDF) argues that systems-of-systems are not just a scaling up of systems-of-components but are distinguishable as follows (click to enlarge):

Yet plainly many of these are simply differences of scale:

Local vs global is a simple geographic scaling, and is not really valid, depending on how you define ‘global’.

Similarly lifespan extents are in practice in the eye of the beholder. Complex systems of systems such as human beings have lifespans of decades, yet systems of humans such as enterprises have typically similar lifespans.

Similarly (not) understanding information flows is a feature of the engineer not the component; a transport company is a system of components that include vehicles. Vehicles are systems of components that include engine management systems, that in turn include microchip information exchanges that are often not very well understood at all when operating in the real world, and so on. Understanding of the information exchanges varies from engineer to engineer and community to community.

The required functions too change; even if a car has been ‘optimised’ for a certain set of requirements, the uses that the owner might want to put it to changes from journey to journey and during the lifetime of ownership as the owner’s lifestyle changes.

And so on.

Is it a new thing?

This paper (A New Accident Model for Engineering Safer Systems PDF), was included as a discussion paper at a Systems of Systems Architecture (Safety) group and claims that we are dealing with new and more complicated systems as technology enables more complex systems.

Yet biological systems are some of the most complex systems that we encounter, and the primitive farmer has had to run systems of these components as a matter of course. The horse pulling a plough, for example, has to be managed as a system and yet is an essential component of some subsistance agricultural farms.

Reverse the polarity…

A system of components is supposedly ‘well understood’ and so there is a top-down view of how the components interrelate and the components are seen as discrete black boxes. It is easy to diagram and describe.

With systems of systems these components are opened up and the interrelationships are less well understood; a kind of ‘inside out’ view, where we sit within this large surrounding system, looking around at a complexity we can’t comprehend.

These are descriptions of viewpoints and the engineers’ understanding though, not descriptions of the systems themselves. As long as the terms are used as a way to categorise viewpoints then this is alright, but unfortunately the terms seem to be used to describe a ‘new’ problem, and so therefore we need ‘new’ ways of approaching it, thus discarding much of what we have learned about systems engineering.

It’s a learning thing

It may be that this is simply part of the way that we preserve corporate or community knowledge. Because expertise is hard to pass on, there is a tendency as new blood arrives to generate ‘new paradigms’ that are a small iterative improvement (hopefully) on the previous paradigm. People are essentially re-learning many existing concepts under the guise of exciting shiny new terms that provide the motivation. More later…

It’s just that I don’t understand

As long as we don’t lose sight of the fact that systems of systems are ‘just’ systems, the term can be use to indicate the engineers’ perhaps quite legitimate incomprehension of the complexity of the system under discussion.


Life & Death Decisions using Sparse, Unreliable Evidence

August 23, 2010

Wahey, my first ‘paper’ (here, PDF) has been reviewed and accepted for the European Conference on Information Management and Exploitation in early September.  It looks rather lonely on its own on my shiny new publications list page, but the title makes up for it.

It became a bit of a brain dump for everything I could think of that was relevant, so it’s a broad sweeping outline with plenty of pieces to pick up and work on in more detail.

Many thanks to John Salt for helping to write it, my parents for their initial review, and friends especially Tom Targett and Eric Titley at the ROE for proof reading the later versions.


Incompetent Systems

December 7, 2009

A few years ago I worked on an excellent research project called AstroGrid.  Nearly twenty commercial software engineers were to build a distributed astronomy data analysis toolkit to support astronomers, as part of an international global effort.

Some astronomers said they had seen it all before and remained skeptical, though their enthusiasm to help was not apparently diminished. I poo-poo’d them; I’d come from developing satellite control center software. I knew how to deliver software that worked.

We had a bunch of bright engineers, who worked hard and produced some pretty good stuff – sometimes very good stuff.  In my case, obviously, it was amazing stuff.

And after two years of quarterly iterative releases we’d still delivered no usable product.  There were a few applications deployed and used here and there, and some proper new science forced through as example test cases, but nothing that astronomers couldn’t have knocked up themselves in a few days of scripting. Sometimes they already had.

Lessons Learned

Forty man-years largely wasted and the project continued in the same vein – what was wrong?

There are lots of technical reasons: poor and shifting requirements, contradictory overall objectives, very little actual commercial experience in the team, unsuitable release procedures and version control, immature support tools, and so on.

But really these are all messing about in the weeds, looking for specific problems and specific someones to blame. Why did these technical problems exist, and why weren’t they resolved?  How did they persist – for so long?  More importantly, given there’s nothing new about these problems, why did they exist in this particular project and its organisation in the first place?

Importantly, I can’t think of any of the staff who were incompetent, and that includes the project manager and chief tech, which are roles that are sometimes (and sometimes should be) held responsible for project process. In this case though, while I disagreed with some of the activities (particularly the release process)  they were hard working and experienced, and yet still we produced nothing. For years.

Competent People working in Incompetent Systems

Imagine a new coal mine owner, who pops down the local and employs a bunch of brawny lads to mine the coal. He pays them for the amount of coal they hack off the coal face, leaves a foreman in charge to handle pick axe handle repair and pay and so on, and settles down in the now peaceful pub for a quiet pint, and waits for the money to roll in.

Within a few days the miners have pushed a little coal out of the mine to make room to get to the coal face and swing a pick, but little else.  Even if we assume they collaborate with each other (people are sociable and to some extent self-organise) to avoid bringing the roof down, there’s no incentive to get coal out and sold, just to get it far enough out to make room to hack more off the wall.  The targets, the incentives, the organisation, are all useless for the owner or any of the cold shivering pensioners waiting for the three lumps they can afford.

More importantly (because we often do things wrong, it’s nothing to be overly ashamed about), there is no local remit to make the changes required to make it useful. The foreman does not have the budget, the incentive or the executive power to change the organisation or targets to make the mine productive.

Someone somewhere can generally be found to make suitable changes. In this case, someone could pop down the pub and find the owner and tell him what’s up. But why bother? Time away from the coal face is time not earning. And who knows what the owner is like, maybe you’ll get fired for disturbing him. Again, there are barriers to improvement.

It’s an incompetent system, staffed by competent people.


What a lovely jargon word. But quite appropriate: who really is supposed to look after projects to make sure they get the support and training and the right staff at the right point, the right checkpoints and feedbacks and incentives?

In the commercial world near the marketplace these are normally fairly straightforward: money is the incentive. In order to make money, you have to provide someone with something they are willing to pay for. The focus is on that delivery of usefulness. So there are ‘automatic’ readjustments that come from that focus; if the miners were being paid for coal sold rather than hacked off a coalface, they would likely organise themselves into some sort of suitable structure and work process to deliver that coal.

As we step further away from the marketplace – to R&D projects in the commercial world, or academic research – getting these incentives right is trickier. Long term benefits of blue sky research are not only hard to define, but if you don’t have a good handle on some kind of target then you can’t differentiate between lack of delivery because we haven’t worked enough yet, and lack of delivery because the system is squashing any progress.

The Tools

Here I assume that pick axes (or more advanced machinery) are available. That people share a language (which isn’t always the case) and so self-organising is feasible. That currencies are in use, and so on.

When the tools are not avalable, the system isn’t the failure point. For example, evidence-based medicine has spread widely only relatively recently; it was hard to build systems for Good Medical Treatments without it.

Competent Systems

Competent systems don’t always succeed. Competent systems encourage success. Importantly, they have the feedback mechanisms that drive change from failures (and sometimes success) in order to direct effort to more success, rather than let that effort dissipate uselessly, or even have it directed unintentionally harmfully, as Incompetent Systems do.


But but but – that was my idea

October 13, 2009

Long ago, when I was young and some chap was building a big wooden ship to help cope with catastrophic climate change, I had two excellently brilliant ideas that would have made me millions if only I could persuade people to invest in them: Inflatable Space Stations, and Unpressurised Space Suits. For some reason this persuasion was quite tricky.

Inflatable Space Stations

The inflatable space stations would be made from double-layer sausage shaped cloth capsules, inflated by injecting some kind of expanding foam (‘no more big gaps’) into the double layer.

This would save cargo space in the launch vehicle (and so weight, in the fairings), and by producing many of these with standard bayonet fittings (or similar) at each end and at standard intervals around the side, you get economies of scale while building any configuration of space station you liked.

Those bayonet fittings would include standard cabling and, say, a small processing and hub node.

The set foam would provide some protection against dust and small impacts. It would certainly insulate against heat loss and/or gain.

With the right expanding goo it might even help protect against radiation, though it’s worth bearing in mind that’s mostly based on weight, and so we’re not likely to get overall weight savings for some spaces. It may well be that normal working & living spaces remain metal or similar, and the inflatable sausages are used for more occasional use or storage spaces, or as a framework to bolt shields on to.

Similar techniques could be used for lunar or asteroid pimple habitats, which once set could be reinforced with some kind of cement made from the local rock.

The big advantage is size; large spaces can be created using this approach that can’t practically be launched.

And since then NASA have been looking at some inflatable habitats and a private company has even launched two scale models.

They must have heard.

Unpressurised Space Suits

The unpressurised space suits were based around the idea that all we need is some pressure on the skin to act a bit like the atmosphere, rather than actually gas pressure. A kind of stretched compression body bandage would do; around the lungs particularly so that the occupant can breath out.

This could mean we could reduce the bulkiness of existing suits largely caused by the special joints required to keep internal volumes equal, so you can make the joints bend.

It would also be more robust, as a puncture would result in only local bruising problems, rather than complete decompression, asphyxiation and death. Or more heavy and complicated systems for partitioning a suit.

There are still problems with heating and cooling, but a metal mesh or a system of tubes embedded in the material can transfer heat around.

Hygene is a problem anyway, and the suit needs either a washable lining, or presumably you have a couple of suits and hang one, inside out, on the washing line outside the space station and let them boil off into the vacuum.

And it seems folks are building just such suits:


Which is fine, they must have overhead me in the pub all those years ago, and thought “What a good idea”. I can continue to have Great Ideas without having to go through all that trouble of working out how – or if – they can work..

You can thank me later.