Challenges in Highly Regulated and Especially Certified Environments

Most things that can easily kill you and others, require certification to ensure that applicable measures to reduce risks have been applied and that they’re functional. While all this is pretty trivial in the analog world, open a door -> release a switch -> open switch triggers a relay -> machine stops, the digital world is a little more complex. Especially when a program running on an operating system periodically scans an I/O port to check whether it has changed its state. Even when using interrupts things don’t get much easier. Here a few things to consider when looking into similar topics.

What is Certification?

Initially certification is similar to a quality assurance process. Somebody sits down, makes a list which standards, rules, regulations or other rules should apply to the assessed product and then checks whether they were actually applied. In addition, it’s usually a ginormous risk assessment “proving” that everything is just as safe as it should be. The more complex the system, the more pages of paper there are to write and read, so it can be a lengthy and expensive process. So lengthy, that depending on the field it’s described in multiple standards. And one of the core aspects, certification is usually performed by a neutral and reliable third party.

What if I Change Something Certified?

Well, the first question usually is, whether it was a relevant change. Relevant meaning checking whether it’s explicitly covered by the aims of the certification process. So, if the target is ensuring safety, the question will be: Does it have a potential influence on safety? If not, there might be a statement for a short way out of the process, i.e. just documenting the changes. If it does have a potential impact, well, time for recertification.

Relevant Changes?

Some changes are super trivial, changing the function watching the I/O of the button in the initial example is safety relevant. But what if you change the underlying operating system? Is timing still on spot? Did it improve? Did it get worse? Is the worse still expectable? Talking timing, couldn’t any change on a system / component put off timing? What if it’s just a communication partner? Can it’s timing influence the other component?

It’s easy to both describe every change as potentially relevant, just as potentially irrelevant…

Security Life Cycles

When applying Security one of the main aspects is patching or installing updates. In general, each update is a change and as such triggers the recertification process and results in the question whether the change is relevant or not. If it is, we’re talking about doing the same verification as performed initially. Obviously, there are a few shortcuts, but each page will have to be checked, and the risk analysis will have to be both adjusted and verified. If looking at a large system, with a single certification, well, this can take a significant amount of time and cost a fair amount of money.

Skipping the recertification process is not an option, as it’s how we make things safe, reliable and acceptable for use. So, talking security life cycle, we simply add an extra step, problem solved.

Timing

One big trigger Security life cycles is the publication of vulnerabilities. Some of them being so critical we’d prefer to patch within in minutes rather than hours. This is where we hit a wall. Short reaction times and certification processes are ambivalent. There is no hidden criticism here, it’s just a fact. As such, when calculating potential patch cycles we’re talking about weeks, rather than days… Which in return can result in significant risks and practical issues when a system is attacked.

So, is This Really a Problem?

No, for most systems out there the considerations in this post are fully irrelevant. Why? Most critical and safety relevant systems are fully air gapped as such the probability of an attack is close to zero. So, while they should still be patched regularly, simply to be on the secure site and protect against attacks which could be performed via portable memory devices or service notebooks, stretching the “regular” can usually be accepted from a risk perspective.

In return, should you have remote access or telemetry interfaces, well, you might actually have an issue.

What Options Exist?

Tough one! The smaller the certified chunks are, the quicker changes should be able to pass recertification. This would in return require defining interfaces between the modules to ensure that each one is surrounded by specific requirements. What might sound a no brainer, might very well not exist, as skipping might be a valid shortcut in certifying the “complete thing”.

In addition, the initial certification costs might be significantly higher, as not a single something has to be certified, but multiple, multiple modules. That said, in theory each module would only have to be certified only once and would be compatible with all surrounding modules, that implement the defined interfaces. Which sounds a bit like what we do when creating software or IT systems and components.

The biggest challenge here is the fact, that many of the certification processes and approaches are not only covered by standards, which take 2-5 years to change, but also by national or international law, which, well, let’s not even make estimations here.

The Market will Solve it!

Well, why should it? Would a company creating certified and regulated products help change these regulations, they might at the same time assist in opening the market to others. Especially as modularization implies that companies don’t have to be able to provide the overall systems but might just start selling specific modules.

Why would a company enjoying limited competition on a market “closed” by strict regulations risk changing anything here?

And now?

For me this one of the big challenges where I’m able to say “hey, go start changing standards, start adjusting regulations and prepare to amend laws” just before hitting a mental blue screen as I realize the complexity. While the high-level strategy fits onto a napkin, the detailed plan is probably enough for writhing a few books.

I guess if you’re part of a big company relying on formally certified products it’s probably up to you to start making a plan and moving a direction that will allow you to apply proper Security to your certified components and systems. If you don’t do it yourself, well, why should anybody else?