//
you're reading...
DevOps, Management, Operations

A Response to ‘Towards an Understanding Tech Debt’

I recently read this blog post on tech debt https://kellanem.com/notes/towards-an-understanding-of-technical-debt.  I had a strong reaction to it, both positive and negative.

The author was arguing that it ‘doesn’t exist’ or ‘rarely’ exists, and the term is overloaded. I agree the term is heavily loaded, but not necessarily overloaded. His breakout is insightful. He points out that lumping them altogether obscures the source and I agree. When people lump them together in one term they lose meaning and focus.  There is a fairly good reason people lump all the categories he provided (and I could name many more) together in one term and give it focus.  All those things aren’t considered ‘critical path’ to getting a product out the door on time.  However, as products over time intend to ‘get better’, ‘get cheaper’ and ‘operate properly’, sometimes decisions made in service to that critical path goal can impede other goals and eventually even impede getting the product out the door on time at some future time.

The Tech debt definition he starts with is

“Technical debt is the choices we made in our code, intentionally, to speed up development today, knowing we’d have to change them later.”

However, I will argue not all tech debt is intentional.   I offer instead

“The choices made which prioritize some goals (i.e. speed to market) balanced against others in a cloud of interdependent characteristics of a product (like quality, maintainability, oper-ability, functionality) introduces risk, some of which produce secondary negative impacts, costs and unintended consequences that  jeopardize the viability of the product characteristics in the future.  Over time, some risks compound in future iterations while some may remain steady or completely fade away.   Tech debt are those set of items holding risk to the viability of the product over some set of characteristic goals necessary for product viability over long time frames.”

It’s really simpler to call it Tech Risk.

Lets take a simple example – a car door handle – you have to have one (critical path), but other aspects may appear variable (I expect there may be regulations I’m overlooking here, but bear with me).   Perhaps the quality can be lower as cost is an issue.  A decision is made, and let’s say it was the best at the time, to choose handle X.

Handle X is provided by producer X and perhaps they are the bleeding edge design of the time – cool stuff.  The decision was made knowing there wasn’t a lot of information on the design’s performance over time, but they would be first to market.  This decision involves risk, just as our blogger said “All code is liability”.

Perhaps handle X however, turns out not to last as long as they hoped, causing fix costs within the warranty time frame of the car, but not so much they can’t afford the liability and cost.  They hurry to find a handle X.1, but for whatever reason that now blows the cost profile.  The handle X was so cool that the fitting in the door (API requirements) makes it so no other, less cool but more sturdy, models can be substituted.

As the design team want to focus on the newest version of the car, they find they can’t change the fitting hole nor the handle easily.  It’s Tech Debt, a decision in past made prioritizing ‘coolness’ now making future change difficult.   How is the manufacturing world so different from software world that Tech Debt is only applied to the latter?

Probably because it’s easier to fix a debt in manufacturing and the underlying structure supporting the design has less layers of infrastructural dependencies and higher flexibility.

So was the decision to use Handle X bad? Not necessarily. If it made the car desirable, and many were purchased, leading to the ability to afford redesign, the risk and penalty paid off.

Let’s say they had all the data about the ruggedness of the handle and it indicated there were no issues.  And later instead of a weakness, it turned out that car thieves found a way to pop open the door thru the handle bypassing the lock.   Was that intentional?  No.  Is it still a Tech Debt?  Will it still be hard to change the handle now a critical flaw is revealed?  Yes to both.

I know of no software project that can prevent it. By it’s nature its not something that’s yields to identification and resolution up front, and it’s not necessarily predictable. Where does it enter the value stream?  Anywhere.

Some simple areas to consider:

  1. Design Choice – you have to choose a language, libraries, frameworks.  There may be risks in those that you don’t know or can’t predict in addition to ones you may know.
  2. Prioritization Choice – feature priorities, function vs non-functional priorities
  3. Scoping Choices – what’s in or out in the minimum viable product and thereafter.
  4. Resource Capabilities – the people you hire or work with and their knowledge and expertise.
  5. Time to market – the windows in the market are not infinite in order to make the product to meet demand.  This pushes staff to make choices on scoping and design they wouldn’t have otherwise given more time.
  6. Unknown underlying flaws in libraries, systems on which an application is built, anywhere in the tech stack really.

So now what?

The whole issue is first to know what you’ve got, and attempt to learn which ones have the highest risk to the organization.  It helps if everyone understands the decisions that were made and why, and what known risk did the decision impose.

Very few companies track the history of their decisions, and maintain knowledge of the known tech risks. Very few companies train their staff to understand what tech risk is, look out for when decisions are made and how to consider the issues and raise it up for tracking and mitigation as a normal part of business.

Very smart companies will do their best to

  • Keep a history of decisions and track the risks they take on knowingly.
  • Track risks that ‘pop up’ that weren’t known at the beginning. The goal is to know the situation.
  • Prioritize all the risks.  The goal is to figure out which ones are most likely to cause the organization issue so they can be addressed.
  • They work to minimize tech risks (note I didn’t say tech debt) within the context of all the other risks the organization has to work with: financial, market, staffing, strategic, compliance, security, operational, and reputation.  Hopefully, the organization will provide time on an ongoing basis for the mitigation of known accumulated tech risks (tech debt).

What can you do?  Be aware.  Raise the risks.  If you are in management, track your risks and track the decisions that led to various risks.  Make sure your risks are understood by the organization at large so you can consider them.  Work with your organization to spend at least small slices of consistent time to mitigate your risks, whatever they may be.

You can’t get rid of all tech debt, but you can be aware, track what risks you know, create a balance in the product so the technical risks don’t cause a significant liability in the ability of the product to change in the future for the better.

 

About Sarah Baker

Professional technical and people manager.

Discussion

Trackbacks/Pingbacks

  1. Pingback: SRE Weekly Issue #184 – SRE WEEKLY - September 8, 2019

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: