On costs, graphs and integrals


Lately, and by mere coincidence, I was discussing programming language choices. When or why should you choose language X over Y. And apart from the usual suspects like

  • “familiarity in the team”
  • “interoperability with other pieces in the ecosystem” and
  • “existing dependencies that we can leverage”

the idea of how to deal with complexity came up.

In this blog post I would like to explore an idea that I can so clearly see with graphs in my mind, but takes more than a quick chat to get across. So hopefully, next time I’m in a position to talk about languages or any other technology to tackle a problem, I can point to random graphs1 in this article and it will make more sense to my interlocutor.

The graphs here were made with jazcarate :: Rough Graph.

Limits

So, let’s start by setting the groundwork for the next graphs. Some axioms we’ll assume true for the sake of discussion:

Complexity

There is an upper- and lower-bound on the complexity of any given system.

The lower limit should be fairly easy to convince, there is such thing as a trivial program2. On the other hand, the higher bound. And this is a bit more contentious. One can think of this limit as either:

  • The allowed limit of the universe. A sub-system can never be more complex than the system on which it is embedded.
  • Or, if simulation theory is too scary, the maximum capacity of the given team to tackle.

Either way, complexity is be bounded for the sake of this thought experiment.

Costs

There are many ways to measure costs. In this post, let’s not dwell much on what unit of measurement we chose. In other words, pick whichever is more relatable to you, and I think that my explorations would still make sense:

  • Lines of code
  • Number of entities in the ER model.
  • Human Hours
  • Money spent
  • Number of sticky notes in the “😟” column in the retrospective.
  • other

But any of these will be, as complexity, bounded at both ends. These two parameters give us a neat 2-dimensional space to explore the cost/complexity of things. So, without further ado, I present you with the first graph.

Types

Absurd scenario

No matter the complexity, the cost is constant.

Absurd scenario

With much grievances; I’ll tell you that there exists no technology that can deliver on this. Complexity will always carry with it a cost. There might be technologies that promise this kind of behavior. They don’t exist3.

Ideal scenario

So a better notion is that there is a lower bound of how much complexity incurs in cost. I think it would not be unreasonable to propose that the marginal cost is proportional to the change of complexity.

With this mindset, we can establish a “minimum” graph. Technologies might promise better ratios, but we should mistrust.

Ideal scenario

It is worth mentioning that in all these graphs, a given complexity in the horizontal axis is but the accumulation of costs to reach to that complexity level, as to reach a complexity level of \(n\), one must first cover \(n-\epsilon\) (where \(\epsilon \in \mathbb{R}_{>0}\)). If we call \(f\) the function that we are plotting, then there exists a \(\Delta f \colon \mathbb{Complexity} \mapsto \mathbb{Cost}\) where \(f(complexity) = \int_{0}^{complexity} \Delta f(x) \,dx\). A useful corollary from this fact, is that our graphs are monotonically non-decreasing. And those are fun! 😺.

Now that we established a common ground to talk about complexity and cost, let’s dive into three distinct categories of technologies: Pay as you go, Debt and, Upfront.

Pay as you go

Maybe the most alluring. The more complex a problem, the more it costs to solve. It is a sensible way of understanding technology. The many flavors of pay as you go differ from one another by their slope, or their initial cost. c-like languages fall squarely here.

Pay as you go

Technologies here could be plotted with a slightly different slope or initial height. I don’t much care about the details comparing each of these technologies 😅. They are great all-around options. Jack of all trades, master of none. I see these technologies as the backbone of the software industry. Some have a steeper \(\Delta f\), where some have a costlier start; but on in all, they grow linearly. Never being able to have a slope less than the ideal scenario.

Examples of these are:

  • Java
  • c
  • SQL
  • Hibernate*

All of these follow the pattern:

Simple things are cheap. Complex things are costly.

A note on Hibernate, and many others: Complexity is treated here as a one-size-fits-all kind of thing. The reality, sadly, is much more complicated. If you were to focus on the \(\frac{cost}{complexity}\) of say, switching databases. Then an ORM like Hibernate makes a upfront cost for simple things to have a better marginal cost when reaching the complex side of the spectrum.

But we are not here to talk about workhorses, but rather the odd ones out. So on to the more interesting ones!

Debt

These are the most interesting to me. Technologies that promise you a very low barrier of entry.

Simple things can be done incredibly fast and easy, without hindering other non-functional aspects like security or speed. But, as in real life, these are not free. At some point in the complexity scale, things start to become hard. And way hard at that.

All those nice defaults start to work against your custom needs. You start to fight the framework, find ways around checks and validations. Ways that deviate from the secure, fast, reliable path.

Debt

Examples of these are:

  • Node’s ecosystem
  • Ruby on Rails
  • no-code

And Ruby on Rails or Node.js fans might argue that that is not the case; that the cost does rarely increase. And that this reflects on my general lack of understanding of the deeper systems, or the metaprogramming machinery, or yet another dependency that can be used both as an import, a CLI, or a bundler ⸮ . For which I will yield, that this is probably true. And that this is not a counterargument, but rather proof. At a certain point, the complexity of a problem intertwines with the complexity of the technology; compounding on the costs.

Thus, the exponentially of the matter. Remember that, the cost at the end of the complexity axis, must always be greater than the ideal scenario. There is no, and will never be, a piece of technology that can beat the cruel reality that complexity carries costs.

But this is a good thing to learn from these: If you know that the complexity of a problem is capped, then these alternatives are just what you need. If you need a CRUD API, some web views with some forms, a little OAuth without much customization; then I encourage the use of these sort of technologies.

Looking at the graph, there is a point where the ideal and the Debt intersects. If you can be sure that the complexity will always be on the left-hand side; then using any other technology will be ill-advised. But, if the threshold does get crossed, then you are stuck between switching to another type of technology, and re-paying all the costs to get you to this point; or deal with an exorbitant \(\frac{cost}{complexity}\).

Partial payment

Be not mistaken, the cost difference before the Debt curve and the Ideal curve meet was paid by the designers of the technology. And we should be grateful that they allow us to offset a big chunk of our costs to their endeavors.

Upfront

Finally: Upfront. These are a treat. You pay a steep cost upfront, but after some turning point, then the marginal cost is cheap. Managing complexities. Composition, ways of meta-programing, having tools for code synthesis are all techniques that pay this cost upfront for future gains.

Upfront

Some might call this approach over engineering, and it is true; up until the point where failing to have done it, the cost is greater. In other words:

\[overEngineered(complex) = \left\{ \begin{array}{ll} \text{Yes} &: payAsYouGo(complex) \leq upfront(complex) \\ \text{No} &: \text{otherwise.} \\ \end{array} \right.\]

Examples of these are:

Maybe you are one of the enlightened few to have partially-payed some of the costs and already know about β-reduction, monad composition4, or other techniques to manage complexity; but be not mistaken that that cost will be paid for every other person that has to deal with it in the future.

Conclusion

No technology is better or worse in the abstract. It is a choice between good things, and bad things. In this blog post, I wanted to explore a different way of evaluating a piece of technology with just these two axes. And hopefully, give you a new tool to measure up prospects for your next undertaking.

Comparison

So I leave you with this final question to encapsulate all of these learnings:

Does either cost or complexity have a ceiling in this project?

And then, it is a matter of putting a horizontal or vertical line on the limit and seeing what strategy yields better results.

More often than not, one wants to minimize costs but complexity is set. Therefore the 1-picture-summary would be:

\[f(complexity) = min( upfront(complexity), payAsYouGo(complexity), debt(complexity) )\]

Comparison optimized

I’m not advocating on using a Debt technology at the begging and then switching mid-project; as the cost at the switching point will double; but rather a methodology of choosing the correct shape to deal with the expected complexity of it.

Nobody Gets Fired For Buying IBM

I’ll surrender that choosing a technology is an architectural decision done at the start of a project, and knowing how complex it will end up being is almost impossible. This is why, I think, pay as you go technologies are so prevalent.


  1. The idea of a graph with no units, no scale, or raw data to replicate them irks me. If you are like me, think of these not like “graphs”, but rather pretty pictures 😊. 

  2. If you allow the idea of representing a program as data, and are Haskell inclined, then pure :: IO () would be what I refer to as a “trivial complexity”, as it does nothing. It has no explanation on what it does, and there is nothing that can be taken from it to reach a valid program; hence the lower bound on complexity. 

  3. No free lunch theorem. 

  4. Or rather lack thereof. 

Melian

Random Posts

Deckchecks, Heuristics, and Decision Trees

One of the tools judges employ to thwart attempts to cheat is to require players to write down what cards they will be playing within a particular tournament, and routinely check them. Judges must perform these checks the fastest way possible to not delay the tournament. As such, they developed a myriad of techniques. I feel we under-use the availability of the decklist, so having a clear plan ahead of actually getting the deck in our hands can prove highly effective. In this post, I would like to explore some techniques' inner workings. Propose a hypothesis on why they might work in some situations but not others. Consider unexplored alternatives. And build a heuristic on what method is better for each circumstance. Join me! #mtg

The essence of Event Sourcing

Event Sourcing is a somewhat trending topic, and you can find a lot of blog posts on what event sourcing supposedly is. I’ll throw my wrench in the works and try to explain what I see as “Event Sourcing”. #xpost

Limit the Problem Space!

Simply put, the “problem space” is the entire spectrum of inputs that exists in the process of finding a solution to a problem. How can we make our code less error-prone? In this post I explore some of the ways we can do so. Join me! #xpost

Discovering Crystal

I was working with Ruby on Rails on a project with other very skilled Ruby and Rails developers. As many others working with Ruby, we liked it’s syntax, it’s ease of use, it’s dependencies (gems) but had some issues with it. Every programming language does. As luck would have it, I was also in close contact with some very vociferous advocate of Crystal. You can imagine, with the tagline: “Fast as C, Slick as Ruby”, I was hooked. #xpost