Software Engineering

Content tagged "Software Engineering".

Knowledge Creates Technical Debt

Luke Plant:

The “pile of technical debt” is essentially a pile of knowledge – everything we now think is bad about the code represents what we’ve learned about how to do software better. The gap between what it is and what it should be is the gap between what we used to know and what we now know.

Seeing Like a Software Company

Sean Goedecke:

This is why tiny software companies are often much better than large software companies at delivering software: it doesn’t matter that the large company is throwing ten times the number of engineers at the problem if the small company is twenty times more efficient.

Why don’t large companies react to this by doing away with all of their processes? Are they stupid? No. The processes that slow engineers down are the same processes that make their work legible to the rest of the company. And that legibility (in dollar terms) is more valuable than being able to produce software more efficiently.

How Swift's Server Support Powers Things Cloud

Vojtěch Rylko and Werner Jainek:

Our legacy Things Cloud service was built on Python 2 and Google App Engine. While it was stable, it suffered from a growing list of limitations. In particular, slow response times impacted the user experience, high memory usage drove up infrastructure costs, and Python’s lack of static typing made every change risky. For our push notification system to be fast, we even had to develop a custom C-based service. As these issues accumulated and several deprecations loomed, we realized we needed a change.

They chose to rewrite using Swift on the server.

A Practitioner's Guide to Wide Events

Jeremy Morrell:

Adopting Wide Event-style instrumentation has been one of the highest-leverage changes I’ve made in my engineering career. The feedback loop on all my changes tightened and debugging systems became so much easier. Systems that were scary to work on suddenly seemed a lot more manageable.

The Greg Pass Strategy for Getting Unstuck

Tony Stubblebine:

Many things at Twitter were broken to the point that they could bring the entire site to a halt. Greg’s strategy, now distorted through multiple retellings and my own foggy memory, was to focus on short-term triage rather than long-term fixes.

Essentially, he realized that a collection of temporary, duct-taped fixes was the only thing that would give the Twitter team the breathing room to start working on longer-term fixes.

I think of this in school grade terms. Greg went looking for all the Fs and then turned them into Ds. Then he turned all the Ds into Cs and then all the Cs into Bs, etc.

Agile and the Long Crisis of Software

Miriam Posner:

You could say, then, that by the late 1960s, software development was facing three crises: a crying need for more programmers; an imperative to wrangle development into something more predictable; and, as businesses saw it, a managerial necessity to get developers to stop acting so weird.

CUPID—for Joyful Coding

Dan North:

The five CUPID properties are:

  • Composable: plays well with others
  • Unix philosophy: does one thing well
  • Predictable: does what you expect
  • Idiomatic: feels natural
  • Domain-based: the solution domain models the problem domain in language and structure

Also this belter from Martin Fowler.

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”

Why Is It Getting Harder to Apply Software Architecture?

George Fairbanks:

There are two parts to software development: creating a design and expressing it as code. The code is tangible but the design is conceptual. Keeping a project healthy means doing both well. Here’s my concern: whenever you mix the conceptual with the tangible, it’s easier to neglect the conceptual. When you miss a tangible target, it’s obvious, but when you miss a conceptual target, you might not recognize it, or might rationalize that, because it’s impossible to measure, you were really quite close.

Blindly applying a factory process to software development will drive improvements to the tangible part (the code) at the expense of the conceptual part (the design). We see plenty of examples of this today, where teams have great feature velocity at first, are puzzled when velocity slows, and eventually the project is abandoned. As Cunningham warned, if we bolt features onto an existing codebase without consolidating those ideas into the code, the design will suffer, and over time “[e]ntire engineering organizations can be brought to a standstill under the debt load of an unconsolidated implementation.”

Uber and the Epic Rebuild

McLaren Stanley:

Alright folks, gather round and let me tell you the story of (almost) the biggest engineering disaster I’ve ever had the misfortune of being involved in. It’s a tale of politics, architecture and the sunk cost fallacy

Application State Management With React

Kent C. Dodds:

  1. Not everything in your application needs to be in a single state object. Keep things logically separated (user settings does not necessarily have to be in the same context as notifications). You will have multiple providers with this approach.
  2. Not all of your context needs to be globally accessible! Keep state as close to where it’s needed as possible.

Reassessing Service Split-Join Criteria

Vasco Figueira:

TL;DR “Microservices” was a good idea taken too far and applied too bluntly. The fix isn’t just to dial back the granularity knob but instead to 1) focus on the split-join criteria as opposed to size; and 2) differentiate between the project model and the deployment model when applying them.

There are build patterns that give us total control over the relative visibility between components and over how they compose into deployables. One build to many deployables, many to one or anything in between is possible. Shifting concerns left helps us build better software. An earlier error is a better error. Constraints liberate.

The tensions and constraints that shape the arrangement of our projects, modules and dependencies are of a different nature than those shaping the arrangement of our deployables. A little build-fu goes a long way in combining development friendliness with mechanical sympathy.

Programming as Theory Building

Peter Naur’s classic 1985 essay “Programming as Theory Building” argues that a program is not its source code. A program is a shared mental construct (he uses the word theory) that lives in the minds of the people who work on it. If you lose the people, you lose the program. The code is merely a written representation of the program, and it’s lossy, so you can’t reconstruct a program from its code.

A copy of the paper.

Design Docs at Google

Malte Ubl:

As software engineers our job is not to produce code per se, but rather to solve problems. Unstructured text, like in the form of a design doc, may be the better tool for solving problems early in a project lifecycle, as it may be more concise and easier to comprehend, and communicates the problems and solutions at a higher level than code.

Deploys at Slack

Michael Deng and Jonathan Chang:

Deploys require a careful balance of speed and reliability. At Slack, we value quick iteration, fast feedback loops, and responsiveness to customer feedback.

An interesting dive into how Slack handles deploys to a large fleet of users.

Write Code That Is Easy to Delete, Not Easy to Extend

tef:

Every line of code written comes at a price: maintenance. To avoid paying for a lot of code, we build reusable software. The problem with code re-use is that it gets in the way of changing your mind later on.

Building reusable code is something that’s easier to do in hindsight with a couple of examples of use in the code base, than foresight of ones you might want later.

Testing in Production

Charity Majors:

it’s actually an unqualified good for engineers to be interacting with production on a daily basis, observing the code they wrote as it interacts with infrastructure and users in ways they could never have predicted.

A system’s resilience is not defined by its lack of errors; it’s defined by its ability to survive many, many, many errors. We build systems that are friendlier to humans and users alike not by decreasing our tolerance for errors, but by increasing it. Failure is not to be feared. Failure is to be embraced, practiced, and made your good friend.

Software Design Is Human Relationships

Kent Beck:

The first step is acknowledging that our relationship is more important than the design of the system. As long as we have a productive working relationship we can move the design in any direction. When our relationship breaks down we don’t get anywhere.

Okay, so you just want to go implement the next feature and along I come and say no no no this should be designed completely differently. Even if you are right that the new structure will eventually make my behavior changes easier to implement it’s not eventually, it’s today.

First, acknowledge that our incentives diverge in this moment. It doesn’t help to pretend that we agree when we don’t.

Second, as the structure changer I need to acknowledge that I am placing a burden of learning on you. I think it’s worth it, but if I’m asking something of you I better be prepared to offer something to you.

Software design is a human relationship problem with interesting technical aspects. Geeks relating to geeks requires as much effort as geeks relating to their systems. Maintaining relationships may be hard and confusing and frustrating to geeks (I could be projecting here but yeah no I don’t think I am), but if you want your technical skills to matter you really have no choice but to improving your people skills.

Scalable Operations in Complex Systems

Liz Fong-Jones:

In order to succeed at production ownership, a team needs a roadmap for developing the necessary skills to run production systems. We don’t just need production ownership; we also need production excellence. Production excellence is a teachable set of skills that teams can use to adapt to changing circumstances with confidence. It requires changes to people, culture, and process rather than only tooling.

Even a perfect set of SLOs and instrumentation for observability do not necessarily result in a sustainable system. People are required to debug and run systems. Nobody is born knowing how to debug, so every engineer must learn that at some point. As systems and techniques evolve, everyone needs to continually update with new insights.

Magnitudes of Exploration

Will Larson:

Standardizing technology is a powerful way to create leverage: improve tooling a bit and every engineer will get more productive. Adopting superior technology is, in the long run, an even more powerful force, with successes compounding over time. The tradeoffs and timing between standardizing on what works, exploring for superior technology, and supporting adoption of superior technology are at the core of engineering strategy.

An effective approach is to prioritize standardization, while explicitly pursuing a bounded number of explorations that are pre-validated to offer a minimum of an order of magnitude improvement over the current standard.

Lessons From Six Software Rewrite Stories

Herb Caudill:

Once you’ve learned enough that there’s a certain distance between the current version of your product and the best version of that product you can imagine, then the right approach is not to replace your software with a new version, but to build something new next to it — without throwing away what you have.

Dark

Paul Biggar:

Dark is a holistic programming language, structured editor, and infrastructure, for building backend web services. It’s aimed at frontend, backend, and mobile engineers.

Soup to nuts.

Mental Models for Code and Systems

Cindy Sridharan quoting Joe Armstrong:

We should identify the error kernel. The error kernel of a system is that part which must be correct. That’s what the error kernel is. All the other code can be incorrect, it doesn’t matter. The error kernel is the part of the system that must be correct. If it’s incorrect, then all bets are off. The error kernel must be correct.

Constant Time Systems

Colm MacCárthaigh:

Now, let’s try a much better, O(1) control plane. DUMB AS ROCKS. Imagine if instead that the customer API pretty much edits a document directly, like a file on S3 or whatever. And imagine the servers just pull that file every few seconds, WHETHER IT CHANGED OR NOT.

This system is O(1). The whole config can change, or none of it, and the servers don’t even care. They are dumb. They just carry on. This is MUCH MUCH more robust. It never lags, it self-heals very quickly, and it’s always ready for a set of events like everyone changing everything at once. The SYSTEM DOES NOT CARE.

An Approach to Using React and Redux

Kevin Rutherford:

What follows is my personal, opinionated, idiosyncratic approach to React/Redux development, forged in half a dozen attempts to pursue test-driven development at a good pace.

Some interesting approaches in here. I like the use of more classic terms for some of the Redux concepts. Commands instead of Action Creators. Events instead of actions. Read models instead of reducers or selectors.

Clojure and Types

Eric Normand:

Rich Hickey explained the design choices behind Clojure and made many statements about static typing along the way. I share an interesting perspective and some stories from my time as a Haskell programmer. I conclude with a design challenge for the statically typed world.

People often talk about a Person class representing a person. But it doesn’t. It represents information about a person. A Person type, with certain fields of given types, is a concrete choice about what information you want to keep out of all of the possible choices of what information to track about a person. An abstraction would ignore the particulars and let you store any information about a person. And while you’re at it, it might as well let you store information about anything. There’s something deeper there, which is about having a higher-order notion of data.

APIs as infrastructure: future-proofing Stripe with versioning

Brandur Leach:

Versioning is always a compromise between improving developer experience and the additional burden of maintaining old versions. We strive to achieve the former while minimizing the cost of the latter, and have implemented a versioning system to help us with it. Let’s take a quick look at how it works.

Explicitly modeling changes between versions via a DSL.

PumpkinDB

About PumpkinDB:

PumpkinDB is essentially a database programming environment, largely inspired by core ideas behind MUMPS. Instead of M, it has a Forth-inspired stack-based language, PumpkinScript. Instead of hierarchical keys, it has a flat key namespace and doesn’t allow overriding values once they are set.

The core ideas behind PumpkinDB stem from the so called lazy event sourcing approach which is based on storing and indexing events while delaying domain binding for as long as possible. That said, the intention of this database is to be a building block for different kinds of event sourcing systems, ranging from the classic one (using it as an event store) all the way to the lazy one (using indices) and anywhere in between.

When in doubt, refactor at the bottom

Eric Normand:

But there’s a deeper problem. I actually think ten lines of code is too big for an abstraction (Metz agrees). Ten lines of code that happen to be repeated are so unlikely to have any interesting properties, including being bug-free, reusable, and composable, that you should be very skeptical of being able to factor the whole thing out as common.

However, don’t despair! You can more often than not find lots of one-, two-, or three-line bits of code that are 1.) easy to name and 2.) reusable. If you break enough of those out, you may start to see some underlying structure to those repeated ten lines as they become replaced by your new, small abstractions.

In Pursuit of Production Minimalism

Brandur Leach:

It’s not that new technology should never be introduced, but it should be done with rational defensiveness, and with a critical eye in how it’ll fit into an evolving (and hopefully ever-improving) architecture.

It suggests improving increasing productive output by continually improving the efficiency of a system even while keeping input the same. I project this onto technology to mean building a stack that scales to more users and more activity while the people and infrastructure supporting it stay fixed.

Building a CQRS/ES web application in Elixir using Phoenix

Ben Smith:

Building applications following domain-driven design and using CQRS feels really natural with the Elixir – and Erlang – actor model. Aggregate roots fit well within Elixir processes, which are driven by immutable messages through their own message mailboxes, allowing them to run concurrently and in isolation.

I too think that Elixir is a good fit for CQRS/ES applications.

The post provides a thorough walkthrough that builds an event sourced/CQRS Elixir application using Phoenix, Commanded, and Eventstore. The Commanded and Eventstore libraries look to have implemented most of the features I’m looking for.

Presentation on versioning event streams from Greg Young

Greg Young:

We will talk about what seems to be the most complex areas of event sourcing for most developers especially those first getting into it. Versioning.

Remapping the event stream is an approach we’ve used. I like the idea of emitting stop and stopped events to the store to support zero downtime migrations while remapping the stream.

Freactal

Formidable:

Clean and robust state management for React and React-like libs.

The library grew from the idea that state should be just as flexible as your React code; the state containers you build with freactal are just components, and you can compose them however you’d like. In this way, it attempts to address the often exponential relationship between application size and complexity in growing projects.

Like Flux and React in general, freactal builds on the principle of unidirectional flow of data. However, it does so in a way that feels idiomatic to ES2015+ and doesn’t get in your way.

The single state object, reducers, and selectors associated with Redux feels overwrought and often gets in the way. Freactal looks like an interesting alternative.

A Whole System Based on Event Sourcing is an Anti-Pattern

Jan Stenberg:

The single biggest bad thing that Young has seen during the last ten years is the common anti-pattern of building a whole system based on Event sourcing. That is a really big failure, effectively creating an event sourced monolith. CQRS and Event sourcing are not top-level architectures and normally they should be applied selectively just in few places.

Young also believes there will be more use of Process managers, especially with more and more use of Actors since they for him make perfect Process managers. He describes an Actor framework as a Process manager framework.

Cap'n Proto: RPC Protocol

Cap’n Proto RPC:

Cap’n Proto RPC employs TIME TRAVEL! The results of an RPC call are returned to the client instantly, before the server even receives the initial request!

There is, of course, a catch: The results can only be used as part of a new request sent to the same server. If you want to use the results for anything else, you must wait.

However, Cap’n Proto promises support an additional feature: pipelining. The promise actually has methods corresponding to whatever methods the final result would have, except that these methods may only be used for the purpose of calling back to the server. Moreover, a pipelined promise can be used in the parameters to another call without waiting.

An interesting feature which limits the number of network hops in the query is a pipeline of dependent requests.

Using Pseudo-URIs Between Services

Phil Calçado:

In my experience, a good-enough specification for a pURI should specify:

  • How many segments are there and if teams and service owners can add new segments
  • What is the encoding and escaping expected for the text
  • What in the identifier should be treated as opaque data as opposed to parts where consumers can make assumptions about data formats

We’ve been playing with almost the exact same scheme recently for some of our services.

Tackling Complexity in CQRS

Vladik Khononov:

From my point of view, the CQRS-induced complexity is largely accidental, and thus can be avoided. To illustrate my point, I want to discuss the goal of CQRS, and then analyze 3 common sources of accidental complexity in CQRS-based systems.

I find myself nodding to the sentiments of this post. It feels like doggedly pursing a theoretical ideal can lead to unnecessary difficulty in CQRS implementations.

What is the Mikado Method?

The Mikado Method is a structured way to make major changes to complex code. It helps you visualize, plan, and perform business value-focused improvements over several iterations and increments of work, without ever having a broken codebase during the process.

Parallel and Concurrent Haskell Videos by Bartosz Milewski

Bartosz Milewski:

The idea of this series was to teach enough Haskell to be able to read Simon Marlow’s book of the same title. It turned out to be both a Haskell course and a selection of topics on parallelism and concurrency. There should be something interesting for both total beginners and experts.

These videos contain some good introductory material to Haskell and great coverage of monads and the IO monad in particular1.

Milewski also has a video series on Category Theory which I’ve yet to watch.


  1. Skip to the 5-1/2 and 6-1/2 videos for the monad stuff. ↩︎

Open Source Project Principles

Ryan Florence:

Rackt is a GitHub organization that was created to ensure critical open source projects in the React community receive long-term support and maintenance.

We believe these four principles will provide a strong ecosystem of our open source projects by ensuring projects have active and enthusiastic owners and contributors.

Running Lisp in Production

Vsevolod Dyomkin:

At Grammarly, the foundation of our business, our core grammar engine, is written in Common Lisp. It currently processes more than a thousand sentences per second, is horizontally scalable, and has reliably served in production for almost 3 years.

RSS feed for content about Software Engineering.