Photo by Helloquence on Unsplash (Edited)

The Contract

The importance in getting the API correct and the initial design of the API.

Editor's Note: the previous story in this series is “Introducing wrkpi.pe”, about the idea to ship a project demonstrating thinking to the world.

The Contract

The fundamental assumption I am making with this application is that it will be consumed over the network. There are two different was we can express the internal state of this application over the network:

  1. As a rendered user interface to be consumed by a browser
  2. As an interface through which clients can make specific remote procedure calls (RPCs) and receive structured information but no user interface

Generally speaking I strongly prefer the latter approach.

To render or not to render

In small teams and within a limited use case the first approach in which the user interface (via HTML documented) makes some sense. It is:

  • Well supported
  • Quickly shippable
  • Easy to reason about

However, in my experience it runs into a couple of issues if ether of those assumptions start to fail.

On a long enough timeline, all things become an API

Just about all software is subject to change over time. However, not all software changes in a way that makes future changes easy to accomplish. It is perhaps easiest to provide a concrete example.

As part of various interview requirements I was required to get information from Metacritic about how a given game is ranked, or what the most popularly ranked game would be.

Unfortunately, MetaCritic does not expose this information in a structured way. Instead, to be able to understand and transform this data one must query the page DOM structure and traverse it looking for certain specific information. In effect, the DOM structure of the page has become the API through which that information can be queried.

This structure is not designed to be an API. That makes it very likely to break over time as its implementers repurpose the DOM for whatever they see fit. Further, it is unreasonable to complain when they do break my usage as there were never any guarantees as to how the DOM would change and DOM regularly does change to adhere to design requirements.

That makes my requirement very expensive to upkeep. This story is repeated across organisations almost indefinitely; if not with scraping than with a given workflow or data format or any number of other “soft” APIs.

Team coordination is expensive

In a large enough software company software is going to be consumed by people who work in different teams, or people who work outside the organisation.

The communication process between teams tends to be horrifically expensive if it is based on email, chat or telephone calls. There is the invariable:

  • Time zones being different
  • Scheduling difficulties
  • The need to concentrate on software development
  • The need to establish a common language about the software stack

When there is no defined contract between teams in the form of an RPC API the requirement to communicate between teams is essentially guaranteed.

That kills development velocity.

Contracts up front

As with any kind of agreement it is generally expensive renegotiating a given agreement when requirements change or a requirement that was clear to one party was not clear to another.

Given this, I will generally start by defining the RPC calls — before implementing a line of code. There are several reasons for this:

RPCs are language independent — hopefully

The value in a language interoperable RPC is users who have a limited knowledge of the application itself but understand the nature and format of the RPC can consume the services exposed by the application. For example, a developer who likes working in Python can query a service written in PHP so long as they follow the RPC conventions; not even knowing that this service is PHP.

It is possible to generate these RPC definitions from the programming languages themselves with “annotations” or similar language specific constructs. Generally speaking, I try to avoid this.

Writing endpoints within a given language and framework encourages thinking in the way that given language and framework provides. The structure of URLs, versioning of the APIs and requests / responses of the payloads all become influenced by that particular toolings design. While this makes it easier to write that tooling it makes it much harder for consumers who see only the RPC specification presented in a way that, to them, is odd.

Given this, I will write the RPC specification before any code at all and push hard to implement it as the spec reads — even if it is more complex to implement in a given application.

This creates an API that is much nicer for consumers who will, ultimately, be doing the vast majority of implementation based on that API.

Choosing the RPC tooling

There are several kinds of RPC tooling, but the ones I use the most and have the most experience with are:

  • gRPC
  • OpenAPI (formerly Swagger)

Both are excellent. In the ideal case I would elect to use the gRPC RPC API, however this is not well supported in the PHP language and PHP is the anticipated language the reviewers of this project will be expecting. Given this, the contract will be specified in the OpenAPI v3 format.

The spec itself

Versioning

Because of the cost and complexity involved in change an API I will tend to push an API through three stages, modelled after the versions exposed by the Kubernetes. There are three phases:

  1. Alpha — This API is a prototype. It has no guarantees and will be changed based on feedback. Generally this is developed in active communication with several downstream partners.
  2. Beta — This API appears to be a good representation of the problem but because there is only limited adoption of the alpha APIs there might be issues that have been unforeseen.
  3. Stable — This API is good. We will try very hard never to break it.

The version of APIs is as follows:

v1alpha1

Where the constituent components are:

  1. v1 - The eventual stable API target
  2. alpha - The current lifecycle stage of this API
  3. 1 - How many times this API has changed within its current lifecycleOnce an API is in v1 it should essentially never be broken. This can never be guaranteed indefinitely, but a guarantee of 12 or 24 months deprecation period should be ample.

I adopt this pattern for most network facing API design; both OpenAPI style and for protobuf implementations.

Routing

If implemented over HTTP, each version is at a separate endpoint. For example,

  • /v1alpha1/${METHOD_A}
  • /v1alpha1/${METHOD_B}
  • /v1beta1/${METHOD_A}
  • /v1/${METHOD_A}

In this way the method can exist across multiple versions simultaneously. This is advantageous to migrate a given object between versions as it can be translated from a v1beta to v1 with a deprecation period in v1beta but still available on v1.

The specs themselves are kept separately as it is difficult in the OpenAPI standard to keep multiple versions within the same file.

Entities

More generally I will tend to first define the minimal properties of the entity and create a set of sane requests and responses — that is, full out a “fully functional” API before implementing too much logic associated with the entities and the responses.

The reason for this is so that I consider the full lifecycle of a given entity, including deploying it and attempting to implement that in a client before I get too invested in the entity design. The hope is that this encourages the design of an API that is optimized for consumption, rather than implementation.

Hypermedia / Links

Within an API design I will tend to make use of hypermedia links rather than attempting to determine how many of a given object should be embedded in the response.

This allows users to crawl an API response easily to generate a “graph” of a person on the client, descending as they wish.

This is fine for links with low latency as the request and response cycle is expected to be quick. However, for requests with high latency needing to crawl the graph may prove too costly as the graph is parsed and new requests are formed In that case a technology like GraphQL might be a worthy successor

Error object

Errors are an interesting class of API return.

The OpenAPI + HTTP spec provides a remarkable amount of information about the nature of an error condition. For example,

  • No connection? TCP will pick that up. You know the state
  • TCP RST? Something badly went wrong there. You know that you don’t know.
  • Empty response? Something went wrong at the HTTP layer. You know that you don’t know, but the app probably did not function correctly
  • HTTP response? Great! You have an abundance of information; status codes, headers and so fourth.
  • The HTTP message body? Hmm. Value can vary.

I break errors down into two forms of action:

  1. Automatically repairable. This could be retries, cancelling transactions or other tasks that can be built in to the logic of the application.
  2. Not automatically repairable. Needs to be surfaced to a human to address.

The goal is to supply a response that maximises the amount that computers can do to repair a given issue. The HTTP spec takes care of this with a bunch of known error cases, such as:

To that end, the current error payload that I return looks like:

{
"uri": "e.wrkpi.pe/9bccc946-d4b3-11e9-a6d2-0ffe87baa1be",
"description": "Unexpected FooCondition"
}

The URI standard for expressing an entity is a well defined standard that assigns a unique identifier at which point a given resource can be discovered. This allows machines to uniquely identify this error and for software engineers to build in some compensatory action into their software. However, it also points humans to a specific location to learn more about where a given error condition is likely to manifest.

The description is just to allow rendering some human readable error condition that is easy to understand even without the context of the documented error condition.

Authentication

As of the time of writing there is no authentication.

This is fine for now on the current API, though in future when Authentication is required I would implement it as OAuth2 with specific scopes modelled after each operation.

In Conclusion

The communication boundary of a service influences the conceptual model of that service, and whether intentional or not how users consume that service becomes its API.

It is better to design such interfaces up front with explicit guarantees such that it is possible to evolve the application underneath them while maintaining user expectations.

You can see this in action at the following commit:

https://github.com/andrewhowdencom/wrkpi.pe/commit/673eb78f7956932ecdb5da73acaa99dfd5f8f1e6

And checkout the next story in this series at: __NOT_YET_PUBLISHED__

Further Reading

I have additionally written on this topic at:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store