Implementing a Connections Pool (the right way)

A pool is generally used to create a set of resources at boot time (i.e.: when the process is starting), and it can be configured to either create all the resources at the same time, or to just create a bare minimum set of resources to be used in the average load, and to only increase them (up to some limit) at need.
This has several effects both at runtime level and at code level.

At runtime, pools will certainly improve the average performances of your application because the resources required by each handler have already been instantiated, and in the average cases it is more likely that the resource is ready to use. Also, pools prevent the process’ memory from growing indefinitely, because they limit by design the number of allocable resources, and this pushes to a design that strive to reuse the same resources as much as possible.

At code level, on the other hand, you will need to be cautious because pools are not as easy at they might seem. Nothing comes for free.

Common usage for pools

A common use is to handle database connections. Designing an application to create a new database connection to serve an incoming HTTP request has some benefits, but also some drawbacks; creating one connection per HTTP request may work on a small scale, when your application is not used too often, and the average number of HTTP requests is low, but it doesn’t scale well as your process receives a growing number of incoming requests. The more requests it must handle, the more the database connections it needs to create. But creating, holding, using and terminating database connections has an impact both on the memory usage and on CPU time. Not to mention the consequences that this design has on the database server, which must in turn be able to handle tons of incoming connections as well:

And things can only go worse for the database as your application scales, and more instances are spawn to handle the incoming HTTP traffic:

But a process that receives an incoming request doesn’t need a database connection for all the time that it is serving the request, it needs that connection only for some of the time, and this means that your process might reuse the same connection to handle two different incoming requests, assigning it to the two handlers as they need it (by means of some synchronisation mechanism, of course).
This lowers the overall number of connections to the database, and decreases the memory required by your process to handle the incoming requests as well:

The Pool design pattern

Pool design pattern - UML diagram

The basic Pool design pattern is pretty straightforward: it holds a collection of resources of some given type, and releases them on demand. Clients that want to use a resource must ask the pool, use the resource and then return it to the pool when it isn’t used anymore. The pool takes care of instantiating new resources and tracks which one is available and which one is not, and makes resources available again when a client returns it.

When the pool becomes empty (all the resources have been borrowed), and the pool size is at its max, no new resources are created and the client that is trying to acquire a resource is locked until a resource is returned. Nothing changes from the client’s point of view. Sooner or later a resource will be disposed, and the client waiting to acquire it will unlock and will get its resource. From its perspective, it doesn’t matter whether the request to acquire a resource lasted some milliseconds or some hundreds, it was just a call to a method that returned an object.

Despite its simplicity, however, the Pool design pattern hides some common pitfalls that you mast pay attention to. First of all: how should your code use a pool? Are the acquire/dispose methods enough? What do you get, when you acquire a resource?

How to properly use a pool

A pool, as stated before, creates resources on your behalf and holds them as long as it can. The same resources are used over and over again through the lifecycle of your process, and this means that the resources that you get can be in an invalid state. You need to design your Resource in a way that you can test it before you can use it.

In the example of the database connection, there’s certainty that the connection that you get from the pool will still be up. Clients shouldn’t trust the status of the resource returned by a pool, because it has most certainly been used by some other service.

First rule of the pool: you don’t talk to the pool

So, you design the application to give each service a new instance of the DBConnection abstraction, with a design like the following:

And then you write your LoginManagementService like follows:

class LoginManagementService {
  
  constructor(
    private readonly conn: DBConnection
  ) {}

  /**
   * Checks if the provided credentials belong to some user
   *
   * @throws UserNotFoundException
   * @throws InvalidPasswordException
   */
  function checkCredentials(username, password) {
    const user = this.conn.query('SELECT * FROM users WHERE ...');
    
    if (!user) {
      throw new UserNotFoundException();
    } else if (user.password !== password) {
      throw new InvalidPasswordException();
    }
  }
}

A common mistake is re-thinking your application’s services to use a pool, making it clear in the code that you’re using it:

and changing your code accordingly:

class LoginManagementService {
  constructor (
    private readonly pool: Pool<DBConnection>
  ) {}

  /**
   * Checks if the provided credentials belong to some user.
   *
   * @throws UserNotFoundException
   * @throws InvalidPasswordException
   */
  function checkCredentials(username, password) {
    const conn = this.pool.acquire();
    const user = conn.query('SELECT * FROM users WHERE ...');
    
    if (!user) {
      throw new UserNotFoundException();
    } else if (user.password !== password) {
      throw new InvalidPasswordException();
    }

    this.pool.dispose(conn);
  }
}

This might seem a good code at a first sight: it only changes a little bit, the business logic is pretty much the same, and it acquires and releases the connection as agreed. But the devil is in the detail.

Despite the appearance, the code is error prone. When one of the exceptions is thrown, the code does not reach the point where the connection is returned to the pool, effectively leaking it for ever. That connection will never be returned to the pool, by any chance. The pool has now one connection less to share between services. As you design your code like this, there will be more and more chances to leak resources from the pool. Sooner or later, the pool will become empty and the processes that try to acquire connections will hang forever.

Designing your project to make explicit use of a pool is not a good idea, unless you’re very disciplined. And, even in that case, don’t design your application like this:

Instead, hide the pool behind some object that makes your code unaware of it. In the above example, one might create one ad-hoc implementation of the DBConnection abstraction to hide the pool and give the services the illusion of using a simple database connection:

Since your service doesn’t care what implementation of the DBConnection abstraction it receives, you can write one more implementation that just takes care of interacting with the pool. Under the hoods, the PoolWrapper implements the two query() and count() base methods to acquire one real DBConnection from the pool, performing some operation with it, and returning it to the pool. Just like in the following snippet:

class PoolWrapper implements DBConnection {
  constructor(private readonly pool: Pool<DBConnection>) {}
  
  function query(sql: string): Row[] {
    const conn = this.pool.acquire();
 
    try {
      return conn.query(string);
    } finally {
      this.pool.dispose(conn);
    }
  }

  function count(sql: string): number {
    /* pretty much the same logic */
  }
}

Of course, this is just an example. You should design the wrapper depending on your real case scenarios.

Anyway, hiding a pool behind some wrapper makes your code way more resilient than designing it to handle it directly, as you can rely on specific, transparent components to trustworthy handle the pool acquire/release flow behind the scenes.

Second rule of the pool: you don’t trust the pool

The second rule to follow when you introduce a pool in your project is to never trust what the pool returns to you. Once you get some resource by the pool, you must ensure that it’s still valid, and this is even more true if the pool contains long-lived connections. If it is not, just tell the pool to destroy it and then get a new one. Repeat until you get a resource that is in a valid state. Then you can safely use it.

Pools typically expose some methods to destroy a resource instead of just disposing of it. These functions will take care of removing the resource from the pool, giving the resource being destroyed one last chance to clean up resources / handlers. The pool will then use its internal policy to determine wether a new resource should be instantiated to replace the one just gone or not.

The logic to ensure that a resource is still valid can be put either in the PoolWrapper component (see previous point), or in some decorator of the pool. Anyway, it is a strongly recommended approach even for security reason.

This is the approach you should follow when using something from the pool:

  • Get an item from the pool
  • Ensure the item is still valid
  • Use it as long as you need it
  • Clean the object before putting it back into the pool
  • Put the object back into the pool

This makes the design more secure, because you can trust that a best-effort approach has been used to remove any sensitive information from the items after their use.

Think of a SaaS service that must establish a pool of connections to the same database server, but customer’s data is segregated in different schemas. Every time a request arrives to the service, it must chose the right database schema depending on the request being served. You don’t want developers to write code that takes care of choosing the right database at business logic level: should the developer forget to switch to the right schema, the damage could be enormous. A pool wrapper might be designed to automatically switch the connection to the proper schema before using it to perform the query, and to switch it back to some empty/unused schema before releasing it.

You design that behaviour once and for all.

How to properly size a pool

Pool can be configured to create all the instances of Resource at once, for example if you know in advance that you won’t need less resources than that, or if you want to make a constant use of memory. But they also allow you to create a min/max configuration: the pool will start with a bare minimum set of resources, creating new instances only when needed. In this case, a pool can destroy the exceeding resources immediately after they’re returned, or after some grace period, trying its best to reuse the exceeding resources as much as possible.

It is up to you to tune your pool with a policy that best fits your needs.

CQRS Episode II – Attach the cloners

In my previous post I explained why CQRS matters and why you should adopt it if you really care your product and don’t want data growth to become a bottleneck rather than a success in your business.

Now I’m gonna dig a bit more. I want to show you how CQRS works under the hood.

Command/Query responsibilities

CQRS stands for Command Query Responsibility Segregation, and its name reveals how it works at its core.
As stated in the previous post, a software needs two data models in order to face the data growth: on model to hold the application state, and one to handle the numbers.
Well, let’s start from naming things. The requests sent to your application can be split in two main categories:

  • requests that do change the application state (e.g.: creating a new user, submitting an order, performing a bank transaction, etc.). These requests are called commands.
  • requests that only read your data and do not change the application state (e.g.: counting the number of registered users, getting the details of one user, getting the account balance). These requests are called queries.

CQRS aims to split your application in these two main areas: the commands and the queries. These have totally different structures, architectures and purposes.

Command vs Query

Usually when a command is sent to your application (e.g.: via an HTTP request), the business logic gets involved in order to determine whether or not the request can be satisfied. The typical steps are:

  • Parse the request (return an error in case of bad syntax/arguments)
  • Load the resource state from the storage
  • Ensure that the requested action is allowed basing on the resource state
  • Eventually apply and persist the change

As an example, imagine a banking application with a business rule stating:

A debit transaction is allowed if the requested amount is not higher than the balance

(i.e.: the account balance cannot go negative).
Some API is then designed to handle the command. The API code will look like the following:

  • load the bank account from the storage (it can involve multiple tables)
  • verify that the account balance covers the requested amount
  • update the balance
  • commit the update to the storage (hopefully with a proper concurrency management)

This is how things work, regardless of the database type.
And that just works fine with the classical one model to rule them all approach: the developer designs one database schema along with the code that handles that model.

What’s new in the CQRS architectural pattern, however, is the query model: when it’s time to query your application to get the numbers, the designated schema should be an ad-hoc set of tables. That is: the model that holds the application state is not touched by queries, it is just read and updated when a command is sent.
But how does that work? How is it possible for a microservice to handle these two different models?

Under the hood

As illustrated in the above diagram, the application is logically split in two models.
The command side handles all the incoming commands: it is invoked when a POST, PUT or DELETE request is sent. The command model and the business logic are involved.
The query side handles all the incoming queries: it is invoked when a GET request is sent. The query storage is used in read-only mode.

Event Bus is the bridge between the two. Whenever a command is processed without errors and the resource updated into the storage, a domain event is emitted to notify whoever is interested in. An event bus can be implemented in a lot of different ways, but that’s not the core point. What matters is that by dispatching domain events, the microservice itself can capture that same events and use them to update the query model.

This is the core point: by introducing an event bus, the business logic is not messed up with additional code that writes the same data in different places and formats. This means that the command side just processes the commands, ensures that the business logic is not deceived, applies the changes and then returns the result. Nothing more, nothing less. Pure business logic.
In a totally asynchronous way, the domain events dispatched by the command side get captured and processed by the query side to update its model.
The two sides are processing the same requests at different times and speeds: should the query model need some time to update its model, the command execution time would not be affected at all.
This however introduces a lag between the models.

But who is in charge of handling the events in the query model?

Attach the cloners

The query model is also known as projection: data coming from the command side is projected – that is represented – in very different ways, and each projection has a specific purpose, depending on the usage for which it has been thought.

Hence the key point in the query model is the projection. It is the microservice component that subscribe to specific business events and transforms their payload to some other data format. One microservice can have several projectors, handling the same events, to write the same data to totally different tables and formats.

As an example, think of a domain event for a debit transaction in a banking application.
When a debit request is sent to the microservice and the debit is successfully applied, an event is dispatched. Such event would most probably carry a payload like the following:

{
  "name": "AccountDebited",
  "date": "2017-12-18T17:23:48Z",
  "transactionId": "tx-7w89u12376162",
  "accountId": "IT32L0300203280675273924243",
  "amount": {
    "currency": "EUR",
    "amount": 42
  }
}

That event can be captured by the same microservice that triggered it and routed to different projectors, who in turn update different projections. For example by:

  • appending one row to the “Transactions” table, that just contains the transactions history
  • updating one row in the “Balances” table, that contains one row for each account, with its current balance and the last update time
  • updating one row in the “Monthly Expenses” table, that contains the sum of debit transactions for a baking account relative to one month; the table unique key is the [“account_id”, “month”] columns pair (the month can be extracted from the “date” field of the event payload, e.g.: “2017-12”)

By doing this, the application does not need to transform “the one” data model on the fly each time a query is performed by an API. Rather, it can rely on different data models to pick the requested data from, depending on what the query is asking for.
The query model already have materialized data.

What’s next?

In the next episode, CQRS Episode III – Rewind of the sync, I’ll show how to rebuild projections in case of bugs or migrations, and how the same applies when you need to build a brand new projection.

Stay tuned!

Antonio Seprano

Apr 2020, still covid-free.

CQRS Episode I – The phantom data

Have you ever heard about CQRS?
Maybe yes, maybe no.

Well, if you got here asking yourself how to implement CQRS, then you already know what I’m talking about.
But if you are looking for what CQRS is, then you are new.
Either way, by this and the following posts, I will progressively explain why CQRS matters, what problems it solves, why it should be the main approach for a medium/big size company who lives by its product, and how to implement it.

One data model to rule them all? No, thanks.

One of the most (if not the only) common approach that software developers unwittingly adopt in their job is to design their software on one specific data model. They think about the software in terms of that model, they think in terms of how they will store data on the database, then they write the code to handle that specific model.
But, and here’s the important part, one single data model is not suitable for data growth.
And no, it doesn’t matter whether you are planning for a microservice architecture or a monolith. If the data is planned to grow, one model is not enough.
Let me show you some examples.

Example 1: The organizational chart

VentureSaas Inc. wants to give their customers the ability to handle the organizational chart into the product, so they start planning to introduce the Organizational Chart feature.
Every part of the product will be impacted by this change, ranging from the search bar to the reports, to the dashboard widgets. The new Organizational Chart feature will be a core functionality.

Typical organizational chart

The project team started by (guess what?) modelling the database schema for the organizational chart operations. It will be easy for them – as developers – to create, rename or even drop entire organizational branches. Also, it seems to be very easy to create the relationships between employees and the branches they belong to.
This data model will suffice each and every operation.
The same model will be queried at need.

A classical parent-child relationship schema to store a tree. A nested set would be way better, but it is just representative.

What the project team didn’t think of is the scalability.

As the first big customer started to use this new feature, everything slowed down.
It became harder and harder for them to access the users management page, as for the reports, last but not least the dashboard.
Tickets forwarded to the dev-team were clear: the new feature was performing bad because of the queries to the users catalogue, to the organizational branch and a mix of the two. Counting the number of employees in one department and all of its sub-sections was awfully slow.
The feature has been designed to query the aforementioned data model via a series of tangled, nested, difficult to read, hard to modify, slow to run queries. Guess why?

One data model to rule them all.

Example 2: The monthly expenses

Sharks&Loans Bank is planning to add a new widget to the homepage of the home banking website: a pie chart representation of the account holder’s monthly expenses. Each slice of the pie represents the monthly expense percentage relative to the running year:

Their project team already has the “one table to rule them all” to start from: the Transactions table. Each and every bank transaction is stored in that table, along with the relative bank account id, the transaction type (deposit or withdrawal), the amount and the date:

Easy peasy. So they say.
But.
But that table is huge. And they had to write a complex query that filters the required transactions from a huge list, groups them by month, sums the amounts and calculates the percentage of each month relative to the entire year. Everything to just get the values of the pie chart from that infinite list.
A medium-complex SQL query, nothing impossible. Every developer is able to write something like that.
Now, just for fun, let’s assume that the above operation is very slow because of the too many transactions, and that’s the only data source that you can use to calculate the values you need for the pie-chart widget.
It becomes a pain in the ass.
What would you do?
I can predict your answers:

  • Check for bad query practices
  • Check for table indexes
  • Check for misused indexes
  • Try to optimize the query
  • Look for some solution on stackoverflow
  • Scale the database server

Everything, but a data model analysis: “That table schema is fine and the problem is somewhere else”. Right?

So what?

Why it is so hard to query a data model that, paradoxically, has been thought to be easy, at first?
Because data has grown too much!”, a developer could reply. And indeed that developer would be true. You can’t prevent how much your data will grow, so your starting model works fine, at least for a while. But the more your customers use your product, the more data they generate. And the more customers you get, the more the data grows.
What the above developer is not thinking at all, however, is the fact that there’s really no need to scan all the historical data to just build a new representation of it.
What developers take for granted is that the required informations can be rebuilt from the actual, generic data model. And it’s not a whim, that’s for one specific reason: developers don’t want to store the same information in different formats. “Data MUST NOT be redundant” is some kind of mantra for developers, because they know that data redundancy is risky and expensive, from a coding point of view. Keeping redundant data synchronized is slow, difficult and error prone. Why should a developer mess the codebase to just write the same information in many different formats when all the possible data formats can be deduced from just one?

The “What If” game

What if the two aforementioned companies already had the data they needed, rather than having to rebuild it from scratch at need? What if it was possible for them to write redundant data, represented in different models, and get faster answers?

VentureSaas in Example 1 didn’t have to COUNT the number of employees in each branch. They could already know how many employees there are in one branch and what is the total number of employees in that branch and all of its sub-branches. Their product wouldn’t have become deadly slow.

A new representation of the orgchart table: members_count is the number of employees in that branch, total_members_count is the number of employees in that node and all of its sub-branches.

What if Sharks&Loans Bank in Example 2 didn’t have to recalculate the monthly expenses of each account holder? What if they already had some snapshot of the monthly expenses into one, ad-hoc, data model? Their widget wouldn’t have slowed the dashboard.

But how could have been it possible for those companies to have the same data represented in different data models?

Two sides of the same coin

Both the above examples make it clear the big mistake that a software architect can make: pretending that the data model that stores the application state can be used to provide the numbers.

On the one hand a software needs to store its state, because the state tells the application where it is and where it can go. The state allows the software to determine whether a user can perform some action or not (e.g.: a debit transaction is allowed only if the account balance has enough funds).

On the other hand a software needs to store the numbers, because all the metrics, all the statistics, all the informations needed by humans (and/or by the UI) are expressed in terms of numbers.

And here’s the lying truth: a good software needs both the models.

State.
And numbers.

Two sides of the same coin.

CQRS is the word

And here’s finally what CQRS is all about: designing your software so that it can handle both the state and the numbers, without actually mixing the two.
A software designed with the CQRS pattern at its core is a software that does not fear the growth. Your customers will thank you.

What’s next?

Next episode, CQRS Episode II – Attach the cloners, will be a tech overview of the CQRS architecture.

Stay tuned!

Antonio Seprano

Apr 2020, covid-free.

The OOP golden rule #0

Interface

Have you ever wondered why an electrical socket is the way it is? And what does it represent in terms of design?
The electrical socket makes our interaction with electric power easy.
Maybe you don’t think about it but generating, leveling and delivering electricity to houses is not an easy job, it is a very complex production chain. Still, of that whole production chain, the socket, the last ring of the chain, is our only point of interaction.
It is easy to use because it has been meant to be like this. It has been conceived so that it is easy to use in the right way and hard to to use in the wrong way. It hides the complexity out of sight, behind a wall, along with annoying stuff like cables, fuses and weldings.
No matter what you need to plug to the socket, whether it is a simple device like a light bulb or a complex one like a computer, and no matter where the electricity comes from, whether it is generated by a hydroelectric plant or an old wizard, all you have to do is plug your device to the socket.

The above introduction on sockets, designed to be easy to use hiding complexity behind a wall, is suitable for introducing a very similar concept at OOP level: interfaces.

Interfaces

There are some reasons why you seriously should start using interfaces in your projects, if you haven’t yet. And, even if you already are, maybe you don’t effectively know the opportunities they provide, so you better keep on reading.
We are going to talk about what I usually like to refer to as “Rule #0” for writing clear, readable and maintainable code:

Program to an interface, not an implementation

Before we go any further, let me make it clear: the term “interface” does not refer to interfaces in the strict sense. I mean… I’m not specifically referring to the “interface” keyword at programming language level. I am talking about abstraction.

Program to an abstraction, not an implementation

The meaning of the above quote is: write your code so that it depends on abstract concepts rather than on concrete classes.

Decoupling the implementation

Introducing interfaces in your project, and using them to enforce arguments, allows you to have multiple implementations of the same concept. It is not unusual to switch implementation at runtime, on the basis of some kind of strategy. Most of the time, however, one single implementation will be enough for the entire application to run. Still, interfaces allows you to switch to another implementation with little or no effort – if you drew the right abstraction from the beginning and filled your code with references to it.

Imagine you are working on a project aimed at allowing users to pay with their credit card: the credit card is the abstraction that your code should rely on. The concrete payment mechanism is – of course – delegated to whatever payment gateway you want to use (e.g.: PayPal, PAYMILL, Stripe, and so on). Usually, those systems come with some proprietary frameworks making it easy for developers to interact with them. Still, it is worth creating an abstraction and passing it around in your code rather than creating a lot of references to the specific implementations of those frameworks:

Note: the abstraction is not called ICreditCard because of a very specific reason: since the word “ICreditCard” does not exist in the above business logic, there must not be room for it in your code.

Personally, I don’t like the “I<something>” syntax when defining abstractions. The fact that we’re introducing an interface, an abstract class or a concrete class, is completely irrelevant. That declaration represents a concept, so its name must be clear and explicit, and should hide the level of implementation (interface, abstract or concrete class) in use.

The credit card, as an abstraction used for payments and refunds, has nothing to do with any implementation of the payment systems on the market. Your code should only refer to the CreditCard abstraction in order to process payments and perform refunds:

class PaymentService {

public charge(amount: Money, targetCard: CreditCard) {
const chargeId = targetCard->charge(amount);
// other stuff, like persisting or triggering events
}

public refund(chargeId: ChargeID, targetCard: CreditCard) {
/* Code for refunding */
}

}

By introducing the CreditCard abstraction, your code becomes more abstract as well. You can use the Adapter design pattern along with a service container to use any vendor-specific credit card implementation, avoiding references to those objects in your code. Should you switch to another payment system, the only thing you have to do is replacing the adapters in use with new ones. The code that relies on the concept of CreditCard will remain the same because the concept has not been affected by the change.

As I said before, making references to abstractions in your code allows you to use any implementation of them. For example, you could write a multi-credit-card object:

The CreditCardArray is a composition that acts as a single object. It implements the CreditCard interface and can therefore be used accordingly. But it is also composed by a collection of one or more CreditCards (you can enforce the “one or more” constraint in the constructor) and its goal is to encapsulate the logic for looping through the collection of credit cards until one that can be charged for the requested amount is found.

Also think about a fake credit card implementation that intentionally fails at charging time: this implementation would allow you to test your system in such edge cases. You could end up creating as many fake cards as you want, each one returning a specific error code for you to test all possible situations.

Removing what is not needed

You may be wondering if you should use an abstraction even when your application doesn’t expect to have multiple implementations. Maybe you only have one single class that does a simple task in a simple way.
Well, you actually don’t have to.
It’s not a must, it’s more a rule of thumb. By introducing interfaces you are just one step ahead on the abstraction road, nothing that a simple refactor can’t do.

But there’s another reason why you should stay on that road.
Think to a class that implements the logic of a Queue, like the following:

An implementation of a queue

It’s a queue in the classic way: a client can push items to it, pop them out, query the number of items inside the queue and clear its content by removing them at once.
It’s ok for a queue.
But clients use what they get.
If a client gets an instance of Queue like the one above, then the developer of that client will feel entitled to carry out his job by doing everything at his disposal. Do you feel comfortable with that? Do you think that the developer sense of responsibility is enough? Do you believe that code reviewing prevents anybody from doing the wrong thing at some point? If you feel comfortable with that, then you can go over and use your class as-is.
But relying on the sense of responsibility has a cost: a higher probability of errors.
Sooner or later, somebody will use a method of your object that wasn’t supposed to be used in that context. A mistake, of course. Can developers be considered unreliable? Would you really blame them? Maybe developers make mistakes, or maybe the code reviewers, looking at what developers was given, thought that they were legitimate to do what they did. Can you really blame them? They are responsible for the mistake to some extent. But the truth is that the guilty developers are not that guilty since they have just used what they was given. You gave them an object and, implicitly, expected them to use only some of its functionalities. If you think about it for a while, this is what happens every time a developer – even you – writes code involving the use of an object provided by somebody else: in order to avoid mistakes, developers must know, based on the context, what they can do with that object, which methods of that object can be used and which ones are forbidden.
As time goes by, the code will need more and more maintenance and it will be more likely for some developer to act wrongly by using that object in a way that wasn’t supposed to be used in that context.
So, one question arises: why should you share an object that can only be partially used? Why don’t you provide an object by exposing the right interface instead? The kind of interface allowing a client to do only the few things it is allowed to do, and nothing more?

If, in a specific context, a client is only allowed to pop items out of a queue, then write it by using the interface of a queue that enforces this constraint:

An interface that is less likely to be used in the wrong way

Of course, you don’t have to write many different implementations of the queue, each one with a reduced set of functionalities. You can still create one full-featured queue as an object that implements the two – or even more – segregated interfaces and choose which of those interfaces you want to use in your code, depending on what the client should be allowed to do with it:

The interface segregation in action

Since the Queue class implements both the ReadableQueue and the WriteableQueue interfaces, it can be passed to any function that accepts those interfaces, keeping them unaware of what specific object it really is.

const queue = new Queue();
// ...

consumeQueue(queue);

function consumeQueue(queue: ReadableQueue) {
/*
Here the developer doesn't know what 'queue'
really is, and he can only use it by its interface
*/
}

In the example above, the consumeQueue() function receives a ReadableQueue object. It is unaware of what specific implementation it will get, and should stay agnostic about it. The only thing that the function can do is using the queue object as a ReadableQueue.

Better testing

Last but not least, introducing abstractions in your code makes it automatically more easy for developers to write tests: they can inject mocks, fake or stub implementations instead of the real classes. It is true that a lot of testing frameworks can create mocks starting from concrete classes, but sometimes writing fake implementations may be easier than programming mocks, depending on how handy the framework is.

Antonio Seprano

Jan 2019