Monolith, Microservices or the Middle Ground?

I have worked as software engineer in various tech companies in silicon valley in the last twenty years, witnessed the transition of software products from the monolith to the micro-service architecture. Now that I have started as a founding software engineer in an early stage startup with a small team of just a few engineers. I have to look back at the past experiences in a new perspective. As startup values highly on engineering productivity and velocity, so as to quickly build out the features and product to deliver value to the end users. This makes me wonder if there is a middle ground between these two architecture styles, which can reap the benefits of both worlds?

Let me start the story with my past experience working with a monolithic applications in a previous employer, and then the saga in marching towards the microservice world from the legacy monolith. Finally I will present the approach we took in our startup which is arguably reaping the benefit from both sides.

What is Monolith?

Monolithic application usually has a single code base with multiple modules each handling one business or technical feature. It uses one single build system to build a single executable or deployable binary.

Many years ago, typical enterprise software applications are taking the monolithic approach. The web UI presentation logic and all the backend function modules are in the same source code repository. There was no clear-cut demarcation of the module boundaries. Functional modules establish dependencies to other modules through direct functional references if the build system does prevent them from doing so. Overtime the whole codebase turns into a spaghetti of very convoluted dependency tree.

Surprisingly, one of the consumer facing internet companies I worked for was also on a monolith when I joined. The main difference is that they made the monolith horizontally scalable to be able to handle internet scale of traffic. But the drawbacks come with the monolithic architecture is none-the-less the same as its enterprise software counterpart:

  • High overhead of planning a release: coordination of all the engineering teams for feature inclusion, branching for major/maintenance/hotfix releases
  • Longer release cycles: one simple change in the last moment in one area may need a full regression testing
  • Lower engineering productivity: more frequent build breakages, longer local build time, longer developer turnaround time

Here was my personal experience working with the monolithic application:

  • It took 10+ minutes just to start that application locally, as it has to initialize hundreds of modules and pre-populate tens of local caches during the startup process. Developers will usually go get a coffee, or go to the rest room or go playing one foos ball game after they issued the command to start the application.
  • Build breakages affects everyone. It is usually caused by concurrent merges of multiple feature branches, or a large refactoring.
  • Long cycles to debug and test code changes when application restart is needed. Otherwise tools such as JRebel needs to be set up to hot reload code changes to avoid restart
  • Need an expensive desktop with high end CPU and high memory to be able perform local development and testing without occasionally freezing the whole desktop

With all the horrible developer experiences, the engineering team of the company made up the mind to fully transition into the microservices architecture.

Transitioning into Microservices

It was not an easy task to break up a monolithic application with almost 10 years of development history. The transition was a multi-year journey. Here were some of steps taken to make the transition from happening:

  • Split the monolithic code base into multiple repositories divided by functional area.
  • Switched from the legacy source control system to github.
  • Set up a binary artifact repository.
  • Each repository has it’s own CI/CD pipeline and versioning.
  • Each microservice build generate two binaries. Once is for the service itself, one is a client library containing data objects, API definition and client stub for external consumption.
  • Converted source code level dependency to binary dependency
  • Organize utility functions into shared utility library in separate repo
  • Create dev/staging environments to make sure the front-end web application and backend microservices works well together end-to-end.
  • Service developers can start the web application locally and point to microservices deployed in dev/staging environment that are up and running already by default, or one specific service can be overriden to point to a local service in development.

After these changes, the local build and application startup took less than one minute to complete. Different teams can work and release their services independently. People got more productive without the stress of running into merge conflicts, build breakages before the release code freeze deadlines and therefore were happier.

All these steps completely changed the development workflow and developer experience. It took about 2 years if I remembered correctly for hundreds of engineers in the engineering organization to be fully onboard with this new way of software development. At the mean time, it also changed the the product rollout, production failure patterns, latency characteristic, monitoring and alert responses and affecting the operation teams.

It was a very expensive transition process considering the length of the time taken, and the amount of engineering resource invested to make it happen. The end result is the faster engineering clock speed, better developer experience and satisfaction. The return of the investment should be achieved in the years follows through the form of engineering productivity improvement.

A Middle Ground?

As an early stage startup, we value highly on the engineering speed. We understand that we need to get the system up and running quickly, we can not afford to build a perfectly engineered system which take too much of our precious time. At the mean time the software stack we build should still be easily modularized into well-defined microservices to avoid the pitfalls comes with a monolith.

Here is the approach we have taken for building our data ingestion pipeline services. I would call it the middle ground between the monolithic approach and fully-fledged microservice paradigm as described in the above sections:

  • Use one single source code repository for all the data processing services.
  • One CI pipeline to build the single binary, packaged into a docker image which can be deployed into Kubernetes.
  • The service behaves accordingly to the Role it is being configured to be.

E.g. We have the seven different roles for services in our data processing pipeline. They are all deployed into google cloud’s Kubernetes engine:

Below is the high-level diagram of how the backend services works together in GCP/GKE environment. The blue services were built from the same source code repository and are deployed from the same binary or docker image, but with different configurations. The reason for grouping those blue services into one source code repository is that they share quite some DTO/DAO (data transfer object/data access object) and event definitions. If we follow the microservice doctrine strictly and separate each of them into a separate source code repository, it means we have to refactor the shared code in another repository. This will significantly increase the complexity and slow down the code development and release process.

Take the streaming service as an example, Kafka streaming client will be started only if it is configured with the role of ALL or STREAMING at the startup initialization time:

If the service is configured as theSTREAMING role, it should only initialized components needed for the streaming service to function properly, nothing else. This will avoid the long startup time issue of a truely monolithic application which has to initialize all components.

The ALL service role is only used in local development. In this configuration, all the service components are initialized and started in the same JVM for simplicity, so that we don’t have to start many different service processes just to be able test and debug locally.

Conclusion

The above-mentioned middle ground approach shared some characteristics with the monolith (such as a single code base, single CI pipeline and one single binary). At the same time, one single binary can to be deployed as different services when configured with different roles and share some characteristics with typical microservices. The service startup time is very reasonable for both local development and in production. It achieved what we want it to be with relatively low engineering cost. We are happy with the result we saw. What’s your thought on this middle ground approach?

Founding Engineer @ Trace Data. Experienced software engineer, tech lead in building scalable distributed systems/data platforms.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store