As an early stage startup with a small engineering team, we build and deploy code into GCP/GKE. CircleCI was first introduced through the recommendation of a contracting firm. Later our founding engineering team quickly picked up the expertise on CircleCI and started expanding its usage in various components in our software stack. This article will present our journey of using CircleCI to drive CI/CD of microservices into GKE, the managed Kubernetes environment in GCP. Hope this article will be useful for engineering organizations in the similar journey.

What is CircleCI?

In the high level, CircleCI is a continuous integration and delivery platform helps…

Early last year we were looking for a time series database (aka TSDB) to replace the Postgres SQL database used in our initial product prototype, as the timeline queries for serving some UI components are becoming unbearably slow when the data entries reaches 10 million in the underline table. Clearly Postgres is not the best option for this job.

What to Look for in a TSDB?

Time-based Partitioning

Time series data is usually queried by a time range. Partitioning the ingested data by time range into smaller segments will greatly improve the query performance, because only the set of segments fall into the time range need to be queried.

Retention Policy and Downsampling

I have worked as software engineer in various tech companies in silicon valley in the last twenty years, witnessed the transition of software products from the monolith to the micro-service architecture. Now that I have started as a founding software engineer in an early stage startup with a small team of just a few engineers. I have to look back at the past experiences in a new perspective. As startup values highly on engineering productivity and velocity, so as to quickly build out the features and product to deliver value to the end users. …

I am a software engineer working in an early stage startup. Half a year ago, we need to choose a stream processing framework to build out some of our features. We ended up picking Kafka Streams. In this article, I will go over the decision process and the overall experience of using Kafka Streams as a reflection of this journey, but not to cover the technical details of how exactly Kafka Streams works internally,

What is Kafka Streams?

Kafka Streams is a library for building streaming applications that transform data from input Kafka topics and output into various sinks such as Kafka topics, databases…

Why Redis Cluster?

Redis is an open source in-memory data structure store. It can be used as a distributed key-value database, cache and message broker. We use Redis primarily for caching and light-weight messaging between distributed components using its pubsub channels.

To make sure it runs reliably in production, it needs to be configured with the HA setup . It typically has two HA configurations:

  1. Redis Sentinel

The sentinel configuration has one master node, and multiple slave nodes replicating data from the master. The master node will handle write traffic, and all nodes can serve read traffic. Master node will be re-elected if…

Mark Lu

Founding Engineer @ Trace Data. Experienced software engineer, tech lead in building scalable distributed systems/data platforms.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store