Introduction to Distributed System Design - 1. Splitting in Microservice Architecture

sun.shuo@aliyun.com

4.93/5 (9 votes)

Apr 27, 2019

CPOL

7 min read

27583

This is an introduction to Distributed System Design - 1. Splitting in Microservice Architecture

In my article, Distributed Method of Web Services, I introduce the method of distributed design. But the reader's response is too academic to understand. So I started this series of articles to help novices use distributed technology in practice. Although distributed is a concept with a long history, the earliest distributed system appeared in ARPANET introduced in the late 1960s. But up to now, distributed system design is very unfriendly to novices. You may have learned a lot about distributed theory, but you still feel helpless in the face of complex software systems. Then I hope this series of articles can help you reorganize the distributed knowledge and establish a correct methodology for designing distributed systems. First of all, the requirement of introducing distributed programming is not high. It requires you to be a software engineer with certain development experience and to understand the basic knowledge of concurrent programming. Parallel programming is the basis of distributed design. You will find that knowledge of concurrent programming is also commonly used in distributed system design. But don't confuse concurrent programming with distributed system design, which are two completely different concepts. Concurrent programming here refers specifically to the method of developing software systems using multithreading. Distributed system design is a more advanced software system design and development behavior than concurrent programming. In this article, we first describe a typical service and how to split it into microservices step by step. Through this typical case, the basic method of splitting service is introduced. Then, we will gradually discuss why this methodology is used and the conditions and principles for its use.

When we develop a software system according to product definition, with the increase of user access, the computing and storage capacity of single server hardware will not meet the product requirements. We will try to split the software system into different hardware servers. The behavior used to increase the overall computing and storage capacity of the system is called distributed operation of the software system. The concept of microservice arises from this splitting a service into smaller services allows services to be put into more hardware. Or merge infrequently used services into one hardware. By increasing or reducing the system hardware, the overall carrying capacity of the system can be adjusted. Since services will be split, the first step is to understand how to build services. A typical Web service is a server-side software system that can handle multiple http requests. For example, under spring boot, each HTTP request is placed in the controller directory. In the express framework of nodejs, it is placed in the routes directory. Previously, service was defined as a software system consisting of tasks triggered by messages.

We know that services are composed of multiple tasks, which belong to different product functions. Although they are all placed in the same service, some of them are closely related to each other, others seem to have nothing to do with each other. Because of product design, these tasks are usually not independently and clearly described. A service consists of dozens to tens of thousands of tasks. Even for project developers, there are too many tasks. In large projects, a developer can develop one or more tasks a day. When there are too many tasks to understand, people introduce another concept "product function" to help classify tasks. First of all, we should abandon the concept of product function, because the division of product function does not belong to the scope of software system. The reason is that computers can only recognize tasks but not product functions. Product functions are only assistant tools to help people memorize and communicate in software development, and cannot be used as the standard of microservice splitting.

The first step in microservice splitting is to eliminate the sequence and data coupling between tasks. All data will be stripped into the memory database to make the service stateless. It is very dangerous to save data in a task or service first. Service crash and hardware downtime can cause data loss or confusion. Private data of tasks and private data exchange will lead to the sequence problem of tasks, that is, the execution order of tasks has a dependency relationship. For example, the task of purchasing goods transfers data directly and triggers the task of payment, and waits for the completion of the task of payment, which is a sequence dependence between tasks. The simplest and most direct way to eliminate this interdependent coupling is to put all the data into an in-memory database. This method of eliminating coupling is what I call the "AP" method, i.e., "Availability partitioning or operation of services". After putting all the data into the memory database, the execution process of the task becomes:

Accept the message trigger
Read the memory database
Implement data processing logic
Write the processed data back to the memory database

After completing the above preparations, you can start preparing for the split service. Because the data for each task is read from the in-memory database. Here, you have to put each task in a separate file. Searching for the keyword "SET" with the search tool "grep" or "Search and Replace" yields a write data set for each task. In concurrent programming, we have learned that if two threads write data at the same time, there will be conflict errors. This problem arises in distributed systems as well. If two tasks with write data intersection are distributed to different servers, there will be a conflict of write data coverage. The condition of writing intersection between these two tasks is called the task with atomic relationship. Atomicity refers to the mutex in concurrent programming. In concurrent programming, if we encounter atomic requirements, we need to create a mutex lock to protect the correct writing or reading of data. In the distributed system, the service container itself has atomicity. Single-threaded server containers, such as single-threaded nginx, Tomcat or web service, are inherently atomic. That is to say, these containers themselves are mutexes or distributed locks. Tasks also have a natural transactional nature in the process of reading and writing data from memory databases. In the process of task calculation, it will not be stored in the memory database. Only when the task calculation is successful, the data will be submitted to the memory database. In the meantime, if task execution fails and data is not submitted to the memory database, it will not pollute or damage the data.

Obviously, tasks with atomic relationships cannot be split into different servers. Tasks without atomic relationships can be split into different servers. For example, we have the task "A, B, C". Written data are "A {pants, skirt, coat}, B {Jeans, coat}, C {hat, glove}". Obviously, because both A and B need to write data "coat", task AB has an atomic relationship and can only be placed in the same server container. Task C has no atomic relationship with other tasks, so it does not need to be placed in the same container. Then we can split the service into two parts.

The magic of this method is that you don't really need to split the project document into two different parts. As long as the write data is different, the request can be shunted in the router.

We call it RP method, which classifies and splits tasks according to the atomic relationship of write data set. Services here seem to be rip apart but still linked to each other, maintaining a wonderful relationship between services. Although "The Mythical Man-Month" tells us that there is no silver bullet, the AP&RP approach does allow us to use simple methodology to achieve the splitting of microservices. I think this is based on the following aspects. First, the AP&RP approach does not attempt to solve the problem of product function realization. Secondly, the AP&RP method first converts the sequentiality of tasks into data. Then the system distribution is realized by dividing the data. Thirdly, the AP&RP method points out that the service container is a complex composed of many potential attributes, including atomicity, Transactionality and so on. In distributed design, the impact of these potential composite attributes should be fully considered. Based on concurrent programming experience, creating mutexes or distributed locks from scratch can further complicate the problem. Here, we introduce the basic distributed knowledge and the splitting technology of microservices. If you have any questions to ask, we will discuss them further in the following article.

If you get articles through search engines, you can click on the links below to get the latest examples and articles in this series.