• Complain

Unmesh Joshi - Patterns of Distributed Systems

Here you can read online Unmesh Joshi - Patterns of Distributed Systems full text of the book (entire story) in english for free. Download pdf and epub, get meaning, cover and reviews about this ebook. City: Boston, MA, year: 2023, publisher: Addison-Wesley Professional, genre: Computer / Science. Description of the work, (preface) as well as reviews are available. Best literature library LitArk.com created for fans of good reading and offers a wide selection of genres:

Romance novel Science fiction Adventure Detective Science History Home and family Prose Art Politics Computer Non-fiction Religion Business Children Humor

Choose a favorite category and find really read worthwhile books. Enjoy immersion in the world of imagination, feel the emotions of the characters or learn something new for yourself, make an fascinating discovery.

No cover
  • Book:
    Patterns of Distributed Systems
  • Author:
  • Publisher:
    Addison-Wesley Professional
  • Genre:
  • Year:
    2023
  • City:
    Boston, MA
  • Rating:
    3 / 5
  • Favourites:
    Add to favourites
  • Your mark:
    • 60
    • 1
    • 2
    • 3
    • 4
    • 5

Patterns of Distributed Systems: summary, description and annotation

We offer to read an annotation, description, summary or preface (depends on what the author of the book "Patterns of Distributed Systems" wrote himself). If you haven't found the necessary information about the book — write in the comments, we will try to find it.

Learn How to Better Understand Distributed System Design and Solve Common Problems

Enterprises today rely on a range of distributed software handling data storage, messaging, system management, and compute capability. Distributed system designs need to be implemented in some programming language, and there are common problems that these implementations need to solve. These problems have common recurring solutions. A patterns approach is very suitable to describe these implementation aspects.

Patterns by nature are generic enough to cover a broad range of products from cloud services like Amazon S3 to message brokers like Apache Kafka to infrastructure frameworks like Kubernetes to databases like MongoDB or Actor frameworks like Akka. At the same time the pattern structure is specific enough to be able to show real code. The beauty of this approach is that even if the code structure is shown in one programming language (Java in this case), the structure applies to many other programming languages. Patterns also form a system of names, with each name having specific meaning in terms of the code structure.

The set of patterns presented in Patterns of Distributed Systems will be useful to all developerseven if they are not directly involved in building these kinds of systems, and mostly use them as a black box. Learning these patterns will help readers develop a deeper understanding of the challenges presented by distributed systems and will also help them choose appropriate cloud services and products. Coverage includes Patterns of Data Replication, Patterns of Data Partitioning, Patterns of Distributed Time, Patterns of Cluster Management, and Patterns of Communication Between Nodes.

The patterns approach used here will help you

  • Learn what a distributed system is and why distributed systems are needed
  • Understand the implementation of a wide range of systems such as databases, in-memory data grids, message brokers, and various cloud services
  • Prepare you to confidently traverse through open source codebases and discover how patterns and solutions map to real world systems like Kafka and Kubernetes

Unmesh Joshi: author's other books


Who wrote Patterns of Distributed Systems? Find out the surname, the name of the author of the book and a list of all author's works by series.

Patterns of Distributed Systems — read online for free the complete book (whole text) full work

Below is the text of the book, divided by pages. System saving the place of the last page read, allows you to conveniently read the book "Patterns of Distributed Systems" online for free, without having to search again every time where you left off. Put a bookmark, and you can go to the page where you finished reading at any time.

Light

Font size:

Reset

Interval:

Bookmark:

Make
Patterns of Distributed Systems Unmesh Joshi Table of Contents Part I - photo 1
Patterns of Distributed Systems

Unmesh Joshi

Table of Contents Part I Narratives Chapter 1 Why Distribute The four - photo 2

Table of Contents
Part I: Narratives
Chapter 1. Why Distribute?
The four fundamental resources

We live in a digital world. Most of what we do is available over the network as a service. Be it ordering our favourate food or managing our finances. All these services run on some servers somewhere. These servers store data and do computations on that data handling user requests over the network. Servers typically wait for user requests, then they read the data stored on the disk into memory and process them using the CPU. CPU, Memory, Network and Disks are the four fundamental physical resources which any computation needs.

Consider a typical retail application exposed as a networked service. Users can add items to the shopping cart and buy them. Users also view their order, and can query their past orders. How many user requests will a single server be able to process? There are many factors based on the specific type of the application, but the upper bound will always be determined by the capacity of these four resources.

Lets start with the network. The network bandwidth decides the maximum limit on how much data can be transferred over the network at any given time. Consider a network bandwidth of 1Gbps. If the application is writing or reading 1KB records, the maximum number of requests that the network can support is 125000. If the records are 5KB in size, the number of requests which can be passed over the network is 25000.

The disks have a limit on the amount of data they can transfer. Following are some example numbers with typical disk bandwidths.

This is a raw hardware limitation But in practice there is some software - photo 3

This is a raw hardware limitation. But in practice, there is some software component which handles the writes and reads. The software component needs to handle issues like concurrent read/writes or transactions, which further limits the number of read/write requests which can be processed on a single server.

Disk and Network are the input/output devices. But to do any work with data read from these needs CPU cycles to be consumed. So with 100K requests if the CPU is at 100% of the capacity, any more requests will be waiting to be processed for their share of CPU.

The fourth factor is that of memory. Servers load data in memory to process it. The data from requests over the network are loaded in memory to further process it. The data from storage is also loaded in memory to process it. Lets say there is 1TB of storage, the memory will typically be in GBs, so at a time only part of the data can be loaded in memory. The whole purpose of different kinds of storage engines is to reduce the need to load all the data in the memory to process. Storage engines use different data structures and maintain indexes to quickly locate the specific data items on the disk and pick and load only those for processing. But depending on the type of the request, more data might be loaded in memory. For example, if all the users are searching for books they ordered in the last year, they will need to scan through more and more data, needing more memory. If the memory is full, again the requests need to wait for their share.

One common problem when these resources reach their physical limit is that of queuing. More requests will need to wait to be processed. This in effect has an adverse effect on the servers ability to process user requests.

Queuing and its impact on system throughput

The disk, network, cpu and memory put upper bound on the number of requests which can be processed. If the number of requests the application needs to process goes above this upper bound, the requests start getting queued for their share of network, disk, cpu or memory. The queuing then increases the time it takes to process any requests.

The effect of reaching the resource limits is observed on overall throughput of the system as following.

This is problematic to the end users Because when they expect the system to - photo 4

This is problematic to the end users. Because when they expect the system to serve more and more users, the system actually performance starts degrading.

The only way to make sure that the requests can be served properly is to divide and process them on multiple servers. This allows using physically separated CPUs, network, memory and disks for processing user requests. In the above example, the workload needs to be partitioned in such a way that each server serves about 500 requests.

Partitioning - Divide and Conquer
Separate business logic and data layer

One common way to divide the architectures is as following. The architecture has two parts, a stateless part exposing functionality to the end user. This can be a web application or more commonly a web api serving user facing applications. The second part, the stateful part, is managed by a database. When user load increases the stateless services are scaled horizontally. This allows serving more and more user requests connections. The business logic executed on the data is done separately on different servers

This architecture works fine provided two basic assumptions hold true The - photo 5

This architecture works fine provided two basic assumptions hold true.

The database can serve a request from the stateless services in less than a few milliseconds.

The database can handle all the connections from multiple stateless service instances. This applications typically work around this constraint by adding caching layers to make sure that not all the request need to go to the database.

This architecture works very well, if most of the users can be served from caches put at different layers in the architecture. It makes sure that, out of all the requests, only a few requests need to reach the database layer. As nicely stated by Roy Fielding in his thesis on REST, "best application performance is obtained by not using the network". But caching does not work always. When most requests are writing data there is obviously no use of caching. With hyper personalized applications needing to always show latest information, the use of caching is limited. As more and more users start using the services, the assumptions start breaking down. It is caused by two reasons.

The size of the data grows, from few terabytes to several hundred of terabytes to petabytes

More and more people need to access and process that data

So the simple architecture shown above starts breaking down. The reasons it starts breaking down is again because of the physical limits of the four fundamental resources.

The impact on the classic architecture then looks as following:

Partitioning by Domain One way to work around these limits is to partition the - photo 6
Partitioning by Domain

One way to work around these limits is to partition the application following the domain boundaries. The popular architectural style today is that of Microservices .

Next page
Light

Font size:

Reset

Interval:

Bookmark:

Make

Similar books «Patterns of Distributed Systems»

Look at similar books to Patterns of Distributed Systems. We have selected literature similar in name and meaning in the hope of providing readers with more options to find new, interesting, not yet read works.


Reviews about «Patterns of Distributed Systems»

Discussion, reviews of the book Patterns of Distributed Systems and just readers' own opinions. Leave your comments, write what you think about the work, its meaning or the main characters. Specify what exactly you liked and what you didn't like, and why you think so.