Resources to study distributed and high fault-tolerant systems?

A supercomputer is a computer with a high level of performance as compared to a general-purpose computer.The performance of a supercomputer is commonly measured in floating-point operations per second instead of million instructions per second (MIPS). Since 2017, there are supercomputers which can perform over 10 17 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). Distributed computing is a field of computer science that studies distributed systems. A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal. DCOM (Distributed Component Object Model) is a set of Microsoft concepts and program interfaces in which client program object s can request services from server program objects on other computers in a network. DCOM is based on the Component Object Model (COM), which provides a set of interfaces allowing clients and servers to communicate within the same computer (that is running Windows 95 or ... liftbridge - Lightweight, fault-tolerant message streams for NATS. lura - Ultra performant API Gateway framework with middlewares. micro - A distributed systems runtime for the cloud and beyond. NATS - Lightweight, high performance messaging system for microservices, IoT, and cloud native systems.

2022.01.16 17:00

Background - I work in the development team of an enterprise SQL database (that you might have heard of). YOE - 2. My work is mostly focused on some trivial incremental features and I think I am missing the big picture.
How do I learn more about the architecture of highly distributed and high fault-tolerant systems?
I have figured out a book - "Designing Data-Intensive Applications" [Kleppmann, Martin]
But what else can I do? Is only reading enough? How about practical exposure?
