분산시스템 개념과 디자인 — 시스템 모델

원서 SUMMARY: Distributed Systems Concepts and Design 5th Edition (2)

SoniaComp
8 min readFeb 12, 2021

글에 앞서…

처음에 두꺼운 원서를 읽는 다는 게 쉽지 않습니다. 저도 사놓고 전시만 해둔 책이 몇권 있습니다 😅 하지만 두꺼운 원서도 조금씩 읽다보면, 서서히 속도가 빨라지더라구요! 그래서 원서 읽으실 때 제 글이 참고가 됐으면 합니다.
( 💡 원서를 읽으면서 제 스스로 정리한 내용입니다! )

[ System Model ]
1. Physical Models
Monolithic -> Internet Scale -> Contemporary
2. Architectural Models
1) Elements
2) Architectural Patterns
3) Middleware Solutions
3. Fundamental Models
1) Interaction Models
2) Failure Models
3) Fundamental Models

System Models

  • Physical models: the types of computers and devices & their interconnectivity
  • Architectural models: the computational and communication tasks performed by its computational elements
    → supported by appropriate network interconnections
    → Client-server & peer-to-peer
  • Fundamental models: solutions to individual issues

[ 💡 Global Clock 이 없다 ]

  • Clock: CPU 를 비롯한 컴퓨터의 모든 부품들은 일정한 간격(속도)으로 발생되는 전기적 신호(pulse, clock)에 맞추어 동작 [ 하나의 클록에서 다음 클록까지: 한 사이클, Hz ]
  • CPU 성능 = ( Clock * IPC[사이클 당 명령어 처리 횟수]) * 코어 수
  • 이렇듯 하나의 컴퓨터에서는 Clock 이란 시간에 맞춰 다함께 동작
    → Distributed System 에서는 Message 교환으로 동작 [ 이벤트 기반 ]
  • Message 교환방식의 위험성
    1) Delay: setting time limits
    2) 실패의 다양성: processes and communication channels
    3) 보안 공격에 취약: Secure Channel 에 대한 개념 도입

1. Physical Models

  • the hardware composition of a system
  1. the computers (and other devices, such as mobile phones)
  2. their interconnecting networks.

Early distributed Model

These systems typically consisted of between 10 and 100 nodes interconnected by a local area network, with limited Internet connectivity and supported a small range of services such as shared local printers and file servers as well as email and file transfer across the Internet.

Internet-scale distributed systems

They incorporate large numbers of nodes and provide distributed system services for global organizations and across organizational boundaries. The level of heterogeneity in such systems is significant in terms of networks, computer architecture, operating systems, languages employed and the development teams involved.

Contemporary distributed systems

  • heterogeneity embracing
  • ex) Grid computing: WAN으로 연결된 서로 다른 기종의 컴퓨터들을 하나로 묶어, 가상의 대용량 고성능 컴퓨터를 구성하여 고도의 연산작업 혹은 대용량 처리를 수행하는 것을 일컫는다. (Data, Computation Intensive)
출처: 책

2. Architectural Models

  • the computational and communication tasks performed by its computational elements(연산과 데이터 처리를 생각한 모델)
  • supported by appropriate network interconnections.

2.1 Elements

A. Communicating entities [ 구성요소 ]

processes coupled with appropriate interprocess communication paradigms

  • distributed independent entities
    processes, objects, components or services
  • Objects: 객체지향적 접근을 가능하게 한다.
    → 인터페이스(객체에 정의된 메소드)를 통한 객체 접근
  • Components: Object 프로그래밍의 단점을 극복하기 위해 제안
    → making all dependencies explicit and providing a more complete contract for system construction
    → 자신의 인터페이스 뿐 아니라, 다른 컴포넌트들과 상호작용도 정의됨
  • Web services: 분산시스템 발전의 세번째 중요한 패러다임 [Alonso et al. 2004].

B. Communication paradigms [ Communication 체계 ]

[ Interprocess communication ]

  • the relatively low-level support for communication between processes
  • the API offered by Internet protocols (socket programming)

[ remote invocation ]

  • a two-way exchange between communicating entities
  • resulting in the calling of a remote operation, procedure or method
  1. Request-reply protocols
    an underlying message-passing service to support client-server computing.
  2. Remote procedure calls
    RPC를 이용하면 원격의 컴퓨터에 있는 프로세스의 프로시저도 지역 주소공간에 있는 프로시저를 호출하는 것처럼 호출할 수 있다.
    → access and location transparency
  3. Remote method invocation
    a calling object can invoke a method in a remote object
  4. Publish-subscribe systems: a one-to-many style of communication
    an intermediary service that efficiently ensures information is routed to consumers who desire this information.
  5. Message queues: a point-to-point service
    → Producer processes can send messages to a specified queue
    → Consumer processes can receive messages from the queue or be notified of the arrival of new messages in the queue.
  6. Tuple spaces
    Tuple spaces offer a further indirect communication service
    supporting a model whereby processes can place arbitrary items of structured data, called tuples
  7. Distributed shared memory
    → provide an abstraction for sharing data between processes
    → 프로그래머가 느끼기에는 한 Local Address Space의 자료구조 이용하는 것처럼 느껴짐(a high level of distribution transparency)

C. Roles and responsibilities [ 역할과 책임 ]

  • Client-Server
    a direct and relatively simple approach to the sharing of data and other resources, it scales poorly.
  • Peer-to-peer
    → interacting cooperatively as peers without any distinction between client and server processes or the computers on which they run
    → all participating processes run the same program and offer the same set of interfaces to each other.

D. Placement [ 배치 ]

  • Mapping of services to multiple servers
  • Caching
    → 최신 데이터인지 확인: a special HTTP request
    → provide a shared cache of web resources for the client machines
  • Mobile code
    → Accessing services means running code that can invoke their operations.
  • Mobile agents
    → from one computer to another in a network carrying out a task on someone’s behalf [ such as collecting information, and eventually returning with the results ]
    → a potential security threat to the resources

2.2 Architectural patterns

A. Layering

  • a complex system is partitioned into a number of layers
  • layering deals with the vertical organization of services into layers of abstraction

B. Tiered architecture

  • tiering is a technique to organize functionality of a given layer
  • place this functionality into appropriate servers or on to physical nodes

C. Thin clients

  • moving complexity away from the end-user device towards services in the Internet
  • enabling access to sophisticated networked services
  • a software layer that supports a window-based user interface
    → virtual network computing (VNC)

D. Other commonly occurring patterns

  • proxy pattern: to support location transparency in remote procedure calls or remote method invocation
    → a proxy to represent the remote object [ in the local address space ]
  • The use of brokerage: supporting interoperability in potentially complex distributed infrastructures.
  • reflection: supporting both introspection (the dynamic discovery of properties of the system) and intercession (the ability to dynamically modify structure or behavior).

2.3 Associated middleware solutions

  • The task of middleware: a higher-level programming abstraction for the development of distributed systems and, through layering
  • limitation: the dependability of systems require support at the application level

Categories of middleware

3. Fundamental Models

Interaction Model

  • the structure and sequencing of the communication between the elements of the system
    → Latency: The delay of communication channels [ performance ]
    → Bandwidth: total amount of information that can be transmitted over it in a given time.
    → Jitter: variation in the time taken to deliver a series of messages. [multimedia data]
  • impossible to maintain a single global notion of time
    → Computer clocks and timing events: There are several approaches to correcting the times on computer clocks.
    → Event Ordering: Event와 그 순서로 표현할 수 있다. Logical Time 은 Logical Ordering 에 대응하는 각 이벤트에 숫자를 부여한다. (나중 이벤트가 이전 이벤트보다 Logical Time 이 적다)
  • 두 가지 다른 형태의 Interaction Model
    → Synchronous Distributed Systems [ Time Bounds ]
    → Asynchronous Distributed Systems [ No Bound ]

Failure Model

  • Failure 의 종류와 그 영향을 정리한 모델
  1. Omission failure: fails to perform actions
    1) a process: Process omission failures
    2) communication channel: Communication omission failures
  2. Arbitrary failures: any type of error may occur
    → Detection 하기 어렵다
    1) Timing failures
    Detection: Timeout → 비동기 시스템에서는 사용하기 어렵다
    2) Masking failures
    중복처리로 해결(Back Up Server)
    3) Reliability of one-to-one communication
    validity and integrity 로 Reliable Communication 을 정의

Security Model

  • how the system is protected against attempts to interfere with its correct operation or to steal its data.

[ Protecting objects ]

Protection is described in terms of objects, although the concepts apply equally well to resources of all types.

  • User : 클라이언트 오브젝트를 통해 서버에게 요구(Invoc) 하는 형태
  • Server : 산출된 값에 대해 각각의 클라이언트에게 전달하는 형태
  • Access Right : 각각의 허용된 권한 관리 / 접근 관리

[ Securing processes and their interaction ]

  • sending messages 를 통해 상호작용한다.
  • The messages 는 네트워크와 커뮤니케이션 서비스가 오픈(모든 프로세스가 모두 통신하기 위해 오픈)되어 있기 때문에, 공격에 노출되어 있다.
  1. The enemy
    Threats to processes: Request 를 받는 프로세스느 Sender 를 식별할 수 없다. IP 주소로 식별할 수 있지만, 공격자가 IP 주소를 바꾸는 건 어려운 일이 아니다.
    Threats to communication channels: 네트워크와 Gateway를 오고가는 메시지를 공격자는 복사하거나 바꾸거나, 다른 메시지를 삽입할 수 있다. 이런 공격은 정보의 보안과 무결성을 침해한다.
  2. Defeating security threat
    → Cryptography and shared secrets: 암호화와 Key 공유
    → Authentication: 인증. proving the identities supplied by their senders.
    → Secure channels: 안전한 채널(Encryption and authentication layer 가 Secure Channels 를 서포트한다)
  3. Other possible threats from an enemy
    → Denial of service: resulting in overloading of physical resources (network bandwidth, server processing capacity)
    → Mobile code: receives and executes program code from elsewhere, such as the email attachment

Difficulties and threats for distributed systems

  • Widely varying modes of use
    The component parts of systems are subject to wide variations in workload
  • Wide range of system environments
    accommodate heterogeneous hardware, operating systems and networks.
  • Internal problems
    1) Non-synchronized clocks → conflicting data updates
    2) many modes of hardware and software failure
  • External threats
    Attacks on data integrity and secrecy, denial

--

--

SoniaComp

Data Engineer interested in Data Infrastructure Powering Fintech Innovation (https://www.linkedin.com/in/sonia-comp/)