Here, we comprehensively describe the very new IBM z17 mainframe server, its technical architecture, and its connectivity options. The “IBM Technical Guides” offers an in-depth overview of the z17’s hardware components like frames, drawers, processors (including the new Telum II and Spyre AI accelerator), memory, and I/O subsystems, along with its reliability, availability, serviceability (RAS) features, security aspects (including cryptography and quantum-safe technologies), and capacity planning considerations. The “IBM Z Connectivity Handbook“ focuses on the channel subsystem (CSS), FICON, zHyperLink, OSA-Express network adapters, HiperSockets, and coupling links, detailing their configuration and functionalities. A news article highlights the z17’s enhanced capabilities for AI workloads, particularly generative AI and real-time inference.
Here are the links to the mentioned IBM Redbooks:
IBM z17 Technical Introduction
A glossary of key terms mentioned in the redbooks:
- CPC (Central Processor Complex): The core of the IBM z17 system, containing the processors, memory, and internal interconnects.
- Drawer (CPC Drawer, I/O Drawer): A modular hardware unit within the IBM z17 that houses specific components like processors and memory (CPC drawer) or I/O adapters (I/O drawer).
- DCM (Dual Chip Module): A physical module containing two processor unit chips in the IBM z17.
- PU (Processor Unit): An individual processing core within the IBM z17 processor unit chip.
- Cache: High-speed memory used by the processors to store frequently accessed data and instructions, improving performance. The IBM z17 has multiple levels of cache (L1, L2, L3).
- Memory (Main Storage): The primary working memory of the IBM z17 system used to store data and instructions that are actively being processed.
- RAIM (Redundant Array of Independent Memory): A memory technology that uses redundancy to detect and correct errors, enhancing memory reliability.
- FICON Express: A high-speed Fibre Channel interface used for connecting the IBM z17 to storage area networks (SANs) and storage devices.
- zHyperLink Express: A low-latency, high-bandwidth interconnect designed for direct communication between IBM Z systems, primarily used for Parallel Sysplex coupling links.
- OSA (Open Systems Adapter): A network interface card that provides Ethernet connectivity for the IBM z17, allowing it to communicate over IP networks.
- Network Express: A newer generation of network adapters for IBM Z, offering enhanced features and performance for Ethernet connectivity.
- HiperSockets: An internal, memory-based communication protocol within an IBM Z system or between LPARs on the same system, providing high-speed, low-latency network connectivity.
- SMC (Shared Memory Communications): A technology that enables high-performance communication between Linux on IBM Z guests by utilizing shared memory. Includes SMC-R (RDMA) and SMC-D (Direct).
- Channel Subsystem (CSS): The hardware and microcode that control the flow of data between the central processing complex and the I/O devices.
- CHPID (Channel Path Identifier): A logical identifier for a channel path within the channel subsystem.
- I/O Subsystem: The part of the IBM z17 responsible for handling input and output operations, including channels and adapters.
- HMC (Hardware Management Console): A separate hardware appliance used to configure, manage, and monitor IBM Z systems.
- LPAR (Logical Partition): A virtualized instance of a computer system within a physical IBM Z server, capable of running its own operating system and applications.
- RAS (Reliability, Availability, Serviceability): A set of design principles and features aimed at ensuring high system uptime, preventing failures, and facilitating efficient repair.
- CoD (Capacity on Demand): The ability to activate or deactivate processor capacity and features on an IBM Z system based on workload requirements and licensing.
- Pervasive Encryption: A security approach where data is encrypted at rest and in motion across the entire IBM Z environment, minimizing the risk of data breaches.
- Quantum-Safe Algorithms: Cryptographic algorithms designed to resist attacks from future quantum computers. The IBM z17 incorporates support for these algorithms.
- z/OS: One of the primary operating systems that runs on IBM Z systems.
- z/VM: A virtualization operating system for IBM Z that allows multiple virtual machines (guests) to run on a single physical server.
- Linux on IBM Z: A port of the Linux operating system that runs natively on IBM Z hardware.
- KVM (Kernel-based Virtual Machine): An open-source virtualization technology that can run on Linux on IBM Z.
- Sparing: The inclusion of redundant hardware components that can automatically take over in case of a failure.
- Transactional Execution: A processor feature that allows a sequence of instructions to execute atomically, either all completing successfully or having no effect, which is useful for concurrency control.
- TLB (Translation Lookaside Buffer): A cache of virtual-to-real address translations used to speed up memory access.
- HiperDispatch: IBM’s workload management technology that optimizes processor resource allocation to logical partitions based on their priority and activity.
- AI Acceleration: The integration of specialized hardware or software to improve the performance of artificial intelligence and machine learning workloads. The IBM z17 includes the Integrated Accelerator for Artificial Intelligence.
- DPU (Data Processing Unit): A specialized processor designed to accelerate data-centric workloads, such as network processing and security functions. The IBM z17 includes a DPU.
- Coupling Facility (CF): A central resource in an IBM Parallel Sysplex used for shared data, locking, and inter-system communication.
- Parallel Sysplex: A cluster of IBM Z systems that work together as a single logical computing entity for increased availability and scalability.
- GDPS (Geographically Dispersed Parallel Sysplex): An IBM solution that extends Parallel Sysplex capabilities across geographically separated sites for disaster recovery.
- System Recovery Boost: A temporary boost in processor capacity provided after a planned or unplanned outage to help the system recover and catch up on processing.
- Cyber Resiliency: The ability of a system to withstand and recover from cyberattacks and disruptions.
- Secure Boot: A process that ensures the firmware and operating system boot process is secure and has not been tampered with.
- Secure Execution for Linux: A set of technologies that provide a highly secure and isolated environment for running Linux workloads on IBM Z.
- Secure Service Container: A hardened, tamper-proof partition designed to run sensitive workloads with enhanced security and isolation.
Excellent video about the new IBM z17 Server and its architecture and connectivity options
Nice to listen to my writings… 😉