Here, we comprehensively describe the very new IBM z17 mainframe server, its technical architecture, and its connectivity options. The “IBM Technical Guides” offers an in-depth overview of the z17’s hardware components like frames, drawers, processors (including the new Telum II and Spyre AI accelerator), memory, and I/O subsystems, along with its reliability, availability, serviceability (RAS) features, security aspects (including cryptography and quantum-safe technologies), and capacity planning considerations. The “IBM Z Connectivity Handbook“ focuses on the channel subsystem (CSS), FICON, zHyperLink, OSA-Express network adapters, HiperSockets, and coupling links, detailing their configuration and functionalities. A news article highlights the z17’s enhanced capabilities for AI workloads, particularly generative AI and real-time inference.
Here are the links to the mentioned IBM Redbooks:
IBM z17 Technical Introduction
A glossary of key terms mentioned in the redbooks:
- CPC (Central Processor Complex): The core of the IBM z17 system, containing the processors, memory, and internal interconnects.
- Drawer (CPC Drawer, I/O Drawer): A modular hardware unit within the IBM z17 that houses specific components like processors and memory (CPC drawer) or I/O adapters (I/O drawer).
- DCM (Dual Chip Module): A physical module containing two processor unit chips in the IBM z17.
- PU (Processor Unit): An individual processing core within the IBM z17 processor unit chip.
- Cache: High-speed memory used by the processors to store frequently accessed data and instructions, improving performance. The IBM z17 has multiple levels of cache (L1, L2, L3).
- Memory (Main Storage): The primary working memory of the IBM z17 system used to store data and instructions that are actively being processed.
- RAIM (Redundant Array of Independent Memory): A memory technology that uses redundancy to detect and correct errors, enhancing memory reliability.
- FICON Express: A high-speed Fibre Channel interface used for connecting the IBM z17 to storage area networks (SANs) and storage devices.
- zHyperLink Express: A low-latency, high-bandwidth interconnect designed for direct communication between IBM Z systems, primarily used for Parallel Sysplex coupling links.
- OSA (Open Systems Adapter): A network interface card that provides Ethernet connectivity for the IBM z17, allowing it to communicate over IP networks.
- Network Express: A newer generation of network adapters for IBM Z, offering enhanced features and performance for Ethernet connectivity.
- HiperSockets: An internal, memory-based communication protocol within an IBM Z system or between LPARs on the same system, providing high-speed, low-latency network connectivity.
- SMC (Shared Memory Communications): A technology that enables high-performance communication between Linux on IBM Z guests by utilizing shared memory. Includes SMC-R (RDMA) and SMC-D (Direct).
- Channel Subsystem (CSS): The hardware and microcode that control the flow of data between the central processing complex and the I/O devices.
- CHPID (Channel Path Identifier): A logical identifier for a channel path within the channel subsystem.
- I/O Subsystem: The part of the IBM z17 responsible for handling input and output operations, including channels and adapters.
- HMC (Hardware Management Console): A separate hardware appliance used to configure, manage, and monitor IBM Z systems.
- LPAR (Logical Partition): A virtualized instance of a computer system within a physical IBM Z server, capable of running its own operating system and applications.
- RAS (Reliability, Availability, Serviceability): A set of design principles and features aimed at ensuring high system uptime, preventing failures, and facilitating efficient repair.
- CoD (Capacity on Demand): The ability to activate or deactivate processor capacity and features on an IBM Z system based on workload requirements and licensing.
- Pervasive Encryption: A security approach where data is encrypted at rest and in motion across the entire IBM Z environment, minimizing the risk of data breaches.
- Quantum-Safe Algorithms: Cryptographic algorithms designed to resist attacks from future quantum computers. The IBM z17 incorporates support for these algorithms.
- z/OS: One of the primary operating systems that runs on IBM Z systems.
- z/VM: A virtualization operating system for IBM Z that allows multiple virtual machines (guests) to run on a single physical server.
- Linux on IBM Z: A port of the Linux operating system that runs natively on IBM Z hardware.
- KVM (Kernel-based Virtual Machine): An open-source virtualization technology that can run on Linux on IBM Z.
- Sparing: The inclusion of redundant hardware components that can automatically take over in case of a failure.
- Transactional Execution: A processor feature that allows a sequence of instructions to execute atomically, either all completing successfully or having no effect, which is useful for concurrency control.
- TLB (Translation Lookaside Buffer): A cache of virtual-to-real address translations used to speed up memory access.
- HiperDispatch: IBM’s workload management technology that optimizes processor resource allocation to logical partitions based on their priority and activity.
- AI Acceleration: The integration of specialized hardware or software to improve the performance of artificial intelligence and machine learning workloads. The IBM z17 includes the Integrated Accelerator for Artificial Intelligence.
- DPU (Data Processing Unit): A specialized processor designed to accelerate data-centric workloads, such as network processing and security functions. The IBM z17 includes a DPU.
- Coupling Facility (CF): A central resource in an IBM Parallel Sysplex used for shared data, locking, and inter-system communication.
- Parallel Sysplex: A cluster of IBM Z systems that work together as a single logical computing entity for increased availability and scalability.
- GDPS (Geographically Dispersed Parallel Sysplex): An IBM solution that extends Parallel Sysplex capabilities across geographically separated sites for disaster recovery.
- System Recovery Boost: A temporary boost in processor capacity provided after a planned or unplanned outage to help the system recover and catch up on processing.
- Cyber Resiliency: The ability of a system to withstand and recover from cyberattacks and disruptions.
- Secure Boot: A process that ensures the firmware and operating system boot process is secure and has not been tampered with.
- Secure Execution for Linux: A set of technologies that provide a highly secure and isolated environment for running Linux workloads on IBM Z.
- Secure Service Container: A hardened, tamper-proof partition designed to run sensitive workloads with enhanced security and isolation.
Excellent video about the new IBM z17 Server and its architecture and connectivity options