Prefer watching instead of reading? Watch the video here. Prefer reading instead? Scroll down for the full text. Prefer listening instead? Scroll up for the audio player.
P.S. The video and audio are in sync, so you can switch between them or control playback as needed. Enjoy Greyhound Standpoint insights in the format that suits you best. Join the conversation on social media using #GreyhoundStandpoint.
Let’s not sugarcoat it. For most enterprise technology leaders, the mainframe feels like the dependable but dated elder in the room — rock-solid, yes, but not exactly the life of the AI party. In a tech world dominated by container orchestration, GPU acceleration, and cloud-native everything, mainframes don’t get a seat at the innovation table. They’re assumed to be ticking away in the background, quietly running core systems while the future happens elsewhere.
But that perception — like most lazy assumptions — is overdue for an upgrade. IBM is not only refusing to let the mainframe fade into irrelevance; it’s doubling down. And with the impending global debut of its next-generation mainframe platform the successor to the z16 – now officially named the z17 – it’s clear this isn’t about nostalgia or maintenance. This is about reinvention.
At Greyhound Research, we’ve been tracking IBM’s strategic positioning of the Z platform for over a decade, and this moment feels like the culmination of that long game. z17 is not just a hardware release. It’s a declaration: the future of AI and hybrid cloud will require systems that combine brute-force resilience with nuanced intelligence. IBM believes the mainframe can be that system. And truth be told, it just might be right — if it can back its rhetoric with real-world readiness.
IBM mainframes currently process 70% of the world’s financial transactions by value — a staggering figure that underscores the platform’s continued centrality in global infrastructure, even in a cloud-first era.
During recent interactions, IBM reinforced this statistic by revealing that the USD 8.5 trillion daily transaction value handled by Z is backed by an availability metric of ‘eight nines’—99.999999%. This reliability isn’t theoretical; it allows governments, banks, and logistics companies to trust Z as their real-time backbone for critical infrastructure.
It’s important to note that z17 wasn’t cooked up in a backroom with a few internal architects. This platform is the product of over 2,000 hours of design thinking sessions with more than 80 global clients across industries, regions, and job roles. That level of co-creation — across CIOs, CISOs, developers, operators, and platform engineers — signals intent. IBM didn’t just build a box. It built a system based on what its most demanding customers said they needed next.
IBM Z Day: The Global Stage for a Global Bet
That big bet IBM is making? It wasn’t quietly rolled out behind closed doors. This strategic evolution was officially unveiled during IBM Z Day, a global virtual event on April 8, 2025, featuring more than 150 speakers over seven hours. The event showcased the z17 platform’s core innovations — from hybrid cloud integration to quantum-safe security — and was structured to engage multiple enterprise personas: from seasoned developers to security architects, from infrastructure leads to new adopters.
But this wasn’t just a product showcase. Z Day marked a deliberate shift in IBM’s narrative — a strategic repositioning of the mainframe from a legacy workhorse to a modern, AI-ready platform engineered for policy-heavy, performance-sensitive environments. IBM is not claiming the platform is returning to relevance. It’s asserting it never left — only evolved to meet a different class of enterprise risk and resilience requirement.
Importantly, IBM used this moment to bring long-awaited certainty to the table. The company confirmed general availability of the z17 platform for June 18, 2025, offering CIOs a fixed milestone around which to orient investment and deployment plans.
Per the Greyhound CIO Pulse 2025, more than 40% of technology leaders in financial services, government, and telecommunications had deferred critical infrastructure refresh decisions into the second half of the year, awaiting clarity on mainframe timelines. That clarity is now on the record — and it will directly influence capital allocation windows over the coming quarter.
Per Greyhound Fieldnotes, notes from our advisory work, a global banking consortium CIO noted that z17 had moved from “architectural curiosity to strategic necessity” after IBM confirmed the GA date. Procurement teams, we were told, had already begun reshuffling budget lines to accommodate early deployments, even pulling back cloud extensions that were originally earmarked for low-latency use cases. That’s not just intent. That’s activation.
Beyond the Box: IBM’s Roadmap and Client Co-Creation
IBM’s confidence in the mainframe isn’t rooted in nostalgia — it’s engineered into a long-term roadmap and shaped by some of its most sophisticated clients. During discussions, IBM executives confirmed that development work is already underway on zNext and zNext+1. Developing three generations of mainframe innovation in parallel is not just a marketing commitment; it’s an architectural investment strategy that spans silicon, software, and systems engineering. This level of roadmap transparency is rare and speaks volumes about the company’s long-term commitment to the platform. At a time when other infrastructure vendors are hedging, IBM is staking out decades of intent — and backing it with active silicon development. See picture below for more details.

Equally important is how IBM got here. The z17 platform is not the product of internal brainstorming or an isolated skunkworks lab. Per the company, it’s the outcome of over 2,000 hours of co-creation with more than 80 enterprise clients worldwide. These sessions included developers, architects, operators, CISOs, CIOs, and platform engineers across regulated and high-volume industries. From workload orchestration to data privacy and AI acceleration, every design decision reflects live enterprise pain points and aspirations.
The result? A platform built not for hypothetical scenarios but for hybrid realities. IBM didn’t just consult its clients — it built z17 with them. IBM underlined that this co-creation effort spanned both traditional and unconventional workloads. One standout example came from Europe, where an agency is using the mainframe and AI to analyse seagrass growth from satellite imagery as a proxy for climate change impact.
At Greyhound Research, we believe that when a platform originally designed for banking is now helping assess environmental degradation, we’re no longer talking legacy — we’re talking agility.
During a recent discussion, IBM noted it has already collaborated with clients on over 250 AI use cases on the mainframe, with 21 currently in production. This breadth underlines not just technical possibility but actual enterprise commitment at scale. See picture below for more details. Important to note here that the use cases shown below are a sample of the 250+ use cases, not necessarily ones that are in production.

The Silicon Shift: Telum II
The foundation of IBM’s z17 narrative rests on two critical silicon innovations: the Telum II processor and the Spyre AI accelerator. Together, these chips aren’t simply upgrades — they represent a deliberate architectural pivot.
Telum II, the follow-up to the chip that first introduced AI inferencing natively within the z16, delivers significantly more power and intelligence. With eight cores running at 5.5 GHz, a 40 percent increase in the cache, and an increase in AI compute power to reach 24 trillion operations per second, Telum II is tailor-made for real-time, high-volume AI tasks like transactional fraud detection and dynamic risk scoring. The Telum II processor introduces a refined cache architecture, featuring 10 Level-2 caches on the chip—one per core, an additional one for the integrated Data Processing Unit (DPU), and a tenth serving as the overall chip cache. This design ensures cache coherency among the 32 cores, enhancing data access speeds and processing efficiency.
Furthermore, the virtual L3 and L4 caches have expanded by 40%, now offering 360 MB and 2.8 GB, respectively, contributing to significant performance improvements. The compute power of each AI accelerator in the Telum II processor has been improved, reaching 24 trillion operations per second (TOPS) and also adding support for large language models on-chip. IBM shared new performance data: 450 billion inferencing operations per day with 1ms latency and 50% performance uplift over z16. Importantly, these inferencing operations aren’t hypothetical — they’re in-transaction and on-platform, using the chip’s tight cache coherence and security by design to deliver AI at the point of need. This enhancement enables the processor to handle more complex AI models with higher efficiency, facilitating real-time analytics and decision-making processes.
Additionally, support for INT8 as a data type has been added to the AI accelerator, enhancing compute capacity and efficiency for applications where INT8 is preferred. This allows for the utilisation of newer models that rely on this data type. System-level enhancements in the processor drawer enable each AI accelerator to accept work from any core in the same drawer, improving load balancing across all eight AI accelerators. This configuration provides each core with access to more low-latency AI acceleration, delivering up to 192 TOPS when fully configured. It keeps AI inferencing close to the data, eliminating the latency and risk of pushing sensitive workloads into external GPU clusters. However, it is important to offer some context here – these TOPS figures are for the mainframe and not comparable to the all-purpose AI solution like a GPU. So CIOs and technology architects must use these figures in the context of the mainframe and not draw direct comparisons.
The inclusion of an on-chip Data Processing Unit (DPU) significantly reduces power for I/O management by up to 70% while supporting complex protocol execution. The DPU comprises four tightly coordinated clusters, each with eight programmable micro-controller cores, designed with concurrency in mind. This isn’t over-engineering — it’s orchestration at the silicon level. It sharpens the mainframe’s reflexes for I/O-heavy transactions and allows for advanced protocol handling, making sure the system doesn’t just respond; it anticipates. This makes the system not only faster but vastly more efficient under real-time loads.
Telum II also features ONNX model compatibility, enabling enterprises to deploy models trained elsewhere — including on GPU clusters or public cloud stacks — directly into mainframe environments. This flexibility addresses the rising demand for open AI model portability, reducing vendor lock-in and supporting enterprise model lifecycle strategies.
Under the hood, all of this orchestration is made seamless by Machine Learning for z/OS, a specialised software layer that handles AI workload distribution across Telum, Spyre, and external accelerators without burdening the application developer.
The architecture supports up to 32 Telum II processors in a coherent SMP system, with 12 I/O expansion drawers accommodating up to 192 PCIe cards — making it not just scalable but massively parallel in design.
IBM also confirmed that z17 is capable of handling up to 5 million inferences per second, enabling multiple inference events within a single transaction — a critical capacity for industry use-cases like fraud detection, where layered intelligence must act near-instantaneously.
Per the Greyhound AI Infrastructure Pulse 2025, 58% of financial services infrastructure leaders flagged the inability to perform concurrent in-transaction inferencing as a key limitation in scaling GenAI models to production-grade fraud detection systems.
In one Greyhound Fieldnote, a global banking CIO told us, “We don’t just need model accuracy — we need five decisions within five milliseconds, every time. That’s the only metric that matters when real money moves.” The promise of 5 million inferences per second isn’t just a benchmark — it’s a gating factor for AI deployment in production-scale environments.
The Silicon Shift: Spyre Redefines AI Infrastructure
But it’s Spyre that truly changes the conversation. Expected to be launched in Q4 2025, this new processor is IBM’s first specifically designed for generative AI workloads. With 32 dedicated accelerator cores and support for scale-out configurations, Spyre is built to run large language models natively on the mainframe. Each Spyre card carries 128GB of LPDDR5 memory and delivers over 300 TOPS on a single PCIe Gen5 x16 adapter. Eight cards in a logical cluster bring in 1TB of memory and an aggregate bandwidth of 1.6TB/s — tailor-made for generative AI’s insatiable appetite.
Christian Jacobi, IBM Fellow and CTO of IBM Systems Development, summed it up succinctly: “Spyre will furnish the mainframe with LLM-optimized processing for the first time.” It’s a powerful statement, and it signals IBM’s intent to claim space not just in AI infrastructure but in the GenAI arms race that’s rapidly defining enterprise IT agendas.
Fabricated using 5nm process technology and mounted on PCIe cards, these accelerators are designed to scale — with eight cards adding 256 additional AI cores to a single system. This gives enterprises the ability to run inference-heavy workloads directly within the walls of their own data centres—no detours, no offloading. From a security and compliance lens, that’s a huge win. The design also reflects a rare pragmatism: by enabling direct data transfer between compute engines, IBM has reduced the power draw during AI operations without compromising performance.
Each Spyre card is projected to draw just 75 watts, a footprint that stands in stark contrast to the multi-hundred-watt demands of typical off-platform GPU alternatives. This energy profile makes Spyre not only efficient, but ESG-aligned — an increasingly non-negotiable attribute for infrastructure buyers in regulated sectors.
Per the Greyhound ESG Pulse 2025, 64% of enterprise sustainability officers now require AI infrastructure purchases to report power-per-inference metrics and include lifecycle carbon projections at the time of procurement.
In one Greyhound Fieldnote, a sustainability lead at a Nordic telecom provider noted that Spyre’s energy profile “finally lets us run inferencing without blowing up our emissions targets.” The win isn’t just technical. It’s reputational — and in this market, that counts.
At Greyhound Research, we’ve taken a close look under the hood — and here’s what stands out. Each Spyre chip is architected with 32 dedicated accelerator cores and over 25 billion transistors, strung together with an eye-watering 14 miles of wiring. It’s not just brute force; it’s elegant engineering.
The Spyre architecture supports low-precision numeric formats like int4 and int8 — a deliberate nod to the memory-hungry nature of GenAI models, allowing them to run efficiently without exploding the compute bill.
IBM has further claimed that Spyre is engineered to deliver up to 70% lower cost per inference compared to off-platform GPU-based inferencing — a figure that moves this architecture squarely into commercial conversation, not just technical viability.
Per the Greyhound AI Infrastructure Pulse 2025, CFOs and platform leads across large U.S. and EU-based enterprises ranked inferencing cost overruns as the top budgetary red flag in GenAI programs, often exceeding initial estimates by 40% or more.
In one Greyhound Fieldnote, a procurement executive at a U.S. pharmaceutical major shared that he expects Spyre-based inferencing to “deliver cost compression without compromising model accuracy” — enabling them to redeploy budget toward LLM tuning and fine-tuning stacks instead. In this light, Spyre is not just a silicon advantage. It’s a budget rebalancing tool.
Still, the power-to-efficiency ratio here is worth calling out: Spyre delivers better power efficiency than off-platform GPU inferencing, according to IBM.
System-Level Advancements: Quantifying the Enterprise Impact of z17
While the architectural innovations behind z17 — including Telum II and Spyre — represent significant leaps in compute capability and AI readiness, it is equally critical to examine the platform’s system-level advancements. These enhancements, which often operate below the line in broader architectural discussions, offer direct, measurable improvements to enterprise infrastructure planning and operations.
IBM has shared a number of performance and efficiency metrics that clearly signal the platform’s intent to deliver enterprise value beyond raw compute. Before you read more details below, here’s a picture from IBM that summarises it well.

An 11% uplift in single-thread performance over the z16 generation strengthens z17’s ability to support latency-sensitive transactional workloads — including real-time fraud detection, payment processing, and high-speed reconciliation engines. For environments where milliseconds translate into margin, this performance gain is non-trivial.
15–20% growth in overall system capacity enables greater workload density and improved consolidation economics. Enterprises can defer infrastructure expansion or frame upgrades, thereby lowering capital expenditure cycles and reducing systems sprawl — a common pain point in hybrid environments.
A 60% increase in addressable memory, supporting configurations up to 64 TB, expands the platform’s capacity to host in-memory databases, accelerate LLM inference, and handle complex AI model staging. This enhancement is particularly significant for financial services, healthcare, and government clients where large-scale data processing at rest and in motion is mission-critical.
Power efficiency improvements ranging from 17–27%, achieved via architectural refinements including fine-grained voltage control loops and improved cooling dynamics, translate directly to reduced energy consumption and improved TCO. With CIOs under increasing pressure to meet both internal ESG mandates and external regulatory requirements, these gains carry weight beyond traditional infrastructure metrics.
Beyond hardware-level savings, IBM is also now quantifying efficiency per workload, claiming up to 25% lower power consumption per inference or transaction versus prior systems. This shift matters — it anchors sustainability in operational output, not just component specs.
Per the Greyhound ESG Pulse 2025, 46% of CIOs and CFOs across global capital markets and healthcare organisations said that energy metrics tied to application throughput — not just data centre draw — are now required in board-level reporting.
In one Greyhound Fieldnote, the CTO of a European insurance provider told us, “We’ve stopped asking how green the chip is — we now ask how much energy it takes to approve a claim.” IBM’s move to position power consumption per workload reflects this accountability shift — and it’s a smart one.
Reduced physical footprint and weight, with z17 supporting flexible deployment in 1–4 frame configurations within a standard 19-inch data centre layout. This enables better space planning, simplifies on-premises deployment, and supports modular scaling. Clients aiming to modernise ageing data centre environments or shift toward distributed edge models will find this configurability particularly valuable.
Per the Greyhound Infrastructure Pulse 2025, over 35% of global infrastructure leaders — particularly in financial services and government — cited physical density and thermal constraints as active blockers to compute expansion in regulated environments.
At Greyhound Research, we believe these improvements signal more than just incremental evolution. They reflect IBM’s ongoing shift from a hardware-centric model to a platform-led engagement — one that recognises that modern IT infrastructure must align with the financial, operational, and environmental realities faced by global enterprises.
These metrics are also a direct result of IBM’s co-creation model with over 80 global clients. Feedback on power density, memory constraints, and data centre constraints has visibly shaped z17’s physical and performance profile. In this light, the platform becomes not merely a technology investment, but a co-authored solution to enterprise-scale complexity.
For infrastructure leaders evaluating their next cycle of hardware investment, z17 offers a rare combination: measurable gains in performance and sustainability, delivered within a highly governed, highly secure system architecture. While many platforms offer performance on paper, few deliver it with this level of operational clarity.
Proof in Procurement: Texas Bets on z17
It’s one thing for IBM to promise a new mainframe — it’s another for governments to start planning their budgets around it. A February 2025 Request for Offers (RFO No. 304-26-0364RG) from the Texas Comptroller of Public Accounts explicitly names the upcoming z17 platform as the planned successor to the agency’s IBM z15 mainframe. The document outlines the intention to migrate by November 2025 — a tangible, time-boxed transition roadmap that demonstrates trust in IBM’s delivery capabilities.
Specifically, the RFO seeks ASPG software maintenance and support for 12 months to span the upgrade to z17. The systems in use today include an IBM z15 T01 with 2858 MIPS across multiple logical partitions (LPARs), and the document makes clear that ongoing support must ensure seamless continuity from the current z15 to z17 environments. This includes maintaining core functionality, job scheduling, and operational performance through the upgrade window.
This isn’t a lab pilot. It’s a production-grade mandate by one of the largest and most regulated state entities in the U.S. In a world flooded with marketing hype, nothing signals confidence quite like a state-backed RFO with hard dates and compliance-backed workloads. Texas didn’t just nod along to IBM’s roadmap — it operationalised it.
Quantum-Safe by Design: Hype or Hard Reality?
Among the more ambitious claims made by IBM around z17 is its quantum-safe cryptographic architecture. With quantum computing on the horizon, IBM has taken a proactive stance by embedding support for quantum-safe algorithms and cryptographic agility directly into the platform.
For many enterprise leaders, quantum security still feels like a future problem. But that’s a dangerous assumption. Regulatory bodies are already beginning to push forward mandates for crypto-agility, recognising that state actors may be harvesting encrypted data today with the intention of breaking it later. In this context, IBM’s inclusion of quantum-safe capabilities is less about hype and more about hardening.
Backing that commitment, IBM confirmed it has invested over USD 1 billion in security and AI capabilities for the z17 platform — a figure that signals long-term conviction, not opportunistic iteration.
Per the Greyhound Boardroom Pulse 2025, 53% of CISOs and CFOs across U.S. and European enterprises reported increased scrutiny on whether major platform vendors are making meaningful R&D investments aligned to zero-trust, post-quantum cryptography, and policy-governed AI models.
In one Greyhound Fieldnote, a CISO from a Fortune 100 healthcare provider remarked, “You can’t just patch security posture — it has to be architected and funded. IBM’s billion-dollar stake tells us they understand the scale of the threat.” For enterprise security teams under constant pressure to justify infrastructure spend, this investment isn’t marketing. It’s insurance.
However, Greyhound Research believes there’s a cautionary tale here. While embedding quantum-resilient algorithms into z17 is smart, it also increases the dependency on IBM’s toolchain and update cadence. Enterprises must remain vigilant. Crypto-agility only matters if systems remain open, interchangeable, and auditable. If IBM truly wants z17 to be a platform of trust, it will need to open the black box and let customers see under the hood.
IBM has also highlighted its proactive stance, referencing the launch of ‘Threat Detection for z/OS’ — aimed at helping enterprises move from reactive to predictive postures when managing cryptographic threats, especially in the context of ‘harvest now, decrypt later’ scenarios.
AI for IT Operations: From Smart Insight to Self-Healing Infrastructure?
Another pillar of the z17 platform is its AI-driven approach to infrastructure management. IBM is making bold promises: anomaly detection, root cause analysis, real-time remediation, and, eventually, self-healing systems. It’s a seductive vision — infrastructure that manages itself.
IBM has also been emphasising agentic AI — a term they used to describe how systems can autonomously respond to anomalies and threats without human intervention. The narrative is shifting from assistive to autonomous — not just a chatbot for your sysadmin, but a system that anticipates and resolves before impact.
To support this vision, IBM is introducing the IBM Data and AI System for z/OS — a pre-integrated, mainframe-resident software stack designed to handle the full AI lifecycle, from ingestion and model training to secure inferencing and policy enforcement. This is not a tool — it’s a tightly governed AI substrate optimised for z/OS environments.
Per the Greyhound CIO Pulse 2025, nearly 50% of enterprise technology leaders globally now view fragmented AI toolchains as a critical barrier to scaling GenAI use cases beyond pilot stages.
In one Greyhound Fieldnote, an Asia-Pacific banking client told us bluntly: “It’s not that we don’t have AI tools — we have too many, and none of them speak Z.” For them, the appeal of an integrated, mainframe-native AI stack wasn’t simplicity for simplicity’s sake. It was auditability, access control, and runtime proximity — rolled into one governed system.
But we at Greyhound Research see this a little differently. While the theory holds water, the real-world implementation often falls flat. Effective AI for operations depends on visibility — and visibility depends on complete, clean, and well-integrated telemetry. Without it, AI remains blind.
In our observation, many enterprises still struggle with fragmented observability stacks stitched together from cloud-native tools, on-premise legacy monitors, and third-party loggers. In such environments, IBM’s claims of predictive incident prevention need to be met with healthy scepticism.
At Greyhound Research, we recommend CIOs go beyond the marketing deck. Ask for real data: reduced MTTRs, fewer false positives, and lowered operational toil. If IBM can prove it, z17 may well become the most intelligent piece of infrastructure in the data centre. But until then, it remains a promise in progress.
AI Assistants for Application Development: Legacy Meets LLMs
In our observation, we at Greyhound Research believe that the most transformative feature of z17 is the introduction of AI assistants into the software development lifecycle — particularly for legacy workloads. In many large enterprises, core systems are still written in languages like COBOL, maintained by an increasingly ageing and shrinking talent pool.
IBM’s AI assistant aims to address this head-on by providing developers with contextual help, code explanations, refactoring suggestions, and modernisation pathways. It’s like a Copilot, but one trained not just on open-source JavaScript but on the deeply entrenched enterprise languages that still run global commerce.
This is not just helpful — it could be existential. Without some form of AI-powered assistance, many enterprises will find themselves frozen in place, unable to evolve critical systems due to sheer human bandwidth constraints. IBM’s approach here feels both necessary and overdue.
Of course, the assistant will need to be more than just a chatbot. It must integrate with modern CI/CD pipelines, support polyglot environments, and remain extensible to the realities of enterprise software. Anything less, and it risks becoming another niche tool in a growing sea of underused developer productivity add-ons.
IBM has also confirmed that WatsonX Code Assistant for Z is already in use in enterprise environments, trained not only on public data but also on customer-specific COBOL codebases. The assistant is designed to run on-prem, supporting regulated industries that demand full control of their development and deployment pipelines.
Equally important, IBM has engineered the assistant to integrate with CI/CD toolchains — allowing mainframe development to plug into broader DevOps pipelines instead of sitting on the sidelines.
Per the Greyhound Developer Pulse 2025, 68% of mainframe engineering teams in global financial and government enterprises reported that the lack of CI/CD integration was the top blocker to attracting modern developer talent.
In one Greyhound Fieldnote, a DevOps lead at a U.S. insurance firm told us, “The first thing our new hires ask is whether COBOL lives in Git. If the answer is no, they won’t stay.” The decision to embed Watsonx Code Assistant within existing software pipelines isn’t a developer convenience. It’s a talent strategy.
IBM also shared its active work with the Z Academy, running regional career fairs and building pipelines from universities into mainframe engineering roles. These efforts are particularly focused on grooming next-gen COBOL talent and equipping them with AI-first mindsets.
As one IBM executive put it during the call, the aim isn’t just to bring people to the mainframe but also to bring the mainframe to people by redesigning tooling so that even engineers without prior Z experience can contribute meaningfully to enterprise workloads. We at Greyhound Research appreciate this passion, but how this lands in the real world is a tale only time will tell.
The New Hybrid Reality: Not Cloud or Mainframe — Both
While assessing this announcement, we at Greyhound Research also grew appreciation for the strategic shift in IBM’s messaging and how the mainframe stays increasingly grounded in a hybrid cloud world. Rather than compete with hyperscalers, IBM is positioning z17 as the trusted anchor — the sovereign, secure, AI-accelerated control plane that can integrate seamlessly with public cloud providers.
In our advisory work, we’re already seeing patterns emerge. Enterprises are training AI models in the cloud, where capacity is abundant, but pulling those models back in-house for inference and decisioning where latency, cost, and compliance are critical. Enterprises can potentially use z17 as their system of record while running analytics and visualisation layers on cloud-native platforms.
The platform’s architecture is engineered for ensemble AI — a layered model approach combining predictive models and LLMs to deliver enhanced accuracy. The Telum II processor and Spyre Accelerator are designed to support ensemble AI methods, which leverage the strengths of multiple AI models to improve prediction performance and accuracy. For example, in insurance claims fraud detection, combining traditional neural networks with large language models enhances the assessment’s precision and reliability. IBM is especially bullish on this for use cases like fraud detection, compliance automation, and insurance claims processing.
This model — distributed, loosely coupled, and trust-centric — is the architecture of the future. But it doesn’t come easy. IBM must reduce the complexity of integration, offer APIs that are truly open, and lower the threshold for hybrid orchestration. The vision is compelling, but it needs to be matched with usability.
During a recent discussion, IBM reiterated its support for ONNX and emphasised that Machine Learning for z/OS can now orchestrate inferencing across Telum, Spyre, and cloud-based GPUs. This means enterprises can train models on public cloud infrastructure and deploy them securely on-prem for inference without workflow fragmentation.
As part of this pivot, IBM has tightened integration between Red Hat OpenShift and IBM z/OS Container Extensions (zCX), enabling containerised workloads to run natively on Z — without requiring Linux partitions or parallel infrastructure. This expands Z’s relevance to developer teams standardised on Kubernetes while preserving governance models anchored in z/OS.
Per the Greyhound Developer Pulse 2025, 7 in 10 enterprise developers across the U.S. and EU expressed frustration with mainframe integration gaps in CI/CD toolchains — especially around container portability and dependency resolution.
In one Greyhound Fieldnote, captured during a cloud-native pilot at a European utility provider, the lead architect shared that zCX allowed them to “finally break the cycle of building twice — once for compliance, and again for deployment.” With zCX and OpenShift alignment, the mainframe no longer feels like an operational outlier. It starts to behave like an integrated, governed node in the DevOps pipeline.
The orchestration layer built into Machine Learning for z/OS allows for intelligent routing between Telum, Spyre, and off-platform GPUs. IBM clarified that this routing is automatic and transparent to the developer — ensuring latency and performance targets are met without code rewrites.
This orchestration is now anchored in z/OS 3.1, IBM’s latest system software, which plays a pivotal role in routing AI workloads across Telum II, Spyre, and even cloud-based GPUs — all without forcing rewrites or workflow reengineering.
Per the Greyhound Developer Pulse 2025, 61% of platform engineering leads across North America and Europe flagged orchestration gaps as the top reason why hybrid AI models stall between training and production. That stat lands hard in enterprise contexts where inferencing must run close to the data, not just in the cloud.
In one Greyhound Fieldnote, a European insurance CIO shared that early attempt to bridge mainframe and cloud inferencing “kept falling through the cracks” due to telemetry mismatches and poor routing logic. The clarity offered by z/OS 3.1 — especially in how it integrates ONNX and TensorFlow-based models — is now being seen as a forcing function to revisit hybrid AI pipeline design, not just infrastructure posture.
IBM described this interoperability as a cornerstone of the mainframe’s evolution — stressing that models trained using open standards like ONNX and TensorFlow can be reliably executed on Z, preserving investment in cloud-native model training.
Per the Greyhound Developer Pulse 2025, 63% of ML engineering teams in global regulated industries reported delays in GenAI rollout due to incompatibility between model training environments (public cloud) and runtime constraints (on-prem or mainframe).
In one Greyhound Fieldnote, a data science lead at a large European bank remarked, “We stopped counting how many models sat in limbo after training — ONNX support on Z is the first time we’ve seen an actual runway to deployment.” For many teams, this isn’t just technical convenience — it’s the removal of a long-standing operational bottleneck.
Final Word: The Renaissance Must Be Earned, Not Announced
IBM’s z17 platform is not just another mainframe update. It is a foundational rethink of what enterprise infrastructure can and should be in the age of artificial intelligence, data sovereignty, and cyber risk. The chips are intelligent. The architecture is hybrid. The promise is bold.
The zSystem isn’t just fast — it’s a battle-hardened veteran. It’s stood firm through digital storms, regulatory pressure, and evolving application demands — boasting 99.999999% availability, according to ITIC’s 2023 reliability survey. That’s eight nines. In a world where uptime is revenue, this isn’t a feature — it’s a lifeline.
The platform also introduces a fine-grained voltage control loop that brings a 15–20% power reduction at the socket level, reinforcing its sustainability credentials. Additionally, the Telum II processor’s design emphasises energy efficiency — and that thinking extends to Spyre. Each Spyre accelerator card consumes no more than 75W, a modest footprint for the kind of AI processing power it delivers. It’s a pragmatic blend of performance and responsibility, one that resonates strongly with enterprises under increasing ESG scrutiny.
Beyond power efficiency, IBM also emphasised z17’s ability to reduce physical footprint — enabling customers to consolidate infrastructure from four frames to three or three to two — freeing up space while lowering energy consumption and cooling overhead.
“z17 makes more possible — more AI, more resilience, more trust” is how IBM likes to position its latest mainframe launch.
At Greyhound Research, if there is one thing that we have learned over two decades of tracking this sector, it is that technology promises don’t move markets — execution does. Enterprises will need proof that Telum II and Spyre can handle real workloads with real ROI. They’ll demand transparency in quantum-safe tooling, extensibility in AI assistants, and simplicity in hybrid deployment. Proof, not posture, will determine whether this renaissance delivers.

Analyst In Focus: Sanchit Vir Gogia
Sanchit Vir Gogia, or SVG as he is popularly known, is a globally recognised technology analyst, innovation strategist, digital consultant and board advisor. SVG is the Chief Analyst, Founder & CEO of Greyhound Research, a Global, Award-Winning Technology Research, Advisory, Consulting & Education firm. Greyhound Research works closely with global organizations, their CxOs and the Board of Directors on Technology & Digital Transformation decisions. SVG is also the Founder & CEO of The House Of Greyhound, an eclectic venture focusing on interdisciplinary innovation.
Copyright Policy. All content contained on the Greyhound Research website is protected by copyright law and may not be reproduced, distributed, transmitted, displayed, published, or broadcast without the prior written permission of Greyhound Research or, in the case of third-party materials, the prior written consent of the copyright owner of that content. You may not alter, delete, obscure, or conceal any trademark, copyright, or other notice appearing in any Greyhound Research content. We request our readers not to copy Greyhound Research content and not republish or redistribute them (in whole or partially) via emails or republishing them in any media, including websites, newsletters, or intranets. We understand that you may want to share this content with others, so we’ve added tools under each content piece that allow you to share the content. If you have any questions, please get in touch with our Community Relations Team at connect@thofgr.com.
Disclaimer. This research note has been developed by Greyhound Research in collaboration with IBM. While this partnership supports broader industry awareness, Greyhound Research has maintained full editorial independence and control throughout the creation of this content. All insights, analysis, and recommendations reflect the views of Greyhound Research alone and have not been influenced by any external party. Please refer to the Copyright Policy above for full details on permitted usage. For questions, licensing requests, or media inquiries, contact our Community Relations Team at connect@thofgr.com.
Discover more from Greyhound Research
Subscribe to get the latest posts sent to your email.
