Looking Ahead at Intel’s Secret Exascale Architecture

November 14, 2017

There has been a lot of talk this week about what architectural direction Intel will be taking for its forthcoming exascale efforts. As we learned when the Aurora system (expected to be the first U.S. exascale system) at Argonne National Lab shifted from the planned Knights Hill course, Intel was seeking a replacement architecture—one that we understand will not be part of the Knights family at all but something entirely different.

Just how different that will be is up for debate. Some have posited that the exascale architecture will feature fully integrated hardware acceleration (no offload model needed for codes) with Intel’s own GPU variant. Some see possibilities for an ultra-heterogeneous chip with integrated FPGA (based on the recent Altera acquisition) combined with GPU and other workload-specific acceleration. Others think it might look something like the PEZY processor out of Japan, which is proving its mettle on top-ranked systems with its unique 2048-core design. NEC and others with vector based architectures showcase their wares well on real-world HPC benchmarks. Others still see efforts like Intel’s recent investments in neuromorphic and quantum technologies as indicative of a novel architecture push.

According to Barry Davis, GM of the enterprise and HPC group at Intel, the reality is that this architecture needs to be production ready by 2021, which is not much time, especially if is software footwork required from users. That pares the options down, but also begs several questions about the approach. From what we could gather, a pure CPU is the target—nothing that requires fancy offload models or novel approaches to programming or thinking about problems. The following Q&A we did with Davis seeks to shed light on some of these theories (keeping in mind, NDA briefings on what this will be have only just begun to the wider HPC community at SC this week).

NH: When did your teams at Intel realize there was going to be a change in architectural direction—more specifically, talk about the process of re-envisioning your approach to exascale and what you had to reroute and how?

Davis: I can’t talk about specific dates but I can that as we know, within the last couple of months, the DoE talked about the fact that they are pulling the exascale timeline in to 2021 and that they would be working with us. Subsequently, we said we would take our Knights Hill investments and focus those on this new exascale platform. I can’t say we have been about this for a long time, but we are accelerating.

The architecture we are moving toward for exascale is not something we just dreamed up in the last six months or even year. We’ve been on working on this for a long time. What the two-year pull-in of the timeline to 2021 does is accelerates our roadmap. We could shift quickly to meet that because we’ve been working on this for a while. A two-year pull-in was not easy and we were already trying to accelerate the roadmap before this and consider how to bring in some of the future Xeon implementations and bring those closer into the market. We are not doing something just for exascale; it was something we planned to do, so this is a lot easier to do than it may seem.

NH: Does this boosted timeline to deliver an exascale chip by 2021 change your process roadmap?

Davis: Everything is hitting within a window where we had already targeted that process node in that timeframe. It’s not like we had to say, and this is just an example not a statement about the chip, that to make this work we had to move this from 14nm to 10nm. This was our timeframe for this processor anyway.

NH: Is this software side of this exascale processor story something that will require a lot of lead time for HPC developers to get up to speed with? Are there elements that will be unfamiliar or that will lead to refactoring or changing codes?

Davis: There is definitely an ecosystem here we have to work toward. There is definitely software, but there always is.

It will not be disruptive to the ecosystem but yes, we will have to engage the ecosystem well in advance of 2021 to get codes ready. We are talking to people under NDA about this here at SC17 this week to start that conversation.

Since we are on a CPU path here, this is not going to be a strategy that completely disrupts the ecosystem. We want to run this up the middle with existing models (OpenMP as an example) but there is enablement that needs to happen.

NH: What challenges that we have touched on yet that you see for your own production perspective—an what, other than software readiness, will be the hurdles for users?

Davis: From an Intel perspective this is a moonshot—this is a big deal. Working without partners to create one of the first exascale platforms in the world is a big challenge. I’m not sure Intel has specific challenges other than we do a lot as a company and the one way to address is to give this importance internally. We address the challenges that are present here daily anywhere in terms of CPU design and packaging and systems—this will exercise all of our muscles. We need to execute this well is the point. We have enough time, this isn’t next year, there’s a few years to figure out what needs to be done and we do that by partnering with the ecosystem and DoE and work together.

From an ecosystem and user perspective, there’s always a software challenge. What will be difficult for people on this one is the question of what makes sense to run at that scale? In other words, what are the applications or workloads that are going to be able to take advantage of this kind of capacity. Which grand challenges? Part of that is integrating everything together. As I said before, modeling and simulation, AI, and high performance data analytics all need to be first class citizens on this platform. You should be able to run all those workloads well and create a workflow that allows the scientist or user to do the task. A good example is running modeling and simulation, then doing AI and then analytics on that and running it all back through—that is all three workloads. It will be hard for users to do that effectively. It changes their thinking about what they do, which used to be one area (modeling and simulation for instance). This isn’t just for exascale, it’s for all of HPC but pronounced on an exascale machine.

NH: What architectural directions seem appropriate for exascale and how did the existing Knights roadmap not fit with that vision?

Davis: We can talk about the right architecture for exascale in general but of course, I can’t be specific [about the Aurora deal] and I’m not going to say “Knights had X and the new Knights has Y”. The needs of the next generation architecture for exascale are fairly well-documented. There’s power, cooling, space, interconnect—and that is a tremendous issue, creating a low latency technology that communicate at that scale. There also compute and storage. There’s a lot to address here.

What we have said publicly is that we are working on a new platform for exascale. That will draw from the best of the Intel portfolio to create a global exascale platform—global because while we’re working with the Department of Energy in the U.S., exascale is a global issue. But what do we need?

We need technology to allow us to compute and accelerate your workloads but do that without specialized code. Everyone talks about accelerators and those are great, but for exascale, every application you bring to that platform should be an offloaded application that you have to port for that particular application. We need something broader that can scale across all of the workloads.

Also keep in mind that the mandate of exascale is to address not just traditional modeling and simulation but also AI and high performance data analytics. You need the right level of performance—and performance per watt across those areas.

NH: There are a lot of architectural contenders right now. How does what you’re working on for the bumped-up exascale chip fit into those?

Davis: You’re talking about hardware architectures but I like to think of things in terms of workloads. So there are, like you say, FPGAs, GPUs, vector machines and things like the PEZY chips in Japan (as one example). Many of those are offload for particular codes that have to be sent over a bus for execution. That’s fine but we are a CPU company. We like to think about codes executing on our platform without the latency and offload. There are areas where that works well, but it doesn’t work for everything and CPUs do.

NH: Based on our conversation so far, it does not sound like we are looking at exotic or novel architectures here—are there novel components? How do your recent investments in truly novel areas like quantum and neuromorphic fit into a current or future exascale (or post-exascale) future?

Davis: The neuromorphic and quantum efforts are from the labs and are great research programs at the bleeding edge of compute. The work we’re doing for exascale is not connected from that perspective. If we are bringing something to the market by 2021 and scale it to this level it can’t be a novel architecture. Those research areas have promise for the future, that’s why we have invested at a future level, but for exascale, we need technologies that are grounded in reality and close to ready for primetime.

There are a lot of processor options; some with volume, some without. There are PEZY machines and there’s volume there and GPUs and there’s definitely volume there. Truly novel architectures mean you have to change how you think about things. As long as tools and applications and development activity is similar to what we know, it is not novel. I can’t say if what we are doing is novel, I’ll let you draw your own conclusions, but to me, novel means big changes for end users and as we discussed, we are not trying to be disruptive to the ecosystem.

Similar Vein

Categories: HPC, SC17

Tags: , ,