Home / Channel / Exclusive interview with Intel’s John Hengeveld about parallelism

Exclusive interview with Intel’s John Hengeveld about parallelism

KitGuru has long been fascinated with the increased levels of parallelism in modern processors. The move to Core, showed the world that doing more work per cycle is usually much more powerful than simply cranking up the clock. Intel's research teams rapidly embraced this increased parallelism and now we're moving toward a world unlike any we've seen before. KitGuru managed to get up close and personal with John Hengeveld, Director of Marketing inside the Data Centre Group at Intel. We'd normally re-write Q&A sessions into a stream of prose, but this one (ironically) works really well in sequence. Our questions are in black – just in case you're not sure.

Abstract concepts are often best understood with real world analogies. What analogy best highlights the difference between a computer system with very low/no parallelism and one that is extremely parallel?

I like to use the highway analogy and lanes. A highway that has only one lane, gets less people to work than a highway with 8 lanes. Computer systems with parallelism allow more work to flow through. Which is better a rush hour, a 6 lane highway with a 55mph speed limit or a 1 lane highway with a 70mph speed limit? The 6 lane highway gets 5 times more cars through (roughly) per hour.

An HPC workload is rush hour traffic. Lots of pieces of code all trying to get someplace at the same time. A traditional Xeon processor is has a small number of lanes (let’s call it 6 for this example) and the speed limit is 350 MPH. An MIC processor is a 50+ lane highway – and let’s call the speed limit 150 MPH. When you have a special assignment and have lots of small cars the MIC processor does better. If you only have fewer big or huge cars and they can all go 350mph. then the Xeon solution is better.

Haven’t computers been working in parallel for years? What’s changing with the new Intel technologies?

Yes, computers have been working in parallel for a long time. Intel has a long history of building more lanes to the highway. You can take many of today's computers and have them work in parallel and get some level of performance – like building a bypass. What is changing here is performance power efficiency. Not only does MIC give you more lanes, but it also substantially reduces the power requirement for computation. It’s like all the cars on the highway get better fuel efficiency just because they are on this road.

Another way to achieve parallelism is to build a special purpose accelerator, like a GPGPU or FPGA. The challenge there is that it takes a lot more work to develop an application for these devices. Think of this as mass transit. If you do the work to get to where it can pick you up – and do the work to get where you are going from – to where you get dropped off and you don’t mind the extra time and cost to do that work… mass transit might get you someplace. But with Intel MIC, you can use your own car (programmed with Intel standard tools and technologies). You not only gain fuel efficiency, you also get through more work, sooner.

From a hardware perspective, what are the biggest challenges as you move forward? How about from a software perspective?

The biggest challenges in getting to Exascale are power efficiency and managing programming complexity. Intel is committed to working with industry, government and research partners to try and reach, by the end of the decade, an Exascale machine with 20MW of power per system. This a big challenge that will take a lot of work and new learning on many peoples’ part to get there.

Key to Intel’s approach is preserving a straight forward programming model, so that a larger range of workloads can take advantage of such of a machine. The harder it is to program – the fewer applications will be developed.

Customers are hearing a lot of new words right now, like MIC, Knights Corner and Cilk – what’s the ‘single line explanation’ for these?

MIC is Intel Many Integrated Core: This is an adjective… followed by a noun… ‘architecture, product, products, research, programming model'. An architectural approach that solves highly parallel problems with increased number of smaller and more specific IA cores than is used for general purpose workloads and systems.

Knights Corner: Intel code name for the first commercial Intel MIC architecture hardware product.

Cilk: A programming language, based on C++, for expressing parallelism. Well worth a trip to Wikipedia.

Kirk Skaugen, vice president and general manager of Intel's Data Center Group holding the 'Knights Ferry' MIC card at the international Supercomputing Conference in Hamburg, Germany. Due to launch inside 6 months, or so we believe.

What is an intrinsic – how can a normal person understand this kind of terminology?

Think of a compiler like an automatic transmission for a car. Most of the time the automatic transmission takes its inputs and configures the car with the right gears to drive smoothly and efficiently to your destination.

Most of the time a compiler takes programming language, and pretty much knows how to do the right thing to get the work done.

But sometimes you face a sleep hill, a slippery road or a dangerous condition where you the driver want to have specific control over how the car operates. Think of an intrinsic as an on-demand manual transmission.

An intrinsic is a programming language element that directs the compiler on how to deal with the use of the underlying machine for a specific element of data or code. This is commonly used in programming for vectorization and parallel programming and allows the programmer to more specifically direct (at a high level language) the compiler on how to use the underlying hardware.

On a recent call with the Intel experts, KitGuru got the impression that you have already invested a lot of time/effort/resource toward bringing this parallel evolution to market, but it seems that more is needed from ‘outside’. Specifically in the area of development languages. With something as ubiquitous as C, where does the responsibility lie in making this language much more ‘parallel friendly’?

There is a great deal of innovation going on this space. As with most things, Intel looks at this as an industry/ecosystem challenge.

New methodologies are constantly emerging. Intel’s tools for software development create a platform for the industry to innovate upon. The whole software industry has to take this on.

Given that graphics has traditionally been a slow, but more parallel environment; is it fair to say that Intel has gained new insight about ways to develop/evolve next generation processors AFTER working through the Larrabee project?

Yes absolutely.

Intel has taken the learning from the Larrabee project [I keep promising THIS will be the last time I type Larrabee! – Interviewee note] as well as years of research at Intel labs, and learning from our mainstream HPC business to be applied to MIC activities and indeed to our whole product line. Yes, Intel and our partners have learned a lot and we’ve gained new insight about how future processors will be better mapped to high performing, and highly efficient, systems.

Last question, with Intel’s on-going advances in hardware – and this drive to new thinking/languages/environments – when will an Intel ‘CPU’ have human-brain-levels of processing power?

It’s a ways off. Inorganic computation today is organized around perfect data storage and retrieval and fast calculation. The organic computation (like the human brain) is organized around observation, intuition, inference and learned response. The logic elements of the human mind (neurons) have fascinating properties of adaptation that inorganic computation labours hard to reproduce. We have many years of research to go before we will be truly able to wrap our technology around this question. If we DO have the ability to create this level of computation, is it economically valuable to do so, such that investment will occur and profitability will follow. There is remarkable technology in both the human and inorganic computation. Each optimized for a purpose.

This is fascinating stuff for KitGuru. The legal and moral issues that a human-level of parallel computing would bring, are simply mind boggling, and yet – that is definitely something on our horizon. At KitGuru, we like to get a feel for the person behind the voice, so we presented John Hengeveld with our standard array of mind-probing questions. Infer what you will from the replies.

First up, what are your favourite bands?
John replied, “The Beatles, Matchbox 20, Cake and Lady Gaga. Yes. Lady Gaga. And in the same sentence as the Beatles!”

What about cars?
“Easy, Mustang built half way through 1964”.

If John had to star in a film, opposite one leading actress, who would that be?
“Drew Barrymore”.

KitGuru is a great lover of the food as well as the technology. If John was cooking for himself, what’s his favourite food?
“I am a serious chef. Panko and Pepper Encrusted and Pan seared ahi tuna with raspberry wasabi vinegrette”. Gosh.

What food can you only get 100% right in a restaurant ?
Molecular Gastronomy“, he replies. “It requires so much technology to deliver. Proper desserts.”

Given that Intel's kind of science did not exist when you were a kid –what did you want to be when you were 14?
“Naval Officer”.

How much parallelism would you need to be able to distinguish complex flavours in realtime?

KitGuru says: We thank John for taking the time to share his insights into the amazing future of parallelism.With the move to a 7nm process by the end of 2016 already well known – and a series of boosts to the levels of parallelism available, already lined up – we're going to be living in a very different tomorrow. One question remains with KitGuru: If parallelism truly takes off, then how many of us will actually need to know how it works in order to use it? Will the pool of hardcore programmers needed increase or decrease with time? All fascinating stuff.

Picture of John taken with courtesy from ITTV @ Siggraph 2010

Become a Patron!

Check Also

AMD launches Ryzen 7 9800X3D processor at $479

AMD kicks off the rollout of its 3D V-Cache equipped Ryzen 9000X3D CPUs with the Ryzen 7 9800X3D. This is an 8-core, 16-thread chip utilising Zen 5 and AMD's 2nd generation 3D V-Cache technology.

8 comments

  1. That was a great interview, learned a few interesting things too. not the usual boring ‘bleh’.

  2. A company with many very very intelligent people in it. I generally loathe (and I mean loathe) interviews, as it involves some random dude speaking out some PR pamphlet from a drawer. This was really interesting. more !

  3. “The whole software industry has to take this on.” – well said, couldn’t agree more. but will it happen? I can’t see it anytime soon, unfortunately

  4. How on earth can you have the beatles and lady gaga in the same sentence :p

    apart from that, I applaud the direction and focus on this.

  5. They should do these more often, the analogies were education. Im not the brightest so I need things explained basically.

  6. enjoyed this with my tea this afternoon, thanks.

  7. How much will these cards cost?