Showing posts with label NAG. Show all posts
Showing posts with label NAG. Show all posts

Friday, 10 January 2020

Over a decade of HPC consulting success at NAG

From small beginnings ...


It is almost 12 years since I joined NAG to build and lead the HPC consulting and services business. Over that time, we have built a consulting business from a tiny start to its current thriving status. We have helped a wide range of customers around the world of High-Performance Computing (HPC) and related areas such as cloud computing and machine learning by providing training and tutorials, multi-year professional services contracts, benchmarking services, focused consulting projects, impartial procurement expertise, strategic and technical advice, and more.

Protecting our customers' confidentiality and competitive advantages has been a strong theme of our success, which is why we have rarely been able to name our customers. We have helped many of the big oil & gas companies, plus several smaller ones, aerospace companies, manufacturing companies, automotive companies, public supercomputer centres, universities, government organisations, sports entities, HPC and cloud vendors, entertainment industry, and others.

The trusted position we have earned in the HPC community is arguably unique and will be difficult to replicate. There are very few other organisations worldwide who can genuinely offer the expertise, experience, impartiality and integrity that NAG delivered.

HPC requires expertise - technical and business


HPC, whether traditional simulation, or using on-premises supercomputers, or combined machine learning and simulation, or in the cloud, is hard. Creating a robust and compelling business case for investment is not easy. Reducing the risk of decisions in strategic direction, technology selection, staffing, software development, is not easy. Finding skilled HPC programmers is not easy. Delivering cost-effective and high-impact HPC services (rather than just standing up a machine) is not easy.

The current era of technology diversity in the HPC world is good for innovation and competitiveness. HPC buyers and users clearly benefit from this with better capabilities and pricing, but they must also manage the uncertainty and risk that the increased decision spaces create. Which CPU? On-premises vs cloud? Which cloud solution? Which system architecture? Which business model?

Over the last decade, we have helped customers and friends solve these challenges. The range of issues and number of customers impacted continue to grow.

I hope NAG has a bright future ahead with the new CEO, the healthy market opportunities, and the vision developing within the Executive Team. I expect NAG will continue to be a rare source of proven expertise in techncial computing.

Friday, 29 September 2017

Finding a Competitive Advantage with High Performance Computing

High Performance Computing (HPC), or supercomputing, is a critical enabling capability for many industries, including energy, aerospace, automotive, manufacturing, and more. However, one of the most important aspects of HPC is that HPC is not only an enabler, it is often also a differentiator – a fundamental means of gaining a competitive advantage.

Differentiating with HPC


Differentiating (gaining a competitive advantage) through HPC can include:
  • faster - complete calculations in a shorter time;
  • more - complete more computations in a given amount of time;
  • better - undertake more complex computations;
  • cheaper - deliver computations at a lower cost;
  • confidence - increase the confidence in the results of the computations; and 
  • impact - effectively exploiting the results of the computations in the business.
These are all powerful business benefits, enabling quicker and better decision making, reducing the cost of business operations, better understanding risk, supporting safety, etc.

Strategic delivery choices are the broad decisions about how to do/use HPC within an organization. This might include:
  • choosing between cloud computing and traditional in-house HPC systems (or points on a spectrum between these two extremes);
  • selecting between a cost-driven hardware philosophy and a capability-driven hardware philosophy;
  • deciding on a balance of internal capability and externally acquired capability;
  • choices on the balance of investment across hardware, software, people and processes.
The answers to these strategic choices will depend on the environment (market landscape, other players, etc.), how and where you want to navigate that environment, and why. This is an area where our consulting customers benefit from our expertise and experience. If I were to extract a core piece of advice from those many consulting projects, it would be: "explicitly make a decision rather than drift into one, and document the reasons, risk accepted, and stakeholder buy-in".

Which HPC technology?


A key means of differentiating with HPC, and one of the most visible, is through the choice of hardware technologies used and at what scale. The HPC market is currently enjoying (or is it suffering?) a broader range of credible hardware technology options than the previous few years.

Monday, 19 June 2017

Cutting through the clutter of ISC17: Monday lunchtime summary

ISC, the HPC community's 2nd biggest annual gathering, in fully underway in Frankfurt now. ISC week is characterized by a vibrant twitter flood (#ISC17), topped up with a deluge of press releases (a small subset of which are actually news), plus a plethora of news and analysis pieces in the HPC media. And, of course, anyone physically present at ISC, has presentations, meetings, and exhibitors further demanding their attention.

I go to ISC almost every year. It is a valuable use of time for anyone in the HPC community or who uses, or has an interest in, HPC even if they don't see themselves as part of the HPC community. However, I have decided not to attend ISC this year, due to other commitments. However, I will keep an eye on the "news" throughout the week and post a handful of summary blogs (like this one), which might be a useful catch-up on "news" so far, whether you are attending ISC or watching from afar.

Sunday, 2 June 2013

Supercomputing goes to Leipzig - a preview of ISC13

I have written my preview of ISC13 over at the NAG Blog ... a new location, Tianhe-2, MIC vs. GPU, industry, exascale, big data and ecoystems. Not quite HPC keyword bingo but close :-)

See you there!

Tuesday, 2 October 2012

The first mention of SC12

It's that time of year again. SC has started to drift into my inbox and phone conversations with increasing regularity - here comes Supercomputing 2012 in Salt Lake City. Last year, in the run up to SC11 in Seattle, I wrote the SC11 diary - blogging every few days on my preparations and thoughts for the biggest annual event of the supercomputing world.

I'm not sure I'll do such a diary again this year (unless by popular demand - not likely!). However, I will be writing some articles for some publications (HPC Wire and others - see my previous articles) in the coming weeks which will set the scene for SC from my point of view - burning issues I hope will be debated in the community, key technology areas I will be watching, and so on.

In the meantime, if you crave SC reading material, you might amuse yourself by reading my previous fun at SC time (e.g. The top ten myths of SC - in HPC Wire for SC11) or you might even want to translate my fun from ISC (Are you an ISC veteran?) to new meanings at SC.

If you want more serious content, then browse on this blog site (e.g. tagged "events") or on the NAG Blog (e.g. tagged "HPC").

If you find nothing you like - drop me a comment below or via twitter and I'll see what I can do to address the topic you are interested in!

Friday, 4 November 2011

My SC11 diary 10

It seems I have been blogging about SC11 for a long time - but it has only been two weeks since the first SC11 diary post, and this is only the 10th SC11 diary entry. However, this will also be the final SC11 diary blog post.

I will write again before SC11 in HPC Wire (to be published around or just before the start of SC11).

And, then maybe a SC11 related blog post after SC11 has all finished.

So, what thoughts for the final pre-SC11 diary then? I'm sure you have noticed that the pre-show press coverage has started in volume now. Perhaps my preview of the SC11 battleground, what to look out for, what might emerge, ...


Tuesday, 1 November 2011

My SC11 diary 8

It turns out I have to actually do some talking at SC11 as well as listen to others. So one of today's jobs was to start preparing some presentations I will be giving at SC11. My normal habit is to have a custom version of a slide set for each audience/customer. I try to avoid simply re-using the same slide deck for each talk. Obviously I do re-use large chunks of previous presentations but update it, or add/remove content to get the right focus.

Monday, 24 October 2011

My SC11 diary 3

Well, shock, today so far has not been dominated by SC11! "Normal" work (and admin) has been the focus so far today. It is easy at this time of year to scan the headlines in the main HPC news outlets such as HPC Wire, InsideHPC, twitter (!), ... and assume SC is the only thing the HPC world is thinking of right now. The same is true of article preparation emails circulating for specialist publications like The Exascale Report. And it is even true to some extent for publications with a broader remit - e.g. Scientific Computing.

Thursday, 20 October 2011

My SC11 diary 1

The SC11 conference, or just "supercomputing", will be held in Seattle this November. For many in the high performance computing community, SC is the big event of the year. Certainly it is the one that attracts the most press (and press releases), the most attendees, the biggest exhibition, and absorbs the most amount of time in planning before we even get there. It is the event where we get to meet with many of our customers, most of our potential suppliers, and many friends and collaborators.

Friday, 24 June 2011

ISC11 Review

ISC11 - the mid-season big international conference for the world of supercomputing - was held this week in Hamburg.

Here, I update my ISC11 preview post with my thoughts after the event.

I said I was watching out for three battles.

GPU vs MIC vs Fusion

The fight for top voice in manycore/GPU world will be one interesting theme of ISC11. Will this be the year that the GPU/manycore theme really means more than just NVidia and CUDA? AMD has opened the lid on Fusion in recent weeks and has sparked some real interest. Intel's MIC (or Knights) is probably set for some profile at ISC11 now the Knights Ferry program has been running a while. How will NVidia react to no longer being the loudest (only?) noise in GPU/manycore land? Or will NVidia's early momentum carry through?

Review: None of this is definitive, but my gut reaction is that MIC won this battle. GPU lost. Fusion didn't play again. My feeling from talking to attendees was that MIC was second only to the K story, in terms of what people were talking about (and asking NAG - as collaborators in the MIC programme - what we thought). Partly because of the MIC hype, and the K success (performance and power efficient without GPUs), GPUs took a quieter role than recent years. Fusion, disappointingly, once again seemed to have a quiet time in terms of people talking about it (or not). Result? As I thought, manycore is now realistically meaning more than just NVidia/CUDA.

Exascale vs Desktop HPC

Both the exascale vision/race/distraction (select according to your preference) and the promise of desktop HPC (personal supercomputing?) have space on the agenda and exhibit floor at ISC11. Which will be the defining scale of the show? Will most attendees be discussing exascale and the research/development challenges to get there? Or will the hopes and constraints of "HPC for the masses" have people talking in the aisles? Will the lone voices trying to link the two extremes be heard? (technology trickle down, market solutions to efficient parallel programming etc.) What about the "missing middle"?

Review: Exascale won this one hands down, I think. Some lone voices still tried to talk about desktop HPC, missing middles, mass usage of HPC and so-on. But exascale got the hype again (not necessarily wrong for one of the year's primary "supercomputing" shows!)

Software vs Hardware

The biggie for me. Will this be the year that software really gets as much attention as hardware? Will the challenges and opportunities of major applications renovation get the profile it deserves? Will people just continue to say "and software too". Or will the debate - and actions - start to follow? The themes above might (should) help drive this (porting to GPU, new algorithms for manycore, new paradigms for exascale, etc). Will people trying to understand where to focus their budget get answers? Balance of hardware vs software development vs new skills? Balance of "protect legacy investment" against opportunity of fresh look at applications?

Review: Hardware still got more attention than software. Top500, MIC, etc. Although ease-of-programming for MIC was a common question too. I did miss lots of talks, so perhaps there was more there focusing on applications and software challenges than I caught. But the chat in the corridors was still hardware dominated I thought.

The rest?

What have I not listed? National flag waving. I'm not sure I will be watching too closely whether USA, Japan, China, Russia or Europe get the most [systems|petaflops|press releases|whatever]. Nor the issue of cloud vs traditional HPC. I'm not saying those two don't matter. But I am guessing the three topics above will have more impact on the lives of HPC users and technology developers - both next week and for the next year once back at work.

Review: Well, I got those two wrong! Flags were out in force, with Japan (K, Fujitsu, Top500, etc) and France (Bull keynote) waving strongly among others. And clouds were seemingly the question to be asked at every panel! But in a way, I was still right - flags and clouds do matter and will get people talking - but I mainatin that manycore, exascale vs desktop, and the desperation of software all matter more.


 What did you learn? What stood out for you? Please add your comments and thoughts below ...

Friday, 18 March 2011

Performance and Results

[Originally posted on The NAG Blog]

What's in a catch phrase?

As you will hopefully know, NAG's strapline is "Results Matter. Trust NAG".

What matters to you, our customers, is results. Correct results that you can rely on. Our strapline invites you to trust NAG - our people and our software products - to deliver that for you.

When I joined NAG to help develop the High Performance Computing (HPC) services and consulting business, one of the early discussions raised the possibility of using a new version of this strapline for our HPC business, reflecting the performance emphasis of the increased HPC activity. Probably the best suggestion was "Performance Matters. Trust NAG." Close second was "Productivity Matters. Trust NAG."

Thursday, 10 March 2011

NAG out and about

[Originally posted on The NAG Blog]

The NAG website has a section called "Meet our experts - NAG out and about", which gives a list of upcoming events worldwide that NAG experts will be attending or presenting at.


The page also notes: "We regularly organise and participate in conferences, seminars and
training days with our customers and partners. If you would like to talk
to us
about hosting a NAG seminar at your organisation or any training
requirements you might have email us at
sales@nag.co.uk
".


In my own focus of high performance computing (HPC), I have previously written (for ZDNet UK) about some key supercomputing events. For those of you interested in meeting up with HPC experts (especially from NAG!), I have set up a survey of HPC events - please let us know which events you plan to attend in 2011 - and see which events other readers of The NAG Blog are attending.

Saturday, 30 October 2010

Comparing HPC across China, USA and Europe

[Originally posted on The NAG Blog]

In my earlier blog post today on China announcing the world's faster supercomputer, I said I'd be back with more later on the comparisons with the USA, Europe and others. In this morning's blog, I made the point that the world's fastest supercomputer, in itself, is not world changing. But leading supercomputers, critically matched with appropriate expertise in programming and using them, togther with the vision to ensure use across basic research, industry and defence applications can indeed be strategically beneficial to a nation - including real economic impact.



There are plenty of reports and studies describing the strategic impact of HPC within a given organisation or at national levels (some are catalogued by IDC here), so let's take it as a premise for the following thoughts.


Friday, 29 October 2010

Why does the China supercomputer matter to western governments?

[Originally posted on The NAG Blog]

There is a lot of fuss in the mainstream media (BBC, FT, CNET, even the Daily Mail!) the last few days about the world's fastest supercomputer being in China for the first time. And much ado on Twitter (me too - @hpcnotes).



But much of the mainstream reporting, twitter-fest, and blogging is missing the point I think. China deploying the world's fastest supercomputer is news (the fastest supercomputer has almost always been American for decades, with the occasional Japanese crown). But the machine alone is not the big news.


Monday, 13 September 2010

Do you want ice with your supercomputer?

[Originally posted on The NAG Blog]

Would you like ice with your drink?” It’s a common question of course. One that divides people – few will think “I don’t mind” – most have a firm preference one way or the other. There are people who hate ice with their drink and those who freak if there is none. National stereotypes have a role to play – in the USA the question is not always asked – it’s assumed you want ice with everything. In the UK, you often have to ask specifically to get ice.



Yet the role of ice in making our drinks chilled is misleading. I once had a discussion with a leading American member of the international HPC community about this. “No ice”, he was complaining as we headed out of a European country, “they had no ice for the drink”.



I don’t get this obsession with ice”, I chipped in. “What?!” He looked at me as if I were mad. “Why do you like your coke warm?



Ah, but that’s just it”, I replied. “I hate warm drinks – I really like my coke chilled. But surely, in this modern world over a century after the invention of the refrigerator, it’s not unreasonable to expect the fluid to be chilled – without the need to drop lumps of solid water into it?



Ah, fair point”, he conceded.



What has this got to do with supercomputing? Perhaps the common thread is that usually we just accept the habitual choices of ways to do things – and don’t often step back to think – “are those the only choices?



Maybe we should step back a little more often and ask ourselves what we are trying to achieve with HPC – and are the usual choices the only ways forward? Or are there different ways to approach the problem that will deliver simpler, better or cheaper performance?



Perhaps your business/research goals mean you need to conduct more complex modelling or you need faster performance. Maybe the drive of computing technology towards many-core processors rather than faster processors is limiting your ability to achieve this. (I have had several conversations recently, where companies are buying older technology because their software won’t run on multicore).



The “ice or no ice” question might be whether or not to upgrade your HPC with the latest multicore processors. But what about the “just chill the fluid” option? Well, how about upgrading the software instead, or as well?



NAG has plenty of case studies to show where enhancements to software have achieved huge gains in performance or capability (e.g., www.hector.ac.uk/cse/reports).



Sometimes buying more compute power is the right answer. Sometimes, extracting more efficient performance from what you have is the answer. Bringing them together - a balance of hardware upgrades and software innovations might well give you the best chance of optimising cost efficiency, performance and sustainability of performance.

Monday, 19 July 2010

Time Machines and Supercomputers

[Originally posted on The NAG Blog]

I found a Linpack App for the iPhone last week. Nothing special, just a bit of five minute fun. It seems a 3G model achieves about 20 MFLOPS. [Note 1]



What's that got to do with time machines? Well it got me thinking "I wonder when 20 MFLOPS was the performance of a leading edge supercomputer?" Actually, it was before the start of the Top500 list (1993), so finding out was beyond the research I was prepared to do for this blog.



So I thought instead about the first supercomputer I used in anger. As soon as I name it, if anyone is still reading this waffle, you will immediately fall into two camps - those who think I'm too young to be nostalgic about old supercomputers yet - and those who think I'm too old to be talking about modern supercomoputers :-).



It was a Cray T3D.



You're still waiting for the time machine bit ... hang on in there.



My application on that T3D sustained about 25 GFLOPS. Which is about the same as a high end PC of recent years. What this means to me is that anyone who cares to apply the effort today with a high end PC, could get comparable results to that work of 15-20 years ago that needed the supercomputer.



Or, in other words, that supercomputer gave us a 15-20 years time advantage over everyone who didn't have supercomputers - or a few years over others with smaller supercomputers. [Note 2]



That is one of the key benefits of High Performance Computing - the ability to get a result before a competitor - you could say HPC is a time machine for simulation and modelling.



Now for the [Notes] - which actually contain the real story!



Note 1 : It's not really true to say the iPhone 3G can do 20 MFLOPs - all we can say is that particular App achieved 20 MFLOPs on that iPhone 3G. The result is a factor of both the software and the hardware. Better performance can come from optimising the application as much as from buying a more powerful phone.



Note 2 : If fact, even with the same supercomputer, it would be hard for most people to replicate the results - simply because there was as much value in the software (physics, algorithms, performance engineering, implementation, etc) and the associated validation and verification program as there was in the supercomputer.



The supercomputer offered us a time machine. But the attention to performance and scalability in the application enabled us to actually use that time machine to get results faster than others - even if those others used the same supercomputer. And the validation and verification effort meant that we could trust what our time machine was telling us.

Tuesday, 22 June 2010

Technical computing futures part 2: GPU and manycore success

[Originally posted on The NAG Blog]

In my previous blog, I suggested that the HPC revolution towards GPUs (or similar many-core technologies) as the primary processor has a lot in common with the move from RISC to commodity x86 processors a few years ago. A new technology appears to offer cheaper (or better) performance than the incumbent, for some porting and tuning pain. Of course, I’m not the first HPC blogger to have made this observation, but I hope to follow it a little further.



In particular, my previous blog suggested the outcome might be: “at first the uptake is tentative ... but in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems” – in other words, hard going initially, but GPU/many-core will “win” eventually. I even ended up with an ambitious promise for my next blog (i.e. this one): “an idea of what/who will emerge as the dominant solution ...



Continuing the basis of using the past to guess the future, my prediction is that the next steady state of HPC processors will be GPU-like/manycore technologies (for most of the FLOPS at least) and, just like the current steady state (x86), those few companies with the strongest financial muscle will eventually own the dominant market share. However, other companies will have pioneered many of the technologies that make that dominant market share possible, enjoying good market share surges in the process.



I can even have a go at predicting some of the path that might get us to the next steady state of HPC architecture. NVIDIA has already shown us that GPUs for HPC are sometimes a good solution – and importantly, that a good programming ecosystem (CUDA) really helps adoption. Over the last year or so, I’d say the HPC community has moved from “if GPUs can work in this case ...” to “how do I make GPUs work across my workload?



As Intel’s Knights processors bring us many-core but with a familiar x86 instruction set, we might learn that getting good performance across a broad range of applications is possible, but critically dependent on software tools and hard work by skilled parallel programmers. AMD’s Fusion with tighter links between CPU & GPU, could show that the nature of the integration between the many-core/GPU unit and the rest of the system (be it CPU, network, main memory etc) will affect not only maximum performance on specific applications, but maybe more importantly the ease of getting “good enough” performance across a range of applications.



I don't know of any GPU/many-core/accelerator announcements from IBM, but it’s always possible IBM will throw in another useful contribution before the dust settles. They were one of the first into many-core processors for HPC acceleration with Cell and they cannot be easily counted out of top end HPC solutions - e.g. the forthcoming Blue Waters (POWER7) and Sequoia (BG/Q) chart-toppers.



But back to my “winner” prediction. When the revolution settles into a new steady state of mostly GPU/many-core for HPC processors, there won’t be (can’t be) critical distinctions between the various products anymore for most applications. Whichever product we consider (whether GPU or x86-based or whatever), many-core is sufficiently different from few-core (e.g. 1-8 cores) to mean that the early winners have been those users who are easily able to move their key applications across to get step changes in cost and performance.



The big winners in the next stages of the GPU/manycore emergence will be those users who can move the bulk of their high-value-generating HPC usage to many-core processors with the most attractive transition (economy and speed) compared to their competitors.



So what about the dominant solution I promised? For the technology to be pervasive, first there must be greater commonality between offerings (I stop short of standardization) so that programmers have at least a hope of portability. Secondly, users need to be able to extract the available performance. Ideally these would mean a software method that makes many-core programming “good enough easily enough” is discovered – and if so, that software method will be the dominant solution, across all hardware.



Or, if the magic bullet is still not market ready, skilled parallel programmers will be the dominant solution for achieving competitive performance and cost benefits - just like it is for HPC using commodity x86 processors today.

Tuesday, 8 June 2010

Revealing the future of technical computing: part 1

[Originally posted on The NAG Blog]

I recall some years ago porting an application code I worked with, which was developed and used almost exclusively on a high end supercomputer, to my PC. Naively (I was young), I was shocked to find that, per-processor, the code ran (much) faster on my PC than on the supercomputer. With very little optimization effort.


How could this be – this desktop machine costing only a few hundred pounds was matching the performance of a four processor HPC node costing many times that? Since I was also starting to get involved in HPC procurements, I naturally asked why we spend millions on special supercomputers, when for a twentieth of the price, we’d get the same throughput from a bunch of high-spec PCs?


The answer then (and now) was that I was extrapolating from only one application, and that application could be run as lots of separate test cases with no reduction in capability (i.e. we didn’t need large memory etc, just lots of parameter space). However, the other major workload (which I also ported and also ran fast on the PC) would not have been able to do the size of problem we wanted on a PC – we needed the larger memory and extra grunt from parallel processing. (We did look at the newfangled Network Of Workstations emerging at the time but decided it might be a wolf in sheep’s clothing. Sorry.)


In the end, we had to find a balance between (a) speed at lowest cost for the one application; (b) the best capability for the other application (i.e. fastest solution time for the largest problems); (c) ease of programming – to get a good enough (fast-enough) code developed with the limited developer effort and funding we had; and (d) whole life affordability.


Why do I foist this reminiscence on you? Because the current GPU crisis (maybe “crisis” is a bit strong – "PR storm" perhaps?) looks very much the same to me. The desktop HPC surprise of my youth has evolved into the dominant HPC processor and so for some years now, we have been developing and running our applications on clusters of general purpose processors – and a new upstart is trying to muscle in with the same tactic – “look how fast and how cheap” – the GPU (or similar technologies – e.g. Larrabee, sorry Knights-thingy).


The issues are the same: (a) for some applications, GPUs offer substantial performance improvements for considerably less cost than a “normal” HPC processor; (b) for other applications, the limits such as off-card bandwidth etc mean that GPU’s cannot deliver the required capability; (c) the underlying concern is ease of programming for GPUs; (d) affordability – sure GPU’s are cheap to buy, but what about power costs when in bulk, or code porting costs, etc?


Maybe the result will be the same as when commodity processors and clusters eventually exploded to leave custom supercomputer hardware as the minority solution. At first the uptake (now) is tentative - and painful. Some will have great success stories, many will get burnt. But in a few years time, we might well look back with nostalgia to when GPU’s were not the dominant processor for HPC systems.


I’ll continue on the future of HPC in my next blog in a few days, including an idea of what/who will emerge as the dominant solution ...

Tuesday, 23 March 2010

What’s the next revolution in technical computing?

[Originally posted on The NAG Blog]

It’s a question that absorbs the attention of the technical computing community, especially those working at the leading edge of technology and performance (high performance computing, HPC). What is the next disruptive technology? In other words, what is the next technology that will replace a currently dominant technology? Usually a disruptive technology presents a step-change in performance, cost or ease-of use (or a combination of these) compared to the established technology. The new technology may or may not be disruptive in the sense of discontinuous change in user experience.



Why is identifying disruptive technology so important? First, those who spot the right change early enough and deploy it effectively can attain a significant advantage over competitors as a result of a substantial improvement in technical computing capability or reduction in cost. Second, identifying the right technology change in time can help ensure that future investments (whether software engineering, procurement planning, or HPC product development) are optimally spent.



However, in a field as fast moving as technical computing, spotting the next disruptive technologies of specific relevance to your individual needs can easily become a full time activity (which is why NAG helps to do this for others).



One very credible candidate for disruptive change in HPC right now is GPU computing (or related products that might be in development). However, at the Newport conference recently, the discussion turned to what the next disruptive technology to hit HPC would be (after the possible GPU disruption). One suggestion, made by John West (of InsideHPC fame), was that the next disruptive technology could be in software, especially programming tools and interfaces. This builds on the fact that parallel computing is no longer a specialist activity unique to the HPC crowd – parallel processors are becoming pervasive across all areas of computing from embedded to personal to workgroup technical computing. Parallel programming is thus heading towards a mass market activity – and the mass market is unlikely to view what we have in HPC currently (Fortran plus MPI and/or OpenMP, or limited tools, etc) with much favour. I’m not knocking any of these, but they are not mass-market interfaces to parallel computing. So perhaps the mass market, through volume of people in need – and companies driven by economics will come up with a “better” solution for interfacing with supercomputers.



As a HPC community we lost control of much of our hardware to the commodity market some years ago. Maybe we now face losing control of our software to the commodity community too.

Thursday, 18 February 2010

Exascale or personal HPC?

[Originally posted on The NAG Blog]

Which is more interesting for HPC watchers - the ambition of exaflops or personal supercomputing? Anyone who answers "personal supercomputing" is probably not being honest (I welcome challenges!). How many people find watching cars on the local road more interesting than F1 racing? Or think local delivery vans more fascinating than the space shuttle? Of course, everyday cars and local delivery vans are more important for most people than F1 and the space shuttle. And so personal supercomputing is more important than exaflops for most people.

High performance computing at an individual or small group scale directly impacts a far broader set of researchers and business users than exaflops will (at least for the next decade or two). Of course, in the same way that F1 and the shuttle pioneer technologies that improve cars and other everyday products, so the exaflops ambition (and the petaflops race before it) will pioneer technologies that make individual scale HPC better.

One potential benefit to widespread technical computing that some are hoping for is an evolution in programming. It is almost certain that the software challenges of an exaflops supercomputer with a complex distributed processing and memory hierarchy demanding billion-way concurrency will be the critical factor to success and thus tools and language evolutions will be developed to help the task.

Languages might be extended (more likely than new languages) to help express parallelism better. Better may mean easier or with assured correctness rather than higher performance. Language implementations might evolve to better support robustness in the face of potential errors. Successful exascale applications might expect to make much greater use of solver and utility libraries optimized for specific supercomputers. Indeed one outlying idea is that libraries might evolve to become part of the computer system rather than part of the application. Developments like these should also help to make the task of programming personal scale high performance computing much easier, reducing the expertise required to get acceptable performance from a system using tens of cores or GPUs.

Of course, while we wait for the exascale benefits to trickle down, getting applications to achieve reasonable performance across many cores still requires specialist skills.