Thursday, 20 December 2012

A review of 2012 in supercomputing - Part 2

This is Part 2 of my review of the year 2012 in supercomputing and related matters.

In Part 1 of the review I re-visited the predictions I made at the start of 2012 and considered how they became real or not over the course of the year. This included cloud computing, Big Data (mandatory capitalization!), GPU, MIC, and ARM - and software innovation. You can find Part 1 here: http://www.hpcnotes.com/2012/12/a-review-of-2012-in-supercomputing-part.html.

Part 2 of the review looks at the themes and events that emerged during the year. As in Part 1, this is all thoroughly biased, of course, towards things that interested me throughout the year.

The themes that stick out in my mind from HPC/supercomputing in 2012 are:
  • The exascale race stalls
  • Petaflops become "ordinary"
  • HPC seeks to engage a broader user community
  • Assault on the Top500

The exascale race stalls

The global race towards exascale supercomputing has been a feature of the last few years. I chipped in myself at the start of 2012 with a debate on the "co-design" mantra.

Confidently tracking the Top500 trend lines, the HPC community had pinned 2018 as the inevitable arrival date of the first supercomputer with a peak performance in excess of 1 exaflops. [Note the limiting definition of the target - loosely coupled computing complexes with aggregate capacity greater than exascale will probably turn up before the HPC machines - and peak performance in FLOPS is the metric here - not application performance or any assumptions of balanced systems.]

Some more cautious folk hedged a delay into their arrival dates and talked about 2020. However, it became apparent throughout 2012 that the US government did not have the appetite (or political support) to commit to being the first to deploy an exascale supercomputer. Other regions of the world have - like the USA government - stated their ambitions to be among the leaders in exascale computing. But no government has yet stood up and committed to a timetable nor to being the first to get there. Critically, neither has anyone committed the required R&D funding needed now to develop the technologies [hardware and software] that will make exascale supercomputing viable.

The consensus at the end of 2012 seems to be towards a date of 2022 for the first exascale supercomputer - and there is no real consensus on which country will win the race to have the first exascale computer.

Perhaps we need to re-visit our communication of the benefits of more powerful supercomputers to the wider economy and society (what is the point of supercomputers?). Communicating the value to society and describing the long term investment requirements is always a fundamental need of any specialist technology but it becomes crucially essential during the testing fiscal conditions (and thus political pressures) that governments face right now.


Petaflops become "ordinary"

2012 was the year that petascale supercomputers became "ordinary" (in as much as a multi-million dollar megawatt-chewing complex technology can be "ordinary"). Certainly by the end of 2012 there were enough deployed supercomputers having a peak performance of one petaflops or more that we stopped being aware of exactly how many there were (33 according the Nov 2012 Top500 list). In fact, even the next order of magnitude (10+PF), is starting to be populated.

Another measure of petaflops moving beyond the domain of the leading national labs and academic supercomputer centers was the announcements by both BP and Total of petascale supercomputers deployed for industrial use. There may well be more supercomputers in industry that have not yet been announced.

And as a final measure of "ordinary", I perceive the HPC community now regards the technical challenges of petascale computing as a production support issue not a research topic for future technology. This, in spite of the many challenges left at petascale - only a tiny proportion of HPc users are successfully using petascale resources. But the allure of exascale is already taking the limelight in the R&D community away from the petascale reality.


HPC seeks to engage a broader user community

There was a growing realization during 2012 that the HPC community needed to take ownership of the issue of widening the user base. Although not a new theme, I did notice this crop up more often in panel discussions, remits for new supercomputer services/facilities, and conference talks. HPC has potential to a much broader audience than currently using the technology.

Leaving aside the reality that many users of HPC would not self-identify themselves as HPC people, there are three main growth directions that visionaries, funding bodies and practitioners have been pushing in 2012:
  • non-academic use of supercomputing (specifically industrial users);
  • more local and smaller HPC facilities in addition to the nationally leading systems - espeiclaly in non-trational user communities (i.e. not just chemists and physicists etc.);
  • future users - the skills and awareness investments needed now to prepare the future generations of scientists and engineers for a world of HPC opportunities.

I weighed in with my own thoughts on this theme in my 1000x article at HPCwire.

Whilst this theme has thankfully risen in prominence throughout 2012, I will still advocate we need to focus more of our community effort on this broadening of the HPC user base.


Assault on the Top500

Among the most debated events of the supercomputing world in 2012 was the news that NSF/NCSA Blue Waters has chosen not to submit the Blue Waters supercomputer to the Top500 list. The system would clearly have made the Top10, is one of the highest profile supercomputer projects in the world, and has been anticpated as a flagship scientific resource. So the rejection of a Top500 ranking was big news. Bill Kramer of NCSA set out the reasons in an exceptionally well thought out article listing not only the issues with the Top500 but also suggested solutions. There followed much debate on blogs, news sites and twitter - of which at least some missed the point in my opinion.

In my view, the Top500 is NOT broken - it is the use to which the Top500 data is put that is broken. The Top500 is useful because it is a set of 500 data points collected twice a year in a consistent manner for 20 years. It's value is as a data set - either of 500 points at any one collection, or as trends over time of subsets, etc.

It is not useful to base procurement, funding or architectural decisions on any single one of those data points alone. The ranking position of any one machine is meaningless - because it no longer correlates well with scientifc delivery. Note that the problem is not with the Linpack (HPL) benchmark per se - any single benchmark would have this issue of being unrepresentative of the whole workload/mission of a supercomputer.

Delivering real science and engineering with supercomputers requires far more than a good HPL result - it requires a production service wrapped around the machine - including hardware feature such as storage, software support and performance development, and skills support (training etc.). The Top500 does not measure these full service capabilities - only that a machine exists and has managed one hero run of HPL.

To me the key thing to remember is that the Top500 does have its uses - but only when used in aggregate or for trends - not when quoting a single position on a given release of the list.


Finally

This ends my (too long) review of 2012 in supercomputing - but the obvious follow on is to look towards 2013. Tune in to this blog after the Christmas season to find out my predictions for HPC/supercomputing in 2013.

In the meantime, feel free to chip in with comments and discussion on my review of 2012.

And, to all my readers (if any!) - Merry Christmas!

1 comment:

Mike Bernhardt said...

Well said Andrew. All good points. Thank you for your perspective and your many insightful contributions over this past year.

Mike Bernhardt
The Exascale Report