Editor
Masaaki Geshi
Osaka University, Toyonaka, Osaka, Japan
ISBN 978-981-13-6193-7 e-ISBN 978-981-13-6194-4
https://doi.org/10.1007/978-981-13-6194-4
Library of Congress Control Number: 2019934788
Springer Nature Singapore Pte Ltd. 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Supercomputers have been used in various fields since around the 1980s. At that time, a vector architecture was the mainstream of the supercomputer, and program developers worked on vector tuning. However, in the 1990s, a parallel architecture emerged and the mainstream of the supercomputer changed. The development of supercomputers is ongoing under this trend. Currently, in the era of massively parallel or manycore systems, in the top 500 ranking in the competition of supercomputers for LINPACK performance, some of the top ten computers of the November 2018 ranking have over 1 million cores, and one even has over 10 million cores.
This is not just a pleasure for users of such supercomputers. Different from the vector tuning in the vector computer, the performance improvement cannot be expected simply for taking the vector length as long as possible. The performance will not improve unless you are familiar with the characteristics of the hardware such as knowing the size of a cache and tuning programs so that the proper amount of data is placed in the cache, concealing communication latency, and reducing the number of data communications. This is the current supercomputer. A hotspot depends mainly on algorithms of calculation methods that are distinguished in the field, and there is no common development policy. The supercomputer with tens of millions of cores has already appeared and more will be created in the future. There is also a problem such as whether a parallelization axis large enough to exhaust all the cores is in the algorithm. If not, we have to create it or consider an alternative.
In addition, although the correspondence to machines equipped with accelerators has advanced in recent years, there are some fields where performance improvement has not yet sufficiently advanced. In some cases, it may be necessary to tackle improvements in fundamental algorithms.
Only computer scientists and some computational scientists took the abovementioned steps. However, if you would like to use the current supercomputers effectively, this knowledge and these techniques are necessary for more users. Such knowledge is indispensable for researchers who are developing software, and there is an increasing number of things to know from the users point of view. Even if we compile and execute a software program on a supercomputer without thinking about hardware, we will not be able to obtain better performance than PC clusters widely used at the laboratory level.
Although the tuning techniques become more difficult year by year, the fact is that we can obtain excellent performance if we simply tune the software program properly. If we make a supercomputer, we must have human resources with the techniques to master it.
In some fields, it is difficult to recognize the speedup of a program as a significant result. In some cases, software development itself is not recognized as an important achievement. For researchers who are promoting software development, therefore, the speedup of the program and/or the software development includes a great risk of being unable to produce research results during the given term. However, it is the essence of research if results can be obtained when otherwise they could not be achieved without a highly tuned program in realistic computing timeunless a huge amount of computational time were spent. Therefore, it is extremely important to decrease the computational time as much as possible.
For the general public, there are only reports of alternate hopes and despair in the top 500 rankings, but for those in computational science, the peak performance of such machines and the benchmark performance of LINPACK are merely for reference. The important point is whether we can achieve the best performance by tuning the software program we create or use to obtain scientific results. Only such software programs can demonstrate the real value of supercomputers.
This series of books is written about the basics of parallelization, the foundation of numerical analysis, and related techniques. Even if it is mentioned as a foundation, we assume the reader is not a complete novice in this field; so if you would like to understand programming from the beginning, you can learn the basics from another book, more suitable for that purpose. Our readers are assumed to be those who have an understanding of physics, chemistry, and biology, as well as those in the fields of earth sciences, space science, meteorology, disaster prevention, and manufacturing, among others. Furthermore, we assume readers to be those who use numerical calculation and simulation as research methods. In particular, we assume them to be those who develop software code.
Volume 1 includes field-independent general numerical analysis and parallelization techniques. From Chaps. , techniques concerning computation accuracy are introduced. Although several examples of methods of materials science are used, most techniques can be applied to other fields.
Chapters is recommended to researchers for all fields.
Volume 2 includes advanced techniques based on concrete applications of software for several fields, in particular, the field of materials science. From Chaps. 1 to 3, advanced techniques are introduced by using the tuning results executed on the K computer. The authors provide several examples from various fields. In Chap. 4, the order-N method based on density functional theory (DFT) calculation is presented. Chapter 5 introduces acceleration techniques of classical molecular dynamics (MD) simulations. In Chap. 6, techniques for large-scale quantum chemical calculation are given.