Centre for Computing and Engineering Software Systems, Faculty of Information and Communication Technologies, Swinburne University of Technology, Hawthorn, Melbourne, Australia
Centre for Innovation in IT Services and Applications, Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia
Copyright
Elsevier
225 Wyman Street, Waltham, MA 02451, USA
32 Jamestown Road, London NW1 7BY
First edition 2013
Copyright 2013 Elsevier Inc. All rights reserved
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publishers permissions policies and our arrangement with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein.
In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-407767-6
For information on all Elsevier publications visit our website at store.elsevier.com
This book has been manufactured using Print On Demand technology. Each copy is produced to order and is limited to black ink. The online version of this book will show color figures where appropriate.
Acknowledgements
The authors are grateful for the discussions with Dr. Willem van Straten and Ms. Lina Levin from the Swinburne Centre for Astrophysics and Supercomputing regarding the pulsar searching scientific workflow. This work is supported by the Australian Research Council under Discovery Project DP110101340.
About the Authors
Dong Yuan received his PhD degree in Computer Science and Software Engineering from the Faculty of Information and Communication Technologies at Swinburne University of Technology, Melbourne, Australia in 2012. He received his Master and Bachelor degrees from the School of Computer Science and Technology, Shandong University, Jinan, China in 2008 and 2005, respectively, all in Computer Science. He is currently a postdoctoral research fellow in the Centre of Computing and Engineering Software System at Swinburne University of Technology. His research interests include data management in parallel and distributed systems, scheduling and resource management, and grid and cloud computing.
Yun Yang received a Master of Engineering degree from the University of Science and Technology of China, Hefei, China in 1987, and a PhD degree from the University of Queensland, Brisbane, Australia in 1992, all in Computer Science. He is currently a full professor in the Faculty of Information and Communication Technologies at Swinburne University of Technology, Melbourne, Australia. Prior to joining Swinburne as an associate professor in late 1999, he was a lecturer and senior lecturer at Deakin University during 19961999. Before that, he was a research scientist at DSTC Cooperative Research Centre for Distributed Systems Technology during 19931996. He also worked at Beihang University in China during 19871988. He has published about 200 papers on journals and refereed numerous conferences. His research interests include software engineering; P2P, grid and cloud computing; workflow systems; service-oriented computing; Internet computing applications; and CSCW.
Jinjun Chen received his PhD degree in Computer Science and Software Engineering from Swinburne University of Technology, Melbourne, Australia in 2007. He is currently an associate professor in the Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia. His research interests include scientific workflow management and applications; workflow management and applications in Web service or SOC environments; workflow management and applications in grid (service)/cloud computing environments; software verification and validation in workflow systems; QoS and resource scheduling in distributed computing systems such as cloud computing, service-oriented computing, semantics and knowledge management; and cloud computing.
Preface
Nowadays, scientific research increasingly relies on IT technologies, where large-scale and high-performance computing systems (e.g. clusters, grids and supercomputers) are utilised by the communities of researchers to carry out their applications. Scientific applications are usually computation and data-intensive, where complex computation tasks take a long time for execution and the generated data sets are often terabytes or petabytes in size. Storing valuable generated application data sets can save their regeneration cost when they are reused, not to mention the waiting time caused by regeneration. However, the large size of the scientific data sets makes their storage a big challenge.
In recent years, cloud computing is emerging as the latest distributed computing paradigm which provides redundant, inexpensive and scalable resources on demand to system requirements. It offers researchers a new way to deploy computation and data-intensive applications (e.g. scientific applications) without any infrastructure investments. Large generated application data sets can be flexibly stored or deleted (and regenerated whenever needed) in the cloud, since, theoretically, unlimited storage and computation resources can be obtained from commercial cloud service providers.