Spring Data
Modern Data Access for Enterprise Java
Mark Pollack
Oliver Gierke
Thomas Risberg
Jon Brisbin
Michael Hunger
Published by OReilly Media
Beijing Cambridge Farnham Kln Sebastopol Tokyo
Thanks to my wife, Daniela, and sons, Gabriel and Alexandre, whose patience with me stealing time away for the book made it possible.
Mark Pollack
Id like to thank my family, friends, fellow musicians, and everyone Ive had the pleasure to work with so far; the entire Spring Data and SpringSource team for this awesome journey; and last, but actually first of all, Sabine, for her inexhaustible love and support.
Oliver Gierke
To my wife, Carol, and my son, Alex, thank you for enriching my life and for all your support and encouragement.
Thomas Risberg
To my wife, Tisha; my sons, Jack, Ben, and Daniel; and my daughters, Morgan and Hannah. Thank you for your love, support, and patience. All this wouldnt be worth it without you.
Jon Brisbin
My special thanks go to Rod and Emil for starting the Spring Data project and to Oliver for making it great. My family is always very supportive of my crazy work; Im very grateful to have such understanding women around me.
Michael Hunger
Id like to thank my wife, Nanette, and my kids for their support, patience, and understanding. Thanks also to Rod and my colleagues on the Spring Data team for making all of this possible.
David Turanski
Foreword
Rod Johnson
Creator, Spring Framework
We live in interesting times. New business processes are driving new requirements. Familiar assumptions are under threatamong them, that the relational database should be the default choice for persistence. While this is now widely accepted, it is far from clear how to proceed effectively into the new world.
A proliferation of data store choices creates fragmentation. Many newer stores require more developer effort than Java developers are used to regarding data access, pushing into the application things customarily done in a relational database.
This book helps you make sense of this new reality. It provides an excellent overview of todays storage world in the context of todays hardware, and explains why NoSQL stores are important in solving modern business problems.
Because of the languages identification with the often-conservative enterprise market (and perhaps also because of the sophistication of Java object-relational mapping [ORM] solutions), Java developers have traditionally been poorly served in the NoSQL space. Fortunately, this is changing, making this an important and timely book. Spring Data is an important project, with the potential to help developers overcome new challenges.
Many of the values that have made Spring the preferred platform for enterprise Java developers deliver particular benefit in a world of fragmented persistence solutions. Part of the value of Spring is how it brings consistency (without descending to a lowest common denominator) in its approach to different technologies with which it integrates. A distinct Spring way helps shorten the learning curve for developers and simplifies code maintenance. If you are already familiar with Spring, you will find that Spring Data eases your exploration and adoption of unfamiliar stores. If you arent already familiar with Spring, this is a good opportunity to see how Spring can simplify your code and make it more consistent.
The authors are uniquely qualified to explain Spring Data, being the project leaders. They bring a mix of deep Spring knowledge and involvement and intimate experience with a range of modern data stores. They do a good job of explaining the motivation of Spring Data and how it continues the mission Spring has long pursued regarding data access. There is valuable coverage of how Spring Data works with other parts of Spring, such as Spring Integration and Spring Batch. The book also provides much value that goes beyond Springfor example, the discussions of the repository concept, the merits of type-safe querying, and why the Java Persistence API (JPA) is not appropriate as a general data access solution.
While this is a book about data access rather than working with NoSQL, many of you will find the NoSQL material most valuable, as it introduces topics and code with which you are likely to be less familiar. All content is up to the minute, and important topics include document databases, graph databases, key/value stores, Hadoop, and the Gemfire data fabric.
We programmers are practical creatures and learn best when we can be hands-on. The book has a welcome practical bent. Early on, the authors show how to get the sample code working in the two leading Java integrated development environments (IDEs), including handy screenshots. They explain requirements around database drivers and basic database setup. I applaud their choice of hosting the sample code on GitHub, making it universally accessible and browsable. Given the many topics the book covers, the well-designed examples help greatly to tie things together.
The emphasis on practical development is also evident in the chapter on Spring Roo, the rapid application development (RAD) solution from the Spring team. Most Roo users are familiar with how Roo can be used with a traditional JPA architecture; the authors show how Roos productivity can be extended beyond relational databases.
When youve finished this book, you will have a deeper understanding of why modern data access is becoming more specialized and fragmented, the major categories of NoSQL data stores, how Spring Data can help Java developers operate effectively in this new environment, and where to look for deeper information on individual topics in which you are particularly interested. Most important, youll have a great start to your own exploration in code!
Preface
Overview of the New Data Access Landscape
The data access landscape over the past seven or so years has changed dramatically. Relational databases, the heart of storing and processing data in the enterprise for over 30 years, are no longer the only game in town. The past seven years have seen the birthand in some cases the deathof many alternative data stores that are being used in mission-critical enterprise applications. These new data stores have been designed specifically to solve data access problems that relational database cant handle as effectively.
An example of a problem that pushes traditional ] New data types range from media files to logfiles to sensor data (RFID, GPS, telemetry...) to tweets on Twitter and posts on Facebook. While data that is stored in relational databases is still crucial to the enterprise, these new types of data are not being stored in relational databases.
While general consumer demands drive the need to store large amounts of media files, enterprises are finding it important to store and analyze many of these new sources of data. In the United States, companies in all sectors have at least 100 TBs of stored data and many have more than 1 petabyte (PB).[, allowing the company to mail coupon books to the customers home before public birth records are available.
Big data generally refers to the process in which large quantities of data are stored, kept in raw form, and continually analyzed and combined with other data sources to provide a deeper understanding of a particular domain, be it commercial or scientific in nature.