Getting a Big Data Job For Dummies
Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com
Copyright 2015 by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions .
Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY : THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit www.wiley.com/techsupport .
Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com . For more information about Wiley products, visit www.wiley.com .
Library of Congress Control Number: 2014935518
ISBN 978-1-118-90340-7 (pbk); ISBN 978-1-118-90383-4 (ebk); ISBN 978-1-118-90384-1 (ebk)
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
Appendix A
Resources
Staying current on tools and trends is always a challenge in emerging technologies. Theres good news though. An abundance of resources, vendor research, webcasts, and standards groups (groups that promote best practices) await you online, where you can learn new tools and keep up on the latest vendor news and offerings.
Vendor Websites
With more than 1,000 companies that have big data products or services, I cant list them all. Instead, the following list includes the major players in the big data space:
- Amazon Web Services ( http://aws.amazon.com/big-data ): Amazon Web Services offers a set of cloud services around big data storage and computing resources.
- Big Data University ( www.bigdatauniversity.com ): Big Data University is an online community dedicated to educating people on big data. Its funded by IBM and is free.
- Birst ( www.birst.com ): Birst is a cloud-based business intelligence and analytics tool.
- Cloudera ( www.cloudera.com ): Cloudera provides software, training, and support for the Apache Hadoop framework.
- Couchbase ( http://couchbase.com ): Couchbase is a NoSQL document database for interactive web applications.
- EMC ( http://bigdatablog.emc.com ): EMC is a hardware vendor providing storage for on-site big data analytics processing.
- Google ().
- Hortonworks ( http://hortonworks.com ): Hortonworks is a framework for using open-source Hadoop in the enterprise.
- IBM Netezza ( www.netezza.com ): Netezza is an on-site data warehouse appliance.
- Informatica ( www.informatica.com/bigdata ): Informatica provides tools for data integration and migration. Its big data offerings help couple traditional databases and NoSQL data stores to make data integration easy for big data processing.
- Jaspersoft ( http://jaspersoft.com ): Jaspersoft provides open-source analytics tools for data visualization from the dashboard.
- MapR ( http://mapr.com ): MapR is a complete distribution system for Apache Hadoop.
- Microsoft ( www.microsoft.com/enterprise/it-trends/big-data ): Microsoft uses its Azure cloud platform and Azure HDInsight products for analytics.
- MicroStrategy ( www.microstrategy.com ): MicroStrategy delivers business intelligence and analytics tools to large and medium-size businesses.
- MongoDB ( www.mongodb.com ): This is an open-source document-centric NoSQL database. Its the most popular NoSQL database today.
- Oracle ( www.oracle.com/us/technologies/big-data ): Oracle is the creator of the worlds most widely used relational database management system (RDMS). It also creates in-memory database systems and manages the open-source MySQL Database Management System.
- Pentaho ( www.pentaho.com ): This is an open-source business analytics tool.
- Predictive Analytics Today ( www.predictiveanalyticstoday.com/top-30-software-for-text-analysis-text-mining-text-analytics ): This is a curated software list for text analytics.
- Qlik ( www.qlik.com ): Qlik is business intelligence and visualization software.
- RapidMiner ( www.rapidminder.com ): RapidMiner is an open-source analytics modeling software application. Its used for statistical analysis.
- SAP ( www.sap.com/solution/big-data ): SAP is one of the worlds largest enterprise software firms. It provides business intelligence tools, cloud services, SAP HANA, and in-memory database systems for big data analytics.
- SAS ( www.sas.com/en_us/insights/big-data.html ): The worlds largest privately held software company, SAS provides the premier statistical analytics software package.
- Splunk ( www.splunk.com ): Splunk is a big data analytics tool used for the analysis and collection of machine data.
- Spotfire ( http://spotfire.tibco.com ): Spotfire, now owned by Tibco, provides business intelligence and visualization tools.
- Sumo Logic ( www.sumologic.com ): Sumo Logic is a cloud-based analytics engine that specializes in log file analysis.
- Tableau Software (
Next page