Overview
- Only Pig book that talks about Pig jobs scheduling using Oozie
- Only Pig book that talks about how to submit Pig jobs using Hue
- One stop shop for all Apache Pig needs
Access this book
Tax calculation will be finalised at checkout
Other ways to access
Table of contents (17 chapters)
Keywords
About this book
The book is divided into four parts: the complete features of Apache Pig; integration with other tools; how to solve complex business problems; and optimization of tools.
You'll discover topics such as MapReduce and why it cannot meet every business need; the features of Pig Latin such as data types for each load, store, joins, groups, and ordering; how Pig workflows can be created; submitting Pig jobs using Hue; and working with Oozie. You'll also see how to extend the framework by writing UDFs and custom load, store, and filter functions. Finally you'll cover different optimization techniques such asgathering statistics about a Pig script, joining strategies, parallelism, and the role of data formats in good performance.
What You Will Learn
• Use all the features of Apache Pig
• Integrate Apache Pig with other tools
• Extend Apache Pig
• Optimize Pig Latin code
• Solve different use cases for Pig Latin
Who This Book Is For
All levels of IT professionals: architects, big data enthusiasts, engineers, developers, and big data administrators
Authors and Affiliations
About the author
In 2013 I had won Hadoop Hackathon event for Hyderabad conducted by Cloudwick technologies. Being top contributor at stackoverflow.com, I helped many people on big data at multiple websites like stackoverflow.com and quora.com. With so much passion on big data I went ahead as independenttrainer and consultant to train hundreds of people and to set big data teams in couple of companies.
Bibliographic Information
Book Title: Beginning Apache Pig
Book Subtitle: Big Data Processing Made Easy
Authors: Balaswamy Vaddeman
DOI: https://doi.org/10.1007/978-1-4842-2337-6
Publisher: Apress Berkeley, CA
eBook Packages: Professional and Applied Computing, Apress Access Books, Professional and Applied Computing (R0)
Copyright Information: Balaswamy Vaddeman 2016
Softcover ISBN: 978-1-4842-2336-9Published: 16 December 2016
eBook ISBN: 978-1-4842-2337-6Published: 10 December 2016
Edition Number: 1
Number of Pages: XXIII, 274
Number of Illustrations: 34 b/w illustrations, 35 illustrations in colour
Topics: Open Source, Database Management, Data Storage Representation, Data Mining and Knowledge Discovery, Information Storage and Retrieval