Abstract
Given that Hadoop-based Map Reduce programming is a relatively new skill, there is likely to be a shortage of highly skilled staff for some time, and those skills will come at a premium price. ETL (extract, transform, and load ) tools, like Pentaho and Talend, offer a visual, component-based method to create Map Reduce jobs, allowing ETL chains to be created and manipulated as visual objects. Such tools are a simpler and quicker way for staff to approach Map Reduce programming. I’m not suggesting that they are a replacement for Java or Pig-based code, but as an entry point they offer a great deal of pre-defined functionality that can be merged so that complex ETL chains can be created and scheduled. This chapter will examine these two tools from installation to use, and along the way, I will offer some resolutions for common problems and errors you might encounter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Michael Frampton
About this chapter
Cite this chapter
Frampton, M. (2015). ETL with Hadoop. In: Big Data Made Easy. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-0094-0_10
Download citation
DOI: https://doi.org/10.1007/978-1-4842-0094-0_10
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-0095-7
Online ISBN: 978-1-4842-0094-0
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)