Skip to main content

Apache Flume

  • Chapter
  • First Online:
Network Data Analytics

Part of the book series: Computer Communications and Networks ((CCN))

  • 2429 Accesses

Abstract

Real-time processing of events plays an important role in data analytics. Recent advances in Internet have lead to the rise of social media with massively large data generated in real time. Analysis of such data leads to interesting scenarios where some of the business decisions can be made. In the previous chapters, data analytics with batch processing and primitive datasets were discussed. Apache Flume is one of the tools in the Hadoop ecosystem that provides a platform for real-time data analytics. In this chapter, an overview of Apache Flume and its architectural components with workflow is discussed. Later, the configuration of Flume with Twitter social network is discussed as an example for real-time analytics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liu, X., Iftikhar, N., & Xie, X. (2014, July). Survey of real-time processing systems for big data. In Proceedings of the 18th International Database Engineering and Applications Symposium (pp. 356–361). ACM.

    Google Scholar 

  2. Wang, C., Rayan, I. A. & Schwan, K. (2012). Faster, larger, easier: Reining realtime big data processing in cloud. In Proceedings of the Posters and Demo Track (p. 4). ACM.

    Google Scholar 

  3. Ranjan, R. (2014). Streaming big data processing in datacenter clouds. IEEE Cloud Computing, 1(1), 78–83.

    Article  Google Scholar 

  4. Lin, J., & Kolcz, A. (2012). Large-scale machine learning at twitter. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (pp. 793–804). ACM.

    Google Scholar 

  5. Khuc, V. N., Shivade, C., Ramnath, R., & Ramanathan, J. (2012). Towards building large-scale distributed systems for twitter sentiment analysis. In Proceedings of the 27th annual ACM symposium on applied computing (pp. 459–464). ACM.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. G. Srinivasa .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Srinivasa, K.G., G. M., S., H., S. (2018). Apache Flume. In: Network Data Analytics. Computer Communications and Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-77800-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77800-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77799-3

  • Online ISBN: 978-3-319-77800-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics