
FreeComputerBooks.com
Links to Free Computer, Mathematics, Technical Books all over the World
|
|
- Title Big Data Processing with Apache Spark
- Author(s) Srini Penchikala
- Publisher: InfoQ (2018)
- Paperback: N/A
- eBook PDF (104 pages)
- Language: English
- ISBN-10: N/A
- ISBN-13: N/A
- Share This:
![]() |
Apache Spark is an open-source big-data processing framework built around speed, ease of use, and sophisticated analytics.
Spark has several advantages compared to other big-data and MapReduce technologies like Hadoop and Storm. It provides a comprehensive, unified framework with which to manage big-data processing requirements for datasets that are diverse in nature (text data, graph data, etc.) and that come from a variety of sources (batch versus real-time streaming data).
Spark enables applications in HDFS clusters to run up to a hundred times faster in memory and ten times faster even when running on disk.
In this mini-book, the reader will learn about the Apache Spark framework and will develop Spark programs for use cases in big-data analysis. The book covers all the libraries that are part of Spark ecosystem, which includes Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX.
About the Authors- Srini Penchikala currently works as Software Architect at a financial services organization in Austin, Texas. He has over 20 years of experience in software architecture, design and development.
- Big Data
- Data Science
- Parallel, Concurrent, and Distributed Computing and Programming
- Data Analysis and Data Mining
- Non-relational/NoSQL Databases
