Friday, June 17, 2016

Difference Between Spark and Hadoop Map-reduce

Difference Between Spark and Hadoop


Difference Spark Hadoop Map-reduce
1. Perfomance Itertaive computations are performed in-memory, the mapper functions just transform one RDD to another RDD, resulting in saving disk io,network io and improving performance Map and Reduce phases cause every mapper/reducer to write data to disk after mapping and then successive mapper/reducer to read from it, thus resulting in disk io,network io, causing latency
2. Programming Languages Scala,Java,Python,R Java
3. Basic Unit of Data RDD - Resilient Distributed Dataset Tuples
4. Lines of Code for WordCount as less as 6 in python code. refer here as less as 73 in Java code. refer here

No comments:

Post a Comment