Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

Five things you need to know about Hadoop v. Apache Spark

Katherine Noyes | Dec. 14, 2015
They're sometimes viewed as competitors in the big-data space, but the growing consensus is that they're better together

5: Failure recovery: different, but still good. Hadoop is naturally resilient to system faults or failures since data are written to disk after every operation, but Spark has similar built-in resiliency by virtue of the fact that its data objects are stored in something called resilient distributed datasets distributed across the data cluster. "These data objects can be stored in memory or on disks, and RDD provides full recovery from faults or failures," Borne pointed out.


Previous Page  1  2 

Sign up for MIS Asia eNewsletters.