Wednesday, November 5, 2014
Sunday, November 2, 2014
Large-scale graph data process
Apache Giraph - is an iterative graph processing system originated as open-source Google Pregel, that implements BSP (Bulk Synchronous Parallel) model of distributed computation based on Hadoop.
Giraph uses Hadoop for scheduling workers and loading data into memory, but then implements its own communication protocol using HadoopRPC. Jobs are run as Mapper only job on Hadoop.
Apache Hama - is a pure BSP computing engine that generally used for iterative processing like Matrix, Graph. Jobs are runs as BSP job on HDFS only.
Subscribe to:
Posts (Atom)