0 citations0 references

Towards Building a Scalable Data Analytics System on Clouds: An Early Experience on AliCloud

2018Vol. 40, pp. 891–895

Citations Over Time

Congfeng Jiang, Wei Huang, Zujie Ren, Youhuizi Li, Jian Wan, Feng Cao, Jiangbin Lin

Abstract

With the development of big data, big data processing systems, such as Hadoop and Spark, are widely used to handle large-scale data. To avoid the complexity and expensiveness of building a self-owned big data processing system, cloud providers tend to deploy big data processing tools as cloud services. Typical examples include Amazon EMR, Azure HDInsight and AliCloud E-MapReduce. However, how to build a cost-efficient system and scale the system is still challenging. In this paper, we have conducted a case study on AliCloud E-MapReduce, and analyzed the system performance upon local and remote file systems. We compared the scalability of Hadoop and Spark by using scaleout and scale-up strategies respectively. Based on the analysis results, we derive several observations and implications, which will contribute to guide the performance optimization.

Citations Over Time

Abstract

Related Papers