Hadoop Based Data Intensive Computation on IaaS Cloud Platforms
- Sanjay Ahuja
Abstract
Cloud computing is a relatively new form of computing, which uses virtualized resources and is dynamically scalable and is often provided as pay for use service over the Internet or Intranet or both. With increasing demand for data storage in the cloud, study of data intensive applications is becoming a primary focus. Data intensive applications are those which involve a high CPU usage, processsing large volumes of data typically in size of hundreds of gigabytes, terabytes, or petabytes. This study was conducted on Amazon's Elastic Cloud Compute (EC2) and Amazon Elastic Map Reduce (EMR) using HiBench Hadoop Benchmark Suite. HiBench is a Hadoop benchmark suite and is used for performing and evaluating Hadoop based data intensive computation on both these cloud paltforms. Both quantitative and qualitative comparison was performed on both Amazon EC2 and Amazon EMR, including a study of their pricing models and measures are suggested for future studies and research.
- Full Text: PDF
- DOI:10.5539/cis.v8n3p103
Journal Metrics
WJCI (2022): 0.636
Impact Factor 2022 (by WJCI): 0.419
h-index (January 2024): 43
i10-index (January 2024): 193
h5-index (January 2024): N/A
h5-median(January 2024): N/A
( The data was calculated based on Google Scholar Citations. Click Here to Learn More. )
Index
- Academic Journals Database
- BASE (Bielefeld Academic Search Engine)
- CiteFactor
- CNKI Scholar
- COPAC
- CrossRef
- DBLP (2008-2019)
- EBSCOhost
- EuroPub Database
- Excellence in Research for Australia (ERA)
- Genamics JournalSeek
- Google Scholar
- Harvard Library
- Infotrieve
- LOCKSS
- Mendeley
- PKP Open Archives Harvester
- Publons
- ResearchGate
- Scilit
- SHERPA/RoMEO
- Standard Periodical Directory
- The Index of Information Systems Journals
- The Keepers Registry
- UCR Library
- Universe Digital Library
- WJCI Report
- WorldCat
Contact
- Chris LeeEditorial Assistant
- cis@ccsenet.org