Data Cleaning Needs and Issues: A Case Study of the National Reproductive Health Assessment (RHA) Data from Solomon Islands


  •  Richard D. Nair    
  •  Latileta L. Odrovakavula    
  •  Masoud Mohammadnezhad    
  •  K. Venkata Raman Reddy    
  •  Dilan A. Gohil    
  •  Shiwanjani S. Sami    

Abstract

Data cleaning is an essential part of any research work without which the validity and reliability of the data could come under the spotlight. Aim: to document common errors found during the cleaning of datasets and suggests ways of minimizing errors during data entry process, reducing human errors throughout data cleaning.

Design and Setting: a case study based on the national Reproductive Health Assessment (RHA) data conducted in Solomon Islands in 2013.

Objective: The main objective of the Solomon Islands RHA was to establish the health status of reproductive aged women between the ages of 15 – 49 for the Solomon Islands.

Method: data was collected using questionnaires and entered on to the SPSS database in the country by the local Solomon Islands research assistants who were trained by the Pacific Sexual and Reproductive Health Research Center (PSRHRC). The data was brought back to Fiji where the cleaning process took place.

Results: findings of this case study showed that there were issues with the standardization of databases, database familiarization and data merging.

Conclusion: more training is needed for researchers who are involved in data collection, data entry and data cleaning to minimize such errors which could give results which may not be a true representation of the indented study.



This work is licensed under a Creative Commons Attribution 4.0 License.