Statistics for Data Science
上QQ阅读APP看书,第一时间看更新

A Developer's Approach to Data Cleaning

This chapter discusses how a developer might understand and approach the topic of data cleaning using several common statistical methods.

In this chapter, we've broken things into the following topics:

  • Understanding basic data cleaning
  • Using R to detect and diagnose common data issues, such as missing values, special values, outliers, inconsistencies, and localization
  • Using R to address advanced statistical situations, such as transformation, deductive correction, and deterministic imputation