Data Scientist spends 80% of their time to perform data wrangling. In this chapter, we will talk about the technique of data wrangling.

  • Data Cleansing – Process to detect and correct corrupted or inaccurate records in a recordset.
  • Data Transform  – Process of converting the data type to the standardized data type that the user defined in the system.
  • Data Patching – Process to patch missing values in a recordset based on the business logic with supporting/history records.
  • Removing Data Point  – Process to eliminate outliers from the recordset. 

Be careful when removing data point in a recordset because the data point might be a condition that is required.