Data Scientist spends 80% of their time to perform data wrangling. In this chapter, we will talk about the technique of data wrangling.
- Data Cleansing – Process to detect and correct corrupted or inaccurate records in a recordset.
- Data Transform – Process of converting the data type to the standardized data type that the user defined in the system.
- Data Patching – Process to patch missing values in a recordset based on the business logic with supporting/history records.
- Removing Data Point – Process to eliminate outliers from the recordset.
Be careful when removing data point in a recordset because the data point might be a condition that is required.