4-5. Frame - Data Cleaning

  1. Fill NA: Replace the value NA with another value.

  2. Drop NA: Removes rows or columns that contain NA values.

  3. Fill Outlier: Replaces outliers in a specific column.

  4. Drop Outlier: Removes outliers in a specific column.

  5. Drop Duplicates: Remove duplicate values.


Fill NA

  1. Method: Select a fill method.

    1. Replace Value: NA with the input value.

    2. Forward/Back Fill: Replace the NA with the value before/after it. If there are consecutive NA's, you can limit the fill to only a few NA's.

    3. Statistics: Replace NA with Statistics.


Drop NA

  1. How

    1. Select Options: If the number of non-missing values in any row is less than the value set in Threshold, delete that row.

    2. Any: If there is any NA in the row, delete the row.

    3. All: If all values in a row are NA, delete the row.

  2. Ignore Index: Choose whether to reset the index after the operation.


Drop Duplicates

  1. Keep: Select which of the duplicate values you want to keep. If you select False, all duplicate values will be deleted.

  2. Ignore Index: Choose whether to reset the index after the operation.

Last updated