Cleaning Wrong Format
Learn all about Cleaning Wrong Format in this comprehensive tutorial.
- •Cells with data of wrong format can make it difficult, or even impossible, to analyze data.
- •In our Data Frame, we have two cells with the wrong format.
- •The result from the converting in the example above gave us a NaT value, which can be handled as a NULL value, and we can remove the row by using the dropna() method.
Data of Wrong Format
Cells with data of wrong format can make it difficult, or even impossible, to analyze data.
To fix it, you have two options: remove the rows, or convert all cells in the columns into the same format.
Convert Into a Correct Format
In our Data Frame, we have two cells with the wrong format. Check out row 22 and 26, the 'Date' column should be a string that represents a date:
Let's try to convert all cells in the 'Date' column into dates.
Pandas has a to_datetime() method for this:
As you can see from the result, the date in row 26 was fixed, but the empty date in row 22 got a NaT (Not a Time) value, in other words an empty value. One way to deal with empty values is simply removing the entire row.
Removing Rows
The result from the converting in the example above gave us a NaT value, which can be handled as a NULL value, and we can remove the row by using the dropna() method.
Module quiz
2 questionsWhich of the following is true about Cleaning Wrong Format?
What is the most common pitfall when working with Cleaning Wrong Format?
Answer all questions to submit.