Data Wrangling includes gathering, selecting, and transforming data to make them appropriate for analytics and machine learning. It also includes data cleaning, imputation, summarization, aggregation, normalization.
This competency area includes Outlier/Anomaly Detection, cleaning data, transforming categorical data to numerical data, grouping data based on values, and joining data, among others.
- Outlier/Anomaly Detection - Apply outlier Detection techniques.
- Missing values in data - Cleaning data by finding and replacing missing values using data science libraries.
- Duplicate values in data - Cleaning data by finding and removing duplicate values using data science libraries.
- Categorical data to numeric data - Transforming categorical data to numerical data using data science libraries.
- Group data based on values - In a single dataset, grouping data using data science libraries.
- Concatenate data along an axis - Concatenating data using Python data science libraries.
- Merge multiple sets of data into a single dataset - Joining multiple sets of data using data science libraries.