Data Transformation - Knowledge Base

What is Data Transformation?

Data transformation is the process of converting data from one format or structure into another. It's a crucial step in data analysis and machine learning, as raw data is rarely in a format that is immediately suitable for use.

The goal of data transformation is to prepare data for analysis by cleaning, restructuring, and enriching it.

Common Data Transformation Techniques

Example: Normalization using Python

                
import pandas as pd

data = {'feature1': [10, 20, 30],
        'feature2': [5, 10, 15]}

df = pd.DataFrame(data)

df['feature1'] = (df['feature1'] - df['feature1'].min()) / (df['feature1'].max() - df['feature1'].min())

print(df)
                
            

Resources