Welcome!
This module provides the fundamental Python knowledge necessary to embark on your journey into data science and machine learning. We'll cover essential programming concepts, data handling techniques, and introduce you to powerful libraries like NumPy and Pandas.
Python Basics
Understand the building blocks of Python programming.
- Variables and Data Types (integers, floats, strings, booleans)
- Operators (arithmetic, comparison, logical)
- Control Flow (if-elif-else statements, for loops, while loops)
- Functions: defining and calling
- Error Handling (try-except blocks)
Example: A Simple Function
def greet(name):
"""This function greets the person passed in as a parameter."""
print(f"Hello, {name}!")
greet("World")
# Output: Hello, World!
Core Data Structures
Efficiently organize and manipulate data.
- Lists: ordered, mutable sequences
- Tuples: ordered, immutable sequences
- Dictionaries: key-value pairs
- Sets: unordered collections of unique elements
Example: Working with a List
numbers = [10, 20, 30, 40, 50]
numbers.append(60)
print(numbers[2])
# Output: 30
print(numbers)
# Output: [10, 20, 30, 40, 50, 60]
NumPy: Numerical Python
The cornerstone for numerical computation in Python.
- Introduction to NumPy arrays (ndarrays)
- Array creation and manipulation
- Vectorized operations for speed
- Array indexing and slicing
- Basic mathematical and statistical functions
Example: NumPy Array Operations
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b)
# Output: [5 7 9]
print(a * 2)
# Output: [2 4 6]
Pandas: Data Manipulation and Analysis
Essential for data wrangling, cleaning, and exploration.
- Introduction to Series and DataFrames
- Reading data from various file formats (CSV, Excel)
- Data selection, filtering, and sorting
- Handling missing data (NaN)
- Data aggregation and grouping (groupby)
Example: Basic DataFrame Usage
import pandas as pd
data = {'col1': [1, 2, 3], 'col2': ['A', 'B', 'C']}
df = pd.DataFrame(data)
print(df.head())
# Output:
# col1 col2
# 0 1 A
# 1 2 B
# 2 3 C
print(df['col1'].mean())
# Output: 2.0
What's Next?
With these foundational skills, you're ready to dive deeper into: