About the Game & AI
This is a simple grid-world environment where an AI agent learns to navigate from a starting point to a goal. The agent can move up, down, left, or right. The objective is to reach the goal state efficiently, minimizing steps and avoiding potential pitfalls (if implemented).
The agent utilizes the Q-learning algorithm, a popular form of reinforcement learning. It learns a Q-table, which maps state-action pairs to expected future rewards. Through trial and error, guided by exploration and exploitation, the agent refines its strategy to maximize its cumulative reward.
Key Concepts:
- States: The current situation or position of the agent in the game environment.
- Actions: The possible moves the agent can make (e.g., Up, Down, Left, Right).
- Rewards: Feedback from the environment after an action. Positive rewards encourage desired behavior (reaching the goal), while negative rewards discourage undesired behavior (hitting a wall or pit).
- Q-Table: A lookup table storing the estimated value (Q-value) of taking a specific action in a specific state.
- Learning Rate (α): How much new information overrides old information.
- Discount Factor (γ): How much future rewards are valued compared to immediate rewards.
- Exploration vs. Exploitation (ε): The balance between trying new actions (exploration) and using the best-known action (exploitation).