Parking Mania

Agent that automatically parks the car in available parking spot

Overview of the project

The goal of the project is to create an agent that can autonomously park in a video game environment. We chose to create our own version of Parking Mania by creating an environment using Unity, because of its ease of use and how it provides us with the ability to create realistic parking scenarios. At the end we aim to equip our agent with the ability to park successfully in the required spot, by using reinforcement learning and providing it with rewards and penalties for each move.

Environment

We designed the environment using the Unity editor, one of the most widely used frameworks for game development. With its added support to train in the extensive support of ML agents we decided to develop our own version of Parking Mania in Unity. The environment has trees, walls (the periphery of the parking area), other car spots,the plane, road lights and also the miniscule objects associated with our very important part of training that is our agent. For our car agent, we have associated front rays, back rays,wheels, body, center of mass.

Approach

Reinforcement learning enables the AI agent to learn from its past experience (by trial and error) as to which actions to take. In our case, giving a negative reward to our agent, whenever it hits an obstacle would eventually make it learn to avoid them. Similarly a positive reward after successful parking, would indicate a good move. Initially, while interacting with the environment, our agent would follow a random policy, but gradually with the help of well defined rewards, it will learn to park the car successfully without collision.

Algorithm Used

As per our environment adaptability we decided to go with PPO. PPO strikes a balance between ease of implementation, sample complexity, and ease of tuning, trying to compute an update at each step that minimizes the cost function while ensuring the deviation from the previous policy is relatively small. Our agent was trained by collecting a small batch of experiences interacting with the environment and using that batch to update its decision-making policy. Once the policy is updated with this batch, the experiences are thrown away and a newer batch is collected with the newly updated policy.

Demo

Team Members

Dennis Mistry

Dharmin Shah

Naiya Shah

Priyanshi Vora

Sagar Makwana

Shriya Katoch