World of Cubes

Daniel Zhang, Gurdeep Sullan, Takara Truong
CS 238 Fall 2021: Decision Making under Uncertainty

About

The goal of this project was to create and solve a 2D puzzle inspired by MIT’s modular M-blocks. The puzzle consisted of robots, walls, and a portal. The agents must learn to work together by climbing over each other to pass as many of themselves through the portal.

The problem is solved using deep-QL with hindsight experience replay. Below is an example policy that the agents learn. The red squares are walls and the square highlighted white is the goal portal.


 
Previous
Previous

CS 330 Fall 2021: Task Affinity Weighted Meta Learning

Next
Next

CS 229 Spring 2021: Using Q-Learning to Personalize Pedagogical Policies for Addition Problems