World of Cubes
Daniel Zhang, Gurdeep Sullan, Takara Truong
CS 238 Fall 2021: Decision Making under Uncertainty
About
The goal of this project was to create and solve a 2D puzzle inspired by MIT’s modular M-blocks. The puzzle consisted of robots, walls, and a portal. The agents must learn to work together by climbing over each other to pass as many of themselves through the portal.
The problem is solved using deep-QL with hindsight experience replay. Below is an example policy that the agents learn. The red squares are walls and the square highlighted white is the goal portal.