Coding question: code a simple agent and environment from scratch that implements Q-learning (with Bellman Equation) to teach an agent how to traverse a simple grid environment from top left corner to bottom right corner.
Check out your Company Bowl for anonymous work chats.