Posts related to passion projects of mine – from competitive programming to deep learning applications. Most of these will be open sourced with a corresponding GitHub project and/or Google Colab notebooks.

Monte Carlo Tree Search (Part 2): A Complete Explanation with Code
In the last post we discussed the problem of acting optimally in an episodic environment by estimating the value of a state. Monte Carlo Tree Search (MCTS) naturally fits the problem by incorporating intelligent exploration into decisiontime multistep planning. Give that post a read if you haven’t checked it out yet, but it isn’t necessary […]

Monte Carlo Tree Search (Part 1): Introduction to MDPs
Following on from the idea of learning to make an optimal single decision, we can expand this to making multiple sequential decisions in an optimal way. To do this we’ll be exploring Monte Carlo Tree Search (MCTS); an algorithm that combines ideas from traditional tree search algorithms, and reinforcement learning (RL). Today we’re going to […]

MultiArmed Bandits 3: UCB and some exploration tricks
In this post we’ll walk through some neat tricks to make greedy more effective, and then we’ll dig into a smarter way to handle exploration: upper confidence bound action selection. We’ll be building on what we learned in my last post, and as always the code can be found in this colab notebook so you […]