Posts related to passion projects of mine – from competitive programming to deep learning applications. Most of these will be open sourced with a corresponding GitHub project and/or Google Colab notebooks.

  • Monte Carlo Tree Search (Part 2): A Complete Explanation with Code

    In the last post we discussed the problem of acting optimally in an episodic environment by estimating the value of a state. Monte Carlo Tree Search (MCTS) naturally fits the problem by incorporating intelligent exploration into decision-time multi-step planning. Give that post a read if you haven’t checked it out yet, but it isn’t necessary […]

    Continue reading…

  • Monte Carlo Tree Search (Part 1): Introduction to MDPs

    Following on from the idea of learning to make an optimal single decision, we can expand this to making multiple sequential decisions in an optimal way. To do this we’ll be exploring Monte Carlo Tree Search (MCTS); an algorithm that combines ideas from traditional tree search algorithms, and reinforcement learning (RL). Today we’re going to […]

    Continue reading…

  • Multi-Armed Bandits 3: UCB and some exploration tricks

    In this post we’ll walk through some neat tricks to make -greedy more effective, and then we’ll dig into a smarter way to handle exploration: upper confidence bound action selection. We’ll be building on what we learned in my last post, and as always the code can be found in this colab notebook so you […]

    Continue reading…

Scroll to Top
%d bloggers like this: