## Reinforcement learning

It’s been a while we haven’t covered any machine learning algorithm. Last time we discussed the Markov Decision Process (or MDP).

Today we’re going to build our knowledge on top of the MDP and see how we can generalise our MDP to solve more complex problems.

Reinforcement learning really hit the news back in 2013 when a computer learned how to play a bunch of old Atari games (like Breakout) just by observing the pixels on the screen. Let’s find out how this is possible! Continue reading “Reinforcement learning”

## Markov Decision Process

Now that we know about Markov chain, let’s focus on a slightly different process: the Markov Decision Process.

This process is quite similar to a Markov chain but adds more concept into it: Actions and Rewards. Having a reward means that it’s possible to learn which action yield the best rewards. This type of learning is also known as reinforcement learning.

In this post we’re going to see what exactly is a Markov decision process and how to solve it in an optimal way. Continue reading “Markov Decision Process”

## Hidden Markov Model

Last time we talk about what is a Markov chain. However there is one big limitation:

A Markov chain implies that we can directly observe the state of the process. (e.g the number of people in the queue).

Many times we can only access an indirect representation or noisy measure of the state of the system. (e.g. we know the noisy GPS coordinates of a robot but we want to know it’s real position).

In this post we’re going to focus on the second point and see how to deal with HMMs. In fact HMM can be useful every time that we don’t have direct access to the system state. Let’s take some motivational examples first before we dig into the maths. Continue reading “Hidden Markov Model”

## Markov chain

I’ve heard the term “Markov chain” a couple of times but never had a look at what it actually is. Probably because it sounded too impressive and complicated.

It turns out it’s not as bad as it sounds. In fact the whole idea is pretty intuitive and once you get a feeling of how things work it’s much easier to get your head around the mathematics.

Andrey Markov was a Russian mathematician who studied stochastic processes (a stochastic process is a random process) and specially systems that follow a suite of linked events. Markov found interesting results about discrete processes that form a chain. Continue reading “Markov chain”

## Distributed training of neural networks

Distributed training of neural networks is something I’ve always wanted to try but couldn’t find much information about it. It seems most people train their models on a single machine.

In fact it makes sense because training on a single machine is much more efficient than distributed training. Distributed training incurs additional cost and is therefore slower than training on a single machine so it must be reserved only for cases where the neural network or the data (or both) don’t fit on a single machine.
Continue reading “Distributed training of neural networks”

## Nd4j – Numpy for the JVM

I have spent years programming in Java and one thing (among others) that I found frustrating is the lack of mathematical libraries (not to say Machine learning framework) on the JVM.

In fact if you’re a little interested in machine learning you’ll notice that all the cool stuffs are written in C++ (for performance reasons) and most often provide a Python wrapper (because who wants to program in C++ anyway).
Continue reading “Nd4j – Numpy for the JVM”

## TF-IDF

The idea from this blog post came after finishing the lab on TF-IDF of the edx Spark specialisation courses.

In this course the labs follow a step-by-step approach where you need to write some lines of code at every step. The lab is very detailed and easy to follow. However I found that focusing on a single step at a time I was missing the big picture of what’s happening overall.