Deep reinforcement learning on a traffic light system

Although reinforcement learning has a history dating back to the 1970’s (think Temporal Difference Learning), the deep component – of using large multi-layered neural network as underlying models – is still a relatively new addition. Pivotal research in Deep Reinforcement Learning (DRL) such as AlphaGo has spurred a growing interest in the field. And though not yet perfect due to for example sample inefficiency, a difficult reward design and generalization issues, DRL has also prompted an increasing, albeit still timid, number of real-world applications.

One of these applications is using DRL agents in traffic light control systems. In comparison to traditional controlling policy, these DRL agents typically involve taking real-time traffic information and dynamically adjusting the light’s phase program, thus ensuring the efficiency of the system. For this project, I try to implement a similar learning agent on an existing intersection.

Read more ...




Area of overlapping circles in Python and Javascript

The idea for this mini project emerged when I was watching how marks left by rain drops on a terrace incrementally turned the dry wood to a darker, wet version of itself. As more drops would conquer new terrain, some would also fall partially or fully within previously made marks. I wondered if I would be able to model that process, and simulate the time required to cover a given area. The method would have to take into account overlapping circles so no area would be counted twice.

After the obligatory google search I found an interesting question on stackoverflow discussing possible ways to compute the combined area of overlapping circles. Some methods proposed to brute force the solution by performing a Monte Carlo simulation of a number of points in a bounding box. The area of the circles could then be approximated by taking the share of points encompassed by the circles and multiplying it with the area of the box. Another method proposed an exact solution, but I could not find a fleshed out implementation of it.

Read more ...