For the 2012 GisCup, me and Sérgio Freitas submitted some code to test some Geospatial methods.
The challenge goal was to match a set of points (P) from a GPS log to a set of Edges (E). In this contest, both the accuracy (points matched correctly) and the efficiency (time taken to complete) were evaluated.
Process
According to our method, for matching a point P to a edge E from the road network set, the ranking criteria used where the following:
- C1 – the distance from the point P to each edge
- C2 – the movement direction similarity between the point and the edge. Something like the the Fréchet distance, but much simplified. Using this algorithm would be overkill for the performance.
- C3 – the topological match between point and an edge in the network context. If the Edge in study is connected (or is the same) that the edge assigned to the previous and following points, this criteria has more value assigned to. In the literature, some have tried Dijkstra for short path, but for this point density, this approach would not be a plus.
The final formula was C1k1+ C2k2 + C3*k3, where k1+k2+k3=1.0 Each value for k was found by Monte Carlo Simulation (it is a fancy name for randomized automated trials).
The motorway trunks where particularly difficult
The wrong points (red) were near the intersections, where the proximity and angle are fuzzy.
Eficiency
To speed up the process, some methods were tested:
- R-Trees for storing all the Edges. This increased the access time some orders of magnitude.
- Limit of number of edges to test, with the threshold in the 60 meters of low quality GPS error in city environments.
- Load everything to memory. With the data supplied, reached almost 1 GB. In a real-world scenario with more data, this approach would be unfeasible.
The code can be accessed in GitHub. According to the email received from the organizers, we got the 7th place out of 31 :-)
edited in 2014