Monday, December 10, 2007

Today, with mixed feelings, we launched the new Atascadero Crime Report. On one hand, it’s great that we’ve launched it, the database that powers it is robust, and it seems to be working well. On the other hand, we’ve managed to include maps for the San Luis Obispo and Paso Robles police departments, and were unable to for Atascadero.

In order to explain this catastrophic and seemingly unforgivable failure, I need to explain how the systems work. First off, while the results appear almost identical, the code powering the SLO and Paso crime maps are completely different. This is because we manually retrieve a text file from the SLO PD, and we receive a PDF file from Paso PD. This is because of differences in the computer systems used by the respective departments. The Atascadero system works the same way as Paso, so I’ll concentrate on that system for now.

When we receive the PDF from the PD, it is run through a script that converts the PDF into raw text. That raw text is parsed and then the data is inserted into a database. Next, system attempts to geocode the addresses through either Google or Yahoo’s geocoding tools. Any address that can’t be found will be flagged by the system so that someone in the online team can manually plot the point. Manually plotted points are then added to another table in the database that contains points of interest in that city; those points will then be cross referenced for next time an address is not found so that we don’t need to re-plot say “Paso Robles High School” every time it occurs in the police report.

Still with me? Fantastic. So the problem we had with Atascadero is that because of the new housing developments in the city, many of the addresses that need to be geocoded simply don’t exist in either of the geocoding tools. In addition, the police department uses business names as addresses more often than the other departments. The last time I checked, in one weekend’s worth of data, I had 40 incidents to manually plot, and many of those I simply couldn’t find (no matter how hard I look, “next to Karen’s office” does not jump out of the map at me).

I don’t blame the police department for this; they don’t control the rate at which new addresses are added to Google and Yahoo’s maps. They use business names because that’s what works for them. I’d also like to thank all of the police departments we’ve worked with for helping us get our maps launched.

I’ll continue to work on getting a map feature working for Atascadero, but it may need to wait until we have up-to-date geocoding tools.

-D

2 comments:

Anonymous said...

Thanks for explaining that in language the average human can understand. =)

Anonymous said...

Danny --

Why not geocode what you can, and find a way of noting a list of incidents that weren't able to be mapped.

Something is better than nothing, as long as the raw details are available somehow.

Signed,
Eagerly Waiting in Atrashadero

Post a Comment