Converting Lat/Lon to Zip Code
I noticed a question on the Analytics X Prize forum about how to determine the zip code for homicides with latitude and longitude values. While there are a plethora of online tools (Google Maps, etc) that will do this for you, I thought I'd describe a simple way to do it using PostgreSQL/PostGIS as it illustrates one aspect of the multitude of open source tools that aid in spatial analysis. Also, the described method can be easily automated in combination with a shell script and some db insert triggers.
First, I retrieved the incident data from the resource described in an earlier post. After some awk and sed wrangling, I got the data into a format where it could be imported into a PostgreSQL table with the following structure:
philly=# d incidents
Table "public.incidents"
Column | Type | Modifiers
-----------+--------------------------+-----------
id | bigint |
date | timestamp with time zone |
geom | geometry |
zip | integer |
Indexes:
"inc_gist_idx" gist (geom)
Notice that there is a geometry column which specified the geocoded location of the homicide. The data came down projected using SRID 26918 - UTM Zone 18. I had to reproject the zip code geometries as they came as unprojected lat/lon. The zip code table (which I retrieved from the source listed in an earlier post) had the following structure:
philly=# d philly
Table "public.philly"
Column | Type | Modifiers
------------+-----------------------+-----------
gid | integer |
area | numeric |
perimeter | numeric |
zt42_d00_ | bigint |
zt42_d00_i | bigint |
zcta | character varying(5) |
name | character varying(90) |
lsad | character varying(2) |
lsad_trans | character varying(50) |
the_geom | geometry |
Indexes:
"philly_gist_idx" gist (the_geom)
Now, the zip code column of the incidents table is empty. I used the following select statement to populate the zip code column with the proper zip code which it falls in:
update incidents set zip=
(select cast(name as integer) from philly
where contains(transform(the_geom,26918),
geom));
The statement is selecting the zip code name from the zip code table where the incident point falls within the zip code polygon.