Thursday, April 30, 2009

SQL Server 2008 Spatial Data And Spatial Search

SQL Server 2008 offers a great support for spatial data and spatial search. Spatial data, or geo-spatial data, means locations on the Earth. Microsoft introduces two new data types in SQL Server 2008 to present location data:

Geometry: Geometry data are based on a flat Earth model, and it doesn’t account for the Earth’s curvature.

Geography: Geography data are Points, lines, polygons, or collection of these in latitude and longitude coordinates using a round Earth model.

What’s good about SQL Server 2008 spatial data? The location information can be stored as native Geography data and indexed, so that the location-based search can be efficiently done by SQL Server 2008. I examined such spatial search on a test database with 1-million rows of test data. The result is quite impressive.

Red Gate SQL Data Generator (V1.028) is used to create 1-million test data. Following script is to mock valid latitude and longitude data inside U.S.:

-- UDF Can't use RAND directly. Using View to cheat it
CREATE VIEW vRandNumber AS SELECT RAND() as RandNumber
GO
-- UDF function returns a float number in a given range
CREATE FUNCTION RandFloat(@Min float, @Max float)
RETURNS float AS
BEGIN
RETURN @Min + (select RandNumber from vRandNumber) * (@Max-@Min)
END
GO
-- Update Address with U.S. geo codes
UPDATE Address
SET Address.Latitude = dbo.RandFloat(32, 48),
Address.Longitude = dbo.RandFloat(-124, -72)

With valid Latitude and Longitude data, now we can create a Gepgraphy column in the database:
ALTER TABLE Address ADD GeoPoint Geography NULL
GO
UPDATE Address SET GeoPoint = geography::Point(Latitude, Longitude, 4326)
GO
Querying on this un-indexed Geopgraphy column is very slow. Search locations within a 10-KM radius takes more than 5 minutes:
SET @point = geography::Point(47.32, -122.18, 4326) -—Seattle
SELECT * FROM Address WHERE GeoPoint.STDistance(@point) < 10 * 1000
Result: return 85 rows in 5 minutes 48 seconds

Once the Spatial index is created, the search is much faster than before and the same search only takes 1 second:
CREATE SPATIAL INDEX SIX_Address_GeoGraphy ON Address(GeoPoint);
GO
SELECT * FROM Address WHERE GeoPoint.STDistance(@point) < 10 * 1000
GO
Return 85 rows in 1 second
The test was conducted inside a virtual machine with Intel 2.2G Core2Duo CPU and 2G of RAM.