## Putting Geospatial on the Map Part I

Extensibility
Extensibility covers the mechanisms by which you, as the user or developer, can extend the functionality of the Teradata Database, for example with the use of User Defined Functions, or UDFs.

## Putting Geospatial on the Map Part I

In my previous article we explored Teradata's spatial features using geometric shapes on a 10x10 grid.  I hope this was a useful introduction into the new ST_Geometry data type and its powerful set of functions.  And now that we've covered the basics, let's take a step into the world of geospatial.  By geospatial, we're now talking about locations based on geographic reference points on Earth; for example longitude and latitude.   (This can and does get more complicated, but we'll try to keep this as light as possible).

So just like we did in the previous article, we'll start with an example using ST_Geometry for point data. Here's a table to store city locations as points on a map.

`CREATE TABLE cities(    id INTEGER NOT NULL PRIMARY KEY,    name VARCHAR(50),    location ST_GEOMETRY);`

This table includes an ID column, a name column and an ST_Geometry column for storing the location. This of course is the field that we'll be using to flex Teradata's geospatial powers.

`INSERT INTO cities VALUES (1, 'New York','POINT( -73.960257 40.768823)');INSERT INTO cities VALUES (2, 'San Francisco', 'POINT(-122.420782 37.795219 )');INSERT INTO cities VALUES (3, 'San Diego', 'POINT(-117.074830 32.995484 )');INSERT INTO cities VALUES (4, 'Los Angeles', 'POINT(-118.243 34.05219 )');INSERT INTO cities VALUES (5, 'Atlanta','POINT( -84.3897 33.75449)');INSERT INTO cities VALUES (6, 'Dallas','POINT(-96.7971 32.78108 )');`

The first difference you should notice is the value of the X,Y coordinates. We're using geospatial references; longitude and latitude.

Longitude are the verticle lines on a map, representing the X coordinate.  And latitude are the horizontal lines, representing the Y coordinate.

OK, so far nothing too complex here.  Let's try our distance query just like we did with our simply geometry example.

`SELECT C1.name, C2.name, C1.location.ST_Distance(C2.location) AS DistanceFROM cities C1, cities C2WHERE C1.id != C2.idORDER BY 1,3;`

Results:

` name   name   Distance Atlanta   New York  12.57 Atlanta   San Diego  32.69 Atlanta   Los Angeles  33.85 Atlanta   San Francisco  38.25 Los Angeles  San Diego  1.58 Los Angeles  San Francisco  5.61 Los Angeles  Atlanta   33.85 Los Angeles  New York  44.79 New York  Atlanta   12.57 New York  San Diego  43.81 New York  Los Angeles  44.79 New York  San Francisco  48.55 San Diego  Los Angeles  1.58 San Diego  San Francisco  7.18 San Diego  Atlanta   32.69 San Diego  New York  43.81 San Francisco  Los Angeles  5.61 San Francisco  San Diego  7.18 San Francisco  Atlanta   38.25 San Francisco  New York  48.55`

Hmm, what's going on here? Atlanta to New York is 12.57?  What units are those - they can't be miles or kilometers?

What do these numbers mean?

OK, let's back up a little and talk through this.  We've entered our coordinates as longitude and latitude, right.  So what should expect from the distance calculation.  To Teradata these are arbitrary units.  We haven't done anything to tell the database that these now represent geospatial coordinates (long/lat).  We created the table and added data just the same way we did for our geometry shapes, so how is the database to know anything differently?

The answer to this problem lies with an ST_Geometry attribute called SRS - Spatial Reference Systems.  Until we do something to indicate that we are using a geographic SRS, all the ST_Distance function is going to do is assume that these are simple units.  And in that simple world, yes, Atlanta and New York are only 12.57 units apart.

This is where the complexity comes in; converting longitude and latitude calculations into units like miles or meters.  Not nearly as easy as it sounds.  A quick web search brings us information like "on Earth, the length of an arcdegree of north-south latitude difference is about 60 nautical miles, 111 kilometres or 69 statute miles at any latitude."

Let's try this simple rule of thumb value and try our calculation using each unit of latitude as 69 miles.  If we take our Atlanta to New York value of 12.57 and multiply by 69 we then have 867.33 miles.  That seems more reasonable.  A web search confirms that the driving distance at about 850 miles.   But wait, another web search has the directly flying distance between these cities at 748 miles.  That's the one we should be using to compare with our database results.  Why then is our calculation off by over a hundred miles?   Well, let's go back to latitude and longitude.

Latitudes are parallel lines running horizontal across the globe.   They are always the same distance apart (69 miles).  Longitude lines, though, are a much different story.  At the equator they too are roughly 69 miles apart, but as they reach the North and South poles, the converge and the distance between them goes to zero.  If we are going to accurately calculate distances, we're going to need to account for these narrowing longitude uints (Atlanta and New York are separated by almost 11 degrees longitude).  Doing this will involve telling Teradata about our coordinates when we ask it to perform any calculations.  The method we'll use for this is the ST_Transform() method.   ST_Transform allows us to convert ST_Geometry data between different SRS units.  In our example, we need to transform our coordinates into geodetic units so that the ST_Distance() calculation can accurately account for the narrowing longitude lines.

This topic does start to get more and more complex, so let's skip ahead and take a look at the answer to this query.  That should make it a little easier to work our way backwards to explain how we create this query.

`SELECT C1.name, C2.name,C1.location.ST_Transform(SRS_1.srtext,SRS_2.srtext).ST_Distance(C2.location.ST_Transform(SRS_1.srtext,SRS_2.srtext))*.6/1000 as MilesFROM cities C1 ,cities C2,     sysspatial.SPATIAL_REF_SYS SRS_1,     sysspatial.SPATIAL_REF_SYS SRS_2WHERE C1.id != C2.idAND   SRS_1.AUTH_SRID = 32616 -- UTM 16 / WGS84AND   SRS_2.AUTH_SRID = 4326 -- WGS84ORDER BY 1,3;`

Results:

` name   name   Miles Atlanta   New York  728.80 Atlanta   San Diego  1872.56 Atlanta   Los Angeles  1929.26 Atlanta   San Francisco  2142.69 Los Angeles  San Diego  105.88 Los Angeles  San Francisco  375.60 Los Angeles  Atlanta   1929.26 Los Angeles  New York  2430.39 New York  Atlanta   728.80 New York  San Diego  2396.82 New York  Los Angeles  2430.39 New York  San Francisco  2560.22 San Diego  Los Angeles  105.88 San Diego  San Francisco  481.32 San Diego  Atlanta   1872.56 San Diego  New York  2396.82 San Francisco  Los Angeles  375.60 San Francisco  San Diego  481.32 San Francisco  Atlanta   2142.69 San Francisco  New York  2560.22`

First let's look at the new distance calculation between Atlanta and New York - 728.80 miles. Yeah, that looks like the right answer.  And another check on the San Diego to New York distance of 2396.82 checks out as well. Looks like we have the right query.   Now let's take a closer look to understand what we did to get these results.

We can see that our select statement now includes two calls to ST_Transform() for the distance calculation.   That makes sense; we're converting both the C1.location and the C2.location for this new calculation.   We also see a unit conversion factor here, "*.6/1000". That's a pretty standard way of converting from meters to miles, so apparently ST_Distance() is returning meters.   All right, we're good to go, right?

Well, before we accept this query as our general solution to geospatial calculations, let's dig a little deeper into these two SRS transformations.  While we do seem to have a query that is working, we need to make sure we have a solution that will work for all locations, not just these US coordinates.  Looking back at the SQL and those commented lines of the query, we should be asking "What is UTM 16 and WGS84".

WGS stands for World Geodetic System and is a standard for use in mapping the Earth's spheroidal surface.  The latest revision is WGS 84, which is also the reference coordinate system used by Global Positioning System (GPS) devices.  Geodetic references are needed because the earth is not a perfect sphere.

UTM, or Universal Transverse Mercator, is a modern mapping coordinate system developed by the US Army Corps of Engineers in the 1940s.  UTM is a grid-based coordinate system that differs from traditional longitude and latitude in that it divides the world into a series of sixty zones.  This allows distance calculations to be done more easily with less distortion than the trigonometric formulas needed for longitude/latitude.   These calculations were particularly important during WWII and it was this work during the war that led to adoption of UTM in the post-war years. See http://egsc.usgs.gov/isb/pubs/factsheets/fs07701.html for some background information.

So back to our query.  What we've done here is taken the location data for our cities, which are UTM based locations, and converted them to geodetic values for more accurate distance calculations.  But the key here is to 'know your data'.  This query works here for US data.  Other regions will be using different coordinate reference systems for their locations.   Teradata does provide over a thousand SRS references to be used by ST_Transform().  Take a look at the table sysspatial.SPATIAL_REF_SYS table:

`SELECT AUTH_SRID, SRTEXT FROM sysspatial.SPATIAL_REF_SYS`

This returns 1338 rows from my database.  Finding the right one for your data may take some research and trial and error.

## Summary

In these past two articles we've had a hands-on tour of simple spatial objects and the powerful functional capabilities of Teradata and a good introduction to the concepts of geospatial data and the considerations required for using our calculations.  In the next articles, we'll continue working with geospatial data and methods, learn how to import external spatial reference data and finally how to visualize our data by creating viewable maps.  Stay tuned - this is great stuff!

Tags (1)
2 REPLIES
Fan

## Re: Putting Geospatial on the Map Part I

a question here,

I want to calcuate the distance for longitude east   121.516 latitude north 25.041 (Taiwan/Asia)

so, I insert one additional row into table you mentioned, and try to query the result.

But it always shows ST_Transform: Error in call to transform...

Some could help me looking this problem? I doubt it maybe need to change the number of

"SRS_1.AUTH_SRID = 32616 -- UTM 16 / WGS84  or  SRS_2.AUTH_SRID = 4326 -- WGS84 "

121.516194

Enthusiast

## Re: Putting Geospatial on the Map Part I

Hi

Taiwan is not on the same UTM 16, it is different,

### WGS 84 / UTM zone 51N

Did you figure this out? If not do you still need any help? Please post your SQL details for any help/error.

Thanks!