banner

For a full list of BASHing data blog posts see the index page.  RSS


What's wrong with my footprintWKT?

Long, long ago, the elder gods of GIS (Geographic Information System) judged that there were only three fundamental shapes you needed for digital mapping on a plane, and that they were all based on points:

wkt1

A point is a pair of x,y coordinates. A polyline is a set of points in a particular order, so that the GIS program knows how to join the points with lines, one point after the other. A polygon is just a polyline with the last point joined to the first one.

Much later, a simple text format was devised by the Open Geospatial Consortium to describe points, polylines (here called "linestrings"), polygons and some other geometric shapes. The format is called Well-Known Text (WKT). Here are some WKT examples, with longitude and latitude for x and y:

wkt2

POINT (12.559220 55.702230)

wkt3

LINESTRING (12.559245 55.702060,12.559479 55.702275,12.559272 55.702346,12.559031 55.702122)

wkt4

POLYGON ((12.559245 55.702060,12.559479 55.702275,12.559272 55.702346,12.559031 55.702122,12.559245 55.702060))

Plotted with the OpenStreetMap WKT Playground by developer Clyde D'Cruz.

The WKT format is so simple that it's hard to get it wrong. Points are longitude first, latitude second. Spaces between numbers, commas between points. Parentheses around points and linestrings, parentheses around each individual polygon as well. The last point in a polygon is the same as the first point.

In the Darwin Core system for recording biological data, there is a category called footprintWKT. Biologists can use it to describe a point where they made an observation or collection, a transect (linestring) or a sampling area (polygon). They can also describe the location as a single point ± the radius of a circle, where the circle surrounds the point and includes the whole of the transect or sampling area. Some biologists include both the point-radius and WKT data in a single record.

Darwin Core data are harvested and processed by the Global Biodiversity Information Facility (GBIF). Processing by GBIF also includes validation checks on data items, and records are flagged if GBIF finds a problem.

For some time now, GBIF has been flagging "Footprint WKT invalid" for many thousands of footprintWKT entries that are clearly valid, and that display perfectly well in GIS or other spatial data viewer. Below are just two examples:

wkt5

From here, 2021-11-13

wkt6

From here, 2021-11-13

The reason for declaring these WKT shapes invalid isn't clear. GBIF's documentation for the flag only says "The Footprint Well-Known-Text given could not be interpreted".

For some polygon WKT, GBIF may be requiring that the point order be anticlockwise, because the WKT specification says that the drawing order for nested polygons should be anticlockwise for an outer polygon, clockwise for inner polygon(s). However, non-nested, clockwise-ordered polygons will plot correctly in GIS. In this text file, Item1 is an anti-clockwise polygon and Item2 is a clockwise one:

Class;WKT
Item1;POLYGON ((10 20,11 20,11 21,10 21,10 20))
Item2;POLYGON ((12 21,12 22,13 22,13 21,12 21))

Here are the two polygons in QGIS on an outline map of Africa with 5° lines:

wkt7

Another possibility is that a footprintWKT entry might be flagged by GBIF as invalid if there is no corresponding footprintSRS entry. That entry would specify the coordinate reference system used by the WKT. However, if the recorder has specified the geodeticDatum as WGS84 (as in the examples shown), it's not obvious that a footprintSRS entry would be needed as well, and GBIF has not documented a requirement for footprintSRS.

GBIF noted the "invalid WKT" problem on an issues page on GitHub last April and is apparently working to fix it. In the meantime, the answer to the title question What's wrong with my footprintWKT? is: possibly nothing, especially if your WKT was generated in GIS.


Last update: 2021-11-17
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License