In last month’s edition of the Geospatial Frequently Asked Question (G-FAQ), we started our two-part discussion on spatial topology in Esri’s ArcGIS. Specifically, we looked at the basics with an exploration of arc-node topology, the types of topology that can be enforced and then several files, i.e. ArcINFO coverages and geodatabases, that can maintain it. In the July edition of the G-FAQ, we wrap up this discussion with a focus on why spatial topology was created as well as the ‘topology’ of shapefiles.
As a quick reminder, this two-part G-FAQ series on spatial topology focuses on three core questions:
What is spatial topology, why is it important and how is it maintained in GIS? What are example files types that are topological in Esri’s ArcGIS? Do shapefiles have topology?
In last month’s G-FAQ, we learned that spatial topology is predicated on an arc-node data structure. Vector files have three basic formats, i.e. points, lines and polygons, which are constructed with nodes, arcs and vertices. The more complex a shape is, the more nodes, arcs and vertices it will have. The spatial topology relationships between the various arcs and nodes in a single vector file are stored in ‘lists’ that ArcGIS can access and interpret. In the case of ArcINFO coverage files, these lists are explicit tables; while for the other most popular file with explicit topology, i.e. geodatabases, they are stored as a set of rules.
Why Was Spatial Topology Created?
Spatial topology is a way to define the geographic relationships between various vector layers in an alternate (and complimentary) fashion to coordinates, such as latitude and longitude. From the research I completed for this G-FAQ series, it appears that topology has been a part of commercially available GIS software from the start. As such, you might ask yourself why this is the case, so let’s try to answer that question.
In order to answer why topology was created, we need to put ourselves in the proper mindset of 1980’s computer technology. These were the days of huge desktop computers that ran awfully slow compared to today, we also had those relics of the past called floppy disks and a hard drive that was 128 megabytes was massive! Okay so now that you remember the good old days of computer technology, spatial topology might make more sense.
From my research, it appears that there were three reasons for topology. First, enforcing spatial topology rules made sure that the vector files someone was creating were correct and adhered to the overall geographic relationships they should. For example, one common use of GIS in the 1980’s was digitizing parcel information from paper maps. Parcels typically connect to their neighbors’ properties and so there should not be gaps, holes, slivers, etc. between adjacent ones – spatial topology then assured this layer was created correctly as it was being digitized. In today’s versions of ArcGIS, topology can be maintained while editing and creating shapefiles without it being explicitly stored. Second, spatial topology reduced file sizes as, for example, common arcs and nodes only had to be stored once in the topology relationship tables; however, file sizes were not reduced as much as you would expect given that these tables had to be stored as well. Finally, topological tables made it quicker and easier to find vectors that adhered to the geographic relationship in question, for example adjacent polygons, when computers had far less processing power.
While spatial topology was created for the distinct reasons stated above, there are also several obvious drawbacks. First, it adds computational costs to creating and editing vectors which was an issue with 1980’s computer technology, however this is not a major issue today. Second, it forces users to create ‘clean’ data layers – which can be an advantage or a disadvantage, for example when a small fraction of the vectors you are digitizing actually might violate the overall topological rules. By clean, I mean that not only do vectors need to adhere to the spatial topology of the layer, they also need to be created with arcs having start and end nodes as well as closed polygons created in a clockwise fashion.
The ‘Topology’ of Shapefiles
Now that we have examined spatial topology, let’s spend the last part of this G-FAQ looking at the most common vector file format, the shapefile. As has been intimated previously, shapefiles do not store topology explicitly. Further, there is no arc-node topology with shapefiles. The basic structure of a shapefile is based around vertices and lines. There is a correct order to their vertices, but you can create valid shapefiles with vertices ordered counter-clockwise. The latitude and longitude as well as the order of vertices is embedded in the .SHP file of a shapefile. In the case of a line shapefile, there is also a list of the lines and vertices that make up each; and in the case of a polygon shapefile, there is a list of the vertices that comprise each polygon.
There are three distinct advantages to the non-topological structure of shapefiles. First, since they do not have topological tables, they can load into ArcGIS quicker and can be drawn faster. Second, even if it looks as though two polygons touch each other (or are adjacent), you can drag them apart and each would still be complete – this is not possible with a coverage file. Finally, shapefiles are created so that topology can be calculated and analyzed on the fly given the processing power of modern computers – this is a point I will elaborate on below. On the other side of the coin, there is a key disadvantage of non-topological shapefiles in that you have to digitize the same edge twice when polygons (or portions of lines) are adjacent; for example in the case of land parcels, which can result in a lot more work during the creation and editing of vector files.
To sum up this two-part G-FAQ series, spatial topology was created in a time when computers lacked the processing power required to perform complex spatial analyses predicated on geographic relationships. In most cases, maintaining spatial topology in an explicit fashion is not necessary, with a key exception of digitizing a complex paper map where many of the vectors are adjacent. Otherwise, shapefiles, which are often easier to work with given their relatively simple file structure compared to a geodatabase or coverage file, can be analyzed with modern computers to determine (or enforce) a suite of geographic relationships that were once defined in topological tables. For example, gaps between and holes within polygons can be located and fixed with automated tools in ArcGIS. Overlaps can be removed by looking at the intersection of polygons and deleting these areas from one of the polygons. Adjacent polygons and those completely contained within another can be located easily by comparing the vertex locations of each individual feature in a shapefile. So while a shapefile is non-topological, for all practical purposes, they can maintain the same relationships found in topological geodatabases and coverage files with the flexibility of violating these rules when required.
Do you have an idea for a future G-FAQ? If so, let me know by email at [email protected].
Find Out More About This Topic Here
- Esri – ArcGIS: Working with Geodatabase Topology
- Oberlin College – Understanding Topology
- Saint Louis University – How to Create and Edit Topology
- State University of New York – Data Structures
- University of Eastern Michigan – Fundamentals of GIS Data
- University of Massachusetts – Topology Slides
- University of Nebraska, Omaha – Structuring Maps
- University of North Texas – GeoDatabase Topology
- University of Texas – Vector Data Model
- University of Washington – Spatial Data Model
Brock Adam McCarty