Mar 25 2009
The magic bullet
Dealing with corrupted shapefiles can be a painful experience: programs crash for seemingly no reason, attribute tables get screwy, features get lost, queries results don’t look right and ArcGIS processing tools fail with mysterious error codes:

Never fear, OGR is here. The magic bullet for fixing corrupted shapefiles is, 90% of the time, accomplished by using ogr2ogr to convert the shapefile to another shapefile.
ogr2ogr -f “ESRI Shapefile” shiny_new_clean_dataset.shp corrupted_dataset.shp corrupted_dataset
OGR’s internal data model cleans it up and the output is a fresh shiny new shapefile that works without hassle.
Don’t forget the -skipfailures switch which will toss out null geometries or other funkiness that might be causing troubles.
Thanks, this is helpful. ESRI seemed very proud of their increased error reporting in geoprocessing with the 9.3 release, but I find that almost all of the errors I get are “999999″.
I find my corrupt shapefiles tend to be those created by non-GIS applications (CADD, and the 3D visualization package EVS). In these cases, the “Repair Geometry” tool (in the ArcToolbox at the ArcView license) is generally enough to fix it.
GDAL (and OGR) is the Swiss Army Knife of geodata. But beware that using OGR to fix shape files does not always give you what you expect. Even when it looks like it at a first look.
For instance we recently came across a bug in ArcGIS where deleting a feature from a shapefile would result in the geometry being removed from the shp-file but the attributes would still be present in the dbf-file. And thus there would be a different number of records in the shp and the dbf files. ArcGIS crashes on such a shape file. OGR reads and translates it happily, BUT the geometries are not linked to the correct attributes. Depending on the type of data, this kind of error can be extremely difficult to check for.
Not to knock OGR, but why would you use this over the “Repair Geometry” function in ArcToolbox?
http://webhelp.esri.com/arcgisdesktop/9.3/index.cfm?TopicName=checking%20and%20repairing%20geometries
If OGR is able to fix problems that ESRI’s built-in tool cannot, they should be aware of the issue. Considering ESRI is relying more on GDAL/OGR for some of its translation tasks, I doubt they would have reservations using OGR’s method for fixing shapefiles, if that fix handles more issues than their own solution.
http://webhelp.esri.com/arcgisdesktop/9.2/body.cfm?tocVisable=1&ID=2458&TopicName=Supported%20raster%20dataset%20file%20formats
As Jamey mentions, Repair Geometry should take care of most of your problems. Asger’s problem (null geom in shp) is handled by the Repair Geometry tool.
John,
Why OGR instead of Repair geometry? Simply because it works better and more consistently for me. I typically don’t have time to troubleshoot a variety of options so I go with the tool that works.
[…] PerryGeo ยป The magic bullet (tags: shapefile solutions) […]