Apr 01 2006

LIDAR data processing with open source tools

Published by perrygeo at 1:41 pm under GIS Tutorials, GRASS, Python

LIDAR data is certainly a hot technology these days. LIght Detection And Ranging data can be used to create extremely detailed terrain models but there are lots of barriers to using LIDAR data effectively. USGS Center for LIDAR Information Coordination and Knowledge was put in place to “facilitate data access, user coordination and education of lidar remote sensing for scientific needs“.

Beyond the sheer size of the datasets and the knowledge and hardware required to process them, software is a big issue. In the realm of open-source GIS tools, there are many applications (GRASS being the most prominent) for dealing with elevation point data and processing it into more meaningful products such as elevation DEMs and contours.

Usually the data comes as simple ASCII text files and the x,y and z values are easily extracted from such a file. But take a look at the USGS data distribution site and you’ll notice some of the datasets are distributed as LAS binary files. It makes sense to store such massive datasets in binary so I started looking for some LAS conversion tools.So after some searching, I found a bunch of proprietary products for working with LAS but no open source tools. Luckily, the format is well documented thanks to the efforts by the ASPRS to make it an open specification.

So dusting off my notes about parsing binary files in python, I set out to create a python module for extracting LIDAR data from LAS files. The LAS format contains a header which needs to be parsed first in order to read the point cloud. Once you have the header info, you can scan your way through the dataset to pick out the x,y,z values.

Here’s an example of the python interface that will read the first 10,000 points into a 2D shapefile with the elevation as a attribute in the dbf:

import pylas
infile = 'sanand000001.las'
outfile = 'lidar.shp'
header = pylas.parseHeader(infile)
pylas.createShp(outfile, header, numpts=10000, rand=False)

The issue I struggled with is the sheer size of these datasets. A USGS quarter quad can contain 10 million points which is an excessive number of points to create, say, a 10 meter DEM over such a small area. Clearly there was a need to extract a subset of this dataset but just taking the points sequentially gives you a subset of the total area. So, by default, pylas randomly scans the data to pull the number of specified points so that the point cloud could cover the entire area (at a much lower point density). Without numpts specified, it will randomly select 1/2000th of the total number.

So the simplified interface to make a more manageable lidar shapefile would be:

header = pylas.parseHeader(infile)
pylas.createShp(outfile, header)

Once the shapefile is created, you can bring it into GRASS to do the processing to generate DEMs, contours and other derived elevation products:

v.in.ogr dsn=lidar.shp layer=lidar output=lidar
g.region vect=lidar
g.region res=10
v.surf.rst input=lidar elev=lidar_dem zcolumn=elev

# Launch the interactive 3D viewer
nviz lidar_dem

Of course the method I just described is very simplistic and does not even come close to utilizing the full potential of the LIDAR point cloud, but it’s a start.

The pylas.py module can be downloaded here. The code has worked for me on the few datasets I’ve tested it with but it should certainly be considered a rough-cut, alpha product. There is much room for improvement and, of course, if you have any suggestions or contributions, please get in touch.

14 Responses to “LIDAR data processing with open source tools”

  1. Hamishon 08 Apr 2006 at 7:01 am

    Hi,

    GRASS 5 has a module called s.cellstats which is yet to be ported to GRASS 6. It fills a raster grid with min/max/mean/median of the points which fall in that grid. GRASS 6.1-cvs can now handle ~3million input points before it runs out of memory. Don’t create a table (v.in.ascii -t) and don’t build topology for pretty much unlimitted points. v.surf.rst et al are currently (slowly) being adapted to work with maps with no topography. Obviously a port or pre-processing tool like s.cellstats would be really great to have. Search the GRASS mailing list archives for posts by Helena on the subject.
    5% and 95% of points would also be nice..

    Hamish

  2. Igoron 04 May 2006 at 7:09 pm

    Nice work Hamish,

    What approach will you suggest to read 17,000 .las files and write shp files with, let’s say, 1% points, .1% points and 0.01 % points of all the set, being the total over 2 billion points and the distribution not even across .las files?
    Quite a challenge ah ?

  3. Bernhard Hoefleon 10 Sep 2006 at 11:26 pm

    Great idea to program a pylas module!
    Exactly what I was looking for.

    The fastest way to rasterize huge LIDAR point datasets would be to extract subsets with pylas, save it as (x,y,z) ASCII files and import them with the new GRASS module r.in.xyz. There are several modules for interpolate the NODATA cells (e.g. use the triangulation of nnbathy - r.surf.nnbathy).

    Bernhard

  4. gokon 15 Sep 2006 at 10:44 am

    Anybody has a sample data in .las 1.1 or 2.0 format?
    Please, let me know.
    Thanks!
    Gennady

  5. STon 09 Nov 2006 at 11:32 am

    I need to include the projection information in the binary las file and save out the file in the las format again.

    I guess I need to change the settings in the point data records of the header file so as to populate the projection/geo-referencing info, project the data into the required system and save it back as a .las file directly or through an intermediate .shp file.

    Can you help me with that please.

  6. Martin Isenburgon 26 Feb 2007 at 1:35 am

    i have a put few links where to download sample LIDAR data in LAS format on this webpage http://www.cs.unc.edu/~isenburg/laszip/ and there is also c++ source code with examples for reading and writing the LAS format.

  7. Martin Isenburgon 04 Jun 2007 at 10:04 pm

    i have put some code here (http://www.cs.unc.edu/~isenburg/googleearth/) for converting large amounts of LIDAR points in either LAS format or ASCII format (possibly stored across multiple files) into correctly georeferenced *.KML files that contain a tiling of on-the-fly extracted elevation contours for easy visualization in google earth.

  8. Jianghua Zhengon 10 Jul 2007 at 5:30 am

    Does anybody know some more effective algorithms for linear objects’s edage detecting and crossing points extracting. I just want to obtain the edages of linear objects in vector format.
    itslbs@126.com
    Thanks for your reply.

  9. Ralph Haugerudon 27 Dec 2007 at 5:31 pm

    Thanks for writing and releasing this–it’s been a great convenience.

    But there is a small bug in pylas.py. Raw X, Y, and Z are read as unsigned integers (L format) and they are signed (l format). Not a problem until you have negative values of a coordinate, such as beach elevation at low tide, and then …

  10. mayaon 17 Apr 2009 at 10:33 am

    Hi all,

    I want to know how to import lidar data to grass and create voxels in them. I am a newbie to lidar and to GRASS..so apologies if i am asking a dumb question. Is there are sort of manual for GRASS to work with lidar data.

    Thanks.

  11. Siti Hawaon 19 Apr 2009 at 7:41 pm

    i have the cloud point data (laser data converted from .las to ascii xyz data format). i wanna create some surface model using GRASS 6. however, i’m yet not familiar with this software. anybody can list down the processing step to create surface model from ascii xyz data? besides, is anybody know any opensources to create surface model (3D)? thanks.

    _newbie GIS_

  12. deadjoon 21 Apr 2009 at 2:17 pm

    @ maya,

    I am also at this stage, trying to get Lidar ascii data into GRASS. The wiki they have is very useful (links off the website), but the ascii I have is the point number, x, y, z, intensity, (then in the same row the last return) x, y, z, intensity.

    When I’ve tried importing ascii to vector into GRASS it only gives option of point number, x, y, z. So i am wondering how to get the last return data in there as well?

    I wonder if i am importing in the right place. Anyway good luck…

  13. mrlon 15 Oct 2009 at 5:27 pm

    @deadjo,
    A raster grid can only have 1 value for each cell, so you can either have elevation, using the z value from the 3rd column, or you can use the intensity value, using the 4th column for the z value, etc.

  14. rmgon 09 Dec 2009 at 2:32 am

    do anyone have c++ open source code for reading las data i tried the one written by martin isenburg but it doesn’t work

Trackback URI | Comments RSS

Leave a Reply