Archive for the 'Uncategorized' Category

Mar 31 2010

Distributed

Published by perrygeo under Uncategorized

I’ve been playing around with some distributed version control systems (DVCS) to replace svn.

First, the why: I’ll leave the details up to Joel in his excellent HgInit tutorial. Its mercurial-specific but the general concepts apply to any DVCS. The takeaway message for any project with > 1 developer is this:

Mercurial [ed: DVCS] separates the act of committing new code from the act of inflicting it on everybody else.

Next, the implementation: I’m using git to work on another project (Golden Cheetah) and its been a tough learning curve. Git is no doubt the most powerful DVCS out there. You can do magical things with it like combine commits and mess with history trees. And you can also screw things up pretty badly if you misinterpret the esotric docs for some non-intuitive piece of the workflow.

I just tried mercurial this morning - hg seems to fit my mind well. There is less power but the workflow is very clear and intuitive. And there are docs written for people who don’t want to do an in-depth study of their version control software. It stays out of the way.

Long story short, I’m going to use mercurial/hg for my new projects. Ah what the heck my old/ongoing projects as well. My googlecode repository has been converted over to Mercurial. Svn will stick around but wont be updated.

No responses yet

Feb 18 2010

Lazy raster processing with GDAL VRTs

Published by perrygeo under Uncategorized

No, not lazy as in REST :-) … Lazy as in “Lazy evaluation“:

In computer programming, lazy evaluation is the technique of delaying a computation until the result is required.

Take an example raster processing workflow to go from a bunch of tiled, latlong, GeoTiff digital elevation models to a single shaded relief GeoTiff in projected space:

  1. Merge the tiles together
  2. Reproject the merged DEM (using bilinear or cubic interpolation)
  3. Generate the hillshade from the merged DEM

Simple enough to do with GDAL tools on the command line. Here’s the typical, process-as-you-go implementation:

  1. gdal_merge.py -of GTiff -o srtm_merged.tif srtm_12_*.tif
  2. gdalwarp -t_srs epsg:3310 -r bilinear -of GTiff srtm_merged.tif srtm_merged_3310.tif
  3. gdaldem hillshade srtm_merged_3310.tif srtm_merged_3310_shade.tif -of GTiff

Alternately, we can simulate lazy evaluation by using GDAL Virtual Rasters (VRT) to perform the intermediate steps, only outputting the GeoTiff as the final step.

  1. gdalbuildvrt srtm_merged.vrt srtm_12_0*.tif
  2. gdalwarp -t_srs epsg:3310 -r bilinear -of VRT srtm_merged.vrt srtm_merged_3310.vrt
  3. gdaldem hillshade srtm_merged_3310.vrt srtm_merged_3310_shade2.tif -of GTiff

So what’s the advantage to doing it the VRT way? They both produce exactly the same output raster. Lets compare:

Process-As-You-Go   “Lazy” VRTs
Merge (#1) time 3.1 sec 0.05 sec
Warp (#2) time 7.3 sec 0.10 sec
Hillshade (#3) time 10.5 sec 19.75 sec
Total processing time 20.9 sec 19.9 sec
Intermediate files 2 tifs 2 vrts
Intermediate file size 261 MB 0.005 MB

The Lazy VRT method delays all the computationally-intensive processing until it is actually required. The intermediate files, instead of containing the raw raster output of the actual computation, are XML files which contain the instructions to get the desired output. This allows GDAL to do all the processing in one step (the final step #3). The total processing time is not significantly different between the two methods but in terms of the productivity of the GIS analyst, the VRT method is superior. Imagine working with datasets 1000x this size with many more steps - having to type the command, wait 2 hours, type the next, etc. would be a waste of human resources versus assembling the instructions into vrts then hitting the final processing step when you leave the office for a long weekend.

Additionaly, the VRT method produces only small intermediate xml files instead of having a potentially huge data management nightmare of shuffling around GB (or TB) of intermediate outputs! Plus those xml files serve as an excellent piece of metadata which describe the exact processing steps which you can refer to later or adapt to different datasets.

So next time you have a multi-step raster workflow, use the GDAL VRTs to your full advantage - you’ll save yourself time and disk space by being lazy.

2 responses so far

Dec 16 2009

Peaksware licensing revisted …

Published by perrygeo under Uncategorized

I had previously bitched and moaned about the licensing restrictions on the TrainingPeaks WKO+ software. Truth be told, the reason I was so put off by their crappy licensing scheme was that my cycling training relied so heavily on their software. It was not perfect but it was the best tool available. I’ve since discovered Golden Cheetah which is a viable open-source alternative but it still lags behind WKO+ in many critical features.

Now, fresh in time for the 2010 training season, Peaksware has released a new version 3.0 of WKO+ which, amongst many UI and functionality improvements, has made considerable progress on the licensing front.

We know, our licensing has been a challenge to deal with for our customers in the past, but we’ve always tried to be as helpful as possible getting you back up and running after a hard drive crash or new computer. To remedy this, we’re pleased to announce an all new flexible licensing system. First, with every purchase we now allow you to install WKO+ 3.0 on up to two computers; second, we’ve built an online activation/deactivation system so you are free to move your active licenses from machine to machine. Are you leaving on a 2 week trip? Just de-activate your home computer, activate your laptop, and you’re on your way. When you get home, de-actiavate your laptop, re-activate your desktop and you’re all set.

It ain’t open source (there is still a place in this world for proprietary software if they can push the boundaries and innovate) but the sensitivity to the licensing issue just may have restored my faith in their company.

One response so far

Jun 23 2009

Peaksware licensing hell

Published by perrygeo under Uncategorized

I’ve been using Peaksware’s WKO+, a cycling and running training tool to manage data from heart rate monitors, GPS units, power meters, etc. Its a powerful tool with a clunky UI but I’ve gotten used to it.

You pay $100 for a “personal” license. Not a big deal to me since they basically have a monopoly on this software niche. I first installed it on my work computer to test the data from my daily bike commute. Cool it works. Then I went to install it at home since that’s where I’ll be using it. Works ok. I proceed to gather all my fitness data into their proprietary binary format.

Fast forward a few months. I’m reformatting the hard drive on the laptop and want to move all my data and software to my desktop. But installing WKO+ is giving me a headache (”Error: Too many installations”). The registration process takes a hardware fingerprint and your must active it via the web to get a registration code. However, hidden withing their EULA, is a term which dissallows the transfer of license to another computer other than the one to which it was originally installed. The second installation was just an allowance they make to allow for “hard drive crashes” and such.

Since neither of those machines would be available to me, certainly there would be a way to transfer it? After several progressively more desperate communications with Matt Allen at peaksware support, he informed me that there was no way they would transfer the license (the non-transfer clause IS in the EULA after all). I would need to purchase another license simply because I switched computers!

Here is my response:

Basically what you are telling me is that I can no longer use WKO+
without paying again. I get to use the software for a few months and
you revoke my right to use it because I buy a new computer! I am a
paying customer, trying to be totally legit here, willing to support
your business in exchange for a license to use your software and you
insist on screwing me over. Brilliant.

This is one of the most unprofessional and idiotic stances I have ever
seen from a software company. Your intention appears to be to screw
over your paying customers and milk as much cash from them as possible
- you might want to rethink that business model unless you want to
loose customers! I will never endorse, recommend or purchase another
product or service from peaksware nor will any of my family, friends,
teammates or readers once the word gets out about your disrespectful
policies.

There are numerous typical situations where a new copy of the software
would need to be installed including:

* Hard drive failure
* Operating system upgrades
* New computer purchases
* Extended traveling and touring (installing onto a laptop or netbook)

Now I fully understand why your policy is one license per computer. It
makes perfect sense. I have seen plenty of other software with a
similar licensing model. But they also allow to uninstall the software
and re-register it on another computer due to these circumstances.
There is simply no technological reason why you could not implement a
licensing structure that allowed the user more freedom to transfer
licenses while still preventing piracy. As it stands, your licensing
model treats paying customers like criminals if they happen to run
across any one of the above situations.

So, to sum it up - your foolish license policy has lost you one
customer and many future ones.

Good riddance.

So if you want to support a company that treats its paying customers like criminals because they get a new computer, go right ahead and support Peaksware. But anyone who expects to use software that they pay for even if they happen to buy a new computer should steer clear.

The real kicker is that all that work is locked away in their proprietary file format simply because of their draconian licensing. This is the real take home lesson to all software users (not just fitness geeks): If you lock your data away in a proprietary format and are beholden to a single company in order to access it, they can and will screw you. Always insist on open data formats, even if using proprietary software. Oh and always read the EULA carefully before clicking OK!

5 responses so far

Jun 12 2009

The GPS told me to do it

Published by perrygeo under Uncategorized

Another disastrous consequence of inaccurate spatial information… Not only can you accidentally tag your neighbor as a criminal, now it appears that sloppy spatial data has lead to the wrong house getting demolished.

I’ve asked it before but its worth repeating … with all the recent advances in spatial data publishing, where are the advances in metadata and data quality assurance? How do you know where the data comes from, what’s been done to it and by whom? What is the intended use of the data? For the vast majority of the data being shoved out onto the web, these bits of metadata are sorely lacking.

Of course this case is more a matter of one person’s sheer stupidity; I’m not sure any caveats in the metadata would have stopped the wrecking ball!

One response so far

Mar 25 2009

The magic bullet

Published by perrygeo under Uncategorized

Dealing with corrupted shapefiles can be a painful experience: programs crash for seemingly no reason, attribute tables get screwy, features get lost, queries results don’t look right and ArcGIS processing tools fail with mysterious error codes:

Dissolve error

Never fear, OGR is here. The magic bullet for fixing corrupted shapefiles is, 90% of the time, accomplished by using ogr2ogr to convert the shapefile to another shapefile.


ogr2ogr -f “ESRI Shapefile” shiny_new_clean_dataset.shp corrupted_dataset.shp corrupted_dataset

OGR’s internal data model cleans it up and the output is a fresh shiny new shapefile that works without hassle.

7 responses so far

Feb 19 2009

TV cycling coverage is dead

Published by perrygeo under Uncategorized

Real-time spatial application developers take note…

I’ve been following the Tour of California this week (looking forward to the Solvang Time Trial this Friday) and have been disappointed with the TV coverage on Versus. Its not that the coverage is bad, its just that long-distance endurance sports don’t lend themselves to the traditional 2 announcers and 1 camera format. There are multiple groups of riders and so much spatial information to keep track of if one really wants to understand the dynamics of a cycling event.

Maybe I’ve just been spoiled by the Amgen Tour Tracker. It is a crowning example of a spatially-aware real-time web application.

It provides two cameras of live coverage, live commentary with interviews, chat, summary updates, gps tracking of riders shown on both an elevation profile and a yahoo-based aerial map, “gps+” location prediction, race standings, time checks, etc. Far more information than any TV coverage without resorting to information overload.

3 responses so far

Feb 12 2009

Stimulus watch

Published by perrygeo under Uncategorized

Last time I posted on this blog, Hillary and Obama were still battling it out for the Democratic nomination. Now Barack Obama is our president with an uphill battle to save the economy. So yeah, it’s been a while. I haven’t been doing too much innovative Geo-related stuff lately, hence the lack of blog posts. I’ll try to pick up the pace a bit, even if I have to resort to fluff pieces like this one…

Well, it looks like the economic stimulus bill is going to pass. The bill doesn’t actually specify the projects that will be funded; the money will be allocated to cities and some federal grant agencies. The mayors have already proposed thousands of “shovel-ready” projects that might get a green light depending on how much funding the city gets.

There’s a great site, stimuluswatch.org, that allows the public to review these proposals. Good to know where our tax dollars are headed!

There are several GIS proposals ranging from projects with specific, well-defined (and measurable) objectives to the nebulous “Give us $500,000 to upgrade our cities’ GIS program”. It will be interesting to see which ones pan out, which ones produce results and which ones are just a pure waste of taxpayer dollars.

P.S. If you’d like to see where most of my time and energy is going these days, it’s training for the US National Cup mountain bike race series. My cycling exploits are available for all who are inclined to read them.

No responses yet

Jul 15 2008

R is for Radiohead

Published by perrygeo under Uncategorized

Radiohead realeased their video for House of Cards yesterday. Besides being a big radiohead fan, I was also loving the LIDAR technology behind the video.

If you want to check it out yourself, there are code samples on the site as well as access to the raw data. The csv files have four columns (x, y, z, and intensity). For me the quickest way to visualize the data was through R and it’s OpenGL interface called rgl (which is a wonderful high-level 3D data visualization environment).

Assuming you have R installed, rgl is a simple add on through the CRAN repositories:

install.packages("rgl")

Then you need to load the library, load the csv, scale the intensity values from 0 to 1. Then it’s a simple rgl.points command to get an interactive 3D rendering:

library(rgl)
d < - read.csv("C:/temp/radiohead/22.csv", header=FALSE)

# scale intensity values from 0 to 1
d$int <- d[,4] / 255

# rgl.points(x,y,z,size=__,color=__)
# note y value is inverted
# color is a grayscale rgb based on intensity
rgl.points(d[,1],d[,2]*-1,d[,3], size=3, color=rgb(d$int,d$int,d$int))

That’s all it takes to render Thom Yorke in all his 3D digital glory:

2 responses so far

Jun 12 2008

Geospatial Reddit - 2 weeks later

Published by perrygeo under Uncategorized

So, despite frustrations with getting submitted URLs to appear, Geospatial Reddit is still puttering along. Not exactly a vibrant community yet but there are currently 133 subscribers. If you’re subscribed, take a minute to submit your favorite URLs. If you haven’t subscribed, check it out.

I thought 133 subscribers was a decent number until I found that the Bacon subreddit has over 500. Apparently the world would rather discuss their greasy breakfast food than maps.

5 responses so far

Next »