John McPhee on geological language

Started reading John McPhee’s tetralogue on geology, Annals of the Former World. Here is a memorable sampling of the thick sediments of geological language (p. 33):

As years went by, such verbal deposits would thicken. Someone developed enough effrontery to call a piece of our earth an epieugeosyncline. There were those who said interfluve when they meant between two streams, and a perfectly good word like mesopotamian would do. A cactolith, according to the American Geological Institute’s Glossary of Geology and Related Sciences, was a “quasi-horizontal chonolith composed of anastomosing ductoliths, whose distal ends curl like a harpolith, thin like a sphenolith, or bulge discordantly like an akmolith or ethmolith.” The same class of people who called one rock serpentine called another jacupirangite. Clinoptilolite, eclogite, migmatite, tincalconite, szaibelyite, pumpellyite. Meyerhofferite. The same class of people who called one rock paracelsian called another despujolsite. Metakirchheimerite, phlogopite, ktzenbuckelite, mboziite, noselite, neighborite, samsonite, pigeonite, muskoxite, pabstite, aenigmatite. Joesmithite.

He could have included turbidite, tsunamite, tempestite, unifite, homogenite, debrite, hyperpycnite, and contourite as well. As if this wasn’t enough, there are sedimentary geologists who suggest introducing new ‘ites’ (PDF link) like gravite and densite.

Bullshite.

Bedforms in Matlab – everything you wanted to know about ripple marks and cross beds

David Rubin’s bedform-generating code has been implemented in Matlab (in fact, it has been out there for a while). It is a great learning, teaching, and research tool that can be downloaded as part of an USGS open file report. Strongly recommended to anyone having some interest in sedimentary structures, bedforms, and cool Matlab graphics.

That reminds me of something else: it would be nice to have a Matlab version running on Intel Macs. I hope Mathworks will keep its promises and have something ready by early 2007. Having to reboot the iMac in Windows XP is an acceptable solution, but I could live without it [although even Windows XP looks OK on this kind of hardware 🙂 ].

Georeferencing photos on a Mac

Not long ago I managed to georeference some of my photos using GPS measurements. Before I forget how I did this, here are some notes on the process. The key piece of software is GPSPhotoLinker, written by Jeffrey Early. After downloading and installing this nice little program, the next step is to get the GPS tracks from the GPS unit. For some reason, GPSPhotoLinker did not do this for me; so I downloaded GPSBabel, connected my Garmin Vista Cx to the iMac, and saved the tracks in GPX format. [GPSBabel is the same utility that is used inside GPSPhotoLinker]. I tried to open the GPX file in GPSPhotoLinker, but it did not work. The problem was that some of the tracks on the GPS unit were actually saved — and saving tracks on a Garmin GPS unit (and maybe on other units as well, I don’t know) results in losing the time stamp from each datapoint. GPSPhotoLinker apparently is not able to just ignore this part of the GPX file; the only solution was that I manually deleted all the saved tracks from the GPX file. After that, everything went pretty smoothly. GPSPhotoLinker finds the GPS points that are the closest in time to the time stamp of the photograph and writes the latitude and longitude into the EXIF header of the jpeg file. You can choose between ‘snapping’ photo locations to the nearest GPS datapoint or to interpolate between two points to find the best estimate for the place where the photo was taken. It is important, of course, to record a fairly large number of GPS points when you are taking the pictures.

Once I had the photos tagged with the geographic coordinates, I had two options to display them in the context of a map: either relying on Smugmug, the photo-hosting web service that I use, or on a cool iPhoto plugin called iPhotoToGoogleEarth. With Smugmug, both Google Maps and Google Earth can be used to look at the photos; the drawback is that the displayed pictures are small and you have to go go back to the Smugmug page to see the photos in a reasonable size. The iPhoto plugin generates a kmz file that can be opened with Google Earth and includes all the photos in a reasonable size, that, of course, can be adjusted by the user). The advantage is that you do not have to leave Google Earth in order to look at the photos.

Here is my first try at doing the gereferencing, as shown by Smugmug in Google Maps. It is not a bad idea after all to have a GPS unit handy when you are traveling and taking photos.

PS. In addition to the saved tracks, the other thing that GPSPhotoLinker does not like in the GPX file is the part of the header that refers to the geographic bounds of the file, e.g., “bounds minlat=”-51.725563835″ minlon =”-98.491744157″ maxlat=”43.777740654″ maxlon=”131.500083692″”. You have to delete that in order for GPSPhotoLinker to read the file.

PS 2. There is always more to learn. I thought that the ideal workflow for georeferencing photos would be to (1) do the tagging in GPSPhotoLinker, (2) import the photos to iPhoto, (3) export the ones I want to post on the web, and (4) put them on Smugmug. It turns out this does not work well; all the photos I took in California (and were correctly labeled by GPSPhotolinker) ended up in Kamchatka. The point is that the georeferencing must be done (or redone) after the photos are exported from iPhoto.

Digital Earth

Last weekend I discovered (1) that Google Earth was even more amazing than I had previously thought [and now they have a Mac version as well!]; and (2) there is a lot more out there in terms of digital geography if you look a bit harder.

Here is for example this USGS site from which you can download (with some patience) not only the usual satellite imagery but digital elevation models (DEMs) as well, for pretty much the whole globe [thanks to my friend Radu Girabcea for pointing me to it]. Once you’ve got a DEM, you can use 3dem, a nice little piece of freeware to display the elevation models in 2D and 3D and to drape georeferenced images over the topography. DEMs are available (for free — at least at this point) with a ~10 m resolution for most of the US and a ~30 m resolution for other areas (I was especially excited to savor the detailed topography of the Carpathians — the more familiar you are with a place, the more illuminating it can be if you examine the morphology).

Another thing worth taking a look at is NASA’s version of Google Earth, that is, World Wind. With one click, you can switch from Landsat images to USGS topographic maps [although I often have problems with the server connection]. Can it get a lot better than this?

Global warming does not cause earthquakes

According to Wired magazine’s “Biggest Discoveries of 2005“, the most important discovery of 2005 is that

Thanks to the Asian tsunami and Hurricane Katrina, global warming can no longer be ignored.

I agree that global warming can no longer be ignored, but you don’t need to know too much about earth science to realize that the Asian tsunami has absolutely nothing to do with it.

The fractal nature of Einstein’s and Darwin’s letter writing

Power laws are gathering quite some attention again, thanks to a few new papers (e.g., this one and this one) by Albert-László Barabási and his coworkers, published in Nature. Cosma Shalizi and others disagree: once again, just because some dataset on a log-log plot looks like you could easily fit a straight line to it, it is not safe to conclude that it is a power-law distribution.

One of the papers looks at the letter-writing habits of Darwin and Einstein, and concludes that the response times have a power-law distribution with an exponent of 3/2. The other “reports that the probability distribution of time intervals between consecutive emails sent by a single user and time delays fro eamil replies follow a power law”. Shalizi and Stouffer et al. claim that these are in fact lognormal distributions.

I am wondering if you could ever get a paper published in Nature that looks at some dataset, shows that it has normal or lognormal distribution, draws some overarching and universal conclusions from that, and… and that’s it.

Or, to translate it to the much more mundane language of geologists, that only applies to dirt, not to Einstein’s letters: there is no interesting story in showing that bed thicknesses or sedimentary body sizes have a lognormal distribution, but if it’s a power law, suddenly you can talk about the “scale-independent physics of turbidite deposition” and the importance of non-equilibrium thermodynamics in the geometries of deltas and everything else under the sun.

That’s why power laws are great.

Hurricanes and barrier islands

Here is the reason why one should think twice about buying or building a house on a barrier island that is in hurricane country. This USGS website also shows convincingly that Hurricane Rita should not be misunderestimated 🙂 just because it barely touched the Houston-Galveston area. It did plenty of damage where the right-front qaudrant made landfall – things would have been very different around here if Rita made landfall at Galveston or a bit to the West of Galveston.

And these images of a barrier island that migrates landward as hurricanes go over it make you wonder how much of the geologic record of barrier islands (and beaches in general) actually consist of fairweather deposits. Everything seems to be moving and redepositing during these storms.

On cumulative probability curves

Let’s go back to some good old science subjects and take some notes about sediments, something I am supposed to be an expert in.

One of my favorite pastimes lately is collecting examples from the geological literature in which the statistical analysis went incredibly wrong. Take for example the papers dealing with grain-size distributions that advertise cumulative probability plots as the best technique to identify subpopulations in a mixed distribution. Here is what G.S. Visher says in his 1969 paper on “Grain size distributions and depositional processes” (Journal of Sedimentary Petrology, v. 39, p. 1074-1106):

“The most important aspect in analysis of textural patterns is the recognition of straight line curve segments. In figure 3 four such segments occur on the log-probability curve, each defined by at least four control points. The interpretation of this distribution is that it represents four separate log-normal populations. Each population is truncated and joined with the next population to form a single distribution. This means that grain size distributions do not follow a single log-normal law, but are composed of several log-normal populations each with a different mean and a standard deviation. These separate populations are readily identifiable on the log-probability plot, but they are difficult to precisely define on the other two curves.” (p. 1079)

I am wondering if this tendency to see straight line segments in cumulative probability plots and to give them some special significance is a syndrome restricted only to geologists – whose abilities for pattern recognition are excellent in general – or one could find such examples from other fields as well. The fact that a certain distribution looks like a straight line on a cumulative plot does not mean that mixtures of the same type of distribution will plot as straight line segments. The excellent sedimentologist Robert Folk has pointed this out in a 1977 discussion of a paper coauthored by Visher (in which they try to prove that the Navajo Sandstone is not an eolian deposit – yeah, right):

“A general defect of the Visher method is exemplified by Kane Creek #2, which is shown as consisting of four straight line segments, implying that it is a mixture of four populations. It can be proved by anyone using probability paper and ordinary arithmetic that such kinky curves can be made by a simple mixing of two (not four) populations that are widely separated; the ‘flat’ portions represent the gaps in the distribution. Furthermore, mixing of populations on probability paper results in smoothly curving inflexions, not angularly joined straight-line segments.”

Despite this, multiple straight-line-fitting to cumulative probability plots is fashionable again, although this time it is done on log-log plots of exceedence probability of either bed thickness or fault size data. But this is going to be part of a paper that I am working on right now (in the evenings and weekends…) — so more about this later.