Wednesday, October 5, 2011

A Perspective on VertNet

This guest post was written by Dr. Linda Trueb, Senior Curator of Herpetology, and Associate Director for Research & Collections, University of Kansas Biodiversity Institute. 
 
As I read VertNet’s recent blogs, a wave of what only could be described as nostalgia overtook me.
 
Today. We don’t need local servers to support DiGIR portals.  We can publish our data to a cloud and then access, visualize, and search these data.  Yes, you have to prep your data before you publish it, but VertNet will provide a script for your local computer that will prepare and publish records to the system.  The script is an anticipated product of the current VertNet NSF grant.  While we await its development, we can use GBIF Integrated Publishing Toolkit —IPT, a Java-based web application that the user can employ to publish data from a server. The system relies on Darwin Core to assure consistency of data describing organisms (identity, geographic origin, age, along with other associated documentation), so that data can be shared among biodiversity scientists.

Step back 50 years.  We climbed the stairs to the 7th Floor combined herpetology and ichthyology collections in what was then known as the University of Kansas Museum of Natural History.  Wooden shelves housed pickled herps and fish in Ball Mason jars, and ceramic crocks littered the floor around a large, central table that was devoted to specimen preparation and cataloguing (= data processing).  The “data processing” table was equipped with one extension telephone (rotary dial). By 1961, an electric typewriter had replaced the manual machine, but Selectric typewriters and correction tape had yet to appear.  The essential tools of data processing were the original field notes of the collector, a Rapidograph technical pen and India ink (no polymer inks yet), the Dickensonian ledger book (ca. 12 x 18 in. and 8 pounds) that was stored in a fire vault, and an employee with extraordinary penmanship to enter, specimen-by-specimen, the name of the animal, its field and accession numbers, locality and date of collection, and the collector(s).  The first innovation that I recall was permission to use ditto marks for series of specimens with duplicate data—a bold step forward!

 
 
The electric typewriter probably was the greatest data-processing innovation of the 60s.  With this enhanced tool, we began to think of other ways to access and use our data.  We could, for example, create species sheets; each sheet recorded all of the holdings of a taxon.  The typewritten sheets were updated as new material was catalogued, and kept in 3-ring binders.  When we received a request for information, we could thermofax, and later, photocopy the species sheet and send it to the researcher.  But what about our holdings by geographic locality?  For that, we created a geographic file on 3 x 5, colored index cards; the colors coded the taxon (frogs, lizards, etc.) and the cards were filed according to states (U.S. and Mexico) and countries elsewhere in the world.  Like the species sheets, the cards could be photocopied and sent to researchers.

Fast-forward.  Mainframe computers arrived in the early 70s along with varieties of programs to handle specimen data.  Our first experiment was with a program known as Selgem (Self-Generated Master).  All of the collections data had to be entered again and verified.  The “hardcopy” was a handy, punched paper tape.  Meanwhile, we kept the hand-written catalogue up to date. In the 80s, we had remote dumb (emphasis on dumb) terminals to hook to the mainframe and print out search records from primitive programs such as FOCUS.  With the arrival of desktop computers and a proliferation of database software in the late 80s and early 90s, we again had the opportunity to re-enter and re-verify data. (Are you seeing a trend here?)  Many of the earlier database programs fell in the face of the selective pressure of advancing technology, and by the 90s, most major collections had electronic records on one of perhaps a half-dozen systems, and institutions reaped the benefits of being able to search their data and prepare reports without typewriters.  And then with the arrival of the Internet, another great innovation emerged at the turn of the century—the concept of a network of distributed databases (e.g., HerpNet, MaNIS, FishNet, ORNIS) based on Darwin Core available to individual researchers on their local computers via servers that supported DiGIR portals, with biodiversity data being the politically preferable name for collections data.

The Future.  Most of us no longer maintain hand-written catalogue ledgers, on the supposition that our faith in information technology is not misguided.  We have only to visit and use collections that have not entered the information age to realize the profound benefits and opportunities afforded by these electronic tools.  In time, I am certain that such collections will be incorporated into our networked system leaving us with an even greater challenge—i.e., how to sustain the network as a whole for future biodiversity scientists.

Photo Captions (top to bottom):

Data Processing: The table and the process that I described; the lady is Barbara Berg, an assistant in 1960. Credit: KUBI

Dickensonian Ledger: A page from the ledger that I actually filled out back in the 60s.  Credit: Linda Trueb

Catalog Room: The accumulation of catalogues for more than the first 100 years of herps at KU.  Credit: Linda Trueb


Notes

  1. vertnet posted this