EMODNET WG4 - EUBON Workshop: Mechanisms and guidelines to mobilize legacy biodiversity data

The EMODNET WG4- EUBON Workshop took place during 8-9 June 2015 in HCMR, Heraklion (Gournes, Crete) .

The overall objective of EMODnet WP4 is to fill in the spatial and temporal gaps in EMODnet species occurrence data availability by implementing data archaeology and rescue activities. This is a two-part process of first identifying and locating data and then performing the steps required to merge them into a digital database, which further will be distributed through EurOBIS, and the EMODnet data portal.

During this first part, many old faunistic reports have been located which contain valuable occurrence data on marine species. The extraction of these data and their conversion into OBIS format (a Darwin Core extension), is a slow and manual process.

In this workshop the Golden Gate Imagine software was demonstrated and participating data managers received training on how to semi-automate the previously mentioned tedious process. Different types of legacy literature were explored such as expedition results, protocol log books and more biodiversity research articles. GoldenGate Imagine was used both for digital born files and for scanned image PDF files.

Via hands-on sessions the complete process was studied: starting from how to scan a document, to import it into Golden Gate Imagine, to mark different document sections as well as entities of interests (e.g. taxonomic mentions and location names), to upload the markup in the PLAZI server (http://plazi.org/wiki/Taxon_Search_Portal) and from there to retrieve the auto-generated Darwin Core Archives.

Finally, in addition to the hands-on sessions, extensive discussions among the data managers and the information technology experts resulted in the compilation reward-via-publication suggestions and best practices (e.g. in scanning documents) to the assistance of the data extraction process.