CHAH/HISCOM Workshop: Future of the AVH

Venue
National Herbarium of Victoria, South Yarra, Melbourne, 26 November 2008

Melbourne accommodation options

Purpose
To evaluate the functionality and development status of the AVH and the opportunity for future development and support. An omphaloskeptic assessment of progress and potential.

Attendees

 * CHAH: Bill Barker (AD); David Cantrill (MEL); Alex Chapman (PERTH); Dale Dixon (DNA); Gintaras Kantvilas (HO); Brett Summerell (NSW -Chair); Judy West (CANB);
 * HISCOM: Bill Barker (AD); Peter Bostock (BRI); Jim Croft (CANB); Niels Klazenga (MEL); Laurence Paine (HO); Ben Richardson (PERTH); Brett Summerell (NSW); Alison Vaughan (MEL); Greg Whitbread (CANB); Karen Wilson (NSW); Aaron Wilton (CHR)
 * Other: Andrew Drinnan (MELU); Nicole Middleton (MELU)

Apologies

 * CHAH: Kevin Thiele (PERTH); Dale Dixon (DNA); Jeremy Bruhl (UNE); Cameron Slatyer (ABRS)
 * HISCOM: Donna Lewis (DNA); Helen Thompson (ABRS)

Background papers

 * AVH Briefing Paper November 2008
 * VAH 1999 Strategic Plan

Welcome and Introductions
Brett to take over as HISCOM Chair on 1 January 2009. He plans to begin on this date because he is currently Chair of CHAH and feels that the dual role would be in conflict.

The AVH Concept
Jim provided background on the development of the AVH between 1999 and now. The major focus during this time has been about putting dots on maps. The original concept was for an all-inclusive system that includes species descriptions and other applications. It is important that we revisit this original vision to see what we can achieve, especially in relation to the ALA and EoL.

Delivery/currency of data - how often is the cache refreshed?

Also need to figure out how to ensure that the data source is acknowledged appropriately, especially as our data gets dispersed more widely (e.g. through GBIF). There is an expectation from our users that our data will be freely available, and that it will be clean.


 * Acknowledgement and ownership issues
 * Need to be explicit about the limitations of the data.
 * Need effective feedback mechanisms to flag erroneous data, and for each herbarium to be committed to cleaning data at the source. Annotation system being developed by ALA and LSIDs will deal with this in the future, but we need an interim measure.

ACTION: Check with Brett - action re: DEHWA/ANHAT

ACTION: Ensure linkage between AVH & the annotation systems being developed by ALA

Interface design
The layout of the AVH web pages are not yet controlled by predefined templates describing the layout of the final page. The appearance of these elements is controlled by linked stylesheets. Updating AVH to use such an approach will enable broad appearance changes to be made efficiently across the AVH application. Ben has been working on cleaning up the source code so changes to the site can be managed by HISCOM rather than by eRSA. Once this is finished, Jim and Siobhan will work on the new design. Need to set limits on how much change we include in this version.

ACTION: All work on interface design to be completed by Dec 19 (Ben, eRSA, Siobhan, Jim)

User registration
CHAH needs to decide who is allowed access to the extended query, and how registrations will be administered. Synchronisation issue (i.e. synchronising registration between different access sites) needs to be resolved.

Acknowledgement/IP
???

Data limits
Set as 20,000 for public access. There are technical limitations to the number of records that can be returned via extended query;

AVH Weed Tracker
Rex and Bill evaluating the site over the next few days and then it will be handed over to HISCOM for further assessment. Licensing and registration issues will take some time to sort out. See Bill if you want the username and password. It was suggested that the weed group (a rep from each herbarium) is involved in testing the Weed Tracker along with HISCOM.

Installation of new AVH site
It is important that the AVH is installed on as many sites as possible.

ACTION: AD, MEL and CANB all to host the AVH on their local sites.

Updating AVH data
Only two herbaria are providing data to the AVH dynamically. We need to determine how frequently data from the other herbaria is to be updated. A significant amount of mapping and data cleaning was undertaken by eRSA after the last dump of data was delivered; this should be undertaken at the source institution as much as possible.

Need to follow up on getting assistance from the ALA for the installation of wrapping software at the herbaria who are not providing data dynamically.

License/registration
Laurence has been investigating licensing and registration models for the AVH. Each herbarium will have to sublicense their data to CHAH/AVH to allow for attribution to be manageable. Need to control the use of the data by defining its fitness for purpose in a way that requires users to go to the source instituions/s for appropriate data for other purposes (Laurence to rewrite this sentence). Will most likely have a two-level access system.

ACTION: HISCOM to expand upon registration options and present to CHAH.

Descriptive data
(Technical issues here - Jim to provide notes) Discussion about developing a federated approach to delivering species profile information via the AVH, and how this would link in with the APC.

New Zealand Virtual Herbarium
Aaron spoke about the development of the New Zealand Virtual Herbarium (NZVH)

will include 13 herbaria ranging from ca. 2,000 sheets to CHR (which is about 60,000 sheets)

identified conservation managers as the primary end users

contract signed recently - up to initiation phase

governing group (subcommittee of national herbarium network) - prioritisation of tasks, set strategic direction

Aaron leading the development project

all contributers meeting in a couple of weeks to do some data mapping and work through other issues

website running with 4 providers dynamically serving data by June 2009

hoping to publicly release with all providers up and running by end of 2009

rather than restricting data to MoU fields, will provide all fields to users (according to a priority list) - a concern here is r & t records; precise localities won't be given for these planning to use AVH software

using TAPIR instead of BioCASE; issues with some herbaria not being able to provide via this method, so looking at other options for them

need to decide how the NZVH would interact with AVH, e.g. links on site; data sharing etc.

How to get other herbaria linking in to AVH
Andrew indicated that university herbaria are very interested in providing data to AVH.

University herbaria have teaching as their core business, and don't have the history of involvement with HISCOM/CHAH etc, which impedes their involvement in the AVH. Most data entry is done by students or volunteers, so the database needs to cater to that level of expertise.

MELU's data is HISPID compliant. Andrew's database of around 20,000 records is held in FileMaker Pro, and his staff would need help (preferrably a visit) to be shown how to get the data in a format suitable for inclusion in AVH. They additionally don't mind whether the export file is held on their servers or on a server at one of the current AVH participant herbaria.

There remain a number of situations where static data file transfer remains the preferred method of updating the AVH cache:


 * 1) Where a herbarium that is already participating in the AVH makes a change to a large number of records in their source database.
 * 2) Where a new large specimen database needs to be added to the AVH network.
 * 3) Where there is insufficient IT resource at a participant herbarium to install and manage the BioCASE or TAPIR web service infrastructure. Smaller herbaria may wish to have their data hosted at larger institutions.

ACTION: CHAH members to talk to the smaller institutions in their jurisdictions to establish the state of their data and to gauge their interest in and ability to contribute data to the AVH - by end of Feb 2009 (CHAH)

(General agreement for something that I missed.)

What will we add to the AVH
(e.g. APC/APNI, protologues, on-line floras, keys to families, genera and species, tools, other resources)

The Atlas of Living Australia project was created as an infrastructure project. The AVH is a content project, and is considered a strong brand. It is thus useful to continue with this brand as an umbrella for the other data types originally envisaged in the VAH 1999 Strategic Plan. Those data types are now envisaged as sub-brands, including:


 * Mapper (the current AVH map interface)
 * Weed Tracker
 * Flora
 * Keys
 * APC/APNI
 * Collections (the current AVH Extended Query?)
 * Trust
 * Molecular (or DNA)
 * Seeds
 * Tools

Australian Plant Imaging Index (APII) - focus is currently on living plants, but there is scope to include specimen images as well. Inclusion of type and protologue images on APNI - need to have them cached so they are always available.

Alex noted that GET requests to AVH won't work with AVH 3.0 and that all existing links will be broken.

Jim sought approval from CHAH for HISCOM to continue looking at alternative mapping options for the AVH (e.g. Rex's Google mapper) A standing item at HISCOM meetings for several years has been to build a standard suite of validation tools and validation rules to apply to AVH.

ACTION: HISCOM to discuss how to implement the servicing of GET requests to AVH on the 27th.

ACTION: CANB to liaise with Dan Rosauer about the ANHAT validation tools and how they might be applied to AVH.

General agreement

Greg promised that, in four months time, APNI will be available to anyone who wants it.

Other ideas
Strategic plan needs to be revisited and updated.

There was some discussion both at the HISCOM 2008 AGM Fremantle and here that the AVH has not yet been fully defined. Is it a representation of the specimens that are stored in the state and territory herbaria, or is it (or will it be) a representation of the collections and the activities that take place in the herbaria? Should new products (e.g. species' profiles etc.) be branded as part of the AVH, or are there benefits (particularly with regard to funding) of branding new products separately?

It was generally agreed that the AVH brand is strong and has good governance, so it would be good to create new products as sub-brands of the AVH, rather than as entirely separate products.

New AVH site to include a front page that points to the other AVH products under development (e.g. Plant mapper, APC, eFloras, etc.)

How does it all fit with ALA
CHAH has not interacted with ALA very much (although some individual CHAH members have had a lot of contact in other roles). It was agreed that it would be advantageous for CHAH to have a MoU type relationship with ALA (though this might be covered by the contract that CHAH has signed with ALA). Kevin has a seat on the ALA management board, and it is his role to inform CHAH of ALA decisions and actions once they have been agreed upon. It was agreed that it would be better to have contributing herbaria listed individually on the ALA site, rather than simply as CHAH.

ACTION: CHAH to request that ALA requesting that contributing herbaria are listed individually.

What do we deliver to ALA/GBIF - sanitised version of our data, or the warts and all data available via AVH?

How do we spend the $80,000?

 * 1) Getting all herbaria delivering dynamically is the highest priority.
 * Need to establish whether ALA will provide funding to get all herbaria delivering dynamically. If not, then the $80,000 can go toward that (though it probably won't be enough money to get everyone delivering dynamically).
 * 1) Building a disambiguator for the APC (and linking this to the AVH) is second priority.
 * 2) Finishing the APC is the third priority.
 * 3) Fourth priority is eFlora/Flora online/species profiles
 * 4) Other AVH issues (quality control, curation tools etc.)

Potential sources of funding for other developments include:


 * ALA/NCRIS
 * TRIN
 * Caring for Country

For CHAH

 * Promote the AVH 'brand'
 * Promote partnerships with data harvesters/integrators
 * IP statements and data use
 * Policy on access to restricted data
 * AVH Strategic Plan

For HISCOM

 * Data custodianship associated with all records and data sets
 * AVH Interface design - look and feel
 * AVH system and user documentation
 * AVH interface help documentation
 * Dealing with 'cultivated' records - in data and in query
 * AVH Strategic Plan
 * Polygon searching

For CHAH

 * Develop/refine data use agreement(s)
 * Pursue partnerships with major data users (e.g. DEWHA,etc.)
 * Pursue 'fitness for purpose statement'
 * Update CHAH MOU to reflect fitness of purpose provisions

For HISCOM

 * Summarize limitations and caveats of AVH data
 * Ensure data custodianship/attribution in all data provision services
 * Pursue development and implementation of data annotation services
 * Update interface navigation, functionality (Ben, by end December)
 * Update interface design graphics, including logo (Siobhan, by end December)
 * Position paper on higher level regions for CHAH
 * Scope and specifications to get all herbaria delivering data dynamically
 * Prototype on-line flora information systems for delivery of descriptive profile data
 * Scope taxonomic disambiguation processes
 * Draft a policy and procedure for user registration on wiki for discussion (December 2008)
 * AVH website to reflect an umbrella to totality of AVH activity
 * Clarify use of Google Maps as alternative display
 * Source more attractive base layer for AVH mapper
 * Initiate AVH 2009 Strategic Plan

Decisions

 * 'AVH Weed Tracker' as the name for the weeds 'Early Warning System'
 * AD, MEL and CANB as server sites for weed application
 * Progress on-line flora information systems to deliver descriptive data
 * Taxonomic disambiguation based on APC to be elevated in priority
 * ABRS to be included in new AVH MOU
 * AVH User Registration policy and procedure to be developed to cover different access levels
 * Herbaria to work with local universities in providing data from university collections
 * AVH to represent totality of herbarium activity and information
 * APNI to include links to images of protologues and type specimens
 * APNI to cache/mirror images for performance/resilience
 * Google maps to be offered if licensing ok
 * Base map to be upgraded

Links

 * AVH development site public interface
 * AVH development site restricted interface
 * Follow-up HISCOM ad hoc AVH Workshop