Joint HISCOM/MAHC 2015 Hobart minutes

Attendees
Laurence Paine (HO), Clifford (HO), Aaron Wilton (CHR), Ben Richardson (PERTH), Anne Fuchs (CANB), Niels Klazenga (MEL), Donna Lewis (DNA - Chair), Wayne Cherry (NSW), Rebecca Pirzl (ALA), Nick dos Remedios (ALA), Ainsley Calladine (AD), Eleanor Crichton (AD), Simon Checksfield (CANB), Carolyn Ricci (AD), Ailsa Holland (BRI), Jo Palmer (CANB), Ines Schoenberger (CHR), Frank Zich (CNS), Peter Jobson (DNA/NT) (minutes), Lyn Cave (HO), Kim Hill (HO), Pina Milne (MEL) (chair), Gill Brown (MELU), Karen Marais (MQU), Gill Towler (NSW), Karina Knight (PERTH)

Apologies
Ian Cowie (DNA, CHAH), Michelle Waycott (AD, CHAH), Helen Vonow (AD), Ingrid Offler (DMHM), Jennifer Tate (Massey), Deb Bisa (DNA), Jeremy Bruhl (NE), Anthony Kusabs (WELT), Frank Hemmings (NSW), Dhahara Ranatunga (AK)

Welcome and Introductions
Pina Milne and Donna Lewis welcomed HISCOM and MAHC members to the joint meeting.

= SESSION 1 =

eFlora Platform demonstration and ALA project update (Rebecca Pirlz)

 * Phase 1 (completed)- building the eFlora platform using data from various institutions.
 * Phase 2 – testing the platform (expected completion end of year).
 * Phase 3- development time based on feedback from users.

Testing of Phase 2 proposed to end at 30th Nov. Zoe Knapp will co-ordinate feedback & present to committee. Then tasks will be grouped to efficiently complete feedback issues Demo site: profiles-dlv.ala.org.au/ There will be 4 levels of access on eFlora: There is an option to make profile data (collection) private – ie: limited access Admin privileges: Reviewer comments on level of availability to public. Can follow edit history in Edit mode. Images – uploading thru ALA but are giving option NOT to make editing available to public. Options to use APNI for species name. Author & acknowledgements can be separated with other input, only Administrator can give options available ie: general, none, or lots of options.
 * Admin – all levels of access including templates, citations, content etc
 * Editor- can add profiles but able to configure
 * Viewer – comment on content, but not edit
 * Public – just look at content
 * Access controls: assign permissions, can configure for entire collections (profiles), can do announcements in social media, upload glossary & about page info.
 * Profile page: species name & nomenclature, authorship of profiles, map & images, boxes of text (attributes configured by administrator), sources or link can be added in, free text attributes. Attributes contain description, notes, etc.

Link to webinar to be distributed. Webinar is c. 3 hrs long. Here is the link for the eFlora platform webinar: https://youtu.be/un-dWczr3x4. Please note, breaks between sessions are at 0:59:30 and 2:00:30.

Can archive old profiles to follow history after splitting/ combining taxa, or for ms treatments (locked) or closed (only available for editors or higher level)

Feedback tab is available to use for improvements/ issues with system. Snapshot view will create unique doi for later down track – only available for editor or higher Viewers can generate pdf of profiles. In future, aim is to be able to download LuCID keys, but currently trying to resolve some issues encountered.

DigiVol (Paul Flemons)
An institution sets up a home page & each project is called an “expedition”. Original data is digitised and volunteers transcribe the data. Then validated by project manager. To generate an expedition, you need images to upload (tutorial available) & you are expected to manage your site. DigiVol manages the general site & is there to assist, but ALL the rest is to be maintained by institution. Admin section: In templates – Administrator can define attributes including defining HOW they appear. Geospatial options: available to improve lat/ longs on old specimens. Updates include changes of front pages & template. New template format resembles a questionnaire. Paul Flemons is the contact person Paul.Flemons@austmus.gov.au from the Australian Museum.
 * Develop templates
 * Lots of functionality for creating & managing
 * Format downloads via CSV

NSL demonstration and update (Anne Fuchs and Niels Klazenga)
https://biodiversity.org.au Aim is to have single list of published names for ALL Australian biota (vascular plants, cryptogams & fauna). Will consist of names & classification. Services available include queries and information. Current options include APC and APNI Advance search has various options which we are encouraged to try. NSL wants feedback on the searching options from herbarium community The “I” above the search gives you information on such usage as wildcards etc. Anne is aware that more work is needed to improve.
 * APNI – has identical output to previous set up (nomenclature & significant publications).
 * APC – just gives accepted names. It does give usage (synonymy etc) but then gives link to APNI.

At base of profile there is a link sign. If querying a species list within a genus, remove/ no need for quotations around genus name. Via copy/ paste, it is possible to do name checks via the tab. The data is available via the url at the end of the name. Comments/ queries via:
 * Twitter - @AuBiodiversity
 * Feedback button on side bar
 * Email: ibis@anbg.gov.au (IT problems)
 * cpbr-info@anbg.gov.au (data issues)

Future tasks for NSL:
 * Ensuring APNI & APC are working concurrently
 * Moss Catalogue is currently being worked on by Niels before being uploaded
 * Interactive Catalogue of Fungi
 * Checklist of Lichens
 * Australian Marine Algae Index & other datasets
 * Australian Faunal Directory (eventually)
 * Moss Catalogue is close to being uploaded

Other future outputs include increased services, extract formats & technical/ user documentation. Quality assurance & editing is expected via community input.

= SESSION 2 =

ALA update (Nick dos Remedios)
ALA front page has a new design & interface including an ALA apps menu. ALA have implemented an AGILE approach consisting of sprints where 2 teams of 3-5 developers work on specific projects. Concentration at the moment is on core work including:
 * devoting time on names in their data due to the NSL changes.
 * performance - there are issues with searches, especially time out issues. This is possibly due to large data downloads affecting concurrent small data downloads? A sprint has been implemented to deal with this by splitting the servers 1. download server, 2. search server.
 * Open source software - other countries are using the ALA software for their own projects (Brazil, Spain, Argentina etc).
 * Citizen Science Portal (part of the community portal) - currently re-writing the system as it has become outdated. It is often used to capture observational data eg: Koala counts.
 * Image repository - allows high resolution images and metadata associated with images. More detail is served with image zooms (technology is similar to that of Google maps). There is a ruler bar option so you are able to do measurements. Linked sub images (close ups) will be a new feature. Will allow you to concentrate on label data etc.

ALA - Herbarium Community Projects, reports (various)
Encountered issues with taxonomy & data quality. Used various online resources to apply current taxonomy & understand synonomy, still have 180 unidentified specimens. Collector & old locality names resolved through gazetteer. 207 taxa added to NT species list, 527 non vascular taxa for NT, 726 specimens databased. Benefits included curation, preliminary non-vascular species checklist. Will be used for the basis of FloraNT non-vascular flora.
 * DNA: fungi databasing (Donna Lewis)

Originally commissioned to illustrate fungi found in SA. Based on fresh material & often linked to vouchers 20/30 years ago, watercolours were cut & put into a folio. Project imaged & separated the individual illustration 532 artworks were imaged. 1/3rd are linked to a voucher & are fully identified. Others are identified as best as possible (genus or to species). AD is the custodian but are not clear on the copywrite issue over permission to publish images. Currently determining the best way to display – as a coherent collection that is taxonomically useful, but also aesthetically.
 * AD: imaging artworks associated with JB Cleland & CG Hanaford fungal collections (Ainsley Calladine)

Pina Milne facilitated a workshop held 21-22 May 2015 with Australian and NZ participants. Pina Milne directed proceedings and Tom May (MEL) provided advice for imaging fungi & Chris Cargill (CANB) provided advice for imaging liverworts. Angharad Johnson (Digitising and Database Officer) provided the technical advice on using the Leaf Aptus camera. Best options were to photograph via Leaf Aptus camera using multiple images to see label & specimen details at various magnifications. Output – a document that has a work flow to take images, plus comprehensive appendices with scenarios of various issues.
 * MEL: development best practice guidelines to digitise cryptogams in packets (Pina Milne)

Recently acquired 6000- 7000 specimens of horticultural specimens from Burnley Ag College. 1st run to database all legumes in the collection. 1st curated and cleaned nomenclature. DigiVol tutorial & template clunky to use, but currently being upgraded. 455 specimens completed by 27 volunteers in 13 days. Data now ready to be uploaded in AVH. High resolution images of all specimens available to deliver to ALA. Validation & feedback to volunteers very important during process to avoid repetitive errors. Next expedition 22-25 October 2015 to database all “edible” species.
 * MELU: DigiVol expedition of Burnley Horticultural Collection (Gillian Brown)

Banksia collections based on Kevin Thiele’s PhD thesis. Miegunyah Fund granted money to have specimens mounted. Total 740 new records from 94 taxa, plus hybrids. Minor curation of specimens. Able to match cotyledon, follicle & seeds to herbarium sheets. 763 sheets with 213 seeds & fruits. Developed a protocol to image bulky collections ie: carpological specimens.
 * MELU: digitising of Banksia collection (Gillian Brown)

2304 specimens databased. 944 new determinations on the specimens (most housed in indet to genus/family boxes). Added 16.7 % to the collection database. Many specimens were of rare or narrow endemic species.Currently specimens are going through cleaning of database before submission to AVH.
 * UNSW: databasing Alligator River special collections (Pina Milne on behalf of Frank Hemmings)

Previous platform not compatible with AVH/ HISPID standard. Started Feb 2015 & due to end October 2015. All records needed name checking, necessary to re-organise data fields to be compatible for AVH migration. Other funds are required for completion. Output: 13,325 records now ready for migration to new platform; 475 records need to be fixed. New database is via UNSW then next year through AVH. All future records will be in correct/ compatible format.
 * UNSW: cleaning of data prior to moving to new platform (Pina Milne on behalf of Frank Hemmings)

Dr Fiona Scott employed for 3 months. Concentrated on green algae 1st. Checked current names & redetermined where necessary; geocoded both Tasmanian & overseas collections. 1863 records generated with 365 species, although not all specimens determined to species level. 30 specimens were de-accessioned. With residue time, recently worked on 400 red specimens. Fiona to stay on as Honorary Botanist.
 * HO: database algal collections (Lyn Cave)

Action 1: ALA Herbarium community projects managers to write an article for the AVH/ALA news feeds. Minkey Faber from the ALA to assist where required. (Project Managers).

Update on HISPID Review (HISPID Review working group)
Review of terms within HISPID aligning them with DarwinCore etc. March 2015 meeting worked on standards and issues with the majority now ratified by HISCOM. Terms and issues are managed through GitHub. HISCOM raised the following Classes/Terms with MAHC for advice or input:


 * Consent/Permit: this is a collective term for permissions. Examples include: "collecting permit", "export permit", "verbal authority". Needing MAHC input on what a permit actually is & aim to parse from the different institutions.
 * Transactions: an interaction between two herbaria, e.g. loan or exchange. Some of the terms were removed as they were duplicated in other Terms. It is possible to use AVH for loans. MEL currently sends a url to you & it contains a list of the loaned material (Listed on AVH). Loan sequence number and shipping method terms have been removed.
 * Occurrence - otherCatalogueNumbers: A list of previous or alternate catalog numbers for the same Occurrence e.g., for the catalogue numbers for exchanged specimens when known or of multisheets. Has a new definition since HISPID3 & is now much broader. Now allows alternate duplicate catalogue numbers for the same specimen & NOT subsequent numbers for a multisheet specimen.
 * Event - collectingTripName: the collecting trip or expedition where the specimen was collected i.e. Bush Blitz surveys.
 * MediaItem - incorporated from TDWG multimedia. This is to facilitate image exchange with specimens.

Issue surrounding Data verbatim (by collector) vs Interpretive field (added later by data entry, or in house). Discussion ensued over verbatim & interpretive. What should it be either? or both? Is it 2 sets of individual data? How much extra data is no longer necessary given locality data can be autogenerated from other spatial datasets?

Discussion around physical data (specimens) vs digital data (database). What is the current point of truth (databases), original data is on the specimens.

Aaron suggested a summary for discussion in 6 months time. Another issue – if changes by ALA or other institutions, how does the data get back to home institution?

HISPID Verbatim Terms include:
 * Event: verbatimEventDate
 * Location: verbatimLocality, verbatimLatitude, verbatimLongitude, verbatimCoordinates, verbatimCoordinateSystem, verbatimSRS, verbatimElevation, verbatimElevation, verbatimDepth
 * Identification: verbatimDateIdentified

Action 2: Institutions to discuss what data remains verbatim and what data can appropriately be interpreted (ALL).

Action 3: Circulate spreadsheet incorporating the verbatim terms for institutions to compile requirements/comments (Donna/Niels?).

Action 4: Send HISPID Terms and any outstanding GitHub Issues to MAHC when deemed appropriate for feedback (Niels).

Report on TDWG meeting (Niels Klazenaga)
New standard being produced for GPI project. Workshop conducted to African delegates to help with data capture and geospatial data. Politicians attended and very interested in bioinformatics. Africa in general working on finding alternative income sources other than from mining. Meeting also had talks on general biology & applications. Sessions included genetic resources, global gene biodiversity & World Flora presentation. Updates on Flora North America & establishment of Flora Canada using template of Flora North America. Interest group was formed on bioinformatics services regarding capture of new annotations.

= SESSION 3 =

Digital Collections update (Simon Checksfield)
Update on NRCA digital strategy. Formerly 6 different collections databases that are inconsistent. Undergoing a software evaluation process. Undergoing a process to work out a best single system for all collections. A collection management system identified 18 candidates, shortlisted 6 (4 commercial, 2 open-source) and agreed this was too many. Final shortlist - 3 products, initial testing comprised 1 commercial and 2 open-source. The final product is CollectiveAccess PhP developer. Pilot test used ANIC as the test case. Other databases will then be tested. So far recommendations from users are very positive. In particular deemed excellent in performance in loans, data entry, mapping & taxonomy. The process is using an AGILE approach consisting of 6 sprints. Living collections are problematic.

Issues associated with images - eg. access to GPI images, technical matters associated with delivery of images to ALA
Herbaria have national subscription to JSTOR for global types. Vision is the ability to access Australasian types via AVH. Images can be provided to the ALA which are uploaded to the BioCache. There is a sprint currently underway to link the images to specimens?? DarwinCore extension for MediaItem which has now been incorporated into the new HISPID standard. There is a method for Institutions to provide GPI images to the ALA by providing the CSV metadata file, this is pointed to the URL to download the image.

Action 5: Provide an instruction document to HISCOM/MAHC members outlining the steps involved in supplying ALA/AVH GPI images. (Niels).

Discussion on images after GPI. How are institutions prioritizing image digitization, some examples:
 * AD: exemplar specimens
 * BRI: threatened species
 * DNA: threatened species and restricted range taxa
 * NZ: Historic collections - earliest date
 * MEL: Cryptogam types
 * PERTH: has a scan of each taxon with Conservation Status, but also starting to talk about scanning loans. Also have a high rate of new accessions and it would be very difficult to keep up or start digitisation.
 * MELU: targeting special collections such as those used for teaching and exhibitions. Has been very useful for cleaning up the database as can just call up the image and check the data

Items identified from MAHC meeting - eg. data or tools specific items TBC
MAHC members each had the opportunity outline how they use AVH and provide feedback on features. The purpose of this exercise was to identify improvements to functionality etc of the AVH. Some examples:
 * Curation of taxa, distributions, creating a list of records in a particular area, identifications of replicates in different institutions
 * Spatial searches for a variety of uses including preparation for fieldwork.
 * Researchers to work out which specimens to borrow and institutions to request loans from.
 * Being able to see an image of the taxon would be invaluable
 * Generate distribution maps for flora treatments
 * Alerts – when people annotate specimens it is great to get the automated alert email.
 * Historical research eg. MEL and AD, to find contemporary and old records.
 * Rare and threatened species assessments, both state/Territory legislation and EPBC, and new weed notifications need to see where else they occur.
 * Validation tool to look for mistakes in lat/long, especially those that should be on land but are in sea.

The following items were raised in MAHC and discussed at the joint HISCOM/MAHC meeting. Many of the bug issues were documented by Nick dos Remedios to be followed up on. Some issues are already in GitHub, and those that aren't will be added.
 * Advanced Search (Peter Jobson): collectors - restricted to 15. i.e. when searching A.R. Bean. Nick explained the issue here. It is also a data issue.
 * Names in AVH/ALA (Anne Fuchs): processed vs provided.
 * Data custodians (Frank Zich): dealing with issue alerts (outliers etc). Flagging issues - but cannot delete after the issue has been resolved.
 * Additional analytical tools (Peter Jobson): customize charts so there are taxonomic charts - need by family, genera, common species. Currently not user friendly for the general public.
 * Conservation codes (Donna Lewis): need to clarify the source and the legislation (i.e. EPBC, TPWCA). Data currently is incorrect and misleading for users.
 * Duplicate information (Deb Bisa): when duplicate information is changed how can these changes be alerted so other curators are notified. Can a report be generated for updates on duplicates?

Action 6: Follow up on any bug fixes to the AVH based on the above. (Nick).