HISCOM 2012 AGM Canberra minutes

12–14 November 2012 (HISCOM meetings 12–13 November, HISCOM/CHAH and HISCOM/MAHC (did I miss this, or was it CHAH/MAHC? -NielsKlazenga 22:35, 21 December 2012 (EST)) meetings, Launch of AVH and OZCAM - 14 November)

Crosbie Morrison Building, Australian National Botanic Gardens, Canberra

Attendees
Peter Bostock (BRI), Gary Chapple (NSW), Wayne Cherry (NSW), Ian Cowie (DNA), Jim Croft (CANB), Anne Fuchs (ANBG), Niels Klazenga (MEL), Dave Martin (ALA, part of meeting), Matt Miles (AD), Kevin Thiele (PERTH), Helen Thompson (ABRS), Alison Vaughan (MEL) (Minutes), Michelle Waycott (AD), Greg Whitbread (CANB), Aaron Wilton (CHR) (Chair)

Wednesday morning:  Peter Brenton (ALA), Peter Doherty (ALA), Paul Flemons (FCIG), John Hook (CANB), Paul Murray (CANB)

Apologies
Eleanor Crichton (AD), Beth Mantle (FCIG), Laurence Paine (HO), Ben Richardson (PERTH), Brett Summerell (NSW)

1. Minutes from previous meetings
The minutes of the previous meeting were accepted without modification.

The outstanding actions from previous meetings were reviewed; most will be discussed under other agenda items. Greg thinks he found the minutes of the 2010 AGM on the Internet Archive, but they need some work to make them readable.

Action 1: Attempt to recover the minutes of the 2010 meeting. (Greg Whitbread)

2. Wiki
The Wiki is running quite well at the moment. The URLs no longer include ‘index.php/’; several extensions have been installed (e.g. syntax highlighting, better user administration tools). The Wiki needs to be cleaned up and be better organised, using categories to give it a more hierarchical structure.

Niels is happy to run the Wiki on the MEL server for now, but it makes sense to host it more centrally. Niels can’t get the reverse proxy to work in Melbourne. The CHAH website is currently hosted by Aussie Hosting (a Sydney-based company), which is also used for CHAHBG, CABC, the Australian Seedbank Partnership, BGANZ etc. It seems to be working okay, but we would lose the short URLs if we transferred the HISCOM Wiki to it.

Some people have multiple user profiles in the Wiki, and other people don’t have access. The user list needs to be cleaned up.

Action 2: Get the reverse proxy to work at MEL, or put the HISCOM Wiki back in the Aussie Hosting domain. (Niels Klazenga, Greg Whitbread, Ben Richardson)

Action 3: Provide a short workshop on using MediaWiki on Tuesday afternoon, if time permits. (Niels Klazenga)

Action 4: Re-organise the Wiki and add user administration tools for the current version. (Niels Klazenga)

Action 5: Delete spam users from the Wiki and ensure past and present HISCOM members have appropriate editing privileges. (Niels Klazenga)

3. Roles
FCIG has recently adopted Technical Manager and Web Content Manager roles. It was agreed that this could be a good model for HISCOM to follow. It was pointed out that FCIG is a much smaller committee, so HISCOM may need to have more than two specialised roles. It was also noted that we need to be careful that the adoption of such roles doesn’t exclude participation by others in the committee. It was suggested that rather than having a single person responsible for each function or role, that two or three people could be jointly responsible for them.

It was agreed that the roles should be adopted, but that they should be coordination roles, instead of management roles.

The draft roles were discussed and modified. It was agreed that we need a separate User Liaison role.

We need to consider how these roles fit in with the proposed AVH management group.

AVH now has a news page; there was some discussion about appropriate content and how the creation of news items might be coordinated. This would be seen as part of the Web Content Coordinator role, if it is adopted.

We need to discuss with CHAH how the CHAH website will be managed, and what interaction between HISCOM and CHAH is needed with regard to web content.

The User Liaison is responsible for:
 * Coordinating responses to feedback from users
 * Administering user access to sensitive data.

The Technical Coordinator is responsible for:
 * Being a point of contact for ALA and other for technical matters
 * Coordinating the management of:
 * the HISCOM and CHAH portals
 * the HISCOM Wiki
 * data delivery mechanisms for the AVH
 * Coordinating responses to technical enquiries regarding the AVH, data access, delivery and standards.

The Web Content Coordinator is responsible for:
 * Being a point of contact for HISCOM Wiki and AVH static content
 * Maintaining web content related to HISCOM activities
 * Updating, and soliciting, content from HISCOM to maintain the news feed
 * Ensuring that AVH and CHAH websites meet accessibility, security and usability standards.

Action 6: Finalise the role descriptions and circulate to CHAH and HISCOM for further consideration. (Aaron Wilton)

4. Election of Chair and other roles

 * Aaron was re-elected as Chair for the coming year.
 * Alison Vaughan was nominated in the role of User Liaison
 * Niels Klazenga was nominated in role of Technical Coordinator

We elected Michelle as Deputy Chair, and thanked Brett for his contributions

Recommendation 1:  CHAH to endorse the following nominations:
 * Chair: Aaron Wilton
 * Deputy Chair: Michelle Waycott
 * Technical Coordinator: Niels Klazenga
 * User Liaison: Alison Vaughan

5. Terms of Reference
The changes agreed at the last HISCOM meeting have been incorporated (e.g. that the Chair of HISCOM is not a CHAH member, and that the Deputy Chair is a member of CHAH). It was agreed that subscription to the HISCOM e-mail list (and inclusion in other communication mechanisms) is granted by request, approved by the HISCOM Chair.

The wording around the nomination of transitory roles was modified, and the period of office for executive roles was clarified.

The sections about the creation of Project Manager and Coordinator roles were removed from the ToR.

It was agreed that having an ALA technical staff member as a permanent member of HISCOM would be greatly beneficial.

Recommendation 2: That CHAH endorse the proposal to have an ALA staff member as a permanent member of HISCOM.

6. HISCOM-L
Ben, Greg and Jim are responsible for moderating HISCOM-L. Jim has been moderating the list based on whether or not something is relevant to the group, rather than based on who is on the list.

The HISCOM list was the reviewed.

Action 7: Remove John Tann and avh@googlecode.com from HISCOM-L, and update Rex Croft’s e-mail address. Check with Jeremy Bruhl whether Jon Burne should still be on the list. (Jim Croft, Ben Richardson)

Aaron noted that the HISCOM list has low traffic compared to the MAHC and CHAH lists, and suggested that it could be used more effectively.

2. AVH management group
The idea of developing an AVH management group to allow HISCOM to move away from AVH and focus on new projects was discussed. The original view of AVH was that it included everything to do with what goes on with a herbarium, so it was felt that it doesn’t make sense to split off the management of ‘AVH’ to work on ‘other’ projects. There was a strong feeling that there is no need for an AVH management group.

Recommendation 3: That CHAH not endorse the formation of an AVH Management Group.

3. Brief report on ALA activities
Dave Martin provided an overview of recent ALA activities:


 * Released Version 1 of Delta
 * Funded by Australian National Data Service to supply web services developed by people at JCU to produce species distribution polygons that can be used for quality assertions; this can now be used for other taxonomic groups (it was developed for birds)
 * Tools to assist in the identification of weeds
 * Developed a hub for the Australian Seedbank Partnership
 * Developed a hub for the microbial community
 * Looking at developing a hub for OBIS
 * In February a tool called Fish Map will be released
 * PhyloJive: integration of phylogenetic trees, point data and characters from Identify Life so you can map characters or species (Jim encourages everyone to have a look at the maps; the dots on maps for multiple species or characters looks really good)
 * Volunteer Portal: work on templates so people transcribing data from labels can use existing data to search for related collecting events etc.
 * Sandbox: tool that allows you to upload data and pass it through the same sort of validation and integration tools so it can be tested; it can also be faceted against ad hoc parameters (see blog post about termite mounds data)
 * Duplicate record detection: in the case of observations it is looking at duplicate records (i.e. the exact same record coming in via different pathways); for specimen-based records it’s looking for duplicate specimens. The notion of a representative record is used to indicate which is the more representative or reliable record, e.g. if two records asserted to be related have different levels of geocoordinate precision, the record with the higher precision will be the representative specimen.
 * Species interaction work: working closely with Greg; data mining to extract species interactions between Acacia and wasps.
 * Soils to satellites project: producing a single portal that combines distribution data and ecological data
 * ALA blog: http://www.ala.org.au/blogs-news/

It was noted that there are a number of ALA projects running in the faunal community, but none in the botanical community.

1. Data delivery
Niels has summarised which fields are delivered to AVH: AVH data providers. All herbaria should deliver as many fields as they can. The fields to prioritise providing are:


 * typification fields
 * loan fields
 * cultivated flag
 * phenology (reproductive condition in DwC)
 * determination history (previous identification in DwC)

BioCASe v. 2.5 onwards has debugging on by default.

Action 8: Herbaria to review the fields they are providing and deliver additional fields if possible. (All) Action 9: Herbaria to list which HISPID concepts they use in their collections database (TBA; someone needs to make a place for this to be recorded) Action 10: Provide those herbaria who have debugging turned on with a list of the errors that their BioCASe providers are reporting. (Niels Klazenga)

2. Engagement with university herbaria
The University of New England (UNE) will start delivering to AVH via a BioCASe provider soon. There was a suggestion that the state and territory herbaria could play a mentoring role with their local university herbaria. The focus of university herbaria varies; some are primarily teaching herbaria, but others participate in loans and exchange programs too. There was an acknowledgement that there needs to be interest and enthusiasm from the university herbaria for a mentoring/knowledge-sharing role to be effective.

Action 11: Write a news item about collections management and data delivery in a university herbarium (e.g. UNE) and communicate to the university herbaria. (Alison Vaughan, Niels Klazenga, Michelle Waycott)

3. Memorandum of Understanding (MoU)
There needs to be an Memorandum of Understanding that outlines what fields must be delivered, what data delivery mechanism should be used, how often data should be updated, and what license it should be delivered under. We might also need a separate MoU to cover the delivery of images (these might not all be delivered under a CC-BY licence).

4. Images
If records come in with URL for an image, ALA can consume that. If that is not possible, image files can also be sent in, with a table that links the file name with the record identifier. ALA has been working on Morphbank; the new version of Morphbank has just been released.

Jim noted that it is important that we manage our image data with the same rigour that we curate our specimen data.

Action 12: Address image storage and metadata with the same rigour that we address specimen storage and metadata. (All)

There was some discussion about how to obfuscate sensitive data on labels in digital images.

5. User feedback
Alison provided a summary of the feedback from AVH users. The User Voice system was disabled in September after we realised that users were not getting alerted when someone had responded to their feedback. Other feedback has come in through avh@rbg.vic.gov.au, avh@ala.org.au  and webmaster@chah.org.au.

Several bug fixes were reported (e.g. problems with taxon name queries, problems using AVH in IE8, missing fields), which were resolved quickly by the ALA team.

The main requests for improvements have been:


 * Implementation of a bounding box query (which was available in AVH3)
 * Requests to have the map tab first (FCIG are happy to make this change in OZCAM too)
 * Better rainfall and temperature layers, so the values can be interpreted more easily.

There was some discussion about whether avh@ala.org.au is the best email address to use, or whether it should be avh@chah.org.au.

Action 13: Transfer suggestions for new functionality and bug fixes (with suggested priority ranking) into Google Code and circulate summary to HISCOM and CHAH (Alison Vaughan)

Action 14: Circulate a summary of other feedback to  HISCOM and CHAH. (Alison Vaughan) Action 15: See if we can change the contact e-mail for AVH to avh@chah.org.au. (Greg Whitbread)

6. AVH Annotations
Dave gave a demonstration of how the annotation system works.
 * The OZCAM community has provided some feedback on the issue types. ALA hasn’t received any feedback from the AVH community, but the list of issue types can be expanded if necessary.
 * Implication of a geospatial issue, habitat incorrect, suspected outlier (the record is not removed from the biocache, but is removed from spatial portal)
 * There is a facet missing; you should be able to facet by records with annotations; Dave will look into this.
 * Contacts on the Collectory page are authorised to respond to annotations within AVH (i.e. to verify them); there can be multiple contacts for the one institution (collections manager, data manager etc.). Send details directly to Miles if you want extra contacts added (or if any other details need to be updated: collections.ala.org.au).
 * HISCOM would like to ingest annotations from AVH and pass responses back to AVH with data delivery.
 * Discussion of delivering HISPID verification level flag and displaying it on the Record detail page.
 * It was pointed out that it would be good to deliver herbarium-based annotations to AVH.
 * Discussion of how to prevent multiple annotations of the same type for the same specimen.
 * The system keeps a copy of the annotated record, and will check newly uploaded versions of the record to see if the part of the record that the annotation relates to has been changed; the original annotation would not be removed, but would be flagged as possibly resolved (this facility is still being tested, but the capability is there).

Action 16: Each herbarium to check the contacts on their Collectory page and send updates to Miles Nicholls. (All)

7. Sensitive data
Need to clarify the procedure for approving who gets access to sensitive data, and for how they access it (blanket access, or one off data dump sent to them).

Recommendation 4: That the User Liaison person acts as gatekeeper for who has access to sensitive data. Action 17: User Liaison person to learn how the sensitive data gatekeeping works. (Alison Vaughan, Dave Martin)

8. Data delivery to GBIF
If AVH data is delivered to GBIF, the default action is for the whole AVH data archive to be downloaded in one go. The sensitive data is masked in the archive. An AVH archive has already been created, but has not been exposed to GBIF yet.

Recommendation 5: That CHAH endorse exposing the AVH archive to GBIF.

9. Future work on AVH
ALA does not have as many resources to commit to AVH as it has had previously, so any proposals for development will need to be properly documented and prioritised so that the time needed and resources available can be assessed.

Enhancements will need to be discussed in details with the ALA team so that it can be mapped out, and planned as there are limited resources.

10. Collectory/Resources of Australasian Herbaria (RAH)
CHAH had decided that the ALA collectory pages would become the new Resources of Australasian Herbaria, and would form part of AVH, but it is unclear how the work towards this is progressing.

Action 18: Check how far the ALA has progressed with replacing Resources of Australasian Herbaria with the collectory pages. (Helen Thompson)

At the moment there is no relationship between the AVH static content and the Collectory. The Collectory pages provide useful statistics on the number of records delivered to AVH, and the number of records downloaded. It was agreed that institutional or departmental logos should link to the relevant institutional or departmental page, but that we should include a list of AVH data providers on the AVH data page, which will link to the Collectory pages.

Action 19: Add a list of data providers to the AVH data page, and link them to the Collectory. (Niels Klazenga, Alison Vaughan)

Action 20: Suggest that FCIG adopt the same strategy of linking to the Collectory from OZCAM. (Aaron Wilton)

It was agreed that it would be good to have a MAHC contact as well as a HISCOM contact (i.e. a collections manager and a data manager) on each institution’s Collectory page.

Recommendation 6: That CHAH endorse the idea of having a MAHC and HISCOM person on each institution’s collectory page.

5. World Flora Online
The World Flora Online (WFO) Technical Working Group (TWG) has been established, and Terms of Reference have been written. Chuck Miller is the chair of the TWG; Greg Whitbread is a member. There have been discussions about the best way of developing the WFO, with some concern that the current proposal is not the best way forward. The Kew-based members of the TWG are coming to Canberra in early 2013 to discuss IPNI2 and whether it can form the basis of WFO. It is important that ABRS and other institutions who are working on online flora projects are kept in the loop.

Action 21: Keep HISCOM members in the loop with the progress of the WFO TWG, so that HISCOM can comment and provide feedback where appropriate. (Greg Whitbread)

6. National Species Lists
Greg provided an update on the National Species Lists project. Details about the different services are available at: biodiversity.org.au/.

Greg talked about taxonomic concepts. And names. At length.

In AVH, there are separate classifications for the names that are in APC and those that are not. The two classifications overlie each other in AVH.

7. MEL’s quality control tools
Alison demonstrated MEL's Fancy Quality Control Machine (FQCM) and GPI error checker, and discussed the enthusiastic uptake of  the tools by data entry staff.

There was strong interest from other HISCOM members in this approach, and request for the underlying queries used to detect data entry errors.

Action 22: Circulate copies of the queries used in the MEL FQCM and GPI error checkers. (Alison Vaughan)

Expedition ID

 * FCIG had a discussion about the need for a transfer field for Expedition ID, so specimens in different collections that were collected on the same expedition (e.g. BushBlitz) can be effectively queried; who records expedition or collecting trip in their database?
 * Agreed that this is important and useful (implications for collecting permits etc.)

Field names etc.

 * The idea of making the vocabulary across the OZCAM and AVH sites was discussed; FCIG agree that this is a good idea
 * FCIG agree that it would be better to have the maps tab first in the results, instead of the list of matching records.

Biodiversity Heritage Library (Joe Coleman)

 * Scanning takes approximately 4 hours per book (300 pp.), plus metadata processing (0.5-1 hr) and marking articles
 * They decided it was not viable to set up mobile scanning stations (less efficient than centralised scanning; loss of volunteers and local knowledge)
 * Sites with digital publications can upload them remotely (Alison has e-mailed Joe Coleman from MV requesting more information about how this works)
 * A program called Macaw is used to clean up the scans (it makes PDFs look better, so reasonably poor quality PDFs can actually turn out okay)

9. AVH Trust Project
The AVH Trust is funding the writing of a proposal for the digitisation of herbarium specimens from Papua New Guinea. There is an additional $60,000 available from the AVH Trust. Two project ideas were discussed:


 * Resolving issues delivering high-res images to ALA (there was concern that this is too closely linked to ALA for the AVH Trust to want to fund it)
 * APNI/APC editor (there was support for this idea): $58,000 would fund completion of the APNI/APC editor and allow re-use of APNI/APC for the development of state floras

Recommendation 7: That CHAH considers the completion of the APNI/APC editor as a project that could be funded by the AVH Trust. If CHAH wants to take this forward, HISCOM will develop a project proposal.

10. Wiki editing workshop
Niels gave a short workshop on how to edit the Wiki. Editing help is available at http://meta.wikimedia.org/wiki/Help:Editing.

11. HISPID
At the last HISCOM meeting, several  required changes to HISPID were flagged. Since then, Niels and Alison have identified some additional areas where new concepts could be added or existing vocabularies need to be updated. Niels has added some notes to the HISPID 5 for HISPID Users page. .

This is a good time to review HISPID, as Walter Berendsohn is planning to bring out a new version of ABCD. We should aim to have a list of additions and fixes for ABCD ready to be sent off to Berlin very soon after the HISCOM meeting.

Action 23: Create a HISPID review team. (Aaron Wilton)

1. Hybrid formulas
HISCOM wants to be able to deliver atomised hybrid formulas, as well as the full scientific name of the hybrid's parents. Although this won’t be used by ALA, it will be useful for direct harvesting from other BioCASe providers. For this to work, we need the hybrid fields to be added to the identification element of ABCD, and not just in the HISPID extension.

Action 24: Consider updating HISPID to accommodate atomised hybrid formula fields. (HISPID review team)

2. Phenology
Michelle raised the issue that the HISPID vocabulary for phenology is much smaller than what is implemented in AD, and the the vocabulary needs to be expanded to better describe non-flowering plants.

Action 25: Expand the HISPID vocabulary for phenology to better provide for non-flowering plants. (Michelle Waycott, HISPID review team)

3. Expedition ID
At the recent FCIG meeting, there was a discussion about the need for a transfer field for Expedition ID, so specimens in different collections that were collected on the same expedition can be effectively queried. HISCOM agreed that this is an important and useful field to add to HISPID.

Action 26: Add ExpeditionID to HISPID. (HISPID review team)

Note: This is already in HISPID – //element(*,Unit)/NamedCollectionsOrSurveys/NamedCollectionOrSurvey –. -NielsKlazenga 17:57, 21 December 2012 (EST)

4. Geocode source
This is a mixed concept and only partly corresponds with abcd:CoordinateMethod. There is no ABCD concept to deal with the first two items in the HISPID vocabulary ('collector' and 'compiler'), but there is one in Darwin Core: dwc:georeferencedBy. This element should be added to ABCD as well. A person’s name might be more useful than just 'collector' or 'compiler', but would probably have to be translated back in AVH for privacy reasons.

Action 27: Update HISPID to provide separate fields for geocode method and georeferenced by concepts. (HISPID review team)

5. Plant occurrence and status
The posnat vocabulary contains some concepts that apply at the specimen level, and others that apply to presence of a taxon in an area.

Action 28: Revise the posnat vocabulary. (Michelle Waycott, HISPID review team)

6. Substrate
The current definition of substrate is ambiguous. It seems to refer more to the underlying geology of the collecting locality, than to what the specimen itself was growing on. At MEL, substrate has primarily been used to record microhabitat (it has mostly been used for cryptogams). MEL has recently added a separate field for geological substrate (i.e. the HISPID concept), and would like to be able to deliver both concepts.

Action 29: Clarify the definition of the existing substrate field, and add a new concept to cover microhabitat. (HISPID review team)

7. Deaccession flag
HISCOM agreed that we need to include a deaccession field in HISPID to flag when records need to be flagged as deaccessioned in AVH.

Action 30:  Add a deaccession flag field to HISPID. (Peter Bostock, HISPID review team) Action 31:  Implement deaccession flag field in each herbarium’s collection database. (All)

8. Deaccession reason
For the remaining occurrence record to be usefully interpreted by users, the reason for deaccessioning the specimen should be recorded.

Action 32: Add a deaccession reason field to HISPID, and come up with a vocabulary. (Peter Bostock, HISPID review team)

Action 33: Implement deaccession reason field in each herbarium’s collection database. (All) Action 34: Communicate proposed changes to standards to FCIG. (Aaron Wilton) Action 35: Communicate changes made to HISPID to the ABCD authors. (HISPID review team)

12. Units and GUIDs
It was agreed that we need a working group comprising members of each major herbarium to come up with a strategy for implementing GUIDs.

Action 36: Document specimen concepts at different herbaria and how UnitIDs are assigned. (Niels Klazenga, Michelle Waycott (All to contribute)) Action 37: Develop a working paper on numbering units and the assignment of GUIDs. (TBC)

13. ALA (Peter Doherty)

 * A funding proposal has been submitted to Department of Science
 * Dave Martin and Miles Nicholls positions are secure through to June 2013, and will probably extend beyond that.
 * Any major development work for AVH needs to be well scoped; development that also relates to the Museum community may be prioritised.

14. Morphbank (Peter Brenton)

 * ALA partnered with US developers to re-develop Morphbank in order for it to meet modern standards (it has been minimally funded since the late 1990s)
 * Australian node of Morphbank: content focus is Australian-based images; about 10,000 new images (large batches from ANIC, Queensland Museum and about to get images from Southern Cross University Herbarium)
 * Morphbank is a repository and is not designed for bulk downloads, but you can download individual images
 * Most images have CC-BY licence, but it does support other licences
 * ANIC SATSCAN images are about 450 MB; they were put up online primarily to allow researchers to view collections prior to making loan requests
 * The IIP zooming tool is used (it is similar to Zoomify, but open-source, not proprietary)
 * The full-sized images are stored in a file server
 * It is mostly being used as a mechanism for sharing images and not as a primary store for images (although a couple of institutions are using it for this). The main benefit of Morphbank is the zooming viewer.
 * The beta version of Morphbank incorporated into ALA was released a couple of weeks ago
 * Batch upload processes are available (requires admin access)
 * Potential problem with duplication of records in Morphbank and ALA
 * Records in Morphbank can only be edited manually, so the best option would be to only supply the minimum metadata required to link the image with the associated record in ALA.
 * There is a minimum set of ten mandatory fields (TSN, Scientific name etc.)
 * Multiple versions of metadata can be associated with the one image; each set of metadata has a separate URL
 * URLs are permanent, so can be cited safely
 * The process for subscribing is straightforward (gatekeeping could be devolved to CHAH and CHAFC)
 * Currently a many-to-one relationship between images and specimens, but intention to make it many-to-many
 * Tools to markup images; marked-up images would become a sub-image of the original
 * Work is continuing on the new version; ALA has run out of money for it, but the international partners are committed to it.

Recommendation 8: That CHAH and HISCOM members consider whether Morphbank is an appropriate image storage solution for their own institution.

15. AVH Trust project proposal (Greg Whitbread)
Greg presented his AVH Trust funding proposal.

16. Report from TDWG (Paul Flemons)
The 2012 meeting was in Beijing; only 85 people attended (compared to 300+ at other meetings).


 * Kevin Richards is leading a group to redevelop the TDWG website
 * A separate conference management website will be implemented
 * Conference planning: an effort is being made to plan TDWG meetings further in advance to make it easier for people to attend
 * Main themes under discussion were:
 * genomic data standards
 * tissue data standards
 * vocabularies
 * annotations.

17. Biodiversity Volunteer Portal (Paul Flemons)
Paul gave a demonstration of the Biodiversity Volunteer Portal (BVP). The BVP is based around the idea of virtual expeditions.

There are two steps in the process:


 * Transcription
 * Validation

Validators are either in-house staff, or are skilled volunteer transcribers who are invited to become validators. Volunteers are good at (and comfortable with) transcription, but are not always comfortable with georeferencing.

Data entry templates can be customised for different expeditions. Talk to Paul if you’re interested. A customised template would cost $5,000 – $10,000 to develop, or you can use one of the existing templates. To get going, you need:


 * images (images must have a URL that the BVP can grab them from)
 * upload files
 * a tutorial for users.

The upload file needs to contain the image filename, its URL, institution and registration/accession number (species name is optional).

Records are included in the Atlas as drafts. Once the record is ingested into the home institution’s database and delivered to ALA, the draft BVP record is removed.

The BVP tries not to have too many transcription projects running at the same time, so as not to overwhelm volunteers or spread the effort too thin. Currently there are usually between two and ten projects running, and six pending (and more in the pipeline). Paul is keen to get the herbarium community involved, so we can jump the queue if we have a botanical project to put up.

Paul’s estimate of the efficiency of the process is that you get the equivalent of three people’s work for the cost of one (but this is based on insects, which are very time-consuming to unpin and photograph; herbarium specimens will be much more efficient to image).

It would be possible for us to run a test project to gauge efficiency and test workflows, and then remove the dataset from ALA.