ALA hub analysis

Static content
AVH contains more than the six pages mentioned in the report. The pages that were likely overlooked (copyright and privacy statements and data use agreement) are actually some of the most important ones and also need to go with the data in ALA.

Schema
AVH uses ABCD 2.06. HISPID 5 is fully ABCD compliant and has vocabularies for some of the fields in AVH that ABCD doesn't. Currently AVH harvests ABCD 2.06, not HISPID. Annex B refers to a very old version of HISPID that is not used in AVH and should be replaced by the following:


 * Source institution ID (MoU): /Unit/SourceInstitutionID
 * Unit ID (MoU): /Unit/UnitID
 * Date last edited (MoU): /Unit/DateLastEdited
 * Family (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/HigherTaxa/HigherTaxon[HigherTaxonRank='familia']/HigherTaxonName
 * Scientific name (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString
 * Genus (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/GenusOrMonomial
 * Species epithet (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/FirstEpithet
 * Author team: /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/AuthorTeam
 * Infraspecific epithet (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/InfraspecificEpithet
 * Infraspecific rank (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/Rank
 * Hybrid flag: /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/HybridFlag
 * Cultivar name: /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/CultivarName
 * Identification qualifier (MoU): /Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/IdentificationQualifier
 * Determined by: /Unit/Identifications/Identification/Identifiers/IdentifiersText
 * Determination date text: /Unit/Identifications/Identification/Date/DateText
 * Determination date: /Unit/Identifications/Identification/Date/ISODateTimeBegin
 * Record basis: /Unit/RecordBasis
 * Collecting date (MoU): /Unit/Gathering/DateTime/ISODateTimeBegin
 * Collector (MoU): /Unit/Gathering/Agents/GatheringAgent[@sequence=1]/AgentText
 * Note: The highly atomised version in HISPID_Mapping_to_ABCD appears not to be supported in AVH.
 * Also note: AVH completely ignores the primarycollector attribute.
 * Additional collectors (MoU): /Unit/Gathering/Agents
 * Locality: /Unit/Gathering/LocalityText
 * Country (MoU): /Unit/Gathering/Country/Name
 * State (MoU): /Unit/Gathering/NamedAreas/NamedArea[AreaClass='state']/AreaName
 * Herbarium region: /Unit/Gathering/NamedAreas/NamedArea[AreaClass='Australian Herbarium Region']/AreaName
 * Near named place (MoU): /Unit/Gathering/NearNamedPlaces/NamedPlaceRelation/NearNamedPlace
 * Longitude (MoU): /Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LongitudeDecimal
 * Latitude (MoU): /Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LatitudeDecimal
 * Spatial datum: /Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/SpatialDatum
 * Geocode precision (MoU): /Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/CoordinateErrorDistanceInMeters
 * Geocode source (MoU): /Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinateMethod
 * Altitude: /Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue
 * Depth: /Unit/Gathering/Depth/MeasurementOrFactAtomised/LowerValue
 * Habitat: /Unit/Gathering/Biotope/Text
 * Collecting number (MoU): /Unit/CollectorsFieldNumber
 * Cultivated status: /Unit/MeasurementsOrFacts/MeasurementOrFact/MeasurementOrFactAtomised[Parameter='CultivatedOccurrence' and IsQuantitative='false']/LowerValue
 * Natural occurrence: /Unit/MeasurementsOrFacts/MeasurementOrFact/MeasurementOrFactAtomised[Parameter='NaturalOccurrence' and IsQuantitative='false']/LowerValue
 * Kind of collection: /Unit/KindOfUnit
 * Notes: /Unit/Notes
 * Type status: /Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypeStatus

Wish list for AVH
In no particular order.

Separation of static and dynamic content
Static pages and static content of pages with dynamic content should not be part of the source code of the web application, as is the situation now. The current situation can't be maintained as we are not able to update anything in the static content without having to recompile the web application.

OZCAM uses WordPress. WordPress is quite nice (especially since it's not Matrix, but also because it uses PHP), we could use it. If the AVH web application can't be tunnelled (Alison suggests 'pummelled') into a CMS, we would have to find a solution for the pages with dynamic content. There are only two of those, the map page and the result HTML table (does anybody ever use that?).

Separation between web application and data harvest
Currently there is a CSV upload button that can be used by 'herbarium administrators' to load data into AVH by uploading a CSV file. Although it was removed from the source code, it is still present in the MEL instance of AVH. The CSV upload function was added to AVH at the very end of the test phase and was never properly tested, but even if we are ever going to allow CSV uploads, it shouldn't be done through the front end of the web application. Herbarium administrators were established at the HISCOM meeting in Cairns solely for user administration. They are not necessarily the same people that would need to upload data.

According to the (extremely limited) documentation you can even attach new BioCASe providers through the AVH front end. Not to worry, nothing I ever tried from the documentation (except Craig's pages) ever worked.

Separation between user administration and web application
Although the user administration module of AVH appears to behave reasonably at the moment (although it doesn't do what we asked to be implemented), it would be better to have it sitting outside the AVH web application. Aaron Wilton suggests we look into Australian Access Federation services. (Funding recently available to integrate the NZ equivalent - Tuakiri - into the NZVH - this is likely to happen in next month or so - AW.)

Monitoring of harvester
Currently, AVH only records the last time harvesting of a BioCASe provider was attempted. Log files for the AVH harvester is a minimum requirement; e-mails to the AVH administrators and the administrator of the individual herbarium data sets when no data has been harvester for a number of days would be very useful too.

AVH harvester should check for presence of MoU fields before uploading data into AVH cache.

Sensible database structure
Diagrams of the current AVH database structure and a suggested improved one are on a separate page. The AVH database structure hampers querying of AVH, has hampered development of AVH and will continue to do so if we do not change it.

Linking with National Species Lists
In order for the proposed database structure to work, at least family should not be harvested from herbarium collections databases, but should come from an external nomenclator (or taxonominator rather) when new taxon names are added to AVH. The current situation that taxa can be stated to belong to different families depending on which herbarium the data comes from is embarrassing and untenable, and makes querying on family misleading and completely useless.

Implementing APG would be nice too.

Capability to link by URL
This is Alex's GET query. Linking through URL should at least be possible for querying on taxon name fields and individual records. As we can't predict what people would want to link to it would be best and easiest if everything that can be set in the query form could also be set in a query string. There should be defaults for all settings that concern display properties and output format. The specimen detail page that can now only be reached by clicking on a dot on a map should also have a URL.

It is hard to understand why this is so hard, if you have not seen the AVH source code. If the code were properly object-oriented (or proper Java) this would be a piece of cake.

Related to this, users should be able to log in from wherever they are in AVH and then be redirected to where they were before they logged in (and then get the more precise data).

New code for AVH query
Newly harvested records are not queried in the Public query for no apparent reason.

Better mapping
Current maps are embarrassing. More intuitive interface - current map navigation is not consistent with 'the norm'

Base layers derived from GIS services (WMF/WFS)

AVH node structure
Just because the link has been lost from the HISCOM wiki, here is the link to the AVH node structure that was discussed at the last HISCOM meeting: http://www.rbg.vic.gov.au/dbpages/avhschema/avhnodes.pdf.