ALA hub analysis

From Hiscom
Jump to: navigation, search


Comments on the report

Static content

AVH contains more than the six pages mentioned in the report. The pages that were likely overlooked (copyright and privacy statements and data use agreement) are actually some of the most important ones and also need to go with the data in ALA.

Schema

AVH uses ABCD 2.06. HISPID 5 is fully ABCD compliant and has vocabularies for some of the fields in AVH that ABCD doesn't. Currently AVH harvests ABCD 2.06, not HISPID. Annex B refers to a very old version of HISPID that is not used in AVH and should be replaced by the following:

Source institution ID (MoU)
/Unit/SourceInstitutionID
Unit ID (MoU)
/Unit/UnitID
Date last edited (MoU)
/Unit/DateLastEdited
Family (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/HigherTaxa/HigherTaxon[HigherTaxonRank='familia']/HigherTaxonName
Scientific name (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString
Genus (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/GenusOrMonomial
Species epithet (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/FirstEpithet
Author team
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/AuthorTeam
Infraspecific epithet (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/InfraspecificEpithet
Infraspecific rank (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/Rank
Hybrid flag
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/HybridFlag
Cultivar name
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/CultivarName
Identification qualifier (MoU)
/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/IdentificationQualifier
Determined by
/Unit/Identifications/Identification/Identifiers/IdentifiersText
Determination date text
/Unit/Identifications/Identification/Date/DateText
Determination date
/Unit/Identifications/Identification/Date/ISODateTimeBegin
Record basis
/Unit/RecordBasis
Collecting date (MoU)
/Unit/Gathering/DateTime/ISODateTimeBegin
Collector (MoU)
/Unit/Gathering/Agents/GatheringAgent[@sequence=1]/AgentText
Note: The highly atomised version in HISPID_Mapping_to_ABCD#cnam appears not to be supported in AVH.
Also note: AVH completely ignores the primarycollector attribute.
Additional collectors (MoU)
/Unit/Gathering/Agents
Locality
/Unit/Gathering/LocalityText
Country (MoU)
/Unit/Gathering/Country/Name
State (MoU)
/Unit/Gathering/NamedAreas/NamedArea[AreaClass='state']/AreaName
Herbarium region
/Unit/Gathering/NamedAreas/NamedArea[AreaClass='Australian Herbarium Region']/AreaName
Near named place (MoU)
/Unit/Gathering/NearNamedPlaces/NamedPlaceRelation/NearNamedPlace
Longitude (MoU)
/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LongitudeDecimal
Latitude (MoU)
/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LatitudeDecimal
Spatial datum
/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/SpatialDatum
Geocode precision (MoU)
/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/CoordinateErrorDistanceInMeters
Geocode source (MoU)
/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinateMethod
Altitude
/Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue
Depth
/Unit/Gathering/Depth/MeasurementOrFactAtomised/LowerValue
Habitat
/Unit/Gathering/Biotope/Text
Collecting number (MoU)
/Unit/CollectorsFieldNumber
Cultivated status
/Unit/MeasurementsOrFacts/MeasurementOrFact/MeasurementOrFactAtomised[Parameter='CultivatedOccurrence' and IsQuantitative='false']/LowerValue
Natural occurrence
/Unit/MeasurementsOrFacts/MeasurementOrFact/MeasurementOrFactAtomised[Parameter='NaturalOccurrence' and IsQuantitative='false']/LowerValue
Kind of collection
/Unit/KindOfUnit
Notes
/Unit/Notes
Type status
/Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypeStatus

Wish list for AVH

In no particular order.

Separation of static and dynamic content

Static pages and static content of pages with dynamic content should not be part of the source code of the web application, as is the situation now. The current situation can't be maintained as we are not able to update anything in the static content without having to recompile the web application.

OZCAM uses WordPress. WordPress is quite nice (especially since it's not Matrix, but also because it uses PHP), we could use it. If the AVH web application can't be tunnelled (Alison suggests 'pummelled') into a CMS, we would have to find a solution for the pages with dynamic content. There are only two of those, the map page and the result HTML table (does anybody ever use that?).

Separation between web application and data harvest

Currently there is a CSV upload button that can be used by 'herbarium administrators' to load data into AVH by uploading a CSV file. Although it was removed from the source code, it is still present in the MEL instance of AVH. The CSV upload function was added to AVH at the very end of the test phase and was never properly tested, but even if we are ever going to allow CSV uploads, it shouldn't be done through the front end of the web application. Herbarium administrators were established at the HISCOM meeting in Cairns solely for user administration. They are not necessarily the same people that would need to upload data.

According to the (extremely limited) documentation you can even attach new BioCASe providers through the AVH front end. Not to worry, nothing I ever tried from the documentation (except Craig's pages) ever worked.

Separation between user administration and web application

Although the user administration module of AVH appears to behave reasonably at the moment (although it doesn't do what we asked to be implemented), it would be better to have it sitting outside the AVH web application. Aaron Wilton suggests we look into Australian Access Federation services. (Funding recently available to integrate the NZ equivalent - Tuakiri - into the NZVH - this is likely to happen in next month or so - AW.)

Monitoring of harvester

Currently, AVH only records the last time harvesting of a BioCASe provider was attempted. Log files for the AVH harvester is a minimum requirement; e-mails to the AVH administrators and the administrator of the individual herbarium data sets when no data has been harvester for a number of days would be very useful too.

AVH harvester should check for presence of MoU fields before uploading data into AVH cache.

Sensible database structure

Diagrams of the current AVH database structure and a suggested improved one are on a separate page. The AVH database structure hampers querying of AVH, has hampered development of AVH and will continue to do so if we do not change it.

Linking with National Species Lists

In order for the proposed database structure to work, at least family should not be harvested from herbarium collections databases, but should come from an external nomenclator (or taxonominator rather) when new taxon names are added to AVH. The current situation that taxa can be stated to belong to different families depending on which herbarium the data comes from is embarrassing and untenable, and makes querying on family misleading and completely useless.

Implementing APG would be nice too.

Capability to link by URL

This is Alex's GET query. Linking through URL should at least be possible for querying on taxon name fields and individual records. As we can't predict what people would want to link to it would be best and easiest if everything that can be set in the query form could also be set in a query string. There should be defaults for all settings that concern display properties and output format. The specimen detail page that can now only be reached by clicking on a dot on a map should also have a URL.

It is hard to understand why this is so hard, if you have not seen the AVH source code. If the code were properly object-oriented (or proper Java) this would be a piece of cake.

Related to this, users should be able to log in from wherever they are in AVH and then be redirected to where they were before they logged in (and then get the more precise data).

New code for AVH query

Newly harvested records are not queried in the Public query for no apparent reason.

Better mapping

Current maps are embarrassing. More intuitive interface - current map navigation is not consistent with 'the norm'

Base layers derived from GIS services (WMF/WFS)

AVH node structure

Just because the link has been lost from the HISCOM wiki, here is the link to the AVH node structure that was discussed at the last HISCOM meeting: http://www.rbg.vic.gov.au/dbpages/avhschema/avhnodes.pdf.