Talk:KeyBase technical specification

KevinThiele 0:37, 30 June 2012 (EST)

Hi Niels,

have been (finally) having a play with Keybase. It’s very good, and fast. I tried pushing it by uploading the Families key from the Flora of Australia – handled it well. I also tried (and failed – bug) uploading a key with images (The genera of grasses of Australia) – you’ll need to delete that entry as it’s a dud key.

Couple of suggestions/buglets

1. With a large key when you choose a couplet the window elements do some jiggery in Firefox – look at the Families key – the right and left borders of the jiggle as it draws the next couplet

2. Suggest you include the number of taxa remaining and discarded – gives a good sense as to how far you’ve got to go

3. Would you consider allowing the upload of text keys as well as lpxk. The problem with the latter is that one needs to go through so many steps to get a key ready to upload. A text key in the format attached can be fairly easily parsed – would need a checking step to make sure all is in order. When I was loading keys into my system I formatted them as text as shown, then imported into Phoenix just to check that all was well and there were no errors, then imported directly from the text. I think this would be good.

So what are your big plans for the weekend?

Also – how would one link the Apocynaceae key I’ve uploaded to the Families key, to start building a superkey?

Cheers - Kevin

NielsKlazenga 15:11, 30 June 2012 (EST)

Hi Kevin, I think the failure to upload a key with images might not so much be a bug, as the fact that for some reason the loading of images is very slow on the RBG web server. If you give me the URL, I can try it from my computer at home. Uploading of images to that server was also very slow when I tried it from the command line. If I can't fix this I will just temporarily store the image URLs in the database and load the images later as a background process. Later on we will also be able to load images from a zip file with a Phoenix deployment, but I need to have an example to work from to know where all the files are sitting.

The families key doesn't do the jiggery thing when I try, neither in Firefox or Chrome (which is what I mostly use). But will keep an eye on it.

Re the number of remaining and discarded entities, I will definitely do that.

I have been thinking about uploading from text files, but was worried about the parsing. The file you attached is very easy to parse though, so I will write a script for that too. Will also allow comma-delimited text, so people can make the keys in Excel. At a later stage we may even get the error checking inside KeyBase, but we don't need to have everything at once.

When I was talking about 'big plans', I didn't mean so much for this weekend as for KeyBase in general. However, I have written script to produce HTML keys, both bracketed (is that the right word?) and indented. The indented keys are really cool, but don't work so well for long keys (as they run off the page on the right-hand side). Will have to do some more thinking about how to deal with long keys, as I would really like to keep the indented keys in. I will move the HTML keys to the RBG server this afternoon, so you can have a look (you've seen the bracketed keys already).

I have only done a little bit of thinking on how to link up all the keys we might get. I was thinking about landing pages for all the end taxa with links to further keys and external links (and links to static HTML pages that can be uploaded along with the LPXK file). I think I will just grab the taxon name lists from the NSL (the data dumps they do to ALA are available online), so we can use the GUIDs from there. There will also be projects. For keys that are part of a project you can immediately link to the next key and (optionally) skip the landing page. This is how you build your super keys. Also, if there are more people involved in a project they can edit each other's keys and not just the keys they uploaded themselves.

There will be more editing facilities. For example, people will be able to upload images separately from the key and link them to key leads and end taxa. I would like to do more with images than Phoenix does, so the images that sit inside the keys are really just thumbnails that can link to full size images. We also need to ask for much more metadata for the keys, as Phoenix doesn't have a lot. For example we need publication details, copyright and licences etc. This will be done in the Add key form (the Edit key form will be similar).

I will add the filter facility for individual keys pretty soon. There will be the option for filters made by the authors and that can be already in the LPXK file or added later and are available to everyone; and filters by users, either as a one-off, or named and saved to the database and available for later use by the same user. (All filters will be saved to the database, but if they haven't been named they will be deleted when the session expires, or a new filter is made, or the filter is turned off). Global filters will be implemented later.

I am not sure if you have seen it, but the end taxa in the Insect order key link to HTML pages with descriptions and more images. For keys that are uploaded through a URL, this will be the HTML pages on the site where the key comes from. However, I thought that if they come in a ZIP archive with the LPXK, they can be stored onsite and displayed in KeyBase. Then they can also be added (or edited) later.

These are just part of my big plans. Do you have time for a phone call in the coming week? By the way, all the keys you gave me earlier in an access file are already in the right format. Are we allowed to use them?

Niels

KevinThiele 17:45, 19 July 2012 (EST)

Hi Niels – I wonder if there’s any reason for Key Title amongst the key metadata? So often the title is redundant when we have taxonomic and geographic scope, and maybe just these will suffice. Have a look amongst the key titles and you’ll see what I mean.

Then again, I’ve been thinking also about taxonomic scope. Occasionally it’s a little more complex than we accommodate – for example, in a recent Nuytsia we have “Key to scale-leaved Mirbelia in Western Australia”. The taxonomic scope of this would be Mirbelia, but it’s a partial key to the genus, not a full key. One way to deal with this would be to have a tick-box for whether the key is full or partial.

The other issue with key title is that we have no good way to maintain a standard amongst the titles. Look at the Dicranoloma, Families and Apocynaceae keys for three very different title forms.

Maybe we do need to keep key title, for cases like the scale-leaved Mirbelia above, but have taxonomic scope and geographic scope first amongst the fields - you could then auto-populate title using javascript once the other two are filled out e.g. enter “Apocynaceae” and “Australia” and you autofill “Key to Apocynaceae in Australia”. Someone would then only need to edit this if it’s not a suitable title.

Another thing we’re not capturing is the taxonomic rank of the terminals in the key (e.g. some keys are now keys to genera, others are to species). At the moment I can’t think of a good reason why we should capture this, but it’s worth keeping an eye on it.

Cheers - Kevin

NielsKlazenga 17:59, 19 July 2012 (EST)

Hi Kevin, We need to have the Title for the simple reason that it is required in LPXK and SDD. It is also required metadata for HTML pages, although I haven't dealt with page metadata yet. I think most of the other things you mention will take care of themselves when we link up to APNI and APC, but we'll keep an eye on it. By the way, I made a page on the HISCOM Wiki with a sort of technical specification: http://www.rbg.vic.gov.au/wiki/hiscom/index.php/KeyBase_technical_specification. Could you put all these sort of ideas in there? Is easier to keep track of and tick them off in the long run then a string of e-mails. I can make you an editor, as I have full power over the Wiki now (so does Ben). Niels