Noark 5 extraction data sets? (Was: Importing Noark 5 XML files in noark5-parser now works without duplicates in web GUI)
Petter Reinholdtsen
pere at hungry.com
Mon Jun 12 13:36:52 CEST 2017
[Ole Aamot]
> Importing XML in the free tool noark5-parser is now working without
> any known bugs and displays the imported Noark 5 archive entries in
> the web GUI.
Very cool!
How far are you from importing the data files too?
> The tool is capable of parsing these XML files:
> <URL:https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/arkivstruktur.xml>
> <URL:https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xml>
I believe we should try to locate as many test data sets as possible and
make a list. One way is to search for arkivstruktur.xml in search
engines and see what we find. The arkade5 git repo have several
different data sets. Did you try them all?
I found more data sets:
* https://github.com/SesamResearch/Records-Management-and-Archive-Systems-Research/blob/master/samples/
* https://github.com/documaster/noark-extraction-validator-samples/tree/0.2.0-add-samples
If anyone on this list know of any more Noark 5 extractions data sets,
please let us know. :)
--
Happy hacking
Petter Reinholdtsen
More information about the nikita-noark
mailing list