Noark 5 extraction data sets? (Was: Importing Noark 5 XML files in noark5-parser now works without duplicates in web GUI)

Petter Reinholdtsen pere at hungry.com
Mon Jun 12 13:36:52 CEST 2017


[Ole Aamot]
> Importing XML in the free tool noark5-parser is now working without
> any known bugs and displays the imported Noark 5 archive entries in
> the web GUI.

Very cool!

How far are you from importing the data files too?

> The tool is capable of parsing these XML files:
> <URL:https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/arkivstruktur.xml>
> <URL:https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xml>

I believe we should try to locate as many test data sets as possible and
make a list.  One way is to search for arkivstruktur.xml in search
engines and see what we find.  The arkade5 git repo have several
different data sets.  Did you try them all?

I found more data sets:

 * https://github.com/SesamResearch/Records-Management-and-Archive-Systems-Research/blob/master/samples/
 * https://github.com/documaster/noark-extraction-validator-samples/tree/0.2.0-add-samples

If anyone on this list know of any more Noark 5 extractions data sets,
please let us know. :)

-- 
Happy hacking
Petter Reinholdtsen


More information about the nikita-noark mailing list