Running noark5-validator/src/importExtraction.php on Debian and MySQL

Petter Reinholdtsen pere at
Sun May 21 18:52:42 CEST 2017

[Ole Aamot]
> I think it may be possible since it appears to me that Thomas Sødring
> has already written PHP code for importing XML in
> noark5-validator/src/importExtraction.php from


Adding the ability to importe existing data sets will make nikita useful
for many new use cases. :)

> Can you describe these tables and their purpose in Noark5?

If you browse to
<URL: >
(and sakarkiv) using the API browser available from
<URL: > and the spec available from
<URL: >
you should get quite far.

Here is my understanding of the API end point names:

>> > +-------------------------------+
>> > | Tables_in_noark5              |
>> > +-------------------------------+
>> > | basic_record                  | arkivstruktur/basisregistrering
>> > | basic_record_keyword          | arkivstruktur/basisregistrering.noekkelord
>> > | case_file                     | sakarkiv/saksmappe
>> > | classification_system         | metadata/klassifikasjonssystem
>> > | correspondence_part           | arkivstruktur/basisregistrering.korrespondansepart
>> > | document_description          | arkivstruktur/dokumentbeskrivelse
>> > | document_object               | arkivstruktur/dokumentobjekt
>> > | file                          | arkivstruktur/mappe
>> > | file_keyword                  | arkivstruktur/mappe.noekkelord
>> > | fonds                         | arkivstruktur/arkiv
>> > | fonds_creator                 | arkivstruktur/arkivskaper
>> > | keyword                       | metadata/noekkelord
>> > | record                        | arkivstruktur/registrering
>> > | record_correspondence_part    | arkivstruktur/registrering.korrespondansepart
>> > | registry_entry                | sakarkiv/journalpost
>> > | series                        | arkivstruktur/arkivdel
>> > | series_classfication_system   | metadata/klassifikasjonssystem
>> > | series_storage_location       | arkivstruktur/arkivdel.oppbevaringssted
>> > +-------------------------------+

Written from memory, so I might have gotten some of them wrong.

There are some base/subclass relations:

  mappe -> saksmappe
  registrering -> basisregistrering -> journalpost

There are also some parent/child container relationships:

 arkivskaper -> arkiv -> arkivdel -> mappe -> registrering ->
 dokumentbeskrivelse -> dokumentobjekt -> {selve filen}

Note, it is possible to skip some of these steps in the chain.  For
example, I believe this is valid too:

 arkiv -> arkivdel -> registrering -> dokumentbeskrivelse -> dokumentobjekt

ie. dropping arkivskaper and/or mappe.

I hope this make it more clear.

> Is the Nikita Noark5 Core storing the data in the MySQL database, in
> memory (Java) or both?

It is using a database agnostic framework, so it can work with many
databases.  The demo setup is the memory database, but both PostgreSQL
and MySQL should work too.

Happy hacking
Petter Reinholdtsen

More information about the nikita-noark mailing list