Hvordan bør epost lagres i en NOARK5-database?

Petter Reinholdtsen pere at hungry.com
Fri Apr 14 20:13:21 CEST 2017


[Thomas Sødring]
> I think you can use XMP
> (https://www.pdflib.com/knowledge-base/xmp-metadata/xmp-in-pdfa/)

Right.  Did not find anything obvious while skimming the document.

> Arkivforskriften is out for comments at the moment. This would be a good
> time to propose that.

I guess we should do this.  I've started on a comment to send in on
<URL: https://titanpad.com/noark5-forskrift>.  Please have a look and
improve it. :)

> Given that I haven't read IETF RFC 5322, you have to forgive me if my
> devils advocate arguments are out of place, but I think it's worth
> considering. Let's take an email with 4 attachments. This results in a
> journalpost with one (the text at the begning of the mail)
> "hoveddokument" and the 4 attachments are "vedlegg". So they would
> have to parsed out as documents. But I guess this is defined in 5322
> so is possible.

The MIME part specify how to extract attachments, sure.  I am not quite
sure how to best handle this in Noark 5, as I assume the original email
should be connected somehow to the separate parts if they are extracted
and stored separately.

> I think a contrived challenge here is that the message to be archived
> might have, for example, been forwarded twice, with or without proper
> ID values in the mail and the case-handlers will complain about how
> difficult it is, or part of the mail contains private information and
> is not meant to head into the archive. But these situations are not
> the majority, and routines should try to pick up the difficult
> use-cases.

Yes, there will be edge cases that need more thought.  For example,
depending on the 'forwarding' mechanism used, storing the original email
might be easy ro impossible.

> But I think it's a really good idea. The less heterogeneity, the easier
> it is to do preservation. And surely it can't be that difficult to get a
> message from exchange to the proper format.

One challenge to solve is how/where to store the message "ID", as it
will need to be easily searchable if you want to group all emails in an
email thread into the same file.  Is there a good place to store such
value in noark 5?

I've extended archive-file in the noar5-tester git repo to be able to
store emails.  But it isn't yet doing a good job sorting them into
files.

-- 
Happy hacking
Petter Reinholdtsen


More information about the nikita-noark mailing list