Re: Hvordan bør epost lagres i en NOARK5-database?

Thomas Sødring thomas.sodring at hioa.no
Sat Apr 15 17:35:10 CEST 2017


On 04/15/2017 09:58 AM, Petter Reinholdtsen wrote:
> [Thomas Sødring]
>> I'll gladly admit that I haven't looked at XMP in detail, nor have I
>> ever anyone in the KAI-miljø in Norway talk about it. But my
>> understanding is that you can extend XML as with the following:
>>
>> https://www.pdflib.com/fileadmin/pdflib/products/XMP/machine_extension_schema_1.xmp
>>
>> So you could create an email description extension.
> Hm, interesting.  Will try to keep it in mind if the email in PDF
> approach is explored further.
>
>>> I guess we should do this.  I've started on a comment to send in on
>>> <URL: https://titanpad.com/noark5-forskrift>.  Please have a look and
>>> improve it. :)
>> I will take a look at it. Norsk Arkivråd are looking for input. Would
>> you be interested in also sending it to Norsk Arkivråd?
> Feel free to send it to whoever might be interested.  My comments are
> public for anyone to have a look.  But I'm not quite sure what kind of
> coordination 'sending it to Norsk Arkivråd' would require, and given
> that it is just 15 days left to the deadline and that I have very
> limited time to work on this, I doubt I will have time to coordinate
> with anyone expecting to follow some democratic work flow before sending
> in my comments.
>
>> No, I don't think so ... If you wanted to be very "microsofty" you could
>> hack the understanding of M007 dokumentnummer. But that is meant to be
>> 1,2,3,4,5,6,7,8,.. but it is defined as an integer in XSD. I really
>> think it would be better if we identify the need for such a field to be
>> included in a revision of the tjenestegrensesnitt / next version of
>> Noark. I think you make a very good case for it. My gut feeling is that
>> dokumentObjekt should be extendible to specific types of documents,
>> emailDocument, SMS, MMS, that have their own additional metadata
>> requirements.
> I find M711 virksomhetsspesifikkeMetadata mentioned several times in the
> spec.  Any idea what it is and how to sue it?  It is part of
> dokumentbeskrivelse, among other things.

M711 virksomhetsspesifikkeMetadata is a way to add additional metadata
to mappe, basisregistrering and sakspart. I don't believe it is part of
dokumentbeskrivelse though.
<mappe>
   ... obligatory values ..
   <virksomhetsspesifikkeMetadata>
     <myMetadataField1></myMetadataField1>
     <myMetadataField2></myMetadataField2>
   </virksomhetsspesifikkeMetadata>
    ... obligatory values ..
</mappe>

It's a nice way to extend the arkivstruktur where you have to provide an
additional XSD-file  that validates the contents.

But seeing as M711 is vendor specific, there is no guarantee of
interoperability. I've been nagging the KAI-miljø that they have to
standardise their own mappe, flyktningsmappe, byggesaksmappe etc or
start defining M711 for various case handling areas. No-one really cares
or has the time to do anything with it.

So in one way it's really good, but in another it will probably cause
interoperability problems.
> Another alternative might be to use filnavn in dokumentobjekt.  Emails
> do not really have file names, and the Message-ID would fit in there
> just fine.  Is it a problem for the filnavn value that Message-ID is not
> always unique?  Spam emails tend to reuse Message-ID or set it to empty.
> Proper email clients and server should always strive to make unique
> values.
That's a nice approach. There is no requirement for filnavn to be unique
as far as I know.

>
> A key for this to work would be to be able to quickly search for all
> dokumentobjekt entries with a given filename value.  This way the email
> injector would loop over all the values in In-Reply-To and References
> and find if any of them are already stored in the archive, and propose
> to store the email in the same file as these existing archive objects.
>
> With my previous proposed work flow (store everything in a temp file and
> move individual documents to their proper file afterwards), there should
> be an automatic task moving emails into files and create new files for
> new email threads.  It would probably handle file merging, as there are
> email clients breaking email threads.  It would only work well for well
> behaving email clients.
>
>> Should we send in a mangelmelding asking for this? Or we could ask for
>> this to be included in Noark 6. I think there may be a Noark 5v4.1 but
>> am hearing rumours that the tjenestegrensnitt will be finalised in Noark
>> 6. But that does not really make sense. They have to finalise the
>> interface in Noark 5v4.1 and then can move forward with Noark 6. I think
>> a mangelmelding identifying the need for this will get them thinking and
>> is perhaps more important from the perspective of a standalone Noark 5
>> core for fagsystem integration than for Noark 5 komplett. I guess the
>> vendors hack these kind of requirements on top of their Noark systems.
> I suspect a defect report with a proposal for storing emails would be
> good.  The key defect is missing a RFC 822 format as allowed storage
> format.  Not sure the lack of Message-ID field is a defect yet, given
> that I do not understand virksomhetsspesifikkeMetadata and filnavn might
> be a OK fit.
>
>> That's really good. I think I've been so stuck in the Noark structure
>> that I don't really look at the possibilities!  I see more the
>> limitations. The word "tvangstrøye" was often used to describe Noark 4
>> and I think in someways it still holds true and your use-cases are
>> showing how Noark lacks flexibility.
> Well, I am not quite there yet, but I worry a bit that I might be
> proposing solutions that break expectations so much that no other system
> will be able to use the information we store in the archive. :)
>

I think RA are scared of doing anything here as they will be criticised
if they mess up so I think a lot of this is left to the vendors to push.
But there is no incentive for them to push this. You are now bordering
towards the part of the standard that is extensible or has no
pre-defined requirements. Once we are here we can  pretty much do what
we want. If we do something good, I believe there is a good chance it
will make it's way into the standard.

 - Tom


More information about the nikita-noark mailing list