<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Am 11.01.2015 um 11:41 schrieb Gnonthgol:<br>
<blockquote type="cite">Den 11. jan. 2015 01:24, Thomas Weidner
skreiv:<br>
> There seemed to be no good tool to convert SOSI files to OSM
(yes there is sosi2osm, but it seemed incomplete...why didn't I
work on sosi2osm? good question...)<br>
<br>
That is a good question. I think it is pretty easy to convert your
"plugins" to lua scripts for use in sosi2osm though so not to
duplicate the work.<br>
</blockquote>
Yes, I totally plead guilty of NIH syndrome here. One thing I did
not seem to be possible with sosi2osm is collecting all roads and
road route numbers and output a corresponding relation at the end.
But then, at some point I stopped looking at it. If sosi2osm is the
project to go and you can maybe only copy some SOSI->OSM tag
mapping from my source this would also be fine for me.<br>
<blockquote type="cite"><br>
> This tool is now pretty usable and understands nearly all
information available in the freely available "N50" map data. I
created some nice maps of the area around Ålesund for my Garmin
GPS using it. :-)<br>
<br>
The tool looks pretty usable. There are however some bugs though.
Firsty I do not see where you are reading the "TEGNSETT" tag in
the header. Not all SOSI files are ISO8859-10 and it may be other
values for that tag.<br>
</blockquote>
You are right this is a problem, I am aware of this field.
Unfortunately Java has no nice way of switching the input encoding
while reading the file, so I put it on the TODO list, but it should
be fixed. <br>
<blockquote type="cite">You have probably seen the SOSI standard
definitions over at
<a class="moz-txt-link-freetext" href="http://www.kartverket.no/en/SOSI-Standard-in-English/SOSI/">http://www.kartverket.no/en/SOSI-Standard-in-English/SOSI/</a> but it
would be fun to make a proper parser that follows the standard. I
see that you have made a regex parser based on actual standard. I
use the official library which does not implement such a<br>
parser. I have also seen that the files can violate the standard
in some areas.<br>
</blockquote>
I read the Norwegian standard document about the SOSI format (using
google translate, but the BNF is international ;) ). I think apart
from the encoding issue, there are two problems:<br>
<ol>
<li>The parser does not handle data values correctly, strings ("a
quoted string") are not parsed correctly</li>
<li>It does not parse references correctly (..REF is handled not
by the parser)</li>
<li>to correctly understand files you seem to have to read the
definition files. For example "..VNR X Y Z" is really shorthand
notation for "..VNR ...VEGKATEGORI X ...VEGSTATUS Y ...VEGNUMMER
Z" </li>
</ol>
<p>The FYBA C library seems to be pretty complete, but also pretty
ancient. Maybe having an alternative is not that bad. <br>
</p>
<p>- Thomas<br>
</p>
</body>
</html>