Frage

Für das Leben von mir kann ich nicht deserialisieren die protobuf Datei von Open Street Maps .

Ich versuche, den folgenden Auszug deserialisieren: http: / /download.geofabrik.de/osm/north-america/us-northeast.osm.pbf bekommen Knoten und ich bin mit http://code.google.com/p/protobuf-net/ wie die Bibliothek. Ich habe versucht, eine Reihe von verschiedenen Objekten deserialisieren, aber sie alle kommen null.

Die Proto-Dateien finden Sie hier: http: // trac. openstreetmap.org/browser/applications/utils/export/osm2pgsql/protobuf

Irgendwelche Vorschläge?

War es hilfreich?

Lösung

rechts; das Problem ist, dass dies nicht nur protobuf - es ist ein Hybrid-Dateiformat ( definiert hier dass enthält protobuf unter verschiedenen Formaten intern. Es enthält auch Kompression (obwohl das Aussehen optional sein).

Ich habe auseinander, was zog ich kann von der Spezifikation, und ich habe eine C # Leser hier bekam, dass Anwendungen protobuf-net die Stücke zu behandeln - es liest glücklich über diese Datei zu Ende - ich kann Ihnen sagen, es gibt 4515 Blöcke (BlockHeader). Wenn es um die Blob bekommt bin ich ein wenig verwirrt, wie die spec demarks OSMHeader und OSMData - ich offen bin hier, um Vorschläge! Ich habe auch ZLIB.NET die zlib-Kompression zu behandeln, ist verwendet werden. In Abwesenheit um das meinen Kopf zu bekommen, habe ich für die Verarbeitung der Daten ZLIB angesiedelt und es gegen die beanspruchte Größe Validierung zu überprüfen, ist es zumindest geistig gesund.

Wenn Sie herausfinden können (oder den Autor fragen) wie sie OSMHeader trennen sich und OSMData ich werde glücklich Kurbel etwas anderes Ich hoffe, Sie haben nichts dagegen, dass ich hier aufgehört haben -. Aber es war ein paar Stunden; p

using System;
using System.IO;
using OpenStreetMap; // where my .proto-generated entities are living
using ProtoBuf; // protobuf-net
using zlib; // ZLIB.NET    

class OpenStreetMapParser
{

    static void Main()
    {
        using (var file = File.OpenRead("us-northeast.osm.pbf"))
        {
            // from http://wiki.openstreetmap.org/wiki/ProtocolBufBinary:
            //A file contains a header followed by a sequence of fileblocks. The design is intended to allow future random-access to the contents of the file and skipping past not-understood or unwanted data.
            //The format is a repeating sequence of:
            //int4: length of the BlockHeader message in network byte order
            //serialized BlockHeader message
            //serialized Blob message (size is given in the header)

            int length, blockCount = 0;
            while (Serializer.TryReadLengthPrefix(file, PrefixStyle.Fixed32, out length))
            {
                // I'm just being lazy and re-using something "close enough" here
                // note that v2 has a big-endian option, but Fixed32 assumes little-endian - we
                // actually need the other way around (network byte order):
                uint len = (uint)length;
                len = ((len & 0xFF) << 24) | ((len & 0xFF00) << 8) | ((len & 0xFF0000) >> 8) | ((len & 0xFF000000) >> 24);
                length = (int)len;

                BlockHeader header;
                // again, v2 has capped-streams built in, but I'm deliberately
                // limiting myself to v1 features
                using (var tmp = new LimitedStream(file, length))
                {
                    header = Serializer.Deserialize<BlockHeader>(tmp);
                }
                Blob blob;
                using (var tmp = new LimitedStream(file, header.datasize))
                {
                    blob = Serializer.Deserialize<Blob>(tmp);
                }
                if(blob.zlib_data == null) throw new NotSupportedException("I'm only handling zlib here!");

                using(var ms = new MemoryStream(blob.zlib_data))
                using(var zlib = new ZLibStream(ms))
                { // at this point I'm very unclear how the OSMHeader and OSMData are packed - it isn't clear
                    // read this to the end, to check we can parse the zlib
                    int payloadLen = 0;
                    while (zlib.ReadByte() >= 0) payloadLen++;
                    if (payloadLen != blob.raw_size) throw new FormatException("Screwed that up...");
                }
                blockCount++;
                Console.WriteLine("Read block " + blockCount.ToString());


            }
            Console.WriteLine("all done");
            Console.ReadLine();
        }
    }
}
abstract class InputStream : Stream
{
    protected abstract int ReadNextBlock(byte[] buffer, int offset, int count);
    public sealed override int Read(byte[] buffer, int offset, int count)
    {
        int bytesRead, totalRead = 0;
        while (count > 0 && (bytesRead = ReadNextBlock(buffer, offset, count)) > 0)
        {
            count -= bytesRead;
            offset += bytesRead;
            totalRead += bytesRead;
            pos += bytesRead;
        }
        return totalRead;
    }
    long pos;
    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new NotImplementedException();
    }
    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }
    public override long Position
    {
        get
        {
            return pos;
        }
        set
        {
            if (pos != value) throw new NotImplementedException();
        }
    }
    public override long Length
    {
        get { throw new NotImplementedException(); }
    }
    public override void Flush()
    {
        throw new NotImplementedException();
    }
    public override bool CanWrite
    {
        get { return false; }
    }
    public override bool CanRead
    {
        get { return true; }
    }
    public override bool CanSeek
    {
        get { return false; }
    }
    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotImplementedException();
    }
}
class ZLibStream : InputStream
{   // uses ZLIB.NET: http://www.componentace.com/download/download.php?editionid=25
    private ZInputStream reader; // seriously, why isn't this a stream?
    public ZLibStream(Stream stream)
    {
        reader = new ZInputStream(stream);
    }
    public override void Close()
    {
        reader.Close();
        base.Close();
    }
    protected override int ReadNextBlock(byte[] buffer, int offset, int count)
    {
        // OMG! reader.Read is the base-stream, reader.read is decompressed! yeuch
        return reader.read(buffer, offset, count);
    }

}
// deliberately doesn't dispose the base-stream    
class LimitedStream : InputStream
{
    private Stream stream;
    private long remaining;
    public LimitedStream(Stream stream, long length)
    {
        if (length < 0) throw new ArgumentOutOfRangeException("length");
        if (stream == null) throw new ArgumentNullException("stream");
        if (!stream.CanRead) throw new ArgumentException("stream");
        this.stream = stream;
        this.remaining = length;
    }
    protected override int ReadNextBlock(byte[] buffer, int offset, int count)
    {
        if(count > remaining) count = (int)remaining;
        int bytesRead = stream.Read(buffer, offset, count);
        if (bytesRead > 0) remaining -= bytesRead;
        return bytesRead;
    }
}

Andere Tipps

Nach dem Umrisse Setup von Mark I heraus den letzten Teil an, indem Sie http://git.openstreetmap.nl/index.cgi/pbf2osm.git/tree/src/main.c?h=35116112eb0066c7729a963b292faa608ddc8ad7

Hier ist der endgültige Code.

using System;
using System.Diagnostics;
using System.IO;
using crosby.binary;
using OSMPBF;
using PerlLLC.Tools;
using ProtoBuf;
using zlib;

namespace OpenStreetMapOperations
{
    class OpenStreetMapParser
    {
        static void Main()
        {
            using (var file = File.OpenRead(StaticTools.AssemblyDirectory + @"\us-pacific.osm.pbf"))
            {
                // from http://wiki.openstreetmap.org/wiki/ProtocolBufBinary:
                //A file contains a header followed by a sequence of fileblocks. The design is intended to allow future random-access to the contents of the file and skipping past not-understood or unwanted data.
                //The format is a repeating sequence of:
                //int4: length of the BlockHeader message in network byte order
                //serialized BlockHeader message
                //serialized Blob message (size is given in the header)

                int length, blockCount = 0;
                while (Serializer.TryReadLengthPrefix(file, PrefixStyle.Fixed32, out length))
                {
                    // I'm just being lazy and re-using something "close enough" here
                    // note that v2 has a big-endian option, but Fixed32 assumes little-endian - we
                    // actually need the other way around (network byte order):
                    length = IntLittleEndianToBigEndian((uint)length);

                    BlockHeader header;
                    // again, v2 has capped-streams built in, but I'm deliberately
                    // limiting myself to v1 features
                    using (var tmp = new LimitedStream(file, length))
                    {
                        header = Serializer.Deserialize<BlockHeader>(tmp);
                    }
                    Blob blob;
                    using (var tmp = new LimitedStream(file, header.datasize))
                    {
                        blob = Serializer.Deserialize<Blob>(tmp);
                    }
                    if (blob.zlib_data == null) throw new NotSupportedException("I'm only handling zlib here!");

                    HeaderBlock headerBlock;
                    PrimitiveBlock primitiveBlock;

                    using (var ms = new MemoryStream(blob.zlib_data))
                    using (var zlib = new ZLibStream(ms))
                    {
                        if (header.type == "OSMHeader")
                            headerBlock = Serializer.Deserialize<HeaderBlock>(zlib);

                        if (header.type == "OSMData")
                            primitiveBlock = Serializer.Deserialize<PrimitiveBlock>(zlib);
                    }
                    blockCount++;
                    Trace.WriteLine("Read block " + blockCount.ToString());


                }
                Trace.WriteLine("all done");
            }
        }

        // 4-byte number
        static int IntLittleEndianToBigEndian(uint i)
        {
            return (int)(((i & 0xff) << 24) + ((i & 0xff00) << 8) + ((i & 0xff0000) >> 8) + ((i >> 24) & 0xff));
        }
    }

    abstract class InputStream : Stream
    {
        protected abstract int ReadNextBlock(byte[] buffer, int offset, int count);
        public sealed override int Read(byte[] buffer, int offset, int count)
        {
            int bytesRead, totalRead = 0;
            while (count > 0 && (bytesRead = ReadNextBlock(buffer, offset, count)) > 0)
            {
                count -= bytesRead;
                offset += bytesRead;
                totalRead += bytesRead;
                pos += bytesRead;
            }
            return totalRead;
        }
        long pos;
        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }
        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }
        public override long Position
        {
            get
            {
                return pos;
            }
            set
            {
                if (pos != value) throw new NotImplementedException();
            }
        }
        public override long Length
        {
            get { throw new NotImplementedException(); }
        }
        public override void Flush()
        {
            throw new NotImplementedException();
        }
        public override bool CanWrite
        {
            get { return false; }
        }
        public override bool CanRead
        {
            get { return true; }
        }
        public override bool CanSeek
        {
            get { return false; }
        }
        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }
    }
    class ZLibStream : InputStream
    {   // uses ZLIB.NET: http://www.componentace.com/download/download.php?editionid=25
        private ZInputStream reader; // seriously, why isn't this a stream?
        public ZLibStream(Stream stream)
        {
            reader = new ZInputStream(stream);
        }
        public override void Close()
        {
            reader.Close();
            base.Close();
        }
        protected override int ReadNextBlock(byte[] buffer, int offset, int count)
        {
            // OMG! reader.Read is the base-stream, reader.read is decompressed! yeuch
            return reader.read(buffer, offset, count);
        }

    }
    // deliberately doesn't dispose the base-stream    
    class LimitedStream : InputStream
    {
        private Stream stream;
        private long remaining;
        public LimitedStream(Stream stream, long length)
        {
            if (length < 0) throw new ArgumentOutOfRangeException("length");
            if (stream == null) throw new ArgumentNullException("stream");
            if (!stream.CanRead) throw new ArgumentException("stream");
            this.stream = stream;
            this.remaining = length;
        }
        protected override int ReadNextBlock(byte[] buffer, int offset, int count)
        {
            if (count > remaining) count = (int)remaining;
            int bytesRead = stream.Read(buffer, offset, count);
            if (bytesRead > 0) remaining -= bytesRead;
            return bytesRead;
        }
    }
}

Ja, es kam von Protogen in Fileformat.cs (basierend auf OSM Fileformat.proto Datei .. Code unten).

package OSM_PROTO;
  message Blob {
    optional bytes raw = 1;
    optional int32 raw_size = 2; 
    optional bytes zlib_data = 3;
    optional bytes lzma_data = 4;
    optional bytes bzip2_data = 5;
  }

  message BlockHeader {
    required string type = 1;
    optional bytes indexdata = 2;
    required int32 datasize = 3;
  }

Hier ist die Deklaration von Blockkopfteil in generierten Datei:

public sealed partial class BlockHeader : pb::GeneratedMessage<BlockHeader, BlockHeader.Builder> {...}

-> mit pb = global :: Google.ProtocolBuffers;

(ProtocolBuffers.dll) kam mit diesem Paket:

http://code.google.com/p/protobuf-csharp-port/downloads/detail?name=protobuf-csharp-port-2.4.1.473-full-binaries.zip&can=2&q=

Haben Sie versucht, etwas kleinere Fläche zu bekommen? wie wir-pacific.osm.pbf

Schließlich wäre es sinnvoll sein, um die Fehlermeldungen zu senden.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top