Why does `OpCode.Value` have the “wrong” endianness?

https://stackoverflow.com/questions/12014642

27-06-2021
|

質問

Facts:

The correct encoding for the CIL instruction rethrow's op-code is the two-byte sequence FE 1A.
OpCodes.Rethrow.Value (which has type short) has value 0xFE1A on my little-endian machine.
BitConverter honours the machine's endianness when converting to/from byte sequences.
On my little-endian machine, BitConverter.GetBytes(OpCodes.Rethrow.Value) results in the byte sequence 1A FE.

That means, serializing an OpCode.Value on a little-endian machine using BitConverter does not produce the correct encoding for the op-code; the byte order is reversed.

Questions:

Is the byte ordering of OpCode.Value documented (and if so, where?), or is it an "implementation detail"?
Does step 4 above on a big-endian machine also result in the wrong byte ordering? That is, would OpCodes.Rethrow.Value be 0x1AFE on a big-endian machine?

解決 2

I've reached the conclusion that serializing an op-code representation based on the OpCode.Value property, i.e.:

OpCode someOpCode = …;
byte[] someOpCodeEncoding = BitConverter.GetBytes(someOpCode.Value);

is a bad idea, but not because of the use of BitConverter.GetBytes(short) , whose behaviour is well-documented. The main culprit is the OpCode.Value property, whose documentation is vague in two respects:

It states that this property contains "the value of the immediate operand", which may or may not refer to the op-code's encoding; that term doesn't appear anywhere in the CLI specification.
Even when we assume that it does in fact contain an op-code's encoding, the documentation says nothing about byte order. (Byte order comes into play when converting between byte[] and short.)

Why am I basing my argument on MSDN documentation, and not on the CLI standard? Because System.Reflection.Emit is not part of the Reflection Library as defined by the CLI standard. For this reason, I think it's fairly safe to say that the MSDN reference documentation for this namespace is as close as it gets to an official specification. (But unlike @Hans Passant's answer, I would not take one step further and claim that the reference source is in any way a specification.)

Conclusion:

There are two ways to output the op-code encoding for a given OpCode object:

Stay with System.Reflection.Emit functionality and use ILGenerator.Emit(someOpCode). This may be too restrictive in some situations.
Create your own mapping between op-code encodings (i.e. byte[] sequences) and the various OpCode objects.

他のヒント

The Value property looks like this in the Reference Source:

public short Value
{
    get
    {
        if (m_size == 2)
            return (short) (m_s1 << 8 | m_s2);
        return (short) m_s2;
    }
}

That looks entirely sane of course, m_s2 is always the Least Significant Byte. Looking at ILGenerator:

    internal void InternalEmit(OpCode opcode)
    {
        if (opcode.m_size == 1)
        {
            m_ILStream[m_length++] = opcode.m_s2;
        }
        else
        {
            m_ILStream[m_length++] = opcode.m_s1;
            m_ILStream[m_length++] = opcode.m_s2;
        }

        UpdateStackSize(opcode, opcode.StackChange());

    }

Which is want you expected, the 0xfe byte gets emitted first.

So the framework code carefully avoids taking a dependency on endian-ness. CIL doesn't have an endian-ness dependency, no variable length data ever does. True for text files, utf-8 encoding, x86 core machine code instructions. An CIL. So if you convert variable length data to a single value, like the Value property getter does, then that code inevitable does make a conversion from non-endian-ness data to endian-ness data. Which inevitably gets half of the world upset because they think it was the wrong way around. And 100% of all programmers that run into it.

Probably the best way is to do it like the framework does and recover m_s1 and m_s2 as quickly as you can, using your own version of the Opcode type. Easy to do with:

foo.m_s1 = opc.Value >> 8;
foo.m_s2 = opc.Value & 0xff;
foo.m_size = opc.Size;

Which has no endian-ness dependency.

Try:

var yourStream = MemoryStream();
var writer = new System.IO.BinaryWriter(yourStream);
writer.Write(OpCodes.Rethrow.Value);

You don't need to worry about byte order since the BinaryWriter (or reader) will handle the implementation details for you. I suspect that the reason why you're getting the "wrong" byte order is that you're applying the BitConverter on the OpCode value when it's already decoded as little endian, and applying the BitConverter.GetShort() call again will reverse the byte order, giving you the "wrong" result.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow