質問

I am facing today with a problem where I need to change memory to a certain pattern like 0x11223344, so that the whole memory looks like (in hex):

1122334411223344112233441122334411223344112233441122334411223344...

I can't figure out how to do it with memset() because it takes only a single byte, not 4 bytes.

Any ideas?

Thanks, Boda Cydo.

役に立ちましたか?

解決

An efficient way would be to cast the pointer to a pointer of the needed size in bytes (e.g. uint32_t for 4 bytes) and fill with integers. It's a little ugly though.

char buf[256] = { 0, };
uint32_t * p = (uint32_t *) buf, i;

for(i = 0; i < sizeof(buf) / sizeof(* p); ++i) {
        p[i] = 0x11223344;
}

Not tested!

他のヒント

On OS X, one uses memset_pattern4( ) for this; I would expect other platforms to have similar APIs.

I don't know of a simple portable solution, other than just filling in the buffer with a loop (which is pretty darn simple).

Recursively copy the memory, using the area which you already filled as a template per iteration (O(log(N)):

int fillLen = ...;
int blockSize = 4; // Size of your pattern

memmove(dest, srcPattern, blockSize);
char * start = dest;
char * current = dest + blockSize;
char * end = start + fillLen;
while(current + blockSize < end) {
    memmove(current, start, blockSize);
    current += blockSize;
    blockSize *= 2;
}
// fill the rest
memmove(current, start, (int)end-current);

[EDIT] What I mean with "O(log(N))" is that the runtime will be much faster than if you fill the memory manually since memmove() usually uses special, hand-optimized assembler loops that are blazing fast.

You could set up the sequence somewhere then copy it using memcpy() to where you need it.

If your pattern fits in a wchar_t, you can use wmemset() as you would have used memset().

Well, the normal method of doing that is to manually setup the first four bytes, and then memcpy(ptr+4, ptr, len -4)

This copies the first four bytes into the second four bytes, then copies the second four bytes into the third, and so on.

Note, that this "usually" works, but is not guarenteed to, depending on your CPU architecture, and your C run-time library.

Using "memcpy" or "memset" maybe not the efficient method.

Don't giving up using loops such as "for" or "while", When lib-defined function does the same.

Standard C library has no such function. But memset is usually implemented as an unrolled loop to minimize branching and condition checking:

static INLINE void memset4(uint32_t *RESTRICT p, uint32_t val, int len) {
  uint32_t *end = p + (len&~0x1f); //round down to nearest multiple of 32
  while (p != end) { //copy 32 times
    p[ 0] = val;
    p[ 1] = val;
    p[ 2] = val;
    p[ 3] = val;
    p[ 4] = val;
    p[ 5] = val;
    p[ 6] = val;
    p[ 7] = val;
    p[ 8] = val;
    p[ 9] = val;
    p[10] = val;
    p[11] = val;
    p[12] = val;
    p[13] = val;
    p[14] = val;
    p[15] = val;
    p[16] = val;
    p[17] = val;
    p[18] = val;
    p[19] = val;
    p[20] = val;
    p[21] = val;
    p[22] = val;
    p[23] = val;
    p[24] = val;
    p[25] = val;
    p[26] = val;
    p[27] = val;
    p[28] = val;
    p[29] = val;
    p[30] = val;
    p[31] = val;
    p += 32;
  }
  end += len&0x1f; //remained
  while (p != end) *p++ = val; //copy remaining bytes
}

Good compiler will likely use some CPU specific instructions to optimize it further (like i.e. use SSE 128-bit store), but even without optimizations, it should be as fast as a library memset, because such simple loops are memory access bound.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top