The size of the object and it's aligment are not the same thing. If the size of the struct is 16 bytes or some multiple it does not mean it will necessarily be 16 byte aligned.
In your case since your code is compiled in 64-bit mode you just need to pad the struct to 32 bytes. In 64-bit mode the stack is 16 byte aligned in Windows and Linux/Unix.
In 32-bit mode it does not have to be 16 byte aligned. You can test this. If you run the code below in MSVC in 32-bit mode you will likely see that the address for each element of the array is not 16 byte aligned (you might have to run it a few times). So even though the size of the struct is a multiple of 16 bytes it is not necessarily 16 byte aligned.
#include <stdio.h>
int main() {
union a {
float data[4];
struct {
double x;
double y;
float z;
float pad[3];
};
a b[10];
for(int i=0; i<10; i++) {
printf("%d\n", ((int)&b[i])%16);
}
}
If you want your code to work in 32-bit mode as well then you should align the memory. If you run the code below in 32-bit mode on Windows or Linux you will see that it's always 16 byte aligned as well.
#include <stdio.h>
#ifdef _MSC_VER // If Microsoft compiler
#define Alignd(X) __declspec(align(16)) X
#else // Gnu compiler, etc.
#define Alignd(X) X __attribute__((aligned(16)))
#endif
int main() {
union a {
float data[4];
struct {
double x;
double y;
float z;
float pad[3];
};
a Alignd(b[10]);
for(int i=0; i<10; i++) {
printf("%d\n", ((int)&b[i])%16);
}
}