How to determine struct used to declare the instance layout of PyObject?
-
13-03-2021 - |
Pregunta
I'm writing Python 3 extensions in C++ and I'm trying to find a way to check if a PyObject
is related to a type (struct) defining its instance layout. I'm only interested in static-size PyObject
, not PyVarObject
. The instance layout is defined by a struct with certain well-defined layout: mandatory PyObject
header and (optional) user-defined members.
Below, is example of PyObject
extension based on the well-known Noddy example in Defining New Types:
// Noddy struct specifies PyObject instance layout
struct Noddy {
PyObject_HEAD
int number;
};
// type object corresponding to Noddy instance layout
PyTypeObject NoddyType = {
PyObject_HEAD_INIT(NULL)
0, /*ob_size*/
"noddy.Noddy", /*tp_name*/
sizeof(Noddy), /*tp_basicsize*/
0, /*tp_itemsize*/
...
Noddy_new, /* tp_new */
};
It is important to notice that the Noddy
is a type, a compile-time entity,
but NoddyType
is an object present in memory at run-time.
The only obvious relation between the Noddy
and NoddyType
seems to be
value of sizeof(Noddy)
stored in tp_basicsize
member.
The hand-written inheritance implemented in Python specifies rules which allow to cast between PyObject
and type used to declare the instance layout of that particular PyObject
:
PyObject* Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
// When a Python object is a Noddy instance,
// its PyObject* pointer can be safely cast to Noddy
Noddy *self = reinterpret_cast<Noddy*>(type->tp_alloc(type, 0));
self->number = 0; // initialise Noddy members
return reinterpret_cast<PyObject*>(self);
}
In circumstances like various slot functions, it is safe to assume "a Python object is a Noddy" and cast without any checks. However, sometimes it is necessary to cast in other situations, then it feels like a blind conversion:
void foo(PyObject* obj)
{
// How to perform safety checks?
Noddy* noddy = reinterpret_cast<Noddy*>(obj);
...
}
It is possible to check sizeof(Noddy) == Py_TYPE(obj)->tp_basicsize
, but it is insufficient solution due to:
1) If a user will derive from Noddy
class BabyNoddy(Noddy):
pass
and obj
in foo
points to instance of the BabyNoddy
, Py_TYPE(obj)->tp_basicsize
is diferent.
But, it is still safe to cast to reinterpret_cast<Noddy*>(obj)
to get pointer to the instance layout part.
2) There can be other struct declaring instance layout of the same size as Noddy
:
struct NeverSeenNoddy {
PyObject_HEAD
short word1;
short word2;
};
In fact, C langauge level, NeverSeenNoddy
struct is compatible with the NoddyType
type object - it can fit into NoddyType
. So, cast could be perfectly fine.
So, my big question is this:
Is there any Python policy which could be used to determine if a PyObject
is compatible with the Noddy
instance layout?
Any way to check if PyObject*
points to the object part which is embedded in the Noddy
?
If not policy, is there any hack possible?
EDIT: There are a few questions which seem to be similar, but in my opinion they are different to the one I have asked. For example: Accessing the underlying struct of a PyObject
EDIT2: In order to understand why I marked Sven Marnach's response as the answer, see comments below that answer.
Solución
In Python, you can check if obj
is of type Noddy
or a derived type by using the test isinstance(obj, Noddy)
. The test in the C-API whether some PyObject *obj
is of type NoddyType
or a derived type is basically the same, you use PyObject_IsInstance()
:
PyObject_IsInstance(obj, &NoddyType)
As for your second question, there is no way to achieve this, and if you think you need this, your design has severe shortcomings. It would be better to derive NeverSeenNoddyType
from NoddyType
in the first place -- then the above check will also recognize an object of the derived type as an instance of NoddyType
.
Otros consejos
Becuase every object starts with PyObject_HEAD
, it is always safe to access the fields defined by this header. One of the fields is ob_type
(usually accessed using the Py_TYPE
macro). If this points to NoddyType
or any other type derived from NoddyType
(which is what PyObject_IsInstance
tells you), then you can assume the object's layout is that of struct Noddy
.
In other words, an object is compatible with Noddy
instance layout if its Py_TYPE
points to NoddyType
or any of its subclasses.
In the second question, the cast wouldn't be fine. The layouts of Noddy
and NeverSeenNoddy
are different, even though the size might be the same.
Assuming that NeverSeenNoddy
is layout of a NeverSeenNoddy_Type
type, you should never cast to NeverSeenNoddy
if PyObject_IsInstance(obj, &NeverSeenNoddy_Type)
is false.
If you want to have two C-level types with common fields, you should derive both types from common base that has only the common fields in the instance layout.
The subtypes should then include the base layout at the top of their layouts:
struct SubNoddy {
// No PyObject_HEAD because it's already in Noddy
Noddy noddy;
int extra_field;
};
Then, if PyObject_IsInstance(obj, &SubNoddy_Type)
returns true, you can cast to SubNoddy
and access the extra_field
field.
If PyObject_IsInstance(obj, &Noddy_Type)
returns true, you can cast to Noddy
and access the common fields.