Pointers and Arrays¶
Shroud will create code to map between C and Fortran pointers. The
interoperability with C features of Fortran 2003 and the
call-by-reference feature of Fortran provides most of the features
necessary to pass arrays to C++ libraries. Shroud can also provide
additional semantic information. Adding the rank attribute
will declare the argument as an assumed-shape array with the given
rank: +rank(2)
creates arg(:,:)
. The +dimension(n)
attribute
will instead give an explicit dimension: +dimension(10,20)
creates
arg(10,20)
.
Using dimension on intent(in) arguments will use the dimension shape in the Fortran wrapper instead of assumed-shape. This adds some additional safety since many compiler will warn if the actual argument is too small. This is useful when the C++ function has an assumed shape. For example, it expects a pointer to 16 elements. The Fortran wrapper will pass a pointer to contiguous memory with no explicit shape information.
When a function returns a pointer, the default behavior of Shroud
is to convert it into a Fortran variable with the POINTER
attribute
using c_f_pointer
. This can be made explicit by adding
+deref(pointer)
to the function declaration in the YAML file.
For example, int *getData(void) +deref(pointer)
creates the Fortran
function interface
function get_data() result(rv)
integer(C_INT), pointer :: rv
end function get_data
The result of the the Fortran function directly accesses the memory returned from the C++ library.
An array can be returned by adding the dimension attribute to
the function. The dimension expression will be used to provide the
shape
argument to c_f_pointer
. The arguments to dimension
are C++ expressions which are evaluated after the C++ function is
called and can be the name of another argument to the function or a call
another C++ function. As a simple example, this declaration returns a
pointer to a constant sized array.
- decl: int *returnIntPtrToFixedArray(void) +dimension(10)
Example returnIntPtrToFixedArray shows the generated code.
If the dimension is unknown when the function returns, a type(C_PTR)
can be returned with +deref(raw)
. This will allow the user
to call c_f_pointer
once the shape is known.
Instead of a Fortran pointer to a scalar, a scalar can be returned
by adding +deref(scalar)
.
A common idiom for C++ is to return pointers to memory via arguments.
This would be declared as int **arg +intent(out)
. By default,
Shroud treats the argument similar to a function which returns a
pointer: it adds the deref(pointer) attribute to treats it as a
POINTER
to a scalar. The dimension attribute can be used to
create an array similar to a function result.
If the deref(allocatable) attribute is added, then a Fortran array
will be allocated to the size of dimension attribute and the
argument will be copied into the Fortran memory.
A function which returns multiple layers of indirection will return
a type(C_PTR)
. This is also true for function arguments beyond
int **arg +intent(out)
.
This pointer can represent non-contiguous memory and Shroud
has no way to know the extend of each pointer in the array.
A special case is provided for arrays of NULL terminated strings,
char **
. While this also represents non-contiguous memory, it is a
common idiom and can be processed since the length of each string can
be found with strlen
.
See example acceptCharArrayIn.
In Python wrappers, Shroud will allocate intent(out) arguments
before calling the function. This requires the dimension attribute
which defines the shape and must be known before the function is
called. The argument will then be returned by the function along with
the function result and other intent(out) arguments. For example,
int **arg +intent(out)+dimension(n)
. The value of the dimension
attribute is used to define the shape of the array and must be known
before the library function is called. The dimension attribute can
include the Fortran intrinsic size
to define the shape in terms of
another array.
char *
functions are treated differently. By default deref
attribute will be set to allocatable. After the C++ function
returns, a CHARACTER
variable will be allocated and the contents
copied. This will convert a NULL
terminated string into the
proper length of Fortran variable.
For very long strings or strings with embedded NULL
, deref(raw)
will return a type(C_PTR)
.
void *
functions return a type(C_PTR)
argument and cannot
have deref, dimension, or rank attributes.
A type(C_PTR)
argument will be passed by value. For a void **
argument,
the type(C_PTR)
will be passed by reference (the default). This
will allow the C wrapper to assign a value to the argument.
See example passVoidStarStar.
If the C++ library function can also provide the length of the
pointer, then its possible to return a Fortran POINTER
or
ALLOCATABLE
variable. This allows the caller to directly use the
returned value of the C++ function. However, there is a price; the
user will have to release the memory if owner(caller) is set. To
accomplish this with POINTER
arguments, an additional argument is
added to the function which contains information about how to delete
the array. If the argument is declared Fortran ALLOCATABLE
, then
the value of the C++ pointer are copied into a newly allocated Fortran
array. The C++ memory is deleted by the wrapper and it is the callers
responsibility to deallocate
the Fortran array. However, Fortran
will release the array automatically under some conditions when the
caller function returns. If owner(library) is set, the Fortran
caller never needs to release the memory.
See Memory Management for details of the implementation.
A void pointer may also be used in a C function when any type may be
passed in. The attribute assumedtype can be used to declare a
Fortran argument as assumed-type: type(*)
.
- decl: int passAssumedType(void *arg+assumedtype)
function pass_assumed_type(arg) &
result(SHT_rv) &
bind(C, name="passAssumedType")
use iso_c_binding, only : C_INT, C_PTR
implicit none
type(*) :: arg
integer(C_INT) :: SHT_rv
end function pass_assumed_type
Memory Management¶
Shroud will maintain ownership of memory via the owner attribute. It uses the value of the attribute to decided when to release memory.
Use owner(library) when the library owns the memory and the user
should not release it. For example, this is used when a function
returns const std::string &
for a reference to a string which is
maintained by the library. Fortran and Python will both get the
reference, copy the contents into their own variable (Fortran
CHARACTER
or Python str
), then return without releasing any
memory. This is the default behavior.
Use owner(caller) when the library allocates new memory which is returned to the caller. The caller is then responsible to release the memory. Fortran and Python can both hold on to the memory and then provide ways to release it using a C++ callback when it is no longer needed.
For shadow classes with a destructor defined, the destructor will be used to release the memory.
The c_statements may also define a way to destroy memory.
For example, std::vector
provides the lines:
destructor_name: std_vector_{cxx_T}
destructor:
- std::vector<{cxx_T}> *cxx_ptr = reinterpret_cast<std::vector<{cxx_T}> *>(ptr);
- delete cxx_ptr;
Patterns can be used to provide code to free memory for a wrapped
function. The address of the memory to free will be in the variable
void *ptr
, which should be referenced in the pattern:
declarations:
- decl: char * getName() +free_pattern(free_getName)
patterns:
free_getName: |
decref(ptr);
Without any explicit destructor_name or pattern, free
will be
used to release POD pointers; otherwise, delete
will be used.
C and Fortran¶
Fortran keeps track of C++ objects with the struct
C_capsule_data_type and the bind(C)
equivalent
F_capsule_data_type. Their names in the format dictionary default to
{C_prefix}SHROUD_capsule_data
and {C_prefix}SHROUD_capsule_data
.
In the Tutorial these types are defined in typesTutorial.h
as:
// helper capsule_CLA_Class1
struct s_CLA_Class1 {
void *addr; /* address of C++ memory */
int idtor; /* index of destructor */
};
typedef struct s_CLA_Class1 CLA_Class1;
And wrapftutorial.f
:
! helper capsule_data_helper
type, bind(C) :: CLA_SHROUD_capsule_data
type(C_PTR) :: addr = C_NULL_PTR ! address of C++ memory
integer(C_INT) :: idtor = 0 ! index of destructor
end type CLA_SHROUD_capsule_data
addr is the address of the C or C++ variable, such as a char *
or std::string *
. idtor is a Shroud generated index of the
destructor code defined by destructor_name or the free_pattern attribute.
These code segments are collected and written to function
C_memory_dtor_function. A value of 0 indicated the memory will not
be released and is used with the owner(library) attribute.
Each class creates its own capsule struct for the C wrapper. This is to provide a measure of type safety in the C API. All Fortran classes use the same derived type since the user does not directly access the derived type.
A typical destructor function would look like:
// Release library allocated memory.
void TUT_SHROUD_memory_destructor(TUT_SHROUD_capsule_data *cap)
{
void *ptr = cap->addr;
switch (cap->idtor) {
case 0: // --none--
{
// Nothing to delete
break;
}
case 1: // new_string
{
std::string *cxx_ptr = reinterpret_cast<std::string *>(ptr);
delete cxx_ptr;
break;
}
default:
{
// Unexpected case in destructor
break;
}
}
cap->addr = nullptr;
cap->idtor = 0; // avoid deleting again
}
Character and Arrays¶
In order to create an allocatable copy of a C++ pointer, an additional
structure is involved. For example, getConstStringPtrAlloc
returns a pointer to a new string. From strings.yaml
:
declarations:
- decl: const std::string * getConstStringPtrAlloc() +owner(library)
The C wrapper calls the function and saves the result along with
metadata consisting of the address of the data within the
std::string
and its length. The Fortran wrappers allocates its
return value to the proper length, then copies the data from the C++
variable and deletes it.
The metadata for variables are saved in the C struct C_array_type
and the bind(C)
equivalent F_array_type.:
// helper array_context
struct s_STR_SHROUD_array {
STR_SHROUD_capsule_data cxx; /* address of C++ memory */
union {
const void * base;
const char * ccharp;
} addr;
int type; /* type of element */
size_t elem_len; /* bytes-per-item or character len in c++ */
size_t size; /* size of data in c++ */
int rank; /* number of dimensions, 0=scalar */
long shape[7];
};
typedef struct s_STR_SHROUD_array STR_SHROUD_array;
The union for addr
makes some assignments easier by removing
the need for casts and also aids debugging.
The union is replaced with a single type(C_PTR)
for Fortran:
! helper array_context
type, bind(C) :: STR_SHROUD_array
! address of C++ memory
type(STR_SHROUD_capsule_data) :: cxx
! address of data in cxx
type(C_PTR) :: base_addr = C_NULL_PTR
! type of element
integer(C_INT) :: type
! bytes-per-item or character len of data in cxx
integer(C_SIZE_T) :: elem_len = 0_C_SIZE_T
! size of data in cxx
integer(C_SIZE_T) :: size = 0_C_SIZE_T
! number of dimensions
integer(C_INT) :: rank = -1
integer(C_LONG) :: shape(7) = 0
end type STR_SHROUD_array
The C wrapper does not return a std::string
pointer. Instead it
passes in a C_array_type pointer as an argument. It calls
getConstStringPtrAlloc
, saves the results and metadata into the
argument. This allows it to be easily accessed from Fortran.
Since the attribute is owner(library), cxx.idtor
is set to 0
to avoid deallocating the memory.
void STR_getConstStringPtrAlloc_bufferify(
STR_SHROUD_array *SHT_rv_cdesc)
{
// splicer begin function.getConstStringPtrAlloc_bufferify
const std::string * SHCXX_rv = getConstStringPtrAlloc();
ShroudStrToArray(SHT_rv_cdesc, SHCXX_rv, 0);
// splicer end function.getConstStringPtrAlloc_bufferify
}
The Fortran wrapper uses the metadata to allocate the return argument to the correct length:
function get_const_string_ptr_alloc() &
result(SHT_rv)
character(len=:), allocatable :: SHT_rv
! splicer begin function.get_const_string_ptr_alloc
type(STR_SHROUD_array) :: SHT_rv_cdesc
call c_get_const_string_ptr_alloc_bufferify(SHT_rv_cdesc)
allocate(character(len=SHT_rv_cdesc%elem_len):: SHT_rv)
call STR_SHROUD_copy_string_and_free(SHT_rv_cdesc, SHT_rv, &
SHT_rv_cdesc%elem_len)
! splicer end function.get_const_string_ptr_alloc
end function get_const_string_ptr_alloc
Finally, the helper function SHROUD_copy_string_and_free
is called
to set the value of the result and possible free memory for
owner(caller) or intermediate values:
// helper copy_string
// Copy the char* or std::string in context into c_var.
// Called by Fortran to deal with allocatable character.
void STR_ShroudCopyStringAndFree(STR_SHROUD_array *data, char *c_var, size_t c_var_len) {
const char *cxx_var = data->addr.ccharp;
size_t n = c_var_len;
if (data->elem_len < n) n = data->elem_len;
std::strncpy(c_var, cxx_var, n);
STR_SHROUD_memory_destructor(&data->cxx); // delete data->cxx.addr
}
Note
The three steps of call, allocate, copy could be replaced
with a single call by using the further interoperability
with C features of Fortran 2018 (a.k.a TS 29113). This
feature allows Fortran ALLOCATABLE
variables to be
allocated by C. However, not all compilers currently support
that feature. The current Shroud implementation works with
Fortran 2003.
Python¶
NumPy arrays control garbage collection of C++ memory by creating
a PyCapsule
as the base object of NumPy objects.
Once the final reference to the NumPy array is removed, the reference
count on the PyCapsule
is decremented.
When 0, the destructor for the capsule is called and releases the C++ memory.
This technique is discussed at [blog1] and [blog2]
Old¶
Note
C_finalize is replaced by statement.final
Shroud generated C wrappers do not explicitly delete any memory.
However a destructor may be automatically called for some C++ stl
classes. For example, a function which returns a std::string
will have its value copied into Fortran memory since the function’s
returned object will be destroyed when the C++ wrapper returns. If a
function returns a char *
value, it will also be copied into Fortran
memory. But if the caller of the C++ function wants to transfer
ownership of the pointer to its caller, the C++ wrapper will leak the
memory.
The C_finalize variable may be used to insert code before returning from the wrapper. Use C_finalize_buf for the buffer version of wrapped functions.
For example, a function which returns a new string will have to
delete
it before the C wrapper returns:
std::string * getConstStringPtrLen()
{
std::string * rv = new std::string("getConstStringPtrLen");
return rv;
}
Wrapped as:
- decl: const string * getConstStringPtrLen+len=30()
format:
C_finalize_buf: delete {cxx_var};
The C buffer version of the wrapper is:
void STR_get_const_string_ptr_len_bufferify(char * SHF_rv, int NSHF_rv)
{
const std::string * SHCXX_rv = getConstStringPtrLen();
if (SHCXX_rv->empty()) {
std::memset(SHF_rv, ' ', NSHF_rv);
} else {
ShroudStrCopy(SHF_rv, NSHF_rv, SHCXX_rv->c_str());
}
{
// C_finalize
delete SHCXX_rv;
}
return;
}
The unbuffer version of the function cannot destroy
the string since
only a pointer to the contents of the string is returned. It would
leak memory when called:
const char * STR_get_const_string_ptr_len()
{
const std::string * SHCXX_rv = getConstStringPtrLen();
const char * SHC_rv = SHCXX_rv->c_str();
return SHC_rv;
}
Footnotes
[blog1] | http://blog.enthought.com/python/numpy-arrays-with-pre-allocated-memory |
[blog2] | http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory |