Pointers and Arrays¶
Shroud will create code to map between C and Fortran pointers. The
interoperability with C features of Fortran 2003 and the
call-by-reference feature of Fortran provides most of the features
necessary to pass arrays to C++ libraries. Shroud can also provide
additional semantic information. Adding the rank attribute
will declare the argument as an assumed-shape array with the given
rank: +rank(2) creates arg(:,:). The +dimension(n) attribute
will instead give an explicit dimension: +dimension(10,20) creates
arg(10,20).
Using +dimension on +intent(in) arguments will use the dimension shape in the Fortran wrapper instead of assumed-shape. This adds some additional safety since many compiler will warn if the actual argument is too small. This is useful when the C++ function has an defined shape. For example, it expects a pointer to 16 elements. The Fortran wrapper will pass a pointer to contiguous memory with no explicit shape information.
When a function returns a pointer, the default behavior of Shroud
is to convert it into a Fortran variable with the POINTER attribute
using c_f_pointer. This can be made explicit by adding
+deref(pointer) to the function declaration in the YAML file.
For example, int *getData(void) +deref(pointer) creates the Fortran
function interface
function get_data() result(rv)
integer(C_INT), pointer :: rv
end function get_data
The result of the the Fortran function directly accesses the memory returned from the C++ library.
An array can be returned by adding the +dimension attribute to
the function. The dimension expression will be used to provide the
shape argument to c_f_pointer. The arguments to +dimension
are C++ expressions which are evaluated after the C++ function is
called and can be the name of another argument to the function or a call
another C++ function.
As a simple example, this declaration returns a pointer to a constant sized array.
- decl: int *returnIntPtrToFixedArray(void) +dimension(10)
Example returnIntPtrToFixedArray shows the generated code.
If the dimension is unknown when the function returns, a type(C_PTR)
can be returned with +deref(raw). This will allow the user
to call c_f_pointer once the shape is known.
Instead of a Fortran pointer to a scalar, a scalar can be returned
by adding +deref(scalar).
A common idiom for C++ is to return pointers to memory via arguments.
This would be declared as int **arg +intent(out). By default,
Shroud treats the argument similar to a function which returns a
pointer: it adds the deref(pointer) attribute to treats it as a
POINTER to a scalar. The +dimension attribute can be used to
create an array similar to a function result.
If the +deref(allocatable) attribute is added, then a Fortran array
will be allocated to the size of +dimension attribute and the
argument will be copied into the Fortran memory.
A function which returns multiple layers of indirection uses
deref(raw) and will return a type(C_PTR). This is also true for
function arguments beyond int **arg +intent(out).
This pointer can represent non-contiguous memory and Shroud
has no way to know the extend of each pointer in the array.
The default behavior of Shroud for intent(out) and intent(inout) arguments can be modifed by setting options F_deref_arg_array, F_deref_arg_character, F_deref_arg_implied_array, F_deref_arg_scalar. For function results the options are F_deref_func_array, F_deref_func_character, F_deref_func_implied_array ** F_deref_func_scalar.
A special case is provided for arrays of NULL terminated strings,
char **. While this also represents non-contiguous memory, it is a
common idiom and can be processed since the length of each string can
be found with strlen.
See example acceptCharArrayIn.
In Python wrappers, Shroud will allocate +intent(out) arguments
before calling the function. This requires the dimension attribute
which defines the shape and must be known before the function is
called. The argument will then be returned by the function along with
the function result and other +intent(out) arguments. For example,
int **arg +intent(out)+dimension(n). The value of the +dimension
attribute is used to define the shape of the array and must be known
before the library function is called. The +dimension attribute can
include the Fortran intrinsic size to define the shape in terms of
another array.
Function results¶
char * functions have several options. By default +deref
attribute will be set to allocatable. After the C++ function
returns, a CHARACTER variable will be allocated and the contents
copied. This will convert a NULL terminated string into the
proper length of Fortran variable. +deref(pointer) returns a pointer
to the library’s memory.
For very long strings or strings with embedded NULL,
+deref(raw) will return a type(C_PTR). It is the caller’s
responsiblity to dereference the C_PTR, typically by using the
Fortran intrinsic c_f_pointer.
The default value of the deref attribute for char * and
std::string functions is controlled by the option
F_deref_func_character.
When the function has the +funcarg attribute, the function result
will be returned in a function argument. Adding the +deref(copy)
will use the type CHARACTER(*) for the argument. The C++ function
return value will be copied into the argument. This avoid any issues
with memory management since the caller provides the memory and works
with any version of Fortran. However, if it is too short the result
will be truncated.
See example getConstCharPtrAsCopyArg.
void * functions return a type(C_PTR) argument and cannot
have deref, dimension, or rank attributes.
A type(C_PTR) argument will be passed by value. For a void ** argument,
the type(C_PTR) will be passed by reference (the default). This
will allow the C wrapper to assign a value to the argument.
See example passVoidStarStar.
If the C++ library function can also provide the length of the
pointer, then its possible to return a Fortran POINTER or
ALLOCATABLE variable. This allows the caller to directly use the
returned value of the C++ function. However, there is a price; the
user will have to release the memory if owner(caller) is set. To
accomplish this with POINTER arguments, an additional argument is
added to the function which contains information about how to delete
the array. If the argument is declared Fortran ALLOCATABLE, then
the value of the C++ pointer are copied into a newly allocated Fortran
array. The C++ memory is deleted by the wrapper and it is the callers
responsibility to deallocate the Fortran array. However, Fortran
will release the array automatically under some conditions when the
caller function returns. If owner(library) is set, the Fortran
caller never needs to release the memory.
See Memory Management for details of the implementation.
A void pointer may also be used in a C function when any type may be
passed in. The attribute assumedtype can be used to declare a
Fortran argument as assumed-type: type(*).
- decl: int passAssumedType(void *arg+assumedtype)
function pass_assumed_type(arg) &
result(SHT_rv) &
bind(C, name="passAssumedType")
use iso_c_binding, only : C_INT, C_PTR
implicit none
type(*) :: arg
integer(C_INT) :: SHT_rv
end function pass_assumed_type
Memory Management¶
Shroud will maintain ownership of memory via the owner attribute. It uses the value of the attribute to decided when to release memory.
Use owner(library) when the library owns the memory and the user
should not release it. For example, this is used when a function
returns const std::string & for a reference to a string which is
maintained by the library. Fortran and Python will both get the
reference, copy the contents into their own variable (Fortran
CHARACTER or Python str), then return without releasing any
memory. This is the default behavior.
Use owner(caller) when the library allocates new memory which is returned to the caller. The caller is then responsible to release the memory. Fortran and Python can both hold on to the memory and then provide ways to release it using a C++ callback when it is no longer needed.
When a library function returns a C++ object such as std::string
or std::vector by value and the Fortran wrapper is returning a
POINTER via +deref(pointer) or uses +deref(raw), the C wrapper
must allocate a new instance. In addition to the POINTER, a
capsule variable is added as a argument. The caller is responsible
to release the memory via call capsule%delete. Otherwise the
memory will leak. The FINAL subroutine of the capsule will be
called when it goes out of scope, so an explicit call to delete
may not be needed. If the declaration uses +deref(allocatable) or
+deref(copy), the wrapper will release the memory before returning
to the caller. At this point the returned varible is owned by Fortran
and released via DEALLOCATE or going out of scope.
For shadow classes with a destructor defined, the destructor will be used to release the memory.
The c_statements may also define a way to destroy memory.
For example, the mixin group c_mixin_destructor_new-vector
is used with std::vector and provides the lines:
destructor_name: std_vector_{cxx_T}
destructor:
- std::vector<{cxx_T}> *cxx_ptr = reinterpret_cast<std::vector<{cxx_T}> *>(ptr);
- delete cxx_ptr;
Destructor code can be defined without creating a new statement group by defining it in the destructors section of the YAML file. Then use the +destructor_name attribute in the declaration. This allows custom destructor code to be used more easily.
The address of the memory to free will be in the variable
void *ptr, which should be referenced in the pattern:
declarations:
- decl: char *getName() +destructor_name(free_getName)
destructors:
free_getName: |
decref(ptr);
Without any explicit destructor_name, free will be
used to release POD pointers; otherwise, delete will be used.
C and Fortran¶
Fortran keeps track of C++ objects with the struct
C_capsule_data_type and the bind(C) equivalent
F_capsule_data_type. Their names in the format dictionary default to
{C_prefix}SHROUD_capsule_data and {C_prefix}SHROUD_capsule_data.
In the Tutorial these types are defined in typesTutorial.h as:
// C capsule CLA_Class1
struct s_CLA_Class1 {
void *addr; // address of C++ memory
int idtor; // index of destructor
int cmemflags; // memory flags
};
typedef struct s_CLA_Class1 CLA_Class1;
And wrapftutorial.f:
! helper capsule_data
type, bind(C) :: CLA_SHROUD_capsule_data
type(C_PTR) :: addr = C_NULL_PTR ! address of C++ memory
integer(C_INT) :: idtor = 0 ! index of destructor
integer(C_INT) :: cmemflags = 0 ! memory flags
end type CLA_SHROUD_capsule_data
addr is the address of the C or C++ variable, such as a char *
or std::string *. idtor is a Shroud generated index of the
destructor code defined by destructor_name in the statement group
or the destructor_name attribute.
These code segments are collected and written to function
C_memory_dtor_function. A value of 0 indicated the memory will not
be released and is used with the owner(library) attribute.
cmemflags contains bit flags to set pointer properties.
Each class creates its own capsule struct for the C wrapper. This is to provide a measure of type safety in the C API. All Fortran classes use the same derived type since the user does not directly access the derived type.
A typical destructor function would look like:
// Release library allocated memory.
void TUT_SHROUD_memory_destructor(TUT_SHROUD_capsule_data *cap)
{
void *ptr = cap->addr;
switch (cap->idtor) {
case 0: // --none--
{
// Nothing to delete
break;
}
case 1: // std::string
{
std::string *cxx_ptr = reinterpret_cast<std::string *>(ptr);
delete cxx_ptr;
break;
}
case 2: // new_string
{
std::string *cxx_ptr = reinterpret_cast<std::string *>(ptr);
delete cxx_ptr;
break;
}
default:
{
// Unexpected case in destructor
break;
}
}
cap->addr = nullptr;
cap->idtor = 0; // avoid deleting again
cap->cmemflags = cap->cmemflags & ~SWIG_MEM_OWN;
}
Character and Arrays¶
In order to create an allocatable copy of a C++ pointer, an additional
structure is involved. For example, getConstStringPtrAlloc
returns a pointer to a new string. From strings.yaml:
declarations:
- decl: const std::string *getConstStringPtrAlloc() +owner(library)
The C wrapper calls the function and saves the result along with
metadata consisting of the address of the data within the
std::string and its length. The Fortran wrappers allocates its
return value to the proper length, then copies the data from the C++
variable and deletes it.
The metadata for variables are saved in the C struct C_array_type
and the bind(C) equivalent F_array_type.:
// helper array_context
struct s_STR_SHROUD_array {
void *base_addr;
int type; /* type of element */
size_t elem_len; /* bytes-per-item or character len in c++ */
size_t size; /* size of data in c++ */
int rank; /* number of dimensions, 0=scalar */
long shape[7];
};
typedef struct s_STR_SHROUD_array STR_SHROUD_array;
The union for addr makes some assignments easier by removing
the need for casts and also aids debugging.
The union is replaced with a single type(C_PTR) for Fortran:
! helper array_context
type, bind(C) :: STR_SHROUD_array
! address of data
type(C_PTR) :: base_addr = C_NULL_PTR
! type of element
integer(C_INT) :: type
! bytes-per-item or character len of data in cxx
integer(C_SIZE_T) :: elem_len = 0_C_SIZE_T
! size of data in cxx
integer(C_SIZE_T) :: size = 0_C_SIZE_T
! number of dimensions
integer(C_INT) :: rank = -1
integer(C_LONG) :: shape(7) = 0
end type STR_SHROUD_array
The C wrapper does not return a std::string pointer. Instead it
passes in a C_array_type pointer as an argument. It calls
getConstStringPtrAlloc, saves the results and metadata into the
argument. This allows it to be easily accessed from Fortran.
Since the attribute is owner(library), cxx.idtor is set to 0
to avoid deallocating the memory.
void STR_getConstStringPtrAlloc_bufferify(
STR_SHROUD_array *SHT_rv_cdesc,
STR_SHROUD_capsule_data *SHT_rv_capsule)
{
// splicer begin function.getConstStringPtrAlloc_bufferify
const std::string *SHC_rv_cxx = getConstStringPtrAlloc();
ShroudStringToCdesc(SHT_rv_cdesc, SHC_rv_cxx);
SHT_rv_capsule->addr = const_cast<std::string *>(SHC_rv_cxx);
SHT_rv_capsule->idtor = 0;
SHT_rv_capsule->cmemflags = SWIG_MEM_RVALUE;
// splicer end function.getConstStringPtrAlloc_bufferify
}
The Fortran wrapper uses the metadata to allocate the return argument to the correct length:
function get_const_string_ptr_alloc() &
result(SHT_rv)
character(len=:), allocatable :: SHT_rv
! splicer begin function.get_const_string_ptr_alloc
type(STR_SHROUD_array) :: SHT_rv_cdesc
type(STR_SHROUD_capsule_data) :: SHT_rv_capsule
call c_get_const_string_ptr_alloc_bufferify(SHT_rv_cdesc, &
SHT_rv_capsule)
allocate(character(len=SHT_rv_cdesc%elem_len):: SHT_rv)
call STR_SHROUD_copy_string(SHT_rv_cdesc, SHT_rv, &
SHT_rv_cdesc%elem_len)
call STR_SHROUD_capsule_dtor(SHT_rv_capsule)
! splicer end function.get_const_string_ptr_alloc
end function get_const_string_ptr_alloc
Finally, the helper function SHROUD_copy_string_and_free is called
to set the value of the result and possible free memory for
owner(caller) or intermediate values:
// helper copy_string
// Copy the char* or std::string in context into c_var.
// Called by Fortran to deal with allocatable character.
void STR_ShroudCopyString(STR_SHROUD_array *data, char *c_var,
size_t c_var_len) {
const void *cxx_var = data->base_addr;
size_t n = c_var_len;
if (data->elem_len < n) n = data->elem_len;
std::memcpy(c_var, cxx_var, n);
}
Note
The three steps of call, allocate, copy could be replaced
with a single call by using the further interoperability
with C features of Fortran 2018 (a.k.a TS 29113). This
feature allows Fortran ALLOCATABLE variables to be
allocated by C. However, not all compilers currently support
that feature. The current Shroud implementation works with
Fortran 2003.
Python¶
NumPy arrays control garbage collection of C++ memory by creating
a PyCapsule as the base object of NumPy objects.
Once the final reference to the NumPy array is removed, the reference
count on the PyCapsule is decremented.
When 0, the destructor for the capsule is called and releases the C++ memory.
This technique is discussed at [blog1] and [blog2]
Old¶
Note
C_finalize is replaced by statement.final
Shroud generated C wrappers do not explicitly delete any memory.
However a destructor may be automatically called for some C++ stl
classes. For example, a function which returns a std::string
will have its value copied into Fortran memory since the function’s
returned object will be destroyed when the C++ wrapper returns. If a
function returns a char * value, it will also be copied into Fortran
memory. But if the caller of the C++ function wants to transfer
ownership of the pointer to its caller, the C++ wrapper will leak the
memory.
The C_finalize variable may be used to insert code before returning from the wrapper. Use C_finalize_buf for the buffer version of wrapped functions.
For example, a function which returns a new string will have to
delete it before the C wrapper returns:
std::string * getConstStringPtrLen()
{
std::string * rv = new std::string("getConstStringPtrLen");
return rv;
}
Wrapped as:
- decl: const string * getConstStringPtrLen+len=30()
format:
C_finalize_buf: delete {cxx_var};
The C buffer version of the wrapper is:
void STR_get_const_string_ptr_len_bufferify(char * SHF_rv, int NSHF_rv)
{
const std::string * SHCXX_rv = getConstStringPtrLen();
if (SHCXX_rv->empty()) {
std::memset(SHF_rv, ' ', NSHF_rv);
} else {
ShroudStrCopy(SHF_rv, NSHF_rv, SHCXX_rv->c_str());
}
{
// C_finalize
delete SHCXX_rv;
}
return;
}
The unbuffer version of the function cannot destroy the string since
only a pointer to the contents of the string is returned. It would
leak memory when called:
const char * STR_get_const_string_ptr_len()
{
const std::string * SHCXX_rv = getConstStringPtrLen();
const char * SHC_rv = SHCXX_rv->c_str();
return SHC_rv;
}
Footnotes