Types¶
A typemap is created for each type to describe to Shroud how it should
convert a type between languages for each wrapper. Native types are
predefined and a Shroud typemap is created for each struct
and
class
declaration.
The general form is:
declarations:
- type: type-name
fields:
field1:
field2:
type-name is the name used by C++. There are some fields which are used by all wrappers and other fields which are used by language specific wrappers.
type fields¶
These fields are common to all wrapper languages.
base¶
The base type of type-name. This is used to generalize operations for several types. The base types that Shroud uses are string, vector, or shadow.
cpp_if¶
A c preprocessor test which is used to conditionally use other fields of the type such as c_header and cxx_header:
- type: MPI_Comm
fields:
cpp_if: ifdef USE_MPI
flat_name¶
A flattened version of cxx_type which allows the name to be
used as a legal identifier in C, Fortran and Python.
By default any scope separators are converted to underscores
i.e. internal::Worker
becomes internal_Worker
.
Imbedded blanks are converted to underscores
i.e. unsigned int
becomes unsigned_int
.
And template arguments are converted to underscores with the trailing
>
being replaced
i.e. std::vector<int>
becomes std_vector_int
.
One use of this name is as the function_suffix for templated functions.
idtor¶
Index of capsule_data
destructor in the function
C_memory_dtor_function.
This value is computed by Shroud and should not be set.
It can be used when formatting statements as {idtor}
.
Defaults to 0 indicating no destructor.
result_as_arg¶
Override fields when result should be treated as an argument. Defaults to None.
Statements¶
Each language also provides a section that is used to insert language specific statements into the wrapper. These are named c_statements, f_statements, and py_statements.
The are broken down into several resolutions. The first is the intent of the argument. result is used as the intent for function results.
- intent_in
- Code to add for argument with
intent(IN)
. Can be used to convert types or copy-in semantics. For example,char *
tostd::string
. - intent_out
- Code to add after call when
intent(OUT)
. Used to implement copy-out semantics. - intent_inout
- Code to add after call when
intent(INOUT)
. Used to implement copy-out semantics. - result
- Result of function. Including when it is passed as an argument, F_string_result_as_arg.
Each intent is then broken down into code to be added into specific sections of the wrapper. For example, declaration, pre_call and post_call.
Each statement is formatted using the format dictionary for the argument. This will define several variables.
- c_var
- The C name of the argument.
- cxx_var
- Name of the C++ variable.
- f_var
- Fortran variable name for argument.
For example:
f_statements:
intent_in:
- '{c_var} = {f_var} ! coerce to C_BOOL'
intent_out:
- '{f_var} = {c_var} ! coerce to logical'
Note that the code lines are quoted since they begin with a curly brace. Otherwise YAML would interpret them as a dictionary.
See the language specific sections for details.
Numeric Types¶
The numeric types usually require no conversion. In this case the type map is mainly used to generate declaration code for wrappers:
type: int
fields:
c_type: int
cxx_type: int
f_type: integer(C_INT)
f_kind: C_INT
f_module:
iso_c_binding:
- C_INT
f_cast: int({f_var}, C_INT)
One case where a conversion is required is when the Fortran argument
is one type and the C++ argument is another. This may happen when an
overloaded function is generated so that a C_INT
or C_LONG
argument may be passed to a C++ function function expecting a
long
. The f_cast field is used to convert the argument to the
type expected by the C++ function.
Bool¶
The first thing to notice is that f_c_type is defined. This is
the type used in the Fortran interface for the C wrapper. The type
is logical(C_BOOL)
while f_type, the type of the Fortran
wrapper argument, is logical
.
The f_statements section describes code to add into the Fortran
wrapper to perform the conversion. c_var and f_var default to
the same value as the argument name. By setting c_local_var, a
local variable is generated for the call to the C wrapper. It will be
named SH_{f_var}
.
There is no Fortran intrinsic function to convert between default
logical
and logical(C_BOOL)
. The pre_call and
post_call sections will insert an assignment statement to allow
the compiler to do the conversion.
If a function returns a bool
result then a wrapper is always needed
to convert the result. The result section sets need_wrapper
to force the wrapper to be created. By default a function with no
argument would not need a wrapper since there will be no pre_call
or post_call code blocks. Only the C interface would be required
since Fortran could call the C function directly.
See example checkBool.
Char¶
Any C++ function which has char
or std::string
arguments or
result will create an additional C function which include additional
arguments for the length of the strings. Most Fortran compiler use
this convention when passing CHARACTER
arguments. Shroud makes
this convention explicit for three reasons:
- It allows an interface to be used. Functions with an interface will not pass the hidden, non-standard length argument, depending on compiler.
- It may pass the result of
len
and/orlen_trim
. The convention just passes the length. - Returning character argument from C to Fortran is non-portable.
Arguments with the intent(in) annotation are given the len_trim annotation. The assumption is that the trailing blanks are not part of the data but only padding. Return values and intent(out) arguments add a len annotation with the assumption that the wrapper will copy the result and blank fill the argument so it need to know the declared length.
The additional function will be named the same as the original function with the option C_bufferify_suffix appended to the end. The Fortran wrapper will use the original function name, but call the C function which accepts the length arguments.
The character type maps use the c_statements section to define code which will be inserted into the C wrapper. intent_in, intent_out, and result subsections add actions for the C wrapper. intent_in_buf, intent_out_buf, and result_buf are used for arguments with the len and len_trim annotations in the additional C wrapper.
There are occasions when the bufferify wrapper is not needed. For
example, when using char *
to pass a large buffer. It is better
to just pass the address of the argument instead of creating a copy
and appending a NULL
. The F_create_bufferify_function options
can set to false to turn off this feature.
Char¶
Ndest
is the declared length of argument dest
and Lsrc
is
the trimmed length of argument src
. These generated names must
not conflict with any other arguments. There are two ways to set the
names. First by using the options C_var_len_template and
C_var_trim_template. This can be used to control how the names are
generated for all functions if set globally or just a single function
if set in the function’s options. The other is by explicitly setting
the len and len_trim annotations which only effect a single
declaration.
The pre_call code creates space for the C strings by allocating
buffers with space for an additional character (the NULL
). The
intent(in) string copies the data and adds an explicit terminating
NULL
. The function is called then the post_call section copies
the result back into the dest
argument and deletes the scratch
space. ShroudStrCopy
is a function provided by Shroud which
copies character into the destination up to Ndest
characters, then
blank fills any remaining space.
MPI_Comm¶
MPI_Comm is provided by Shroud and serves as an example of how to wrap a non-native type. MPI provides a Fortran interface and the ability to convert MPI_comm between Fortran and C. The type map tells Shroud how to use these routines:
type: MPI_Comm
fields:
cxx_type: MPI_Comm
c_header: mpi.h
c_type: MPI_Fint
f_type: integer
f_kind: C_INT
f_c_type: integer(C_INT)
f_c_module:
iso_c_binding:
- C_INT
cxx_to_c: MPI_Comm_c2f({cxx_var})
c_to_cxx: MPI_Comm_f2c({c_var})
This mapping makes the assumption that integer
and
integer(C_INT)
are the same type.
Templates¶
Shroud will wrap templated classes and functions for explicit instantiations.
The template is given as part of the decl
and the instantations are listed in the
cxx_template
section:
- decl: |
template<typename ArgType>
void Function7(ArgType arg)
cxx_template:
- instantiation: <int>
- instantiation: <double>
options
and format
may be provide to control the generated code:
- decl: template<typename T> class vector
cxx_header: <vector>
cxx_template:
- instantiation: <int>
format:
C_impl_filename: wrapvectorforint.cpp
options:
optblah: two
- instantiation: <double>
For a class template, the class_name is modified to included the
instantion type. If only a single template parameter is provided,
then the template argument is used. For the above example,
C_impl_filename will default to wrapvector_int.cpp
but has been
explicitly changed to wrapvectorforint.cpp
.
Memory Management¶
Shroud will maintain ownership of memory via the owner attribute. It uses the value of the attribute to decided when to release memory.
Use owner(library) when the library owns the memory and the user
should not release it. For example, this is used when a function
returns const std::string &
for a reference to a string which is
maintained by the library. Fortran and Python will both get the
reference, copy the contents into their own variable (Fortran
CHARACTER
or Python str
), then return without releasing any
memory. This is the default behavior.
Use owner(caller) when the library allocates new memory which is returned to the caller. The caller is then responsible to release the memory. Fortran and Python can both hold on to the memory and then provide ways to release it using a C++ callback when it is no longer needed.
For shadow classes with a destructor defined, the destructor will be used to release the memory.
The c_statements may also define a way to destroy memory.
For example, std::vector
provides the lines:
destructor_name: std_vector_{cxx_T}
destructor:
- std::vector<{cxx_T}> *cxx_ptr = reinterpret_cast<std::vector<{cxx_T}> *>(ptr);
- delete cxx_ptr;
Patterns can be used to provide code to free memory for a wrapped
function. The address of the memory to free will be in the variable
void *ptr
, which should be referenced in the pattern:
declarations:
- decl: char * getName() +free_pattern(free_getName)
patterns:
free_getName: |
decref(ptr);
Without any explicit destructor_name or pattern, free
will be
used to release POD pointers; otherwise, delete
will be used.
C and Fortran¶
Fortran keeps track of C++ objects with the struct
C_capsule_data_type and the bind(C)
equivalent
F_capsule_data_type. Their names default to
{C_prefix}SHROUD_capsule_data
and SHROUD_{F_name_scope}capsule
.
In the Tutorial these types are defined in typesTutorial.h
as:
And wrapftutorial.f
:
addr is the address of the C or C++ variable, such as a char *
or std::string *
. idtor is a Shroud generated index of the
destructor code defined by destructor_name or the free_pattern attribute.
These code segments are collected and written to function
C_memory_dtor_function. A value of 0 indicated the memory will not
be released and is used with the owner(library) attribute. A
typical function would look like:
// Release library allocated memory.
void TUT_SHROUD_memory_destructor(TUT_SHROUD_capsule_data *cap)
{
void *ptr = cap->addr;
switch (cap->idtor) {
case 0: // --none--
{
// Nothing to delete
break;
}
case 1: // new_string
{
std::string *cxx_ptr = reinterpret_cast<std::string *>(ptr);
delete cxx_ptr;
break;
}
default:
{
// Unexpected case in destructor
break;
}
}
cap->addr = NULL;
cap->idtor = 0; // avoid deleting again
}
Character and Arrays¶
In order to create an allocatable copy of a C++ pointer, an additional
structure is involved. For example, getConstStringPtrAlloc
returns a pointer to a new string. From strings.yaml
:
declarations:
- decl: const std::string * getConstStringPtrAlloc() +owner(library)
The C wrapper calls the function and saves the result along with
metadata consisting of the address of the data within the
std::string
and its length. The Fortran wrappers allocates its
return value to the proper length, then copies the data from the C++
variable and deletes it.
The metadata for variables are saved in the C struct C_array_type
and the bind(C)
equivalent F_array_type.:
struct s_STR_SHROUD_array {
STR_SHROUD_capsule_data cxx; /* address of C++ memory */
union {
const void * base;
const char * ccharp;
} addr;
int type; /* type of element */
size_t elem_len; /* bytes-per-item or character len in c++ */
size_t size; /* size of data in c++ */
};
typedef struct s_STR_SHROUD_array STR_SHROUD_array;
The union for addr
makes some assignments easier and also aids debugging.
The union is replaced with a single type(C_PTR)
for Fortran:
type, bind(C) :: SHROUD_array
! address of C++ memory
type(SHROUD_capsule_data) :: cxx
! address of data in cxx
type(C_PTR) :: base_addr = C_NULL_PTR
! type of element
integer(C_INT) :: type
! bytes-per-item or character len of data in cxx
integer(C_SIZE_T) :: elem_len = 0_C_SIZE_T
! size of data in cxx
integer(C_SIZE_T) :: size = 0_C_SIZE_T
end type SHROUD_array
The C wrapper does not return a std::string
pointer. Instead it
passes in a C_array_type pointer as an argument. It calls
getConstStringPtrAlloc
, saves the results and metadata into the
argument. This allows it to be easily accessed from Fortran.
Since the attribute is owner(library), cxx.idtor
is set to 0
to avoid deallocating the memory.
void STR_get_const_string_ptr_alloc_bufferify(STR_SHROUD_array *DSHF_rv)
{
// splicer begin function.get_const_string_ptr_alloc_bufferify
const std::string * SHCXX_rv = getConstStringPtrAlloc();
ShroudStrToArray(DSHF_rv, SHCXX_rv, 0);
return;
// splicer end function.get_const_string_ptr_alloc_bufferify
}
The Fortran wrapper uses the metadata to allocate the return argument to the correct length:
function get_const_string_ptr_alloc() &
result(SHT_rv)
type(SHROUD_array) :: DSHF_rv
character(len=:), allocatable :: SHT_rv
! splicer begin function.get_const_string_ptr_alloc
call c_get_const_string_ptr_alloc_bufferify(DSHF_rv)
allocate(character(len=DSHF_rv%elem_len):: SHT_rv)
call SHROUD_copy_string_and_free(DSHF_rv, SHT_rv, DSHF_rv%elem_len)
! splicer end function.get_const_string_ptr_alloc
end function get_const_string_ptr_alloc
Finally, the helper function SHROUD_copy_string_and_free
is called
to set the value of the result and possible free memory for
owner(caller) or intermediate values:
// Copy the char* or std::string in context into c_var.
// Called by Fortran to deal with allocatable character.
void STR_ShroudCopyStringAndFree(STR_SHROUD_array *data, char *c_var, size_t c_var_len) {
const char *cxx_var = data->addr.ccharp;
size_t n = c_var_len;
if (data->elem_len < n) n = data->elem_len;
std::strncpy(c_var, cxx_var, n);
STR_SHROUD_memory_destructor(&data->cxx); // delete data->cxx.addr
}
Note
The three steps of call, allocate, copy could be replaced
with a single call by using the futher interoperability
with C features of Fortran 2018 (a.k.a TS 29113). This
feature allows Fortran ALLOCATABLE
variables to be
allocated by C. However, not all compilers currently support
that feature. The current Shroud implementation works with
Fortran 2003.
Python¶
NumPy arrays control garbage collection of C++ memory by creating
a PyCapsule
as the base object of NumPy objects.
Once the final reference to the NumPy array is removed, the reference
count on the PyCapsule
is decremented.
When 0, the destructor for the capsule is called and releases the C++ memory.
This technique is discussed at [blog1] and [blog2]
Old¶
Note
C_finalize is replaced by statement.final
Shroud generated C wrappers do not explicitly delete any memory.
However a destructor may be automatically called for some C++ stl
classes. For example, a function which returns a std::string
will have its value copied into Fortran memory since the function’s
returned object will be destroyed when the C++ wrapper returns. If a
function returns a char *
value, it will also be copied into Fortran
memory. But if the caller of the C++ function wants to transfer
ownership of the pointer to its caller, the C++ wrapper will leak the
memory.
The C_finalize variable may be used to insert code before returning from the wrapper. Use C_finalize_buf for the buffer version of wrapped functions.
For example, a function which returns a new string will have to
delete
it before the C wrapper returns:
std::string * getConstStringPtrLen()
{
std::string * rv = new std::string("getConstStringPtrLen");
return rv;
}
Wrapped as:
- decl: const string * getConstStringPtrLen+len=30()
format:
C_finalize_buf: delete {cxx_var};
The C buffer version of the wrapper is:
void STR_get_const_string_ptr_len_bufferify(char * SHF_rv, int NSHF_rv)
{
const std::string * SHCXX_rv = getConstStringPtrLen();
if (SHCXX_rv->empty()) {
std::memset(SHF_rv, ' ', NSHF_rv);
} else {
ShroudStrCopy(SHF_rv, NSHF_rv, SHCXX_rv->c_str());
}
{
// C_finalize
delete SHCXX_rv;
}
return;
}
The unbuffer version of the function cannot destroy
the string since
only a pointer to the contents of the string is returned. It would
leak memory when called:
const char * STR_get_const_string_ptr_len()
{
const std::string * SHCXX_rv = getConstStringPtrLen();
const char * SHC_rv = SHCXX_rv->c_str();
return SHC_rv;
}
Note
Reference counting and garbage collection are still a work in progress
Footnotes
[blog1] | http://blog.enthought.com/python/numpy-arrays-with-pre-allocated-memory |
[blog2] | http://blog.enthought.com/python/numpy/simplified-creation-of-numpy-arrays-from-pre-allocated-memory |