5.6  Interface to external functions and variables

5.6.1  Accessing external objects

[syntax] (define-foreign-type NAME TYPE [ARGCONVERT [RETCONVERT]])
Defines an alias for TYPE. TYPE may be a type-specifier or a string naming a C type. The namespace of foreign type specifiers is separate from the normal Scheme namespace. The optional arguments ARGCONVERT and RETCONVERT should evaluate to procedures that map argument- and result-values to a value that can be transformed to TYPE:

(declare (uses extras))

(define-foreign-type char-vector 
  nonnull-c-string
  (compose list->string vector->list)
  (compose list->vector string->list) )

(define strlen
  (foreign-lambda int "strlen" char-vector) )

(strlen '#(#\a#\b#\c))                      ==> 3

(define memset
  (foreign-lambda char-vector "memset" char-vector char int) )

(memset '#(#\_ #\_ #\_) #\X3)                ==> #(#\X#\X#\X)

Foreign type-definitions are only visible in the compilation-unit in which they are defined, so use include to use the same definitions in multiple files.

[syntax] (define-foreign-variable NAME TYPE [STRING])
Defines a foreign variable of name NAME. STRING should be the real name of a foreign variable or parameterless macro. If STRING is not given, then the variable name NAME will be converted to a string and used instead. All references and assignments (via set!) are modified to correctly convert values between Scheme and C representation. This foreign variable can only be accessed in the current compilation unit, but the name can be lexically shadowed. Note that STRING can name an arbitrary C expression. If no assignments are performed, then STRING doesn't even have to specify an lvalue.

#>
enum { abc=3, def, ghi };
<#

(define-macro (define-foreign-enum . items)
  `(begin
     ,@(map (match-lambda 
              [(name realname) `(define-foreign-variable ,name int ,realname)]
              [name `(define-foreign-variable ,name int)] )
     items) ) )

(define-foreign-enum abc def ghi)

ghi                               ==> 5

[syntax] (foreign-callback-lambda RETURNTYPE NAME ARGTYPE)
This is similar to foreign-lambda, but also allows the called function to call Scheme functions. See 5.6.4.

[syntax] (foreign-callback-lambda* RETURNTYPE ((ARGTYPE VARIABLE) ...) STRING ...)
This is similar to foreign-lambda*, but also allows the called function to call Scheme functions. See 5.6.4.

[syntax] (foreign-lambda RETURNTYPE NAME ARGTYPE ...)
Represents a binding to an external routine. This form can be used in the position of an ordinary lambda expression. NAME specifies the name of the external procedure and should be a string or a symbol.

[syntax] (foreign-lambda* RETURNTYPE ((ARGTYPE VARIABLE) ...) STRING ...)
Similar to foreign-lambda, but instead of generating code to call an external function, the body of the C procedure is directly given in STRING ...:

(define my-strlen
  (foreign-lambda* int ((c-string str))
    "int n = 0;
     while(*(str++)) ++n;
     return(n);") )

(my-strlen "one two three")             ==> 13

For obscure technical reasons any use of the return statement should enclose the result value in parentheses. For the same reasons return without an argument is not allowed.

5.6.2  Foreign type specifiers

Here is a list of valid foreign type specifiers:

scheme-object
An arbitrary Scheme data object (immediate or non-immediate).

bool
As argument: any value (#f is false, anything else is true). As result: anything different from 0 and the NULL-pointer is #t.

char unsigned-char
A character.

short unsigned-short
A short integer number.

int unsigned-int
An small integer number in fixnum range (at least 30 bit).

integer unsigned-integer
Either a fixnum or a flonum in the range of a (unsigned) machine ``int''.

long unsigned-long
Either a fixnum or a flonum in the range of a (unsigned) machine ``long''.

float double
A floating-point number. If an exact integer is passed as an argument, then it is automatically converted to a float.

pointer
An untyped pointer to the contents of a non-immediate Scheme object (not allowed as return type). The value #f is also allowed and is passed as a NULL pointer.

nonnull-pointer
As pointer, but guaranteed not to be #f.

c-pointer
An untyped operating-system pointer or a locative. The value #f is also allowed and is passed as a NULL pointer. If uses as the type of a return value, a NULL pointer will be returned as #f.

nonnull-c-pointer
As c-pointer, but guaranteed not to be #f/NULL.

[nonnull-]byte-vector
A byte-vector object, passed as a pointer to its contents. Arguments of type byte-vector may optionally be #f, which is passed as a NULL pointer. This is not allowed as a return type.

[nonnull-]u8vector
[nonnull-]u16vector
[nonnull-]u32vector
[nonnull-]s8vector
[nonnull-]s16vector
[nonnull-]s32vector
[nonnull-]f32vector
[nonnull-]f64vector
A SRFI-4 number-vector object, passed as a pointer to its contents. Arguments of type byte-vector may optionally be #f, which is passed as a NULL pointer. These are not allowed as return types.

c-string
A C string (zero-terminated). The value #f is also allowed and is passed as a NULL pointer. If uses as the type of a return value, a NULL pointer will be returned as #f. Note that the string is copied (with a zero-byte appended) when passed as an argument to a foreign function. Also a return value of this type is copied into garbage collected memory.

nonnull-c-string
As c-string, but guaranteed not to be #f/NULL.

[nonnull-]c-string*
Similar to [nonnull-]c-string, but if used as a result-type, the pointer returned by the foreign code will be freed (using the C-libraries free()) after copying.

void
Specifies an undefined return value. Not allowed as argument type.

(pointer TYPE)
(c-pointer TYPE)
An operating-system pointer or a locative to an object of TYPE.

(nonnull-pointer TYPE)
(nonnull-c-pointer TYPE)
As (pointer TYPE), but guaranteed not to be #f/NULL.

(struct NAME)
A struct of the name NAME, which should be a string. Structs can not be directly passed as arguments to foreign function, neither can they be result values. Pointers to structs are allowed, though.

(union NAME)
A union of the name NAME, which should be a string. Unions can not be directly passed as arguments to foreign function, neither can they be result values. Pointers to unions are allowed, though.

(function RESULTTYPE (ARGUMENTTYPE1 ... [...]) [CALLCONV])
A function pointer. CALLCONV specifies an optional calling convention and should be a string. The meaning of this string is entirely platform dependent. The value #f is also allowed and is passed as a NULL pointer.

Foreign types are mapped to C types in the following manner:

boolint
[unsigned-]char[unsigned] char
[unsigned-]short[unsigned] short
[unsigned-]int[unsigned] int
[unsigned-]integer[unsigned] int
[unsigned-]long[unsigned] long
floatfloat
doubledouble
[nonnull-]pointervoid *
[nonnull-]c-pointervoid *
[nonnull-]byte-vectorunsigned char *
[nonnull-]u8vectorunsigned char *
[nonnull-]s8vectorchar *
[nonnull-]u16vectorunsigned short *
[nonnull-]s16vectorshort *
[nonnull-]u32vectoruint32_t *
[nonnull-]s32vectorint32_t *
[nonnull-]f32vectorfloat *
[nonnull-]f64vectordouble *
[nonnull-]c-stringchar *
voidvoid
([nonnull-]pointer TYPE)TYPE *
(struct NAME)struct NAME
(union NAME)union NAME
(function RTYPE (ATYPE ...) [CALLCONV])[CALLCONV] RTYPE (*)(ATYPE, ...)

5.6.3  Entry points

To simplify embedding compiled Scheme code into arbitrary programs, one can define so called ``entry points'', which provide a uniform interface and parameter conversion facilities.

[syntax] (define-entry-point INDEX ((VAR1 TYPE1) ...) (RTYPE1 ...) EXP1 EXP2 ...)
Defines a new entry-point with index INDEX which should evaluate to an exact integer. During execution of the body EXP1 EXP2 ... the variables VAR1 ... are bound to the parameters passed from the host program to the invoked entry point. The parameters passed are converted according to the foreign type specifiers TYPE1 .... The expressions should return as many values as foreign type specifiers are given in RTYPE1 .... The results are then transformed into values that can be used in the host program.

Note: if one or more of the result types RTYPE ... specify the type c-string, then the parameter types at the same positions in TYPE1 ... have to be c-strings as well, because the result strings are copied into the same area in memory. You should also take care that the passed buffer is long enough to hold the result string or unpredictable things will happen.

If entry points were defined then the program will not terminate after execution of the last toplevel expression, but instead it will enter a loop that waits for the host to invoke one of the defined entry points.

The following C functions and data types are provided:

[C function] void CHICKEN_parse_command_line(int argc, char *argv[], int *heap, int *stack int *symbols)
Parse the programs command-line contained in argc and argv and return the heap-, stack- and symbol table limits given by runtime options of the form -:..., or choose default limits. The library procedure argv can access the command-line only if this function has been called by the containing application.

[C function] int CHICKEN_initialize(int heap, int stack, int symbols, void *toplevel)
Initializes the Scheme execution context and memory. heap holds the number of bytes that are to be allocated for the secondary heap. stack holds the number of bytes for the primary heap. symbols contains the size of the symbol table. Passing 0 to one or more of these parameters will select a default size. toplevel should be a pointer to the toplevel entry point procedure. You should pass C_toplevel here. In any subsequent call to CHICKEN_run or CHICKEN_invoke you can simply pass NULL. Calling this function more than once has no effect. If enough memory is available and initialization was successful, then 1 is returned, otherwise this function returns 0.

[C function] void CHICKEN_run(void **data, int *bytes, int *maxlen, void *toplevel)
Starts the Scheme program. data, bytes and maxlen contain invocation parameters in raw form. Pass NULL here. Call this function once to execute all toplevel expressions in your compiled Scheme program. If the runtime system was not initialized before, then CHICKEN_initialize is called with default sizes. toplevel is the toplevel entry-point procedure.

[C function] void CHICKEN_invoke(int index, C_parameter *params, int count, void *toplevel)
Invoke the entry point with index index. count should contain the number of parameters passed. params is a pointer to parameter data:

typedef union
{
  C_word x;           /* parameter type scheme-object */
  long i;             /* parameter type bool, [unsigned] int/short/long */
  long c;             /* parameter type [unsigned] char */
  double f;           /* parameter type float/double */
  void *p;            /* any pointer parameter type and C strings */
} C_parameter;

This function calls CHICKEN_run if it was not called at least once before.

Here is a simple example (assuming a UNIX-like environment):

% cat foo.c
#include <stdio.h>
#include "chicken.h"

int main(void)
{
  C_parameter p[ 3 ];
  char str[ 32 ] = "hello!";  /* We need some space for the result string! */

  memset(p, 0, sizeof(p));
  p[ 0 ].i = -99;
  p[ 1 ].p = str;
  p[ 2 ].f = 3.14;
  CHICKEN_invoke(1, p, 3, C_toplevel);
  printf("->\n%d\n%s\n", p[ 0 ].i, p[ 1 ].p);
  return 0;
}

% cat bar.scm
(define-entry-point 1
    ((a integer) (b c-string) (c double))
    (int c-string)
  (print (list a b c))
  (values 123 "good bye!") )

% chicken bar.scm -quiet
% gcc foo.c bar.c -o foo `chicken-config -cflags -embedded`
% foo
(-99 "hello!" 3.14)
->
123
good bye!

Note the use of -embedded. We have to compile with additional compiler options, because the host program provides the main function.

5.6.4  Callbacks

To enable an external C function to call back to Scheme, the form foreign-callback-lambda (or foreign-callback-lambda*) has to be used. This generates special code to save and restore important state information during execution of C code. There are two ways of calling Scheme procedures from C: the first is to invoke the runtime function C_callback with the closure to be called and the number of arguments. The second is to define an externally visible wrapper function around a Scheme procedure with the define-external or foreign-callback-wrapper forms. *****

Note: the names of all functions, variables and macros exported by the Chicken runtime system start with ``C_''. It is advisable to use a different naming scheme for your own code to avoid name clashes.

[syntax] (define-external [QUALIFIERS] (NAME (ARGUMENTTYPE1 VARIABLE1) ...) RETURNTYPE BODY ...)
[syntax] (define-external NAME TYPE [INIT])
The first form defines an externally callable Scheme procedure. NAME should be a symbol, which, when converted to a string, represents a legal C identifier. ARGUMENTTYPE1 ... and RETURNTYPE are foreign type specifiers for the argument variables VAR1 ... and the result, respectively. QUALIFIERS is an optional qualifier for the foreign procedure definition, like __stdcall.

(define-external (foo (c-string x)) int (string-length x))

is equivalent to

(define foo 
  (foreign-callback-wrapper int "foo" 
    (c-string) (lambda (x) (string-length x))))

The second form of define-external can be used to define variables that are accessible from foreign code. It declares a global variable named by the symbol NAME that has the type TYPE. INIT can be an arbitrary expression that is used to initialize the variable. NAME is accessible from Scheme just like any other foreign variable defined by define-foreign-variable.

(define-external foo int 42)
((foreign-lambda* int ()
  "return(foo);"))           ==> 42

Note: don't be tempted to assign strings or bytevectors to external variables. Garbage collection moves those objects around, so it is very bad idea to assign pointers to heap-data. If you have to do so, then copy the data object into statically allocated memory (for example by using evict).

[syntax] (foreign-callback-wrapper RETURNTYPE NAME [QUALIFIERS] (ARGUMENTTYPE1 ...) EXP)
Defines an externally callable wrapper around the procedure EXP. EXP must be a lambda expression of the form (lambda ...). The wrapper will have the name NAME and will have a signature as specified in the return- and argument-types given in RETURNTYPE and ARGUMENTTYPE1 .... QUALIFIERS is a qualifier string for the function definition (see define-external).

[C function] C_word C_callback(C_word closure, int argc)
This function can be used to invoke the Scheme procedure closure. argc should contain the number of arguments that are passed to the procedure on the temporary stack. Values are put onto the temporary stack with the C_save macro.

5.6.5  Locations

It is also possible to define variables containing unboxed C data, so called locations. It should be noted that locations may only contain simple data, that is: everything that fits into a machine word, and double-precision floating point values.

[syntax] (define-location NAME TYPE [INIT])
Identical to (define-external NAME TYPE [INIT]), but the variable is not accessible from outside of the current compilation unit (it is declared static).

[syntax] (let-location ((NAME TYPE [INIT]) ...) BODY ...)
Defines a lexically bound location.

[syntax] (location NAME)
NAME should be an external variable defined by define-external or a location defined by define-location or let-location. This form returns a pointer object that contains the address of the variable NAME.

(define-external foo int)
((foreign-lambda* void (((pointer int) ip)) "*ip = 123;") 
  (location foo))
foo                                                                               ==> 123

This facility is especially useful in situations, where a C function returns more than one result value:

#>
#include <math.h>
<#

(define modf
  (foreign-lambda double "modf" double (pointer double)) )

(let-location ([i double])
  (let ([f (modf 1.99 (location i))])
    (print "i=" i ", f=" f) ) )

location returns a value of type c-pointer, when given the name of a callback-procedure defined with define-external.

5.6.6  C interface

The following functions and macros are available for C code that invokes Scheme:

[C function] void C_save(C_word x)
Saves the Scheme data object x on the temporary stack.

[C macro] C_word C_fix(int integer)
[C macro] C_word C_make_character(int char_code)
[C macro] C_word C_SCHEME_END_OF_LIST
[C macro] C_word C_SCHEME_END_OF_FILE
[C macro] C_word C_SCHEME_FALSE
[C macro] C_word C_SCHEME_TRUE
These macros return immediate Scheme data objects.

[C function] C_word C_string(C_word **ptr, int length, char *string)
[C function] C_word C_string2(C_word **ptr, char *zero_terminated_string)
[C function] C_word C_intern2(C_word **ptr, char *zero_terminated_string)
[C function] C_word C_intern3(C_word **ptr, char *zero_terminated_string, C_word initial_value)
[C function] C_word C_pair[C function
(C_word **ptr, C_word car, C_word cdr)]
[C function] C_word C_flonum(C_word **ptr, double number)
[C function] C_word C_int_to_num(C_word **ptr, int integer)
[C function] C_word C_mpointer(C_word **ptr, void *pointer)
[C function] C_word C_vector(C_word **ptr, int length, ...)
[C function] C_word C_list(C_word **ptr, int length, ...)
These functions allocate memory from ptr and initialize a fresh data object. The new data object is returned. ptr should be the address of an allocation pointer created with C_alloc or C_alloc_in_heap.

[C macro] C_word *C_alloc(int words)
[C function] C_word *C_alloc_in_heap(int words)
Allocates memory from the C stack (C_alloc) or the second generation heap (C_alloc_in_heap) and returns a pointer to it. words should be the number of words needed for all data objects that are to be created in this function. Note that stack-allocated data objects have to be passed to the Scheme function, or they will not be seen by the garbage collector. This is really only usable for callback procedure invocations, make sure not to use it in normal code, because the allocated memory will be re-used after the foreign procedure returns. When invoking Scheme callback procedures a minor garbage collection is performed, so data allocated with C_alloc will already have moved to a safe place.

[C macro] int C_SIZEOF_LIST(int length)
[C macro] int C_SIZEOF_STRING(int length)
[C macro] int C_SIZEOF_VECTOR(int length)
[C macro] int C_SIZEOF_INTERNED_SYMBOL(int length)
[C macro] int C_SIZEOF_PAIR
[C macro] int C_SIZEOF_FLONUM
[C macro] int C_SIZEOF_POINTER
[C macro] int C_SIZEOF_LOCATIVE
[C macro] int C_SIZEOF_TAGGED_POINTER
These are macros that return the size in words needed for a data object of a given type.

[C macro] int C_character_code(C_word character)
[C macro] int C_unfix(C_word fixnum)
[C macro] double C_flonum_magnitude(C_word flonum)
[C function] char *C_c_string(C_word string)
[C function] int C_num_to_int(C_word fixnum_or_flonum)
[C function void *C_pointer_address(C_word pointer)] These macros and functions can be used to convert Scheme data objects back to C data.

[C macro] int C_header_size(C_word x)
[C macro] int C_header_bits(C_word x)
Return the number of elements and the type-bits of the non-immediate Scheme data object x.

[C macro] C_word C_block_item(C_word x, int index)
This macro can be used to access slots of the non-immediate Scheme data object x. index specifies the index of the slot to be fetched, starting at 0. Pairs have 2 slots, one for the car and one for the cdr. Vectors have one slot for each element.

[C macro] void *C_data_pointer(C_word x)
Returns a pointer to the data-section of a non-immediate Scheme object.

[C macro] C_word C_make_header(C_word bits, C_word size)
A macro to build a Scheme object header from its bits and size parts.

[C function] C_word C_mutate(C_word *slot, C_word val)
Assign the Scheme value val to the location specified by slot. If the value points to data inside the nursery (the first heap-generation), then the garbage collector will remember to handle the data appropriately. Assigning nursery-pointers directly will otherwise result in lost data.

[C macro] C_word C_symbol_value(C_word symbol)
Returns the global value of the variable with the name symbol.

[C function] void C_gc_protect(C_word *ptrs[], int n)
Registers n variables at address ptrs to be garbage collection roots. The locations should not contain pointers to data allocated in the nursery, only immediate values or pointers to heap-data are valid. Any assignment of potential nursery data into a root-array should be done via C_mutate(). The variables have to be initialized to sensible values before the next garbage collection starts (when in doubt, set all locations in ptrs to C_SCHEME_UNDEFINED)

[C function] void C_gc_unprotect(int n)
Removes the last n registered variables from the set of root variables.

An example:

% cat foo.scm
#>
extern int callout(int, int, int);
<#

(define callout (foreign-callback-lambda int "callout" int int int))

(define-external (callin (scheme-object xyz)) int
  (print "This is 'callin': " xyz)
  123)

(print (callout 1 2 3))

% cat bar.c
#include <stdio.h>
#include "chicken.h"

extern int callout(int, int, int);
extern int callin(C_word x);

int callout(int x, int y, int z)
{
  C_word *ptr = C_alloc(C_SIZEOF_LIST(3));
  C_word lst;

  printf("This is 'callout': %d, %d, %d\n", x, y, z);
  lst = C_list(&ptr, 3, C_fix(x), C_fix(y), C_fix(z));
  return callin(lst);  /* Note: `callin' will have GC'd the data in `ptr' */
}

% chicken foo.scm -quiet
% gcc foo.c bar.c -o foo `chicken-config -cflags -libs`
% foo
This is 'callout': 1, 2, 3
This is 'callin': (1 2 3)
123

Notes: