|
This chapter presents the Bigloo standard library. Bigloo is mostly
R5RS compliant but it proposes many extensions to this standard.
In a first section ( Scheme Library)
the Bigloo R5RS support is presented. This section also contains various
function that are not standard (for instance, various functions used
to manage a file system). Then, in the following sections
( Structures and Records, Serialization, Bit Manipulation,
System Programming and Posix Regular Expressions
Bigloo specific extensions are presented. Bigloo input and output facilities
constitue a large superset of the standard Scheme definition. For this
reason they are presented in a separate section ( Input and Output).
When the definition of a procedure or a special form is the
same in Bigloo and Scheme, we just mention its name;
otherwise, we explain it and qualify it as a ``bigloo
procedure''.
The standard boolean objects are #t and #f .
Note: the empty list is true.
not returns #t if obj is false, and returns
#f otherwise.
(not #t) => #f
(not 3) => #f
(not (list 3)) => #f
(not #f) => #t
(not '()) => #f
(not (list)) => #f
(not 'nil) => #f
|
|
boolean? obj | library procedure |
Boolean? returns #t if obj is either #t or
#f and returns #f otherwise.
(boolean? #f) => #t
(boolean? 0) => #f
(boolean? '()) => #f
|
|
5.1.2 Equivalence predicates
|
eqv? and eq? are equivalent in Bigloo.
(eq? 'a 'a) => #t
(eq? '(a) '(a)) => unspecified
(eq? (list 'a) (list 'a)) => #f
(eq? "a" "a") => unspecified
(eq? "" "") => unspecified
(eq? '() '()) => #t
(eq? 2 2) => unspecified
(eq? #\A #\A) => unspecified
(eq? car car) => #t
(let ((n (+ 2 3)))
(eq? n n)) => unspecified
(let ((x '(a)))
(eq? x x)) => #t
(let ((x '#()))
(eq? x x)) => #t
(let ((p (lambda (x) x)))
(eq? p p)) => #t
|
Since Bigloo implements eqv? as eq? , the behavior is not
always conforming to R5RS.
(eqv? 'a 'a) => #t
(eqv? 'a 'b) => #f
(eqv? 2 2) => #t
(eqv? '() '()) => #t
(eqv? 100000000 100000000) => #t
(eqv? (cons 1 2) (cons 1 2)) => #f
(eqv? (lambda () 1)
(lambda () 2)) => #f
(eqv? #f 'nil) => #f
(let ((p (lambda (x) x)))
(eqv? p p)) => unspecified
|
The following examples illustrate cases in which the above rules do
not fully specify the behavior of eqv?. All that can be said
about such cases is that the value returned by eqv? must be a
boolean.
(eqv? "" "") => unspecified
(eqv? '#() '#()) => unspecified
(eqv? (lambda (x) x)
(lambda (x) x)) => unspecified
(eqv? (lambda (x) x)
(lambda (y) y)) => unspecified
(define gen-counter
(lambda ()
(let ((n 0))
(lambda () (set! n (+ n 1)) n))))
(let ((g (gen-counter)))
(eqv? g g)) => #t
(eqv? (gen-counter) (gen-counter))
=> #f
(define gen-loser
(lambda ()
(let ((n 0))
(lambda () (set! n (+ n 1)) 27))))
(let ((g (gen-loser)))
(eqv? g g)) => #t
(eqv? (gen-loser) (gen-loser))
=> unspecified
(letrec ((f (lambda () (if (eqv? f g) 'both 'f)))
(g (lambda () (if (eqv? f g) 'both 'g))))
(eqv? f g))
=> unspecified
(letrec ((f (lambda () (if (eqv? f g) 'f 'both)))
(g (lambda () (if (eqv? f g) 'g 'both))))
(eqv? f g))
=> #f
(eqv? '(a) '(a)) => unspecified
(eqv? "a" "a") => unspecified
(eqv? '(b) (cdr '(a b))) => unspecified
(let ((x '(a)))
(eqv? x x)) => #t
|
|
equal? obj1 obj2 | library procedure |
(equal? 'a 'a) => #t
(equal? '(a) '(a)) => #t
(equal? '(a (b) c)
'(a (b) c)) => #t
(equal? "abc" "abc") => #t
(equal? 2 2) => #t
(equal? (make-vector 5 'a)
(make-vector 5 'a)) => #t
(equal? (lambda (x) x)
(lambda (y) y)) => unspecified
|
|
See r5rs, Equivalence predicates, for more details.
The form () is illegal.
pair-or-null? obj | bigloo procedure |
Returns #t if obj is either a pair or the empty list. Otherwise
it returns #f .
|
set-car! pair obj | procedure |
set-cdr! pair obj | procedure |
|
caar pair | library procedure |
cadr pair | library procedure |
...
cdddar pair | library procedure |
cddddr pair | library procedure |
|
null? obj | library procedure |
list? obj | library procedure |
list obj ... | library procedure |
length list | library procedure |
append list ... | library procedure |
append! list ... | bigloo procedure |
A destructive append.
|
reverse list | library procedure |
reverse! list | bigloo procedure |
A destructive reverse.
|
list-ref list k | library procedure |
list-tail list k | library procedure |
Returns the sublist of list obtained by omitting the
first k elements.
|
last-pair list | bigloo procedure |
Returns the last pair in the nonempty, possibly improper, list .
|
memq obj list | library procedure |
memv obj list | library procedure |
member obj list | library procedure |
assq obj alist | library procedure |
assv obj alist | library procedure |
assoc obj alist | library procedure |
remq obj list | bigloo procedure |
Returns a new list which is a copy of list with all items
eq? to obj removed from it.
|
remq! obj list | bigloo procedure |
Same as remq but in a destructive way.
|
delete obj list | bigloo procedure |
Returns a new list which is a copy of list with all items
equal? to obj deleted from it.
|
delete! obj list | bigloo procedure |
Same as delete but in a destructive way.
|
cons* obj ... | bigloo procedure |
Returns an object formed by consing all arguments together from right to left.
If only one obj is supplied, that obj is returned.
|
every? pred clist1 clist2 ... | bigloo procedure |
Applies the predicate across the lists, returning true if the
predicate returns true on every application.
(every < '(1 2 3) '(2 3 4)) => #t
(every < '(1 2 3) '(2 3 0)) => #f
|
|
any? pred clist1 clist2 ... | bigloo procedure |
Applies the predicate across the lists, returning true if the
predicate returns true for at least one application.
(any < '(1 2 3) '(2 3 4)) => #t
(any < '(1 2 3) '(2 3 0)) => #t
|
|
every fun clist1 clist2 ... | bigloo procedure |
Applies the function fun across the lists, returning the last
non-false if the function returns non-false on every application. If
non-false, the result of every is the last value returned by the
last application of fun .
(every < '(1 2 3) '(2 3 4)) => #t
(every < '(1 2 3) '(2 3 0)) => #f
|
|
any fun clist1 clist2 ... | bigloo procedure |
Applies the function fun across the lists, returning non-false if the
function returns non-false for at least one application. If non-false,
the result of any is the first non-false value returned by fun .
(any < '(1 2 3) '(2 3 4)) => #t
(any < '(1 2 3) '(2 3 0)) => #t
|
|
make-list n [fill] | bigloo procedure |
Returns an n -element list, whose elements are all the value fill .
If the fill argument is not given, the elements of the list may be
arbitrary values.
(make-list 4 'c) => (c c c c)
|
|
list-tabulate n init-proc | bigloo procedure |
Returns an n -element list. Element i of the list, where 0 <= i <
n , is produced by (init-proc i) . No guarantee is made about the
dynamic order in which init-proc is applied to these indices.
(list-tabulate 4 values) => (0 1 2 3)
|
|
iota count [start step] | bigloo procedure |
Returns a list containing the elements
(start start+step ... start+(count-1)*step)
|
The start and step parameters default to 0 and 1 ,
respectively. This procedure takes its name from the APL primitive.
(iota 5) => (0 1 2 3 4)
(iota 5 0 -0.1) => (0 -0.1 -0.2 -0.3 -0.4)
|
|
See r5rs, Pairs and lists, for more details.
Symbols are case sensitive and the reader is case sensitive too. So:
(eq? 'foo 'FOO) => #f
(eq? (string->symbol "foo") (string->symbol "FOO")) => #f
|
Symbols may contain special characters (such as #\Newline or #\Space).
Such symbols that have to be read must be written: |[^]+| . The
function write uses that notation when it encounters symbols
containing special characters.
(write 'foo) => foo
(write 'Foo) =>Foo
(write '|foo bar|) => |foo bar|
|
symbol->string symbol | procedure |
Returns the name of the symbol as a string. Modifying the string result
of symbol->string could yield incoherent programs. It is better
to copy the string before any physical update. For instance, don't write:
(string-downcase! (symbol->string 'foo))
|
See r5rs, Symbols, for more details.
but prefer:
(string-downcase (symbol->string 'foo))
|
|
string->symbol string | procedure |
string->symbol-ci string | bigloo procedure |
symbol-append symbol ... | bigloo procedure |
String->symbol returns a symbol whose name is string .
String->symbol respects the case of string .
String->symbol-ci returns a symbol whose name is
(string-upcase string ) . Symbol-append returns a
symbol whose name is the concatenation of all the symbol 's names.
|
gensym [obj] | bigloo procedure |
Returns a new fresh symbol. If obj is provided and is a string or
a symbol, it is used as prefix for the new symbol.
|
symbol-plist symbol-or-keyword | bigloo procedure |
Returns the property-list associated with symbol-or-keyword .
|
getprop symbol-or-keyword key | bigloo procedure |
Returns the value that has the key eq? to key from the
symbol-or-keyword 's property list. If there is no value associated
with key then #f is returned.
|
putprop! symbol-or-keyword key val | bigloo procedure |
Stores val using key on symbol-or-keyword 's property list.
|
remprop! symbol-or-keyword key | bigloo procedure |
Removes the value associated with key in the symbol-or-keyword 's
property list. The result is unspecified.
|
Here is an example of properties handling:
(getprop 'a-sym 'a-key) => #f
(putprop! 'a-sym 'a-key 24)
(getprop 'a-sym 'a-key) => 24
(putprop! 'a-sym 'a-key2 25)
(getprop 'a-sym 'a-key) => 24
(getprop 'a-sym 'a-key2) => 25
(symbol-plist 'a-sym) => (a-key2 25 a-key 24)
(remprop! 'a-sym 'a-key)
(symbol-plist 'a-sym) => (a-key2 25)
(putprop! 'a-sym 'a-key2 16)
(symbol-plist 'a-sym) => (a-key2 16)
|
Keywords constitute an extension to Scheme required by Dsssl [Dsssl96].
Keywords syntax is: <ident>: Keywords are autoquote and case insensitive. So
keyword? obj | bigloo procedure |
keyword->string keyword | bigloo procedure |
string->keyword string | bigloo procedure |
|
Bigloo has only three kinds of numbers: fixnum, long fixnum and flonum.
Operations on complexes and rationals are not implemented but for
compatibility purposes, the functions complex? and
rational? exist. (In fact, complex? is the same as
number? and rational? is the same as real? in
Bigloo.) The binary radix is not implemented so the only accepted
prefixes are #o , #d and #x . For each generic
arithmetic procedure, Bigloo provides two specialized procedures, one
for fixnums and one for flonums. The names of these two specialized
procedures is the name of the original one suffixed by fx or
fl . A fixnum has the size of a C integer minus 2 bits. A
flonum has the size of a C double .
complex? x | bigloo procedure |
rational? x | bigloo procedure |
|
fixnum? obj | bigloo procedure |
flonum? obj | bigloo procedure |
These two procedures are type checkers on
types integer and real .
|
elong? obj | bigloo procedure |
llong? obj | bigloo procedure |
The elong? procedures is a type checker for "hardware" integers, that is
integers that have the very same size has the host platform permits (e.g.,
32 bits or 64 bits integers). The llong? procedure is a type checker
for "hardware" long long integers.
|
make-elong int | bigloo procedure |
make-llong int | bigloo procedure |
Create an exact fixnum integer from the fixnum value int .
|
positive? z | library procedure |
negative? z | library procedure |
|
max x1 x2 ... | library procedure |
min x1 x2 ... | library procedure |
|
=fx i1 i2 | bigloo procedure |
=fl r1 r2 | bigloo procedure |
=elong r1 r2 | bigloo procedure |
=llong r1 r2 | bigloo procedure |
<fx i1 i2 | bigloo procedure |
<fl r1 r2 | bigloo procedure |
<elong r1 r2 | bigloo procedure |
<lllong r1 r2 | bigloo procedure |
>fx i1 i2 | bigloo procedure |
>fl r1 r2 | bigloo procedure |
>elong r1 r2 | bigloo procedure |
>lllong r1 r2 | bigloo procedure |
<=fx i1 i2 | bigloo procedure |
<=fl r1 r2 | bigloo procedure |
<=elong r1 r2 | bigloo procedure |
<=lllong r1 r2 | bigloo procedure |
>=fx i1 i2 | bigloo procedure |
>=fl r1 r2 | bigloo procedure |
>=elong r1 r2 | bigloo procedure |
>=llong r1 r2 | bigloo procedure |
|
+fx i1 i2 | bigloo procedure |
+fl r1 r2 | bigloo procedure |
+elong r1 r2 | bigloo procedure |
+llong r1 r2 | bigloo procedure |
*fx i1 i2 | bigloo procedure |
*fl r1 r2 | bigloo procedure |
*elong r1 r2 | bigloo procedure |
*lllnog r1 r2 | bigloo procedure |
-fx i1 i2 | bigloo procedure |
-fl r1 r2 | bigloo procedure |
-elong r1 r2 | bigloo procedure |
-llong r1 r2 | bigloo procedure |
negelong r | bigloo procedure |
negllong r | bigloo procedure |
These two functions implement the unary function - .
|
/fx i1 i2 | bigloo procedure |
/fl r1 r2 | bigloo procedure |
/elong r1 r2 | bigloo procedure |
/lllong r1 r2 | bigloo procedure |
|
quotientelong z1 z2 | procedure |
quotientllong z1 z2 | procedure |
remainderelong z1 z2 | procedure |
remainderllong z1 z2 | procedure |
|
exact->inexact z | procedure |
inexact->exact z | procedure |
number->string z | procedure |
integer->string i | bigloo procedure |
integer->string i radix | bigloo procedure |
elong->string i | bigloo procedure |
elong->string i radix | bigloo procedure |
llong->string i | bigloo procedure |
llong->string i radix | bigloo procedure |
real->string z | bigloo procedure |
|
string->number string | procedure |
string->number string radix | procedure |
string->elong string radix | procedure |
string->llong string radix | procedure |
Bigloo implements a restricted version of string->number . If
string denotes a floating point number then, the only radix
10 may be send to string->number . That is:
(string->number "1243" 16) => 4675
(string->number "1243.0" 16) -|
# *** ERROR:bigloo:string->number
# Only radix `10' is legal for floating point number -- 16
(string->elong "234456353") => #e234456353
|
In addition, string->number does not support radix encoded inside
string . That is:
(string->number "#x1243") => #f
|
|
string->integer string | bigloo procedure |
string->integer string radix | bigloo procedure |
string->real string | bigloo procedure |
fixnum->flonum i | bigloo procedure |
flonum->fixnum r | bigloo procedure |
elong->fixnum i | bigloo procedure |
fixnum->elong r | bigloo procedure |
llong->fixnum i | bigloo procedure |
fixnum->llong r | bigloo procedure |
elong->flonum i | bigloo procedure |
flonum->elong r | bigloo procedure |
llong->flonum i | bigloo procedure |
flonum->llong r | bigloo procedure |
These last procedures implement the natural translation
from and to fixnum, flonum, elong, and llong.
|
See r5rs, Numerical operations, for more details.
Bigloo knows one more named character #\tab in addition to the
#\space and #\newline of R5RS. A new alternate syntax exists for characters:
#a<ascii-code>
where <ascii-code> is the three digit decimal ascii number
of the character to be read. Thus, for instance, the character #\space
can be written #a032 .
char=? char1 char2 | procedure |
char<? char1 char2 | procedure |
char>? char1 char2 | procedure |
char<=? char1 char2 | procedure |
char>=? char1 char2 | procedure |
char-ci=? char1 char2 | library procedure |
char-ci<? char1 char2 | library procedure |
char-ci>? char1 char2 | library procedure |
char-ci<=? char1 char2 | library procedure |
char-ci>=? char1 char2 | library procedure |
|
char-alphabetic? char | library procedure |
char-numeric? char | library procedure |
char-whitespace? char | library procedure |
char-upper-case? char | library procedure |
char-lower-case? char | library procedure |
|
char->integer char | procedure |
|
char-upcase char | library procedure |
char-downcase char | library procedure |
|
UCS-2 Characters are two byte encoded characters. They can be read with
the syntax:
#u<unicode>
where <unicode> is the four digit hexadecimal unicode value
of the character to be read. Thus, for instance, the character #\space
can be written #u0020 .
ucs2? obj | bigloo procedure |
|
ucs2=? ucs2a ucs2b | bigloo procedure |
ucs2<? ucs2a ucs2b | bigloo procedure |
ucs2>? ucs2a ucs2b | bigloo procedure |
ucs2<=? ucs2a ucs2b | bigloo procedure |
ucs2>=? ucs2a ucs2b | bigloo procedure |
ucs2-ci=? ucs2a ucs2b | bigloo procedure |
ucs2-ci<? ucs2a ucs2b | bigloo procedure |
ucs2-ci>? ucs2a ucs2b | bigloo procedure |
ucs2-ci<=? ucs2a ucs2b | bigloo procedure |
ucs2-ci>=? ucs2a ucs2b | bigloo procedure |
|
ucs2-alphabetic? ucs2 | bigloo procedure |
ucs2-numeric? ucs2 | bigloo procedure |
ucs2-whitespace? ucs2 | bigloo procedure |
ucs2-upper-case? ucs2 | bigloo procedure |
ucs2-lower-case? ucs2 | bigloo procedure |
|
ucs2->integer ucs2 | bigloo procedure |
integer->ucs2 i | bigloo procedure |
|
ucs2->char ucs2 | bigloo procedure |
char->ucs2 char | bigloo procedure |
|
ucs2-upcase ucs2 | bigloo procedure |
ucs2-downcase ucs2 | bigloo procedure |
|
There are three different syntaxes for strings in Bigloo: traditional,
foreign or Unicode. The traditional syntax for strings may conform to
the Revised Report, see r5rs, Lexical structure.
With the foreign syntax, C escape
sequences are interpreted as specified by ISO-C. In addition, Bigloo's
reader evaluate \x?? sequence as an hexadecimal escape
character. For Unicode syntax, see Unicode (UCS-2) Strings. Only
the reader distinguishes between these three appearances of strings;
i.e., there is only one type of string at evaluation-time. The regular
expression describing the syntax for foreign string is:
#"([^"]|\")*" .
*bigloo-strict-r5rs-strings* | variable |
Traditional syntax conforms to the Revised Report if the variable
*bigloo-strict-r5rs-strings* is not #f . Otherwise
constant strings specified by the "([^"]|\")*" are considered
as foreign strings.
For example, after reading the expression
"1\n23\t4\"5" , the following string is built, which is equal to
(string #\1 #\n #\2 #\3 #\t #\4 #\" #\5) if
*bigloo-strict-r5rs-strings* is not #f . It is
(string #\1 #\n #\2 #\3 #\tab #\4 #\" #\5) otherwise.
Printing this string will produce: 1n23t4"5 .
The new foreign syntax allows C escape sequences to be recognized. For
example, the expression #"1\n23\t4\"5" builds a string equal to:
(string #\1 #\newline #\2 #\3 #\t #\4 #\" #\5)
and printing this string will then produce:
|
The library functions for string processing are:
make-string k char | procedure |
string char ... | library procedure |
|
string-length string | procedure |
string-ref string k | procedure |
string-set! string k char | procedure |
|
string=? string1 string2 | library procedure |
substring=? string1 string2 len | bigloo procedure |
This function returns #t if string1 and string2 have a
common prefix of size len .
(substring=? "abcdef" "ab9989898" 2)
=> #t
(substring=? "abcdef" "ab9989898" 3)
=> #f
|
|
substring-at? string1 string2 offset | bigloo procedure |
This function returns #t if string2 is at position offset
in the string string1 .
(substring-at? "abcdefghij" "def" 3)
=> #t
(substring-at? "abcdefghij" "def" 2)
=> #f
|
|
string-ci=? string1 string2 | library procedure |
string<? string1 string2 | library procedure |
string>? string1 string2 | library procedure |
string<=? string1 string2 | library procedure |
string>=? string1 string2 | library procedure |
string-ci<? string1 string2 | library procedure |
string-ci>? string1 string2 | library procedure |
string-ci<=? string1 string2 | library procedure |
string-ci>=? string1 string2 | library procedure |
|
string-compare3 string1 string2 | bigloo procedure |
string-compare3-ci string1 string2 | bigloo procedure |
This function compares string1 and string2 . It returns
a negative integer if string1 < string2 . It returns
zero if the string1 equal string2 . It returns
a positive integer if string1 > string2 .
|
substring string start end | library procedure |
string must be a string, and start and end must be
exact integers satisfying:
0 <= START <= END <= (string-length STRING)
|
substring returns a newly allocated string formed from the
characters of STRING beginning with index START (inclusive)
and ending with index END (exclusive).
(substring "abcdef" 0 5)
=> "abcde"
(substring "abcdef" 1 5)
=> "bcde"
|
|
string-shrink! string end | library procedure |
string must be a string, and end must be
an exact integers satisfying:
0 <= END <= (string-length STRING)
|
string-shrink! returns a newly allocated string formed from the
characters of STRING beginning with index 0 (inclusive)
and ending with index END (exclusive). As much as possible
string-shrink! changes the argument string . That is, as much
as possible, and for the back-ends that enable it, string-shrink!
operate a side effect on its argument.
(let ((s (string #\a #\b #\c #\d #\e)))
(set! s (string-shrink! s 3))
s)
=> "abc"
|
|
string-append string ... | library procedure |
string->list string | library procedure |
list->string list | library procedure |
string-copy string | library procedure |
|
string-fill! string char | bigloo procedure |
Stores char in every element of the given string
and returns an unspecified value.
|
string-downcase string | bigloo procedure |
Returns a newly allocated version of string where each upper case
letter is replaced by its lower case equivalent.
|
string-upcase string | bigloo procedure |
Returns a newly allocated version of string where each lower case
letter is replaced by its upper case equivalent.
|
string-capitalize string | bigloo procedure |
Builds a newly allocated capitalized string.
|
string-downcase! string | bigloo procedure |
Physically downcases the string argument.
|
string-upcase! string | bigloo procedure |
Physically upcases the string argument.
|
string-capitalize! string | bigloo procedure |
Physically capitalized the string argument.
|
string-for-read string | bigloo procedure |
Returns a copy of string with each special character
replaced by an escape sequence.
|
blit-string! string1 o1 string2 o2 len | bigloo procedure |
Fill string s2 starting at position o2 with
len characters taken out of string s1 from
position o1 .
(let ((s (make-string 20 #\-)))
(blit-string! "toto" 0 s 16 4)
s)
=> "----------------toto"
|
|
5.1.10 Unicode (UCS-2) Strings
|
UCS-2 strings cannot be read by the standard reader but UTF-8 strings
can. The special syntax for UTF-8 is described by the
regular expression:
#u"([^]|\")*" . The library functions for Unicode string processing are:
ucs2-string? obj | bigloo procedure |
|
make-ucs2-string k | bigloo procedure |
make-ucs2-string k char | bigloo procedure |
ucs2-string k ... | bigloo procedure |
|
ucs2-string-length s-ucs2 | bigloo procedure |
ucs2-string-ref s-ucs2 k | bigloo procedure |
ucs2-string-set! s-ucs2 k char | bigloo procedure |
|
ucs2-string=? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string-ci=? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string<? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string>? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string<=? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string>=? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string-ci<? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string-ci>? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string-ci<=? s-ucs2a s-ucs2b | bigloo procedure |
ucs2-string-ci>=? s-ucs2a s-ucs2b | bigloo procedure |
|
subucs2-string s-ucs2 start end | bigloo procedure |
ucs2-string-append s-ucs2 ... | bigloo procedure |
ucs2-string->list s-ucs2 | bigloo procedure |
list->ucs2-string chars | bigloo procedure |
ucs2-string-copy s-ucs2 | bigloo procedure |
|
ucs2-string-fill! s-ucs2 char | bigloo procedure |
Stores char in every element of the given s-ucs2
and returns an unspecified value.
|
ucs2-string-downcase s-ucs2 | bigloo procedure |
Builds a newly allocated ucs2-string with lower case letters.
|
ucs2-string-upcase s-ucs2 | bigloo procedure |
Builds a new allocated ucs2-string with upper case letters.
|
ucs2-string-downcase! s-ucs2 | bigloo procedure |
Physically downcases the s-ucs2 argument.
|
ucs2-string-upcase! s-ucs2 | bigloo procedure |
Physically upcases the s-ucs2 argument.
|
ucs2-string->utf8-string s-ucs2 | bigloo procedure |
utf8-string->ucs2-string string | bigloo procedure |
Convert UCS-2 strings to (or from) UTF-8 encoded ascii strings.
|
Vectors are not autoquoted objects.
make-vector k obj | procedure |
vector obj ... | library procedure |
|
vector-length vector | procedure |
vector-ref vector k | procedure |
vector-set! vector k obj | procedure |
|
vector->list vector | library procedure |
list->vector list | library procedure |
|
vector-fill! vector obj | library procedure |
Stores obj in every element of vector . For instance:
(let ((v (make-vector 5 #f)))
(vector-fill! v #t)
v)
|
|
copy-vector vector len | bigloo procedure |
Allocate a new vector of size len and fills it with the first len
element of vector . The new length len may be bigger than
the old vector length.
|
vector-copy vector start end | bigloo procedure |
vector must be a vector, and start and end must be
exact integers satisfying:
0 <= START <= END <= (vector-length VECTOR)
|
vector-copy returns a newly allocated vector formed from the
elements of VECTOR beginning with index START (inclusive)
and ending with index END (exclusive).
(vector-copy '#(1 2 3 4) 0 4)
=> '#(1 2 3 4)
(vector-copy '#(1 2 3 4) 1 3)
=> '#(2 3)
|
|
See r5rs, Vectors, for more details.
apply proc arg1 ... args | procedure |
|
map proc list1 list2 ... | library procedure |
for-each proc list1 list2 ... | library procedure |
|
filter pred list ... | library procedure |
filter! pred list ... | library procedure |
Strip out all elements of list for which the predicate pred
is not true. The second version filter! is destructive:
(filter number? '(1 2 #\a "foo" foo 3)) => (1 2 3)
(let ((l (list 1 2 #\a "foo" 'foo 3)))
(set! l (filter! number? l))
l) => (1 2 3)
|
|
sort obj proc | bigloo procedure |
Sorts obj according to proc test. The argument obj can
either be a vector or a list. In either case, a copy of the argument
is returned. For instance:
(let ((l '(("foo" 5) ("bar" 6) ("hux" 1) ("gee" 4))))
(sort l (lambda (x y) (string<? (car x) (car y)))))
=> ((bar 6) (foo 5) (gee 4) (hux 1))
|
|
force promise | library procedure |
|
call/cc proc | bigloo procedure |
This function is the same as the call-with-current-continuation
function of the R5RS, see r5rs, call-with-current-continuation,
but it is necessary to compile the module with the -call/cc
option to use it, see Section
See The Bigloo command line.
Note: Since call/cc is difficult to compile efficiently,
one might consider using bind-exit instead.
For this reason, we decided to enable call/cc only with a
compiler option.
|
bind-exit escape body | bigloo syntax |
This form provides an escape operator facility. bind-exit
evaluates the body , which may refer to the variable
escape which will denote an ``escape function'' of one
argument: when called, this escape function will return from
the bind-exit form with the given argument as the value of
the bind-exit form. The escape can only be used
while in the dynamic extent of the form. Bindings introduced by
bind-exit are immutable.
(bind-exit (exit)
(for-each (lambda (x)
(if (negative? x)
(exit x)))
'(54 0 37 -3 245 19))
#t) => -3
(define list-length
(lambda (obj)
(bind-exit (return)
(letrec ((r (lambda (obj)
(cond ((null? obj) 0)
((pair? obj)
(+ (r (cdr obj)) 1))
(else (return #f))))))
(r obj)))))
(list-length '(1 2 3 4)) => 4
(list-length '(a b . c)) => #f
|
|
unwind-protect expr protect | bigloo syntax |
This form provides protections. Expression expr is
evaluated. If this evaluation requires the invocation of an
escape procedure (a procedure bounded by the bind-exit
special form), protect is evaluated before the control
jump to the exit procedure. If expr does not raise any
exit procedure, unwind-protect has the same behaviour as
the begin special form except that the value of the form is
always the value of expr .
(define (my-open f)
(if (file-exists? f)
(let ((port (open-input-file f)))
(if (input-port? port)
(unwind-protect
(bar port)
(close-input-port port))))))
|
|
dynamic-wind before thunk after | procedure |
Calls thunk without arguments, returning the result(s) of this call.
Before and after are called, also without arguments, as required
by the following rules (note that in the absence of calls to continuations
captured using call/cc the three arguments are
called once each, in order). Before is called whenever execution
enters the dynamic extent of the call to thunk and after is called
whenever it exits that dynamic extent. The dynamic extent of a procedure
call is the period between when the call is initiated and when it
returns. In Scheme, because of call/cc , the
dynamic extent of a call may not be a single, connected time period.
It is defined as follows:
- The dynamic extent is entered when execution of the body of the
called procedure begins.
- The dynamic extent is also entered when execution is not within
the dynamic extent and a continuation is invoked that was captured
(using
call/cc ) during the dynamic extent.
- It is exited when the called procedure returns.
- It is also exited when execution is within the dynamic extent and
a continuation is invoked that was captured while not within the
dynamic extent.
If a second call to dynamic-wind occurs within the dynamic extent of the
call to thunk and then a continuation is invoked in such a way that the
after s from these two invocations of dynamic-wind are both to be
called, then the after associated with the second (inner) call to
dynamic-wind is called first.
If a second call to dynamic-wind occurs within the dynamic extent of the
call to thunk and then a continuation is invoked in such a way that the
before s from these two invocations of dynamic-wind are both to be
called, then the before associated with the first (outer) call to
dynamic-wind is called first.
If invoking a continuation requires calling the before from one call
to dynamic-wind and the after from another, then the after
is called first.
The effect of using a captured continuation to enter or exit the dynamic
extent of a call to before or after is undefined.
(let ((path '())
(c #f))
(let ((add (lambda (s)
(set! path (cons s path)))))
(dynamic-wind
(lambda () (add 'connect))
(lambda ()
(add (call/cc
(lambda (c0)
(set! c c0)
'talk1))))
(lambda () (add 'disconnect)))
(if (< (length path) 4)
(c 'talk2)
(reverse path))))
=> (connect talk1 disconnect connect talk2 disconnect)
|
|
unspecified | bigloo procedure |
Returns the unspecified (noted as #unspecified ) object with
no specific property.
|
Delivers all of its arguments to its continuation.
Except for continuations created by the call-with-values
procedure, all continuations take exactly one value.
Values might be defined as follows:
(define (values . things)
(call/cc
(lambda (cont) (apply cont things))))
|
|
call-with-values producer consumer | procedure |
Calls its producer argument with no values and
a continuation that, when passed some values, calls the
consumer procedure with those values as arguments.
The continuation for the call to consumer is the
continuation of the call to call-with-values.
(call-with-values (lambda () (values 4 5))
(lambda (a b) b))
=> 5
(call-with-values * -)
=> -1
|
|
multiple-value-bind (var ...) producer exp ... | bigloo syntax |
receive (var ...) producer exp ... | bigloo syntax |
Evaluates exp ... in a environment where var ... are bound
from the evaluation of producer . The result of producer must
be a call to values where the number of argument is the number of
bound variables.
(define (bar a)
(values (modulo a 5) (quotient a 5)))
(define (foo a)
(multiple-value-bind (x y)
(bar a)
(print x " " y)))
(foo 354)
-| 4 70
|
|
This section describes Scheme operation for reading and writing data.
The section Files describes functions for handling files.
call-with-input-file string proc | library procedure |
call-with-output-file string proc | library procedure |
These two procedures call proc with one argument, a port obtained
by opening string .
See r5rs, Ports, for more details.
(call-with-input-file "/etc/passwd"
(lambda (port)
(let loop ((line (read-line port)))
(if (not (eof-object? line))
(begin
(print line)
(loop (read-line port)))))))
|
|
input-port? obj | procedure |
input-string-port? obj | procedure |
output-port? obj | procedure |
output-string-port? obj | procedure |
|
input-port-name obj | bigloo procedure |
Returns the file name for which obj has been opened.
|
input-port-reopen! obj | bigloo procedure |
Re-open the input port obj . That is, re-start reading from the first
character of the input port.
|
current-input-port | procedure |
current-output-port | procedure |
current-error-port | bigloo procedure |
|
with-input-from-file string thunk | optional procedure |
with-input-from-string string thunk | optional procedure |
with-input-from-procedure procedure thunk | optional procedure |
with-output-to-file string thunk | optional procedure |
with-error-to-file string thunk | bigloo procedure |
with-output-to-string thunk | bigloo procedure |
with-error-to-string thunk | bigloo procedure |
A port is opened from file string . This port is made the
current input port (resp. the current output port or the current error port)
and thunk is called.
See r5rs, Ports, for more details.
(with-input-from-file "/etc/passwd"
(lambda ()
(let loop ((line (read-line (current-input-port))))
(if (not (eof-object? line))
(begin
(print line)
(loop (read-line (current-input-port))))))))
|
|
with-input-from-port port thunk | bigloo procedure |
with-output-to-port port thunk | bigloo procedure |
with-error-to-port port thunk | bigloo procedure |
with-input-from-port , with-output-to-port and
with-error-to-port all suppose port to be a legal port. They
call thunk making port the current input (resp. output or
error) port. None of these functions close port on the continuation
of thunk .
(with-output-to-port (current-error-port)
(lambda () (display "hello")))
|
|
open-input-file file-name | procedure |
If file-name is a regular file name, open-input-file behaves as
the function defined in the Scheme report. If file-name starts with
special prefixes it behaves differently. Here are the recognized prefixes:
| (a string made of the characters #\| and #\space )
Instead of opening a regular file, Bigloo opens an input pipe.
The same syntax is used for output file.
(define pin (open-input-file "| cat /etc/passwd"))
(define pout (open-output-file "| wc -l"))
(display (read pin) pout)
(close-input-port pin)
(newline pout)
(close-output-port pout)
|
pipe:
Same as | .
file:
Opens a regular file.
string:
Opens a port on a string. This is equivalent to open-input-string .
Example:
(with-input-from-file "string:foo bar Gee"
(lambda ()
(print (read))
(print (read))
(print (read))))
-| foo
-| bar
-| Gee
|
http:server/path
Opens an http connection on server and open an input file
on file path .
http:server:port-number/path
Opens an http connection on server , on port number
port and open an input file on file path .
ftp:server/path
Opens an http connection on server and open an input file
on file path .
|
open-input-string string | bigloo procedure |
Returns an input-port able to deliver characters from
string .
|
open-input-c-string string | bigloo procedure |
Returns an input-port able to deliver characters from
C string string . The buffer used by the input port is the exact
same string as the argument. That is, no buffer is allocated.
|
open-input-procedure procedure | bigloo procedure |
Returns an input-port able to deliver characters from
procedure . Each time a character has to be read, the procedure
is called. This procedure may returns a character, a string of characters, or
the boolean #f . This last value stands for the end of file.
Example:
(let ((p (open-input-procedure (let ((s #t))
(lambda ()
(if s
(begin (set! s #f) "foobar")
s))))))
(read))
|
|
open-output-file file-name | procedure |
The same syntax as open-input-file for file names applies here.
When a file name starts with | , Bigloo opens an output pipe
instead of a regular file.
|
append-output-file file-name | bigloo procedure |
If file-name exists, this function returns an output-port
on it, without removing it. New output will be appended to file-name .
If file-name does not exist, it is created.
|
open-output-string | bigloo procedure |
This function returns an output string port. This object has almost
the same purpose as output-port . It can be used with all
the printer functions which accept output-port . An output
on a output string port memorizes all the characters written. An
invocation of flush-output-port or close-output-port on an
output string port returns a new string which contains all the
characters accumulated in the port.
|
get-output-string output-port | bigloo procedure |
Given an output port created by open-output-string ,
returns a string consisting of the characters that have been
output to the port so far.
|
close-input-port input-port | procedure |
close-output-port output-port | procedure |
According to R5RS, the value returned is unspecified. However, if
output-port was created using open-output-string , the value
returned is the string consisting of all characters sent to the port.
|
input-port-name input-port | bigloo procedure |
Returns the name of the file used to open the input-port .
|
input-port-position port | bigloo procedure |
output-port-position port | bigloo procedure |
Returns the current position (a character number), in the port .
|
set-input-port-position! port pos | bigloo procedure |
set-output-port-position! port pos | bigloo procedure |
These functions set the file position indicator for port . The new
position, measured in bytes, is specified by pos . It is an error
to seek a port that cannot be changed (for instance, a string or a
console port). The result of these functions is unspecified. An error
is raised if the position cannot be changed.
|
input-port-reopen! input-port | bigloo procedure |
This function re-opens the input input-port . That is, it reset the
position in the input-port to the first character.
|
read [input-port] | procedure |
read/case case [input-port] | bigloo procedure |
read-case-sensitive [input-port] | bigloo procedure |
read-case-insensitive [input-port] | bigloo procedure |
Read a lisp expression. The case sensitivity of read is unspecified.
If have to to enforce a special behavior regarding the case, use
read/case , read-case-sensitive or read-case-insensitive .
Let us consider the following source code: The value of the read/case 's
case argument may either be upcase , downcase or
sensitive . Using any other value is an error.
(define (main argv)
(let loop ((exp (read-case-sensitive)))
(if (not (eof-object? exp))
(begin
(display "exp: ")
(write exp)
(display " [")
(display exp)
(display "]")
(print " eq?: " (eq? exp 'FOO) " " (eq? exp 'foo))
(loop (read-case-sensitive))))))
|
Thus:
> a.out
foo
-| exp: foo [foo] eq?: #f #t
FOO
-| exp: FOO [FOO] eq?: #t #f
|
|
read/rp grammar port | bigloo procedure |
read/lalrp lalrg rg port [emptyp] | bigloo procedure |
These functions are fully explained in Regular Parsing,
and Lalr Parsing.
|
read-char [port] | procedure |
peek-char [port] | procedure |
|
char-ready? [port] | procedure |
As specified in the R5Rs, r5rs, Ports, char-ready?
returns #t if a character is ready on the input port and
returns #f otherwise. If char-ready returns #t then
the next read-char operation on the given port is guaranteed
not to hang. If the port is at end of file then char-ready?
returns #t. Port may be omitted, in which case it defaults to
the value returned by current-input-port.
When using char-ready? consider the latency that may exists
before characters are available. For instance, executing the
following source code:
(let* ((proc (run-process "/bin/ls" "-l" "/bin" output: pipe:))
(port (process-output-port proc)))
(let loop ((line (read-line port)))
(print "char ready " (char-ready? port))
(if (eof-object? line)
(close-input-port port)
(begin
(print line)
(loop (read-line port))))))
|
Produces outputs such as:
char ready #f
total 7168
char ready #f
-rwxr-xr-x 1 root root 2896 Sep 6 2001 arch
char ready #f
-rwxr-xr-x 1 root root 66428 Aug 25 2001 ash
char ready #t
...
|
For a discussion of Bigloo processes, see Process.
Note: Thanks to Todd Dukes for the example and the suggestion
of including it this documentation.
|
read-line [input-port] | bigloo procedure |
Reads characters from input-port until a #\Newline ,
a #\Return or an end of file condition is encountered.
read-line returns a newly allocated string composed of the characters
read.
|
read-lines [input-port] | bigloo procedure |
Accumulates all the line of an input-port into a list.
|
read-of-strings [input-port] | bigloo procedure |
Reads a sequence of non-space characters on input-port , makes a
string of them and returns the string.
|
read-string [input-port] | bigloo procedure |
Reads all the characters of input-port into a string.
|
read-chars size [input-port] | bigloo procedure |
Returns a newly allocated strings made of size characters read
from input-port (or from (current-input-port) if
input-port is not provided). If less than size characters
are available on the input port, the returned string is smaller than
size . Its size is the number of available characters.
|
send-chars input-port output-port [len] | bigloo procedure |
Transfer the characters from input-port to output-port . This
procedure is sometimes mapped to a system call (such as sendfile under
Linux) and might thus be more efficient than copying the ports by hand.
|
read-fill-string! s o len [input-port] | bigloo procedure |
Fills the string s starting at offset o with at
most len characters read from the input port input-port
(or from (current-input-port) if input-port is not provided).
This function returns the number of fill characters (which may be smaller
than len if less characters are available).
Example:
(let ((s (make-string 10 #\-)))
(with-input-from-string "abcdefghijlkmnops"
(lambda ()
(read-fill-string! s 3 5)
s)))
=> ---abcde--
|
|
write obj [output-port] | library procedure |
display obj [output-port] | library procedure |
print obj ... | bigloo procedure |
This procedure allows several objects to be displayed. When
all these objects have been printed, print adds a newline.
|
display* obj ... | bigloo procedure |
This function is similar to print but does not add a newline.
|
fprint output-port obj ... | bigloo procedure |
This function is the same as print except that a
port is provided.
|
newline [output-port] | procedure |
write-char char [output-port] | procedure |
flush-output-port output-port | bigloo procedure |
This procedure flushes the output port output-port .
|
format format-string [objs] | bigloo procedure |
Note: Many thanks to Scott G. Miller who is the author of
SRFI-28. Most of the documentation of this function is copied from the
SRFI documentation.
Accepts a message template (a Scheme String), and processes it,
replacing any escape sequences in order with one or more characters,
the characters themselves dependent on the semantics of the escape
sequence encountered.
An escape sequence is a two character sequence in the string where the
first character is a tilde ~ . Each escape code's meaning is as
follows:
~a The corresponding value is inserted into the string
as if printed with display.
~s The corresponding value is inserted into the string
as if printed with write.
~% A newline is inserted.
~~ A tilde ~ is inserted.
~a and ~s , when encountered, require a corresponding
Scheme value to be present after the format string. The values
provided as operands are used by the escape sequences in order. It is
an error if fewer values are provided than escape sequences that
require them.
~% and ~~ require no corresponding value.
(format "Hello, ~a" "World!")
-| Hello, World!
(format "Error, list is too short: ~s~%" '(one "two" 3))
-| Error, list is too short: (one "two" 3)
|
|
printf format-string [objs] | bigloo procedure |
fprintf port format-string [objs] | bigloo procedure |
Formats objs to the current output port or to the specified port .
|
set-write-length! len | bigloo procedure |
Sets to len the maximum number of atoms that can be printed
by write and display . This facility is useful in preventing
the printer from falling into an infinite loop when printing circular
structures.
|
get-write-length | bigloo procedure |
Gets the current length of the printer. That is, get-write-length
returns the maximum number of Bigloo objects that are allowed to be printed
when printing compound objects. This function is useful in preventing
the system from looping when printing circular data structures.
|
set-printer! proc | bigloo procedure |
Set the current printer to be proc ; proc has to be a
procedure expecting at least one argument: an expression to
print; an output port is an optional, second argument.
|
native-printer | bigloo procedure |
Returns the native Bigloo's printer.
|
current-printer | bigloo procedure |
Returns the current Bigloo's printer.
|
pp obj [output-port] | bigloo procedure |
Pretty print obj on output-port .
|
Sets the variable to respect , lower or upper
to change the case for pretty-printing.
|
*pp-width* | bigloo variable |
The width of the pretty-print.
|
write-circle obj [output-port] | bigloo procedure |
Display recursive object obj on output-port . Each component
of the object is displayed using the write library function.
|
display-circle obj [output-port] | bigloo procedure |
Display recursive object obj on output-port . Each component
of the object is displayed using the display library function.
For instance:
(define l (list 1 2 3))
(set-car! (cdr l) l)
(set-car! (cddr l) l)
(display-circle l) -| #0=(1 #0# #0#)
|
|
5.3 Structures and Records
|
Bigloo supports two kinds of enumerated types: the structures and
the records. They offer similar facilities. Structures where
pre-exising to records and they are maintained mainly for backward
compatiblity. Recors are compliant with the Scheme request for
implementation 9.
There is, in Bigloo, a new class of objects:
structures, which are equivalent to C struct .
define-struct name field... | bigloo syntax |
This form defines a structure with name name , which is a symbol,
having fields field ... which are symbols or lists, each
list being composed of a symbol and a default value. This form creates
several functions: creator, predicate, accessor and assigner functions. The
name of each function is built in the following way:
- Creator:
make-name
- Predicate:
name ?
- Accessor:
name -field
- Assigner:
name -field -set!
Function make-name accepts an optional argument. If
provided, all the slots of the created structures are filled with it. The
creator named name accepts as many arguments as the number of
slots of the structure. This function allocates a structure and fills
each of its slots with its corresponding argument.
If a structure is created using make-name and no initialization
value is provided, the slot default values (when provided) are used
to initialize the new structure. For instance, the execution of the program:
(define-struct pt1 a b)
(define-struct pt2 (h 4) (g 6))
(make-pt1)
=> #{PT1 () ()}
(make-pt1 5)
=> #{PT1 5 5}
(make-pt2)
=> #{PT2 4 6}
(make-pt2 5)
=> #{PT2 5 5}
|
|
struct? obj | bigloo procedure |
Returns #t if and only if obj is a structure.
|
Bigloo supports records has specified by SRFI-9. This section is a copy
of the SRFI-9 specification by Richard Kelsey. This SRFI describes
syntax for creating new data types, called record types. A predicate,
constructor, and field accessors and modifiers are defined for each
record type. Each new record type is distinct from all existing types,
including other record types and Scheme's predefined types.
define-record-type expression... | syntax |
The syntax of a record-type definition is:
<record-type-definition> ==> (define-record-type <type-name>
(<constructor-name> <field-tag> ...)
<predicate-name>
<field-spec> ...)
<field-spec> ==> (<field-tag> <accessor-name>)
| (<field-tag> <accessor-name> <modifier-name>)
<field-tag> ==> <identifier>
<accessor-name> ==> <identifier>
<predicate-name> ==> <identifier>
<modifier-name> ==> <identifier>
<type-name> ==> <identifier>
|
Define-record-type is generative: each use creates a new record
type that is distinct from all existing types, including other record
types and Scheme's predefined types. Record-type definitions may only
occur at top-level (there are two possible semantics for `internal'
record-type definitions, generative and nongenerative, and no consensus
as to which is better).
an instance of define-record-type is equivalent to the following
definitions:
<type-name>
is bound to a representation of the record type itself. Operations on
record types, such as defining print methods, reflection, etc. are left
to other SRFIs.
<constructor-name>
is bound to a procedure that takes as many arguments as the
re are <field-tag> s in the (<constructor-name> ...) subform
and returns a new <type-name> record. Fields whose tags are listed
with <constructor-name> have the corresponding argument as their
initial value. The initial values of all other fields are unspecified.
<predicate-name>
is a predicate that returns #t when given a value returned by
<constructor-name> and #f for everything else.
- Each
<accessor-name> is a procedure that takes a record of
type <type-name> and returns the current value of the corresponding
field. It is an error to pass an accessor a value which is not a record
of the appropriate type.
- Each
<modifier-name> is a procedure that takes a record of
type <type-name> and a value which becomes the new value of the
corresponding field; an unspecified value is returned. It is an error
to pass a modifier a first argument which is not a record of the appropriate
type.
Records are disjoint from the types listed in Section 4.2 of R5RS.
Seting the value of any of these identifiers has no effect on the
behavior of any of their original values.
The following
(define-record-type pare
(kons x y)
pare?
(x kar set-kar!)
(y kdr))
|
defines kons to be a constructor, kar and kdr to be
accessors, set-kar! to be a modifier, and pare? to be a
predicate for pare s.
(pare? (kons 1 2)) => #t
(pare? (cons 1 2)) => #f
(kar (kons 1 2)) => 1
(kdr (kons 1 2)) => 2
(let ((k (kons 1 2)))
(set-kar! k 3)
(kar k)) => 3
|
|
string->obj string | bigloo procedure |
This function converts a string which has been produced by
obj->string into a Bigloo object.
|
obj->string object | bigloo procedure |
This function converts into a string any Bigloo object
which does not contain a procedure.
|
The implementation of the last two functions ensures that for every
Bigloo object obj (containing no procedure), the expression:
(equal? obj (string->obj (obj->string obj)))
=> #t
|
binary-port? obj | bigloo procedure |
open-output-binary-file file-name | bigloo procedure |
append-output-binary-file file-name | bigloo procedure |
open-input-binary-file file-name | bigloo procedure |
close-binary-port binary-port | bigloo procedure |
input-obj binary-port | bigloo procedure |
output-obj binary-port obj | bigloo procedure |
Bigloo allows Scheme objects to be dumped into, and restored from, files.
These operations are performed by the previous functions. The dump and
the restore use the two functions obj->string and
string->obj .
It is also possible to use a binary file as a flat character file. This can
be done by the means of output-char , input-char ,
output-string , and input-string functions.
|
input-char binary-port | bigloo procedure |
output-char binary-port | bigloo procedure |
The function input-char reads a single character from a
binary-port . It returns the read character or the end-of-file
object. The function output-char writes a character into a
binary-port .
|
input-string binary-port len | bigloo procedure |
output-string binary-port | bigloo procedure |
The function input-string reads a string from a binary-port of
maximum length len . It returns a newly allocated string whose length
is possibly smaller than len . The function output-string writes
a string into a binary-port .
|
input-fill-string! binary-port string len | bigloo procedure |
Fills a string with characters read from binary-port with at most
len characters. The function returns the number of filled characters.
|
register-procedure-serialization serializer unserializer | bigloo procedure |
There is no existing portable method to dump and restore a procedure. Thus,
if obj->string is passed a procedure, it will emit an error message.
Sometime, using strict restrictions, it may be convenient to use an
ad-hoc framework to serialize and unserialize procedures. User may
specify there own procedure serializer and unserializer. This is the
role of register-procedure-serialization . The argument
serializer is a procedure of one argument, converting a procedure
into a characters strings. The argument unserializer is a procedure
of one argument, converting a characters string into a procedure. It belongs
to the user to provide correct serializer and unserializer.
Here is an example of procedure serializer and unserializer that
may be correct under some Unix platform:
(module foo
(extern (macro %sprintf::int (::string ::string ::procedure) "sprintf")))
(define (string->procedure str)
(pragma "(obj_t)(strtoul(BSTRING_TO_STRING($1), 0, 16))" str))
(define (procedure->string proc)
(let ((item (make-string 10)))
(%sprintf item "#p%lx" proc)
item))
(register-procedure-serialization procedure->string string->procedure)
(let ((x 4))
(let ((obj (cons "toto" (lambda (y) (+ x y)))))
(let ((nobj (string->obj (obj->string obj))))
(print ((cdr nobj) 5)))))
|
|
get-procedure-serialization | bigloo procedure |
Returns the a pair whose car is the current procedure serializer
and the cdr is the current procedure unserializer.
|
register-process-serialization serializer unserializer | bigloo procedure |
Same as register-procedure-serialization for Bigloo processes.
|
get-process-serialization | bigloo procedure |
Same as get-procedure-serialization for Bigloo processes.
|
These procedures allow the manipulation of fixnums as bit-fields.
bit-or i1 i2 | bigloo procedure |
bit-orelong i1 i2 | bigloo procedure |
bit-orllong i1 i2 | bigloo procedure |
bit-xor i1 i2 | bigloo procedure |
bit-xorelong i1 i2 | bigloo procedure |
bit-xorllong i1 i2 | bigloo procedure |
bit-and i1 i2 | bigloo procedure |
bit-andelong i1 i2 | bigloo procedure |
bit-andllong i1 i2 | bigloo procedure |
bit-not i | bigloo procedure |
bit-notelong i | bigloo procedure |
bit-notllong i | bigloo procedure |
bit-lsh i1 i2 | bigloo procedure |
bit-lshelong i1 i2 | bigloo procedure |
bit-lshllong i1 i2 | bigloo procedure |
bit-rsh i1 i2 | bigloo procedure |
bit-ursh i1 i2 | bigloo procedure |
bit-rshelong i1 i2 | bigloo procedure |
bit-rshllong i1 i2 | bigloo procedure |
(bit-or 5 3) => 7
(bit-orelong #e5 #e3) => #e7
(bit-xor 5 3) => 6
(bit-andllong #l5 #l3) => #l1
(bit-not 5) => -6
(bit-lsh 5 3) => 40
(bit-rsh 5 1) => 2
|
|
Bigloo offers hash tables. Here are described functions which define
and use them.
make-hashtable [bucket-len] [max-bucket-len] | bigloo procedure |
Defines an hash table for which the number of buckets is bucket-len .
The variable max-bucket-len specify when the table should be
resized. If provided, these two values have to be exact integers greater or
equal to 1. Normally you could ignore bucket-len and max-bucket-len
arguments and call make-hashtable with no argument at all.
|
hashtable? obj | bigloo procedure |
Returns #t if obj is an hash table, constructed by
make-hashtable .
|
hashtable-size table | bigloo procedure |
Returns the number of entries contained in table .
|
hashtable-contains? table key | bigloo procedure |
Returns the boolean #t if it exists at least one entry whose key
is key in table . If not entry is found #f is returned.
|
hashtable-get table key | bigloo procedure |
Returns the entry whose key is key in table . If not entry
is found #f is returned.
|
hashtable-put! table key obj | bigloo procedure |
Puts obj in table under the key key . This function
returns the object bound in the table. If there was an object
obj-old already in the table with the same key as obj ,
this function returns obj-old ; otherwise it returns obj .
|
hashtable-remove! table key | bigloo procedure |
Removes the object associated to key from table ,
returning #t if such object
was bound in table and #f otherwise.
|
hashtable-update! table key update-fun init-value | bigloo procedure |
If key is already in table, the new value is calculated by
(update-fun current-value) . Otherwise the table is extended
by an entry linking key and init-value .
|
hashtable->vector table | bigloo procedure |
hashtable->list table | bigloo procedure |
Convert a hash table table to a vector or to a list.
|
hashtable-key-list table | bigloo procedure |
Returns the list of keys used in the table .
|
hashtable-for-each table fun | bigloo procedure |
Applies fun to each of the keys and elements of table
(no order is specified). In consequence, fun must be a procedure
of two arguments. The first one is a key and the second one, an
associated object.
|
Here is an example of hash table.
(define *table* (make-hashtable))
(hashtable-put! *table* "toto" "tutu")
(hashtable-put! *table* "tata" "titi")
(hashtable-put! *table* "titi" 5)
(hashtable-put! *table* "tutu" 'tutu)
(hashtable-put! *table* 'foo 'foo)
(print (hashtable-get *table* "toto"))
-| "tutu"
(print (hashtable-get *table* 'foo))
-| 'foo
(print (hashtable-get *table* 'bar))
-| #f
(hashtable-for-each *table* (lambda (key obj) (print (cons key obj))))
-| ("toto" . "tutu")
("tata" . "titi")
("titi" . 5)
("tutu" . TUTU)
(foo . foo)
|
object-hashnumber object | bigloo generic |
This generic function computes a hash number of the instance object .
Example:
(define-method (object-hashnumber pt::point)
(with-access::point pt (x y)
(+fx (*fx x 10) y)))
|
|
5.6.2 Deprecated Hash tables
|
Bigloo offers hash tables. Here are described functions which define
and use them.
make-hash-table ms nb gk eq [is] | bigloo procedure |
Defines an hash table of maximum length ms . If
is is provided, it sets the initial table size. The
formal nb is a function which, if applied to a
key , has to return an integer bound in interval
[0 ..ms ]. The formal gk
is a function which, if applied to an object, returns its
key. For example, the identity is a good candidate for integers,
strings, symbols, etc. The last formal eq is a
equivalence predicate which is used when searching for an entry in
the table.
|
hash-table? obj | bigloo procedure |
Returns #t if obj is an hash table.
|
hash-table-nb-entry table | bigloo procedure |
Returns the number of entries contained in table .
|
get-hash key table | bigloo procedure |
Returns the entry whose key is key in table . If not entry
is found #f is returned.
|
put-hash! obj table | bigloo procedure |
Puts obj in table . This function returns the object bound
in the table. If there was an object obj-old already in the
table with the same key as obj , this function returns obj-old ; o
therwise it returns obj .
|
rem-obj-hash! obj table | bigloo procedure |
Removes obj from table , returning #t if such object
was bound in table and #f otherwise.
|
rem-key-hash! key table | bigloo procedure |
Removes an object associated with key in
table , returning #t if such an object was bound
in table and #f otherwise.
|
for-each-hash fun table | bigloo procedure |
Applies fun to each of the elements of table (no order
is specified).
|
Some functions compute hash numbers from Scheme objects:
string->0..255 string | bigloo procedure |
string->0..2^x-1 string power | bigloo procedure |
The formal power has to be a power of 2 between 1 and 16.
|
int->0..255 int | bigloo procedure |
int->0..2^x-1 int power | bigloo procedure |
obj->0..255 obj | bigloo procedure |
obj->0..2^x-1 obj power | bigloo procedure |
|
Here is an example of hash table that contains pairs:
(define *table*
(make-hash-table 1024
(lambda (o) (string->0..2^x-1 o 10))
(lambda (x) (car x))
string=?
64))
(let ((cell1 (cons "toto" "tutu"))
(cell2 (cons "tata" "titi"))
(cell3 (cons "titi" 5))
(cell4 (cons "tutu" 'tutu)))
(put-hash! cell1 *table*)
(put-hash! cell2 *table*)
(put-hash! cell3 *table*)
(put-hash! cell4 *table*))
(print (cdr (get-hash "toto" *table*)))
-| "tutu"
(for-each-hash print *table*)
-| ("toto" . "tutu")
("tata" . "titi")
("titi" . 5)
("tutu" . TUTU)
|
5.7.1 Operating System interface
|
register-exit-function! proc | bigloo procedure |
Register proc as an exit functions. Proc is a procedure
accepting of one argument. This argument is the numerical value which
is the status of the exit call. The registered functions are called when the
execution ends.
|
Apply all the registered exit functions then stops an execution,
returning the integer int .
|
signal n proc | bigloo procedure |
Provides a signal handler for the operating system dependent signal
n . proc is a procedure of one argument.
|
get-signal-handler n | bigloo procedure |
Returns the current handler associated with signal n or
#f if no handler is installed.
|
system . strings | bigloo procedure |
Append all the arguments strings and invoke the native host
system command on that new string which returns an integer.
|
system->string . strings | bigloo procedure |
Append all the arguments strings and invoke the native host
system command on that new string. If the command completes,
system->string returns a string made of the output of the
command.
|
getenv string | bigloo procedure |
Returns the string value of the Unix shell's string variable. If no
such variable is bound, getenv returns #f .
|
putenv string val | bigloo procedure |
Adds or modifies the global environment variable string so that
it is bound to val after the call. This facility is not supported
by all back-end. In particular, the JVM back-end does not support it.
|
Returns the current date in a string . See also Date.
|
Sleeps for a delay during at least ms microseconds.
|
command-line | bigloo procedure |
Returns a list of strings which are the Unix command line arguments.
|
executable-name | bigloo procedure |
Returns the name of the running executable.
|
Gives the OS class (e.g. unix).
|
Gives the OS name (e.g. Linux).
|
Gives the host architecture (e.g. i386).
|
os-version | bigloo procedure |
Gives the operating system version (e.g. RedHat 2.0.27).
|
Gives the regular temporary directory (e.g. /tmp).
|
file-separator | bigloo procedure |
Gives the operating system file separator (e.g. #\/).
|
path-separator | bigloo procedure |
Gives the operating system file path separator (e.g.#\:).
|
For additional functions (such as directory->list )
see Input and Output.
unix-path->list | bigloo procedure |
Converts a Unix path to a Bigloo list of strings.
(unix-path->list ".") => (".")
(unix-path->list ".:/usr/bin") => ("." "/usr/bin")
|
|
Returns the fully qualified name of the current host.
|
See Input and Output for file and directory handling. This
section only deals with name handling. Four procedures exist to
manipulate Unix filenames.
basename string | bigloo procedure |
Returns a copy of string where the longest prefix ending in / is
deleted if any existed.
|
prefix string | bigloo procedure |
Returns a copy of string where the suffix starting by
the char #\. is deleted. If no prefix is found,
the result of prefix is a copy of string . For
instance:
(prefix "foo.scm")
=> "foo"
(prefix "./foo.scm")
=> "./foo"
(prefix "foo.tar.gz")
=> "foo.tar"
|
|
suffix string | bigloo procedure |
Returns a new string which is the suffix of string . If no
suffix is found, this function returns an empty string. For instance,
(suffix "foo.scm")
=> "scm"
(suffix "./foo.scm")
=> "scm"
(suffix "foo.tar.gz")
=> "gz"
|
|
dirname string | bigloo procedure |
Returns a new string which is the directory component of string .
For instance:
(dirname "abc/def/ghi")
=> "abc/def"
(dirname "abc")
=> "."
(dirname "abc/")
=> "abc"
(dirname "/abc")
=> "/"
|
|
Returns the current working directory.
|
chdir dir-name | bigloo procedure |
Changes the current directory to dir-name . On success, chdir
returns #t . On failure it returns #f .
|
make-file-name dir-name name | bigloo procedure |
Make an absolute file-name from a directory name dir-name and a relative
name name .
|
make-file-path dir-name name . names | bigloo procedure |
Make an absolute file-name from a directory name dir-name and a relative
name name s.
|
find-file/path name path | bigloo procedure |
Search, in sequence, in the directory list path for the file
name . If name is an absolute name, then path is not
used to find the file. If name is a relative name, the function
make-file-name is used to build absolute name from name and
the directories in path . The current path is not included
automatically in the list of path . In consequence, to check the
current directory one may add "." to the path list. On
success, the absolute file name is returned. On failure,
#f is returned. Example:
(find-file/path "/etc/passwd" '("/toto" "/titi"))
=> "/etc/passwd"
(find-file/path "passwd" '("/toto" "/etc"))
=> "/etc/passwd"
(find-file/path "pass-wd" '("." "/etc"))
=> #f
|
|
make-static-library-name name | bigloo procedure |
Make a static library name from
name by adding the static library regular suffix.
|
make-shared-library-name name | bigloo procedure |
Make a shared library name from
name by adding the shared library regular suffix.
|
file-exists? string | bigloo procedure |
This procedure returns #t if the file string exists. Otherwise
it returns #f .
|
delete-file string | bigloo procedure |
Deletes the file named string . The result of this procedure
is #f is the operation succeeded. The result is #t otherwise.
|
rename-file string1 string2 | bigloo procedure |
Renames the file string1 as string2 . The two files have to
be located on the same file system. If the renaming succeeds, the result
is #t , otherwise it is #f .
|
copy-file string1 string2 | bigloo procedure |
Copies the file string1 into string2 . If the copy succeeds,
the result is #t , otherwise it is #f .
|
directory? string | bigloo procedure |
This procedure returns #t if the file string exists and is a
directory. Otherwise it returns #f .
|
make-directory string | bigloo procedure |
Creates a new directory named string . It returns #t if the
directory was created. It returns #f otherwise.
|
make-directories string | bigloo procedure |
Creates a new directory named string , including any necessary
but nonexistent parent directories. It returns #t if the
directory was created. It returns #f otherwise. Note that
if this operation fails it may have succeeded in creating some
of the necessary parent directories.
|
delete-directory string | bigloo procedure |
Deletes the directory named string . The directory must be empty
in order to be deleted. The result of this procedure is unspecified.
|
directory->list string | bigloo procedure |
If file string exists and is a directory, this function returns the
list of files in string .
|
file-modification-time string | bigloo procedure |
The date (in second) of the last modification for file string . The
number of seconds is represented by a value that may be converted into
a date by the means of seconds->date (see Date).
|
file-size string | bigloo procedure |
Returns the size (in bytes) for file string .
|
chmod string [option] | bigloo procedure |
Change the access mode of the file named string . The option
must be either a list of the following symbols read , write
and execute or an integer. If the operation succeeds, chmod
returns #t . It returns #f otherwise.
Example:
(chmod (make-file-name (getenv "HOME") ".bigloorc") 'read 'write)
(chmod (make-file-name (getenv "HOME") ".bigloorc") #o777)
|
|
Bigloo provides access to Unix-like processes as first class
objects. The implementation and this documentation are to a great
extent copies of the STk [Gallesio95] process
support. Basically, a process contains four informations: the standard
Unix process identification (aka PID) and the three standard files of
the process.
run-process command arg... | bigloo procedure |
run-process creates a new process and run the executable specified
in command . The arg correspond to the command line arguments.
When is process completes its execution, non pipe associated ports are
automatically closed. Pipe associated ports have to be explicitly closed
by the program. The following values of p have a special meaning:
input: permits to redirect the standard input file of the process.
Redirection can come from a file or from a pipe. To redirect the standard
input from a file, the name of this file must be specified after input: .
Use the special keyword pipe: to redirect the standard input
from a pipe.
output: permits to redirect the standard output file of the
process. Redirection can go to a file or to a pipe. To redirect the
standard output to a file, the name of this file must be specified
after output: . Use the special keyword pipe: to redirect the
standard output to a pipe.
error: permits to redirect the standard error file of the
process. Redirection can go to a file or to a pipe. To redirect the
standard error to a file, the name of this file must be specified
after error: . Use the special keyword pipe: to redirect the
standard error to a pipe.
wait: must be followed by a boolean value. This value
specifies if the process must be ran asynchronously or not. By
default, the process is run asynchronously (i.e. wait: if
#f ).
host: must be followed by a string. This string represents the
name of the machine on which the command must be executed. This
option uses the external command rsh . The shell variable PATH
must be correctly set for accessing it without specifying its absolute
path.
fork: must be followed by a boolean value. This value
specifies if the process must substitute the current execution. That is,
if the value is #t a new process is spawned otherwise, the current
execution is stopped and replaced by the execution of command .
env: must be followed by a string of
the form var =val . This will bound an environment variable
in the spawned process. A run-process command may contain several
env: arguments. The current variables of the current process are
also passed to the new process.
The following example launches a process which execute the Unix command
ls
with the arguments -l and /bin . The lines printed by this command
are stored in the file tmp/X .
(run-process "ls" "-l" "/bin" output: "/tmp/X")
|
The same example with a pipe for output:
(let* ((proc (run-process "ls" "-l" "/bin" output: pipe:))
(port (process-output-port proc)))
(let loop ((line (read-line port)))
(if (eof-object? line)
(close-input-port port)
(begin
(print line)
(loop (read-line port))))))
|
One should note that the same program can be written with explicit
process handling but making use of the | notation for
open-input-file .
(let ((port (open-input-file "| ls -l /bin")))
(let loop ((line (read-line port)))
(if (eof-object? line)
(close-input-port port)
(begin
(print line)
(loop (read-line port))))))
|
Both input and output ports can be piped:
(let* ((proc (run-process "/usr/bin/dc" output: pipe: input: pipe:))
(inport (process-input-port proc))
(port (process-output-port proc)))
(fprint inport "16 o")
(fprint inport "16 i")
(fprint inport "10")
(fprint inport "10")
(fprint inport "+ p")
(flush-output-port inport)
(let loop ((line (read-line port)))
(if (eof-object? line)
(close-input-port port)
(begin
(print line)
(loop (read-line port)))))) -| 20
|
Note: The call to flush-output-port is mandatory in order
to get the dc process to get its input characters.
Note: Thanks to Todd Dukes for the example and the suggestion
of including it this documentation.
|
process? obj | bigloo procedure |
Returns #t if obj is a process, otherwise returns #f .
|
process-alive? process | bigloo procedure |
Returns #t if process is currently running, otherwise
returns #f .
|
close-process-ports command arg... | bigloo procedure |
Close the three ports associated with a process. In general the ports should
not be closed before the process is terminated.
|
process-pid process | bigloo procedure |
Returns an integer value which represents the Unix identification (PID) of
the process .
|
process-input-port process | bigloo procedure |
process-output-port process | bigloo procedure |
process-error-port process | bigloo procedure |
Return the file port associated to the standard input, output and
error of process otherwise returns #f .
Note that the returned port is opened for reading when calling
process-output-port or process-error-port .
It is opened for writing when calling process-input-port .
|
process-wait process | bigloo procedure |
This function stops the current process until process completion.
This function returns #f when process is already terminated. It
returns #t otherwise.
|
process-exit-status process | bigloo procedure |
This function returns the exit status of process if it is has
finished its execution. It returns #f otherwise.
|
process-send-signal process s | bigloo procedure |
Sends the signal whose integer value is s to process . Value
of s is system dependent. The result of process-send-signal
is undefined.
|
process-kill process | bigloo procedure |
This function brutally kills process . The result of process-kill
is undefined.
|
process-stop process | bigloo procedure |
process-continue process | bigloo procedure |
Those procedures are only available on systems that support job control.
The function process-stop stops the execution of process and
process-continue resumes its execution.
|
process-list | bigloo procedure |
This function returns the list of processes which are currently running
(i.e. alive).
|
Bigloo defines sockets, on systems that support them, as first class objects.
Sockets permits processes to communicate even if they are on different
machines. Sockets are useful for creating client-server applications.
The implementation and this documentation are, to a great
extent copies of the STk [Gallesio95] socket support.
make-client-socket hostname port-number [buffered] | bigloo procedure |
make-client-socket returns a new socket object. This socket establishes
a link between the running application listening on port port-number
of hostname . If optional argument buffered is #f then
the input port associated with the socket is unbuffered. This is useful for
socket clients connected to servers that do not emit #\Newline character
after emissions. If optional argument buffered is missing or is not
to #f the input port uses a buffer.
When a socket is used in unbufferized mode the characters available on
the input port must be read exclusively with read-char
or read-line . It is forbidden to use read or any regular
grammar. This limitation is imposed by Rgc (see Regular Parsing) that
intrinsicly associate buffers with regular grammars. If the current Rgc
implementation is improved on the coming version this restriction will
be suppressed.
|
socket? obj | bigloo procedure |
socket-server? obj | bigloo procedure |
socket-client? obj | bigloo procedure |
Returns #t if obj is a socket, a socket server a socket client.
Otherwise returns #f . Socket servers and socket clients are
sockets.
|
socket-hostname socket | bigloo procedure |
Returns a string which contains the name of the distant host attached to
socket . If socket has been created with make-client-socket
this procedure returns the official name of the distant machine used for
connection. If socket has been created with make-server-socket ,
this function returns the official name of the client connected to the socket.
If no client has used yet the socket, this function returns #f .
|
socket-host-address socket | bigloo procedure |
Returns a string which contains the IP number of
the distant host attached to socket . If socket has been
created with make-client-socket this procedure returns the
IP number of the distant machine used for connection. If
socket has been created with make-server-socket , this
function returns the address of the client connected to the
socket. If no client has used yet the socket, this function returns
#f .
|
socket-local-address socket | bigloo procedure |
Returns a string which contains the IP number of
the local host attached to socket .
|
socket-port-number socket | bigloo procedure |
Returns the integer number of the port used for socket .
|
socket-input socket | bigloo procedure |
socket-output socket | bigloo procedure |
Returns the file port associated for reading or writing with the program
connected with socket . If no connection has already been established,
these functions return #f .
The following example shows how to make a client socket. Here we create a
socket on port 13 of the machine ``kaolin.unice.fr ''1:
(let ((s (make-client-socket "kaolin.unice.fr" 13)))
(print "Time is: " (read-line (socket-input s)))
(socket-shutdown s))
|
|
make-server-socket [port-number] | bigloo procedure |
make-server-socket returns a new socket object. If port-number
is specified, the socket is listening on the specified port; otherwise, the
communication port is choosen by the system.
|
socket-accept socket [buffered] | bigloo procedure |
The function socket-accept replaces the former Bigloo functions
socket-accept-connection and socket-dup .
socket-accept waits for a client connection on the given
socket . It returns a client-socket . If no client is
already waiting for a connection, this procedure blocks its caller;
otherwise, the first connection request on the queue of pending
connections is connected to socket . This procedure must be
called on a server socket created with make-server-socket . If
optional argument buffered is #f then the input port
associated with the socket is unbuffered. This is useful for socket
clients connected to servers that do not emit #\Newline character
after emissions. If optional argument buffered is missing or is
not to #f the input port uses a buffer.
Note: When a socket is used in unbufferized mode the characters
available on the input port must be read exclusively with
read-char or read-line . It is forbidden to use read
or any regular grammar. This limitation is imposed by Rgc (see
Regular Parsing) that intrinsicly associate buffers with regular
grammars. If the current Rgc implementation is improved on the coming
version this restriction will be suppressed.
The following exemple is a simple server which waits for a connection
on the port 12342. Once the connection with the
distant program is established, we read a line on the input port
associated to the socket and we write the length of this line on its
output port.
(let* ((s (make-server-socket 1234))
(s2 (socket-accept s)))
(let ((l (read-line (socket-input s2))))
(fprint (socket-output s2) "Length is: " (string-length l))
(flush-output-port (socket-output s2)))
(socket-close s2)
(socket-shutdown s))
|
|
socket-accept-connection socket [buffered] | bigloo procedure |
socket-accept-connection waits for a client connection on the
given socket . If no client is already waiting for a connection,
this procedure blocks its caller; otherwise, the first connection
request on the queue of pending connections is connected to
socket .
note:This function is part of the old Bigloo API. It was
intended to be used in conjunction with
socket-accept-connection . The new API made of
socket-accept and socket-close should be preferred.
This procedure must be called on a server socket created
with make-server-socket . If optional argument buffered is
#f then the input port associated with the socket is
unbuffered. This is useful for socket clients connected to servers that
do not emit #\Newline character after emissions. If optional argument
buffered is missing or is not to #f the input port uses a
buffer. The result of socket-accept-connection is undefined.
Note: When a socket is used in unbufferized mode the characters
available on the input port must be read exclusively with
read-char or read-line . It is forbidden to use read
or any regular grammar. This limitation is imposed by Rgc (see
Regular Parsing) that intrinsicly associate buffers with regular
grammars. If the current Rgc implementation is improved on the coming
version this restriction will be suppressed.
The following exemple is a simple server which waits for a connection
on the port 12343. Once the connection with the
distant program is established, we read a line on the input port
associated to the socket and we write the length of this line on its
output port.
(let ((s (make-server-socket 1234)))
(socket-accept-connection s)
(let ((l (read-line (socket-input s))))
(fprint (socket-output s) "Length is: " (string-length l))
(flush-output-port (socket-output s)))
(socket-shutdown s))
|
|
socket-close socket | bigloo procedure |
The function socket-close closes the connection established with
a socket-client .
|
socket-shutdown socket [close] | bigloo procedure |
Socket-shutdown shutdowns the connection associated to socket .
Close is a boolean; it indicates if the socket must be closed or not,
when the connection is destroyed. Closing the socket forbids further
connections on the same port with the socket-accept-connection
procedure. Omitting a value for close implies the closing of socket.
The result of socket-shutdown is undefined.
The following example shows a simple server: when there is a new connection
on the port number 1234, the server displays the first line sent to it by the
client, discards the others and go back waiting for further client connections.
(let ((s (make-server-socket 1234)))
(let loop ()
(socket-accept-connection s)
(print "I've read: " (read-line (socket-input s)))
(socket-shutdown s #f)
(loop)))
|
|
socket-down? socket | bigloo procedure |
Returns #t if socket has been previously closed
with socket-shutdown . It returns #f otherwise.
|
socket-dup socket | bigloo procedure |
Returns a copy of socket .
note:This function is part of the old Bigloo
API. It was intended to be used in conjunction with
socket-accept-connection . The new API made of socket-accept
and socket-close should be preferred.
The original and the copy socket can be used interchangeably. However,
if a new connection is accepted on one socket, the characters
exchanged on this socket are not visible on the other socket.
Duplicating a socket is useful when a server must accept multiple
simultaneous connections. The following example creates a server
listening on port 1234. This server is duplicated and, once two
clients are present, a message is sent on both connections.
(define s1 (make-server-socket 1234))
(define s2 (socket-dup s1))
(socket-accept-connection s1)
(socket-accept-connection s2)
;; blocks until two clients are present
(display #"Hello,\n" (socket-output s1))
(display #"world\n" (socket-output s2))
(flush-output-port (socket-output s1))
(flush-output-port (socket-output s2))
|
|
Here is another example of making use of sockets:
(define s1 (make-server-socket))
(define s2 #unspecified)
(dynamic-wind
;; Init: Launch an xterm with telnet running
;; on the s listening port and connect
(lambda ()
(run-process "/usr/X11R6/bin/xterm" "-display" ":0" "-e" "telnet" "localhost"
(number->string (socket-port-number s1)))
(set! s2 (socket-accept s1))
(display #"\nWelcome on the socket REPL.\n\n> " (socket-output s2))
(flush-output-port (socket-output s2)))
;; Action: A toplevel like loop
(lambda ()
(let loop ()
(let ((obj (eval (read (socket-input s2)))))
(fprint (socket-output s2) "; Result: " obj)
(display "> " (socket-output s2))
(flush-output-port (socket-output s2))
(loop))))
;; Termination: We go here when
;; -a: an error occurs
;; -b: connection is closed
(lambda ()
(print #"Shutdown ......\n")
(socket-close s2)
(socket-shutdown s1)))
|
Here is a second example that uses sockets. It implements
a client-server architecture and it uses unbufferized
(see socket-accept ) input ports.
First, here is the code of the client:
(module client)
(let* ((s (make-client-socket "localhost" 8080 #f))
(p (socket-output s)))
(display "string" p)
(newline p)
(display "abc" p)
(flush-output-port p)
(let loop ()
(loop)))
|
Then, here is the code of the server:
(module server)
(let* ((s (make-server-socket 8080))
(s2 (socket-accept s #f)))
(let ((pin (socket-input s2)))
(let loop ()
(display (read-char pin))
(flush-output-port (current-output-port))
(loop))))
|
At, to conclude here the source code for a server waiting for multiple
consecutive connections:
(define (main argv)
(let ((n (if (pair? (cdr argv))
(string->integer (cadr argv))
10))
(s (make-server-socket)))
(print "s: " s)
(let loop ((i 0))
(if (<fx i n)
(let ((s2 (socket-accept s)))
(print "i: " i " " s2)
(print (read-line (socket-input s2)))
(socket-close s2)
(loop (+fx i 1)))
(socket-shutdown s)))))
|
date? obj | bigloo procedure |
Returns #t if and only if obj is a date as returned
by make-date , current-date , or seconds->date . It
returns #f otherwise.
|
make-date sec min hour day mon year [timezone] [dst] | bigloo procedure |
Creates a date object from the integer value passed as argument.
Example:
(write (make-date 0 22 17 5 2 2003 0))
-| #<date:Wed Feb 5 17:22:00 2003>
|
This argument date is either -1 when the information is not
available, 0 when daylight saving is disabled, 1 when daylight
saving is enabled.
|
date-copy date [s] [m] [h] [d] [m] [year] | bigloo procedure |
Creates a new date from the argument date .
Example:
(date-copy (current-date) 1 0 0)
|
|
current-date | bigloo procedure |
Returns a date object representing the current date.
|
current-seconds | bigloo procedure |
Returns an elong integer representing the current date expressed
in seconds.
|
date->seconds | bigloo procedure |
seconds->date | bigloo procedure |
Convert from date and elong .
|
date->string date | bigloo procedure |
date->utc-string date | bigloo procedure |
seconds->string elong | bigloo procedure |
seconds->utc-string elong | bigloo procedure |
Construct a textual representation of the date passed in argument
|
date-second date | bigloo procedure |
Returns the number of seconds of a date, in the range 0...59 .
|
date-minute date | bigloo procedure |
Returns the minute of a date, in the range 0...59 .
|
date-hour date | bigloo procedure |
Returns the hour of a date, in the range 0...23 .
|
date-day date | bigloo procedure |
Returns the day of a date, in the range 1...31 .
|
date-wday date | bigloo procedure |
Returns the week day of a date, in the range 1...7 .
|
date-yday date | bigloo procedure |
Returns the year day of a date, in the range 1...366 .
|
date-month date | bigloo procedure |
Returns the month of a date, in the range 1...12 .
|
date-year date | bigloo procedure |
Returns the year of a date.
|
date-timezone date | bigloo procedure |
Returns the timezone of a date.
|
date-is-dst date | bigloo procedure |
Returns -1 if the information is not available, 0 is the
date does not contain daylight saving adjustment, 1 if it
contains a daylight saving adjustment.
|
+second elong1 elong2 | bigloo procedure |
*second elong1 elong2 | bigloo procedure |
-second elong1 elong2 | bigloo procedure |
=second elong1 elong2 | bigloo procedure |
>second elong1 elong2 | bigloo procedure |
>=second elong1 elong2 | bigloo procedure |
<second elong1 elong2 | bigloo procedure |
<=second elong1 elong2 | bigloo procedure |
Arithmetic operators on seconds.
|
integer->second | bigloo procedure |
Converts a Bigloo fixnum integer into a second number.
|
day-seconds | bigloo procedure |
Returns the number of seconds contained in one day.
|
day-name int | bigloo procedure |
day-aname int | bigloo procedure |
Return the name and the abbreviated name of a week day.
|
month-name int | bigloo procedure |
month-aname int | bigloo procedure |
Return the name and the abbreviated name of a month.
|
leap-year? int | bigloo procedure |
Returns #t if and only if the year int is a leap year.
Returns #f otherwise.
|
5.11 Posix Regular Expressions
|
This whole section has been written by Dorai Sitaram.
It consists in the documentation of the pregexp package that may be
found at http://www.ccs.neu.edu/~dorai/pregexp/pregexp.html.
The regexp notation supported is modeled on Perl's, and includes such
powerful directives as numeric and nongreedy quantifiers, capturing and
non-capturing clustering, POSIX character classes, selective case- and
space-insensitivity, backreferences, alternation, backtrack pruning,
positive and negative lookahead and lookbehind, in addition to the more
basic directives familiar to all regexp users. A regexp is a
string that describes a pattern. A regexp matcher tries to match
this pattern against (a portion of) another string, which we will call
the text string. The text string is treated as raw text and not
as a pattern. Most of the characters in a regexp pattern are meant to match
occurrences of themselves in the text string. Thus, the pattern
"abc" matches a string that contains the characters a , b ,
c in succession. In the regexp pattern, some characters act as
metacharacters, and some character sequences act as
metasequences. That is, they specify something
other than their literal selves. For example, in the
pattern "a.c" , the characters a and c do
stand for themselves but the metacharacter .
can match any character (other than
newline). Therefore, the pattern "a.c"
matches an a , followed by any character,
followed by a c . If we needed to match the character . itself,
we escape it, ie, precede it with a backslash
( \ ). The character sequence \. is thus a
metasequence, since it doesn't match itself but rather
just . . So, to match a followed by a literal
. followed by c , we use the regexp pattern
"a\\.c" . 4
Another example of a metasequence is \t , which is a
readable way to represent the tab character. We will call the string representation of a regexp the
U-regexp, where U can be taken to mean Unix-style or
universal, because this
notation for regexps is universally familiar. Our
implementation uses an intermediate tree-like
representation called the S-regexp, where S
can stand for Scheme, symbolic, or
s-expression. S-regexps are more verbose
and less readable than U-regexps, but they are much
easier for Scheme's recursive procedures to navigate.
5.11.1 Regular Expressions Procedures
|
Four procedures pregexp , pregexp-match-positions ,
pregexp-match , pregexp-replace , and
pregexp-replace* enable compilation and matching of regular
expressions.
pregexp U-regexp | bigloo procedure |
The procedure pregexp takes a U-regexp, which is a
string, and returns an S-regexp, which is a tree.
(pregexp "c.r") => (:sub (:or (:seq #\c :any #\r)))
|
There is rarely any need to look at the S-regexps returned by pregexp .
|
pregexp-match-positions regexp string | bigloo procedure |
The procedure pregexp-match-positions takes a
regexp pattern and a text string, and returns a match
if the pattern matches the text string.
The pattern may be either a U- or an S-regexp.
(pregexp-match-positions will internally compile a
U-regexp to an S-regexp before proceeding with the
matching. If you find yourself calling
pregexp-match-positions repeatedly with the same
U-regexp, it may be advisable to explicitly convert the
latter into an S-regexp once beforehand, using
pregexp , to save needless recompilation.)
pregexp-match-positions returns #f if the pattern did not
match the string; and a list of index pairs if it
did match. Eg,
(pregexp-match-positions "brain" "bird")
=> #f
(pregexp-match-positions "needle" "hay needle stack")
=> ((4 . 10))
|
In the second example, the integers 4 and 10 identifythe substring that was matched. 1 is the starting
(inclusive) index and 2 the ending (exclusive) index of
the matching substring.
(substring "hay needle stack" 4 10)
=> "needle"
|
Here, pregexp-match-positions 's return list contains only
one index pair, and that pair represents the entire
substring matched by the regexp. When we discuss
subpatterns later, we will see how a single match
operation can yield a list of submatches.
pregexp-match-positions takes optional third
and fourth arguments that specify the indices of
the text string within which the matching should
take place.
(pregexp-match-positions "needle"
"his hay needle stack -- my hay needle stack -- her hay needle stack"
24 43)
=> ((31 . 37))
|
Note that the returned indices are still reckoned
relative to the full text string.
|
pregexp-match regexp string | bigloo procedure |
The procedure pregexp-match is called like
pregexp-match-positions
but instead of returning index pairs it returns the
matching substrings:
(pregexp-match "brain" "bird")
=> #f
(pregexp-match "needle" "hay needle stack")
=> ("needle")
|
pregexp-match also takes optional third and
fourth arguments, with the same meaning as does
pregexp-match-positions .
|
pregexp-replace regexp string1 string2 | bigloo procedure |
The procedure pregexp-replace replaces the
matched portion of the text string by another
string. The first argument is the regexp,
the second the text string, and the third
is the insert string (string to be inserted).
(pregexp-replace "te" "liberte" "ty")
=> "liberty"
|
If the pattern doesn't occur in the text string, the returned string is
identical (eq? ) to the text string.
|
pregexp-replace* regexp string1 string2 | bigloo procedure |
The procedure pregexp-replace* replaces all matches in the
text string by the insert string:
(pregexp-replace* "te" "liberte egalite fraternite" "ty")
=> "liberty egality fratyrnity"
|
As with pregexp-replace , if the pattern doesn't occur in the text
string, the returned string is identical (eq? ) to the text string.
|
pregexp-split regexp string | bigloo procedure |
The procedure pregexp-split takes two arguments, a
regexp pattern and a text string, and returns a list of
substrings of the text string, where the pattern identifies the
delimiter separating the substrings.
(pregexp-split ":" "/bin:/usr/bin:/usr/bin/X11:/usr/local/bin")
=> ("/bin" "/usr/bin" "/usr/bin/X11" "/usr/local/bin")
(pregexp-split " " "pea soup")
=> ("pea" "soup")
|
If the first argument can match an empty string, then
the list of all the single-character substrings is returned.
(pregexp-split "" "smithereens")
=> ("s" "m" "i" "t" "h" "e" "r" "e" "e" "n" "s")
|
To identify one-or-more spaces as the delimiter,
take care to use the regexp " +" , not " *" .
(pregexp-split " +" "split pea soup")
=> ("split" "pea" "soup")
(pregexp-split " *" "split pea soup")
=> ("s" "p" "l" "i" "t" "p" "e" "a" "s" "o" "u" "p")
|
|
pregexp-quote string | bigloo procedure |
The procedure pregexp-quote takes an arbitrary string and
returns a U-regexp (string) that precisely represents it. In particular,
characters in the input string that could serve as regexp metacharacters are
escaped with a backslash, so that they safely match only themselves.
(pregexp-quote "cons")
=> "cons"
(pregexp-quote "list?")
=> "list\\?"
|
pregexp-quote is useful when building a composite regexp
from a mix of regexp strings and verbatim strings.
|
5.11.2 Regular Expressions Pattern Language
|
Here is a complete description of the regexp pattern
language recognized by the pregexp procedures.
5.11.2.1 Basic assertions
The assertions ^ and $ identify the beginning and
the end of the text string respectively. They ensure that their
adjoining regexps match at one or other end of the text string.
Examples:
(pregexp-match-positions "^contact" "first contact") => #f
|
The regexp fails to match because contact does notoccur at the beginning of the text string.
(pregexp-match-positions "laugh$" "laugh laugh laugh laugh") => ((18 . 23))
|
The regexp matches the last laugh .
The metasequence \b asserts that
a word boundary exists.
(pregexp-match-positions "yack\\b" "yackety yack") => ((8 . 12))
|
The yack in yackety doesn't end at a wordboundary so it isn't matched. The second yack does and is. The metasequence \B has the opposite effect to \b . It
asserts that a word boundary does not exist.
(pregexp-match-positions "an\\B" "an analysis") => ((3 . 5))
|
The an that doesn't end in a word boundaryis matched.
5.11.2.2 Characters and character classes
Typically a character in the regexp matches the same character in the
text string. Sometimes it is necessary or convenient to use a regexp
metasequence to refer to a single character. Thus, metasequences
\n , \r , \t , and \. match the newline,
return, tab and period characters respectively. The metacharacter period ( . ) matches
any character other than newline.
(pregexp-match "p.t" "pet") => ("pet")
|
It also matches pat , pit , pot , put ,and p8t but not peat or pfffft . A character class matches any one character from a set of
characters. A typical format for this is the bracketed character
class [ ... ] , which matches any one character from the
non-empty sequence of characters enclosed within the
brackets. 5 Thus "p[aeiou]t" matches
pat , pet , pit , pot , put and nothing
else. Inside the brackets, a hyphen ( - ) between two characters
specifies the ascii range between the characters. Eg,
"ta[b-dgn-p]" matches tab , tac , tad ,
and tag , and tan , tao , tap . An initial caret ( ^ ) after the left bracket inverts the set
specified by the rest of the contents, ie, it specifies the set of
characters other than those identified in the brackets. Eg,
"do[^g]" matches all three-character sequences starting with
do except dog . Note that the metacharacter ^ inside brackets means something
quite different from what it means outside. Most other metacharacters
( . , * , + , ? , etc) cease to be metacharacters
when inside brackets, although you may still escape them for peace of
mind. - is a metacharacter only when it's inside brackets, and
neither the first nor the last character. Bracketed character classes cannot contain other bracketed character
classes (although they contain certain other types of character classes
--- see below). Thus a left bracket ( [ ) inside a bracketed
character class doesn't have to be a metacharacter; it can stand for
itself. Eg, "[a[b]" matches a , [ , and b . Furthermore, since empty bracketed character classes are disallowed, a
right bracket ( ] ) immediately occurring after the opening left
bracket also doesn't need to be a metacharacter. Eg, "[]ab]"
matches ] , a , and b .
5.11.2.3 Some frequently used character classes
Some standard character classes can be conveniently represented as
metasequences instead of as explicit bracketed expressions. \d
matches a digit ( [0-9] ); \s matches a whitespace
character; and \w matches a character that could be part of a
``word''. 6The upper-case versions of these metasequences stand for the inversions
of the corresponding character classes. Thus \D matches a
non-digit, \S a non-whitespace character, and \W a
non-``word'' character. Remember to include a double backslash when putting these metasequences
in a Scheme string:
(pregexp-match "\\d\\d" "0 dear, 1 have 2 read catch 22 before 9") => ("22")
|
These character classes can be used inside
a bracketed expression. Eg,
"[a-z\\d]" matches a lower-case letter
or a digit.
5.11.2.4 POSIX character classes
A POSIX character class is a special metasequence
of the form [: ... :] that can be used only
inside a bracketed expression. The POSIX classes
supported are
[:alnum:] letters and digits
[:alpha:] letters
[:algor:] the letters c , h , a and d
[:ascii:] 7-bit ascii characters
[:blank:] widthful whitespace, ie, space and tab
[:cntrl:] ``control'' characters, viz, those with code < 32
[:digit:] digits, same as \d
[:graph:] characters that use ink
[:lower:] lower-case letters
[:print:] ink-users plus widthful whitespace
[:space:] whitespace, same as \s
[:upper:] upper-case letters
[:word:] letters, digits, and underscore, same as \w
[:xdigit:] hex digits
|
For example, the regexp "[[:alpha:]_]" matches a letter or underscore.
(pregexp-match "[[:alpha:]_]" "--x--") => ("x")
(pregexp-match "[[:alpha:]_]" "--_--") => ("_")
(pregexp-match "[[:alpha:]_]" "--:--") => #f
|
The POSIX class notation is valid only inside a
bracketed expression. For instance, [:alpha:] ,
when not inside a bracketed expression, will not
be read as the letter class.
Rather it is (from previous principles) the character
class containing the characters : , a , l ,
p , h .
(pregexp-match "[:alpha:]" "--a--") => ("a")
(pregexp-match "[:alpha:]" "--_--") => #f
|
By placing a caret ( ^ ) immediately after
[: , you get the inversion of that POSIX
character class. Thus, [:^alpha]
is the class containing all characters
except the letters.
5.11.2.5 Quantifiers
The quantifiers * , + , and ? match
respectively: zero or more, one or more, and zero or one instances of
the preceding subpattern.
(pregexp-match-positions "c[ad]*r" "cadaddadddr") => ((0 . 11))
(pregexp-match-positions "c[ad]*r" "cr") => ((0 . 2))
(pregexp-match-positions "c[ad]+r" "cadaddadddr") => ((0 . 11))
(pregexp-match-positions "c[ad]+r" "cr") => #f
(pregexp-match-positions "c[ad]?r" "cadaddadddr") => #f
(pregexp-match-positions "c[ad]?r" "cr") => ((0 . 2))
(pregexp-match-positions "c[ad]?r" "car") => ((0 . 3))
|
5.11.2.6 Numeric quantifiers
You can use braces to specify much finer-tuned quantification than is
possible with * , + , ? . The quantifier {m} matches exactly m
instances of the preceding subpattern. m
must be a nonnegative integer. The quantifier {m,n} matches at least m and at most
n instances. m and n are nonnegative integers with
m <= n . You may omit either or both numbers, in which case
m defaults to 0 and n to infinity. It is evident that + and ? are abbreviations for
{1,} and {0,1} respectively. * abbreviates
{,} , which is the same as {0,} .
(pregexp-match "[aeiou]{3}" "vacuous") => ("uou")
(pregexp-match "[aeiou]{3}" "evolve") => #f
(pregexp-match "[aeiou]{2,3}" "evolve") => #f
(pregexp-match "[aeiou]{2,3}" "zeugma") => ("eu")
|
5.11.2.7 Non-greedy quantifiers
The quantifiers described above are greedy, ie, they match the
maximal number of instances that would still lead to an overall match
for the full pattern.
(pregexp-match "<.*>" "<tag1> <tag2> <tag3>")
=> ("<tag1> <tag2> <tag3>")
|
To make these quantifiers non-greedy, append a ? to them.
Non-greedy quantifiers match the minimal number of instances needed to
ensure an overall match.
(pregexp-match "<.*?>" "<tag1> <tag2> <tag3>") => ("<tag1>")
|
The non-greedy quantifiers are respectively:
*? , +? , ?? , {m}? , {m,n}? .
Note the two uses of the metacharacter ? .
5.11.2.8 Clusters
Clustering, ie, enclosure within parens ( ... ) ,
identifies the enclosed subpattern as a single entity. It causes
the matcher to capture the submatch, or the portion of the
string matching the subpattern, in addition to the overall match.
(pregexp-match "([a-z]+) ([0-9]+), ([0-9]+)" "jan 1, 1970")
=> ("jan 1, 1970" "jan" "1" "1970")
|
Clustering also causes a following quantifier to treat
the entire enclosed subpattern as an entity.
(pregexp-match "(poo )*" "poo poo platter") => ("poo poo " "poo ")
|
The number of submatches returned is always equal to the number of
subpatterns specified in the regexp, even if a particular subpattern
happens to match more than one substring or no substring at all.
(pregexp-match "([a-z ]+;)*" "lather; rinse; repeat;")
=> ("lather; rinse; repeat;" " repeat;")
|
Here the * -quantified subpattern matches threetimes, but it is the last submatch that is returned. It is also possible for a quantified subpattern to
fail to match, even if the overall pattern matches.
In such cases, the failing submatch is represented
by #f .
(define date-re
;match `month year' or `month day, year'.
;subpattern matches day, if present
(pregexp "([a-z]+) +([0-9]+,)? *([0-9]+)"))
(pregexp-match date-re "jan 1, 1970")
=> ("jan 1, 1970" "jan" "1," "1970")
(pregexp-match date-re "jan 1970")
=> ("jan 1970" "jan" #f "1970")
|
5.11.2.9 Backreferences
Submatches can be used in the insert string argument of the procedures
pregexp-replace and pregexp-replace* . The insert string
can use \n as a backreference to refer back to the
nth submatch, ie, the substring that matched the nth
subpattern. \0 refers to the entire match, and it can also be
specified as \& .
(pregexp-replace "_(.+?)_"
"the _nina_, the _pinta_, and the _santa maria_"
"*\\1*")
=> "the *nina*, the _pinta_, and the _santa maria_"
(pregexp-replace* "_(.+?)_"
"the _nina_, the _pinta_, and the _santa maria_"
"*\\1*")
=> "the *nina*, the *pinta*, and the *santa maria*"
;recall: \S stands for non-whitespace character
(pregexp-replace "(\\S+) (\\S+) (\\S+)"
"eat to live"
"\\3 \\2 \\1")
=> "live to eat"
|
Use \\ in the insert string to specify a literal
backslash. Also, \$ stands for an empty string,
and is useful for separating a backreference \n
from an immediately following number. Backreferences can also be used within the regexp
pattern to refer back to an already matched subpattern
in the pattern. \n stands for an exact repeat
of the nth submatch. 7
(pregexp-match "([a-z]+) and \\1"
"billions and billions")
=> ("billions and billions" "billions")
|
Note that the backreference is not simply a repeatof the previous subpattern. Rather it is a repeat of
the particular substring already matched by the
subpattern. In the above example, the backreference can only match
billions . It will not match millions , even
though the subpattern it harks back to --- ([a-z]+)
--- would have had no problem doing so:
(pregexp-match "([a-z]+) and \\1"
"billions and millions")
=> #f
|
The following corrects doubled words:
(pregexp-replace* "(\\S+) \\1"
"now is the the time for all good men to to come to the aid of of the party"
"\\1")
=> "now is the time for all good men to come to the aid of the party"
|
The following marks all immediately repeating patterns
in a number string:
(pregexp-replace* "(\\d+)\\1"
"123340983242432420980980234"
"{\\1,\\1}")
=> "12{3,3}40983{24,24}3242{098,098}0234"
|
5.11.2.10 Non-capturing clusters
It is often required to specify a cluster
(typically for quantification) but without triggering
the capture of submatch information. Such
clusters are called non-capturing. In such cases,
use (?: instead of ( as the cluster opener. In
the following example, the non-capturing cluster
eliminates the ``directory'' portion of a given
pathname, and the capturing cluster identifies the
basename.
(pregexp-match "^(?:[a-z]*/)*([a-z]+)$"
"/usr/local/bin/mzscheme")
=> ("/usr/local/bin/mzscheme" "mzscheme")
|
5.11.2.11 Cloisters
The location between the ? and the : of a non-capturing
cluster is called a cloister. 8 You can put modifiers there
that will cause the enclustered subpattern to be treated specially. The
modifier i causes the subpattern to match
case-insensitively:
(pregexp-match "(?i:hearth)" "HeartH") => ("HeartH")
|
The modifier x causes the subpattern to match
space-insensitively, ie, spaces and
comments within the
subpattern are ignored. Comments are introduced
as usual with a semicolon ( ; ) and extend till
the end of the line. If you need
to include a literal space or semicolon in
a space-insensitized subpattern, escape it
with a backslash.
(pregexp-match "(?x: a lot)" "alot")
=> ("alot")
(pregexp-match "(?x: a \\ lot)" "a lot")
=> ("a lot")
(pregexp-match "(?x:
a \\ man \\; \\ ; ignore
a \\ plan \\; \\ ; me
a \\ canal ; completely
)"
"a man; a plan; a canal")
=> ("a man; a plan; a canal")
|
The global variable *pregexp-comment-char* contains the comment character ( #\; ).
For Perl-like comments,
(set! *pregexp-comment-char* #\#)
|
You can put more than one modifier in the cloister.
(pregexp-match "(?ix:
a \\ man \\; \\ ; ignore
a \\ plan \\; \\ ; me
a \\ canal ; completely
)"
"A Man; a Plan; a Canal")
=> ("A Man; a Plan; a Canal")
|
A minus sign before a modifier inverts its meaning.
Thus, you can use -i and -x in a
subcluster to overturn the insensitivities caused by an
enclosing cluster.
(pregexp-match "(?i:the (?-i:TeX)book)"
"The TeXbook")
=> ("The TeXbook")
|
This regexp will allow any casing for the and book but insists that TeX not be
differently cased.
5.11.2.12 Alternation
You can specify a list of alternate
subpatterns by separating them by | . The |
separates subpatterns in the nearest enclosing cluster
(or in the entire pattern string if there are no
enclosing parens).
(pregexp-match "f(ee|i|o|um)" "a small, final fee")
=> ("fi" "i")
(pregexp-replace* "([yi])s(e[sdr]?|ing|ation)"
"it is energising to analyse an organisation
pulsing with noisy organisms"
"\\1z\\2")
=> "it is energizing to analyze an organization
pulsing with noisy organisms"
|
Note again that if you wish
to use clustering merely to specify a list of alternate
subpatterns but do not want the submatch, use (?:
instead of ( .
(pregexp-match "f(?:ee|i|o|um)" "fun for all")
=> ("fo")
|
An important thing to note about alternation is that
the leftmost matching alternate is picked regardless of
its length. Thus, if one of the alternates is a prefix
of a later alternate, the latter may not have
a chance to match.
(pregexp-match "call|call-with-current-continuation"
"call-with-current-continuation")
=> ("call")
|
To allow the longer alternate to have a shot at
matching, place it before the shorter one:
(pregexp-match "call-with-current-continuation|call"
"call-with-current-continuation")
=> ("call-with-current-continuation")
|
In any case, an overall match for the entire regexp is
always preferred to an overall nonmatch. In the
following, the longer alternate still wins, because its
preferred shorter prefix fails to yield an overall
match.
(pregexp-match "(?:call|call-with-current-continuation) constrained"
"call-with-current-continuation constrained")
=> ("call-with-current-continuation constrained")
|
5.11.2.13 Backtracking
We've already seen that greedy quantifiers match
the maximal number of times, but the overriding priority
is that the overall match succeed. Consider
(pregexp-match "a*a" "aaaa")
|
The regexp consists of two subregexps, a* followed by a .
The subregexp a* cannot be allowed to match
all four a 's in the text string "aaaa" , even though
* is a greedy quantifier. It may match only the first
three, leaving the last one for the second subregexp.
This ensures that the full regexp matches successfully. The regexp matcher accomplishes this via a process
called backtracking. The matcher
tentatively allows the greedy quantifier
to match all four a 's, but then when it becomes
clear that the overall match is in jeopardy, it
backtracks to a less greedy match of
three a 's. If even this fails, as in the
call
(pregexp-match "a*aa" "aaaa")
|
the matcher backtracks even further. Overallfailure is conceded only when all possible backtracking
has been tried with no success. Backtracking is not restricted to greedy quantifiers.
Nongreedy quantifiers match as few instances as
possible, and progressively backtrack to more and more
instances in order to attain an overall match. There
is backtracking in alternation too, as the more
rightward alternates are tried when locally successful
leftward ones fail to yield an overall match.
5.11.2.14 Disabling backtracking
Sometimes it is efficient to disable backtracking. For
example, we may wish to commit to a choice, or
we know that trying alternatives is fruitless. A
nonbacktracking regexp is enclosed in (?> ... ) .
(pregexp-match "(?>a+)." "aaaa")
=> #f
|
In this call, the subregexp ?>a* greedily matches
all four a 's, and is denied the opportunity to
backpedal. So the overall match is denied. The effect
of the regexp is therefore to match one or more a 's
followed by something that is definitely non- a .
5.11.2.15 Looking ahead and behind
You can have assertions in your pattern that look
ahead or behind to ensure that a subpattern does
or does not occur. These ``look around'' assertions are
specified by putting the subpattern checked for in a
cluster whose leading characters are: ?= (for positive
lookahead), ?! (negative lookahead), ?<=
(positive lookbehind), ?<! (negative lookbehind).
Note that the subpattern in the assertion does not
generate a match in the final result. It merely allows
or disallows the rest of the match.
5.11.2.16 Lookahead
Positive lookahead ( ?= ) peeks ahead to ensure that
its subpattern could match.
(pregexp-match-positions "grey(?=hound)"
"i left my grey socks at the greyhound")
=> ((28 . 32))
|
The regexp "grey(?=hound)" matches grey , but only if it is followed by hound . Thus, the first
grey in the text string is not matched. Negative lookahead ( ?! ) peeks ahead
to ensure that its subpattern could not possibly match.
(pregexp-match-positions "grey(?!hound)"
"the gray greyhound ate the grey socks")
=> ((27 . 31))
|
The regexp "grey(?!hound)" matches grey , butonly if it is not followed by hound . Thus
the grey just before socks is matched.
5.11.2.17 Lookbehind
Positive lookbehind ( ?<= ) checks that its subpattern could match
immediately to the left of the current position in
the text string.
(pregexp-match-positions "(?<=grey)hound"
"the hound in the picture is not a greyhound")
=> ((38 . 43))
|
The regexp (?<=grey)hound matches hound , but only if it is preceded by grey . Negative lookbehind
( ?<! ) checks that its subpattern
could not possibly match immediately to the left.
(pregexp-match-positions "(?<!grey)hound"
"the greyhound in the picture is not a hound")
=> ((38 . 43))
|
The regexp (?<!grey)hound matches hound , but only if
it is not preceded by grey . Lookaheads and lookbehinds can be convenient when they
are not confusing.
5.11.3 An Extended Example
|
Here's an extended example from Friedl that covers many of the features
described above. The problem is to fashion a regexp that will match any
and only IP addresses or dotted quads, ie, four numbers separated
by three dots, with each number between 0 and 255. We will use the
commenting mechanism to build the final regexp with clarity. First, a
subregexp n0-255 that matches 0 through 255.
(define n0-255
"(?x:
\\d ; 0 through 9
| \\d\\d ; 00 through 99
| [01]\\d\\d ;000 through 199
| 2[0-4]\\d ;200 through 249
| 25[0-5] ;250 through 255
)")
|
The first two alternates simply get all single- and
double-digit numbers. Since 0-padding is allowed, we
need to match both 1 and 01. We need to be careful
when getting 3-digit numbers, since numbers above 255
must be excluded. So we fashion alternates to get 000
through 199, then 200 through 249, and finally 250
through 255. 9An IP-address is a string that consists of
four n0-255 s with three dots separating
them.
(define ip-re1
(string-append
"^" ;nothing before
n0-255 ;the first n0-255,
"(?x:" ;then the subpattern of
"\\." ;a dot followed by
n0-255 ;an n0-255,
")" ;which is
"{3}" ;repeated exactly 3 times
"$" ;with nothing following
))
|
Let's try it out.
(pregexp-match ip-re1 "1.2.3.4") => ("1.2.3.4")
(pregexp-match ip-re1 "55.155.255.265") => #f
|
which is fine, except that we also have
(pregexp-match ip-re1 "0.00.000.00") => ("0.00.000.00")
|
All-zero sequences are not valid IP addresses! Lookahead to the rescue.
Before starting to match ip-re1 , we look ahead to ensure we don't
have all zeros. We could use positive lookahead to ensure there
is a digit other than zero.
(define ip-re
(string-append
"(?=.*[1-9])" ;ensure there's a non-0 digit
ip-re1))
|
Or we could use negative lookahead to ensure that what's ahead isn't
composed of only zeros and dots.
(define ip-re
(string-append
"(?![0.]*$)" ;not just zeros and dots
;(note: dot is not metachar inside [])
ip-re1))
|
The regexp ip-re will match all and only valid IP addresses.
(pregexp-match ip-re "1.2.3.4") => ("1.2.3.4")
(pregexp-match ip-re "0.0.0.0") => #f
|
|