There are quite a few make-fooer functions hanging around. Now
that regexp-position
does caching, these
are basically useless, but we've kept them around for
backwards compatibility. Unfortunately, internally most of
the functions are implemented in terms of
make-regexp-positioner
. To minimize the
amount of rewriting, I've liberally applied seals and inline
declarations so that
make-regexp-positioner
won't clobber all
type information. The downside, of course, is that
everything's sealed, but hey, no one ever subclassed [ed:
specialized?] regexp-position
anyway.
Parsing a regexp is not cheap, so we cache the parsed
regexps and only parse a string if we haven't seen it
before. Because in practice almost all regexp strings are
string literals, we're free to choose
\==
or \=
depending on whatever's fastest. However, because a string
is parsed differently depending on whether the search is
case sensitive or not, we also have to keep track of that
information as well. (The case dependent parse boils down
to the parse creating a
<character-set>
, which must be
either case sensitive or case insensitive).
Note: Currently, only
regexp-position
uses this cache, because the other functions are still usingmake-regexp-positioner
. With caching, thatmake-regexp-whatever
stuff should probably go.
regexp-position | [Function] |
The index of a regexp in a string
Synopsis
regexp-position (big, regexp, #key start, end, case-sensitive) => (regexp-start, #rest marks)
Parameters
big An instance of <string>
. The string to parse.regexp An instance of <string>
.start:
An instance of <object>
. Where to start parsing the string. Defaults to0
.end:
An instance of <object>
. If defined, where to stop parsing the string. Defaults to#f
.case-sensitive:
An instance of <object>
. Match case in regexp while parsing. Defaults to#f
.
Return Values
regexp-start An instance of false-or(<integer>)
. If defined, the index of the match.marks Instances of false-or(<integer>)
. The position of the end of the matche in the string (see below).
Description
Find the position of a regular expression inside a string. If the regexp is not found, return
#f
, otherwise return a variable number of marks.This function returns the index of the start of the regular expression in the
big-string
, or#f
if the regular expression is not found. As a second value, it returns the index of the end of the regular expression in thebig-string
(assuming it was found; otherwise there is no second value). These values are called marks, and they come in pairs, a start-mark and an end-mark. If there are groups in the regular expression,regexp-position
will return an additional pair of marks (a start and an end) for each group. If the group is matched, these marks will be integers; if the group is not matched, the marks will be#f
. Soregexp-position("This is a string", "is");returns
values(2, 4)
andregexp-position("This is a string", "(is)(.*)ing");returns
values(2, 16, 2, 4, 4, 13)
, whileregexp-position("This is a string", "(not found)(.*)ing");returns
#f
. Marks are always given relative to the start ofbig-string
, not relative to thestart:
keyword.
regexp-replace | [Function] |
Replace information in a string.
Synopsis
regexp-replace (input, regexp, new-substring, #key count, case-sensitive, start, end) => (changed-string)
Parameters
input An instance of <string>
. The string to parse and replace pieces of.regexp An instance of <string>
.new-substring An instance of <string>
. The replacement string.count:
An instance of <object>
. If supplied, number of substitutions to make. Defaults to#f
.case-sensitive:
An instance of <object>
. Match case in regexp while parsing. Defaults to#f
.start:
An instance of <object>
. Where to start parsing the string. Defaults to0
.end:
An instance of <object>
. If defined, where to stop parsing the string. Defaults to#f
.
Return Values
changed-string An instance of <string>
.
Description
This replaces all occurrences of
regexp
ininput
withnew-substring
. Ifcount:
is specified, it replaces only the firstcount
occurrences ofregexp
. (This is different from Perl, which replaces only the first occurrence unless /g is specified)New-substring
can contain backreferences to theregexp
. For instance,regexp-replace("The rain in Spain and some other text", "the (.*) in (\\w*\\b)", "\\2 has its \\1")returns "Spain has its rain and some other text". If the subgroup referred to by the backreference was not matched, the reference is interpreted as the null string. For instance,
regexp-replace("Hi there", "Hi there(, Bert)?", "What do you think\\1?")returns "What do you think?" because ", Bert" wasn't found.
translate | [Method] |
Equivalent to Perl's tr. Does a character by character translation.
Synopsis
translate (input, from-set, to-set, #key delete, start, end) => (output)
Parameters
input An instance of <string>
. The string to translate.from-set An instance of <string>
. String specification of a character set.to-set An instance of <string>
. Another character set.delete:
An instance of <object>
. If#t
, any characters in thefrom-string
that don't have matching characters in theto-string
are deleted. Defaults to#f
.start:
An instance of <object>
. Where to start parsing the string. Defaults to0
.end:
An instance of <object>
. If defined, where to stop parsing the string. Defaults to#f
.
Return Values
output An instance of <string>
.
Description
This is equivalent to Perl's tr/// construct.
From-string
is a string specification of a character set, andto-string
is another character set.Translate
convertsinput
character by character, according to the sets. For instance,translate("any string", "a-z", "A-Z")will convert "any string" to all uppercase: "ANY STRING".
Like Perl, character ranges are not allowed to be "backwards". The following is not legal:
translate("any string", "a-z", "z-a")(This restriction may be removed in future releases) Unlike Perl's tr///,
translate
doesn't return the number of characters translated.If
delete:
is#t
, any characters in thefrom-string
that don't have matching characters in theto-string
are deleted. The following will remove all vowels from a string and convert periods to commas:translate("any string", ".aeiou", ",", delete: #t)
Delete:
is#f
by default. Ifdelete:
is#f
and there aren't enough characters in theto-string
, the last character in theto-string
is reused as many times as necessary. The following converts several punctuation characters into spaces:translate("any string", ",./:;[]{}()", " ");
Start:
andend:
indicate which part ofinput
to translate. They default to the entire string.Note:
Translate
is always case sensitive.
split | [Function] |
Breaks up a string along boundary characters.
Synopsis
split (pattern, input, #key count, remove-empty-items, start, end) => (#rest whole-bunch-of-strings)
Parameters
pattern An instance of <string>
. The regexp to split on.input An instance of <string>
. The string to parse and replace pieces of.count:
An instance of <object>
. If supplied, maximum number of strings to return. Defaults to#f
.remove-empty-items:
An instance of <object>
. Magically skips empty items when#t
. Defaults to#t
.start:
An instance of <object>
. Where to start parsing the string. Defaults to0
.end:
An instance of <object>
. If defined, where to stop parsing the string. Defaults to#f
.
Return Values
whole-bunch-of-strings Instances of <string>
.
Description
This is like Perl's split function. It searches
input
from occurrences ofpattern
, and returns substrings that were delimited by that regexp. For instance,split("-", "long-dylan-identifier")returns
values("long", "dylan", "identifier")
. Note that what matched the regexp is left out.Remove-empty-items
, which defaults to true, magically skips over empty items, so thatsplit("-", "long--with--multiple-dashes")returns
values("long", "with", "multiple", "dashes")
.Count
is the maximum number of strings to return. If there aren
strings andcount
is specified, the firstcount
- 1 strings are returned as usual, and thecount
th string is the remainder, unsplit. Sosplit("-", "really-long-dylan-identifier", count: 3)returns
values("really", "long", "dylan-identifier")
. Ifremove-empty-items
is#t
, empty items aren't counted.
Start:
andend:
indicate what part ofinput
should be looked at for delimiters. They default to the entire string. For instance,split("-", "really-long-dylan-identifier", start: 8)returns
values("really-long", "dylan", "identifier")
.Note: Unlike Perl, empty regular expressions are never legal regular expressions, so there is no way to split a string into a bunch of single character strings. Of course, in Dylan this is not a useful thing to do (as one can get each character of the string by iteration or by indexing), so this is not really a problem.
join | [Function] |
Does the opposite of split.
Synopsis
join (delimiter, #rest strings) => (big-string)
Parameters
delimiter An instance of <string>
.strings Instances of <object>
.
Return Values
big-string An instance of <string>
.
Description
This is like Perl's join function. This is not really any more efficient than
concatenate-as
, but it's more convenient.join(":", word1, word2, word3)is equivalent to
concatenate(word1, ":", word2, ":", word3)(and no more efficient).
<illegal-regexp> | [sealed Class] |
Signaled when a function receives an illegal regular expression.
Superclasses
<error>
Initialization Keywords
regexp:
An instance of <string>
. The regexp that caused the error
Description
Signaled when a function receives an illegal regular expression.
These functions still work, but are deprecated. Use the foo functions (described above) instead of these make-foo functions.
make-regexp-positioner | [Function] |
[Deprecated] Creates a function that finds the index of a regexp in a string.
Synopsis
make-regexp-positioner (regexp, #key byte-character-only, needs-marks, maximum-compile, case-sensitive) => (regexp-positioner)
Parameters
regexp An instance of <string>
.byte-character-only:
An instance of <object>
. Ignored. Defaults to#f
.needs-marks:
An instance of <object>
. Ignored. Defaults to#f
.maximum-compile:
An instance of <object>
. Ignored. Defaults to#f
.case-sensitive:
An instance of <object>
. Match case in regexp while parsing. Defaults to#f
.
Return Values
regexp-positioner An instance of <function>
. The function to execute a match on a string.
Description
Once upon a time, this was how you interfaced to the NFA stuff (
maximum-compile:
#t
). That's gone. Now it's just here for backwards compatibility. All keywords exceptcase-sensitive
are now ignored.
make-regexp-replacer | [Function] |
[Deprecated] Creates a function that replaces information in a string.
Synopsis
make-regexp-replacer (regexp, #key replace-with, case-sensitive) => (replacer)
Parameters
regexp An instance of <string>
.replace-with:
An instance of <object>
. The replacement string.case-sensitive:
An instance of <object>
. Match case in regexp while parsing. Defaults to#f
.
Return Values
replacer An instance of <function>
. The function that does the replacement.
Description
This returns an anonymous replacer function that is either
method (big-string, #key count, start, end)or
method (big-string, replace-string, #key count, start, end)The first form is returned if the
replace-with:
keyword isn't supplied, otherwise the second form is returned.
make-translator | [Method] |
[Deprecated] Creates a function that translates a string.
Synopsis
make-translator (from-set, to-set, #key delete) => (translator)
Parameters
from-set An instance of <string>
. String specification of a character set.to-set An instance of <string>
. Another character set.delete:
An instance of <object>
. If#t
, deletefrom-set
characters not into-set
. Defaults to#f
.
Return Values
translator An instance of <function>
. The function that does the translation.
Description
This returns an anonymous translation function.
make-splitter | [Function] |
[Deprecated] Creates a function that splits a string.
Synopsis
make-splitter (pattern) => (splitter)
Parameters
pattern An instance of <string>
. The regexp to split on.
Return Values
splitter An instance of <function>
. (If you're Brit, don't smile.) The function that does the split on the string.
Description
This returns an anonymous splitter function.