Up

The MIME parsing system

Authors

Richard frith-Macdonald (rfm@gnu.org)

Version: 1.100

Date: 2004/06/01 12:19:04

Copyright: (C) 2000 Free Software Foundation, Inc.


Contents -

  1. Mime Parser
  2. Software documentation for the GSMimeCodingContext class
  3. Software documentation for the GSMimeDocument class
  4. Software documentation for the GSMimeHeader class
  5. Software documentation for the GSMimeParser class

Mime Parser

The GNUstep Mime parser. This is collection Objective-C classes for representing MIME (and HTTP) documents and managing conversions to and from convenient internal formats.

The idea is to center round two classes -

document
A container for the actual data (and headers) of a mime/http document, this is also used to create raw MIME data for sending.
parser
An object that can be fed data and will parse it into a document. This object also provides various utility methods and an API that permits overriding in order to extend the functionality to cope with new document types.

Software documentation for the GSMimeCodingContext class

GSMimeCodingContext : NSObject

Declared in:
GNUstepBase/GSMime.h
Standards:

Coding contexts are objects used by the parser to store the state of decoding incoming data while it is being incrementally parsed.
The most rudimentary context... this is used for decoding plain text and binary data (ie data which is not really decoded at all) and all other decoding work is done by a subclass.



Instance Variables for GSMimeCodingContext Class

atEnd

@private BOOL atEnd;

Description forthcoming.





Method summary

atEnd 

- (BOOL) atEnd;

Returns the current value of the 'atEnd' flag.


decodeData: length: intoData: 

- (BOOL) decodeData: (const void*)sData length: (unsigned)length intoData: (NSMutableData*)dData;

Decode length bytes of data from sData and append the results to dData.
Return YES on succes, NO if there is an error.


setAtEnd: 

- (void) setAtEnd: (BOOL)flag;

Sets the current value of the 'atEnd' flag.


Software documentation for the GSMimeDocument class

GSMimeDocument : NSObject

Declared in:
GNUstepBase/GSMime.h
Conforms to:
NSCopying
Standards:

This class is intended to provide a wrapper for MIME messages permitting easy access to the contents of a message and providing a basis for parsing an unparsing messages that have arrived via email or as a web document.

The class keeps track of all the document headers, and provides methods for modifying and examining the headers that apply to a document.



Instance Variables for GSMimeDocument Class

content

@private id content;

Description forthcoming.


headers

@private NSMutableArray* headers;

Description forthcoming.





Method summary

charsetFromEncoding: 

+ (NSString*) charsetFromEncoding: (NSStringEncoding)enc;

Return the MIME characterset name corresponding to the specified string encoding.


decodeBase64: 

+ (NSData*) decodeBase64: (NSData*)source;

Decode the source data from base64 encoding and return the result.


decodeBase64String: 

+ (NSString*) decodeBase64String: (NSString*)source;

Converts the base64 encoded data in source to a decoded ASCII string using the +decodeBase64: method. If the encoded data does not represent an ASCII string, you should use the +decodeBase64: method directly.


documentWithContent: type: name: 

+ (GSMimeDocument*) documentWithContent: (id)newContent type: (NSString*)type name: (NSString*)name;

Convenience method to return an autoreleased document using the specified content, type, and name value. This calls the -setContent:type:name: method to set up the document.


encodeBase64: 

+ (NSData*) encodeBase64: (NSData*)source;

Encode the source data to base64 encoding and return the result.


encodeBase64String: 

+ (NSString*) encodeBase64String: (NSString*)source;

Converts the ASCII string source into base64 encoded data using the +encodeBase64: method. If the original data is not an ASCII string, you should use the +encodeBase64: method directly.


encodingFromCharset: 

+ (NSStringEncoding) encodingFromCharset: (NSString*)charset;

Return the string encoding corresponding to the specified MIME characterset name.


addContent: 

- (void) addContent: (id)newContent;

Adds a part to a multipart document


addHeader: 

- (void) addHeader: (GSMimeHeader*)info;

This method may be called to add a header to the document. The header must be a mutable dictionary object that contains at least the fields that are standard for all headers.

Certain well-known headers are restricted to one occurrance in an email, and when extra copies are added they replace originals.

The mime-version header is special... it is inserted before any other mime headers rather than being added at the end.


addHeader: value: parameters: 

- (GSMimeHeader*) addHeader: (NSString*)name value: (NSString*)value parameters: (NSDictionary*)parameters;

Convenience method to create a new header and add it to the receiver.
Returns the newly created header.
See [GSMimeHeader -initWithName:value:parameters:] and -addHeader: methods.


allHeaders 

- (NSArray*) allHeaders;

This method returns an array containing GSMimeHeader objects representing the headers associated with the document.

The order of the headers in the array is the order of the headers in the document.


content 

- (id) content;

This returns the content data of the document in the same format in which the data was placed in the document. This may be one of -

text
an NSString object
binary
an NSData object
multipart
an NSArray object containing GSMimeDocument objects
If you want to be sure that you get a particular type of data, use the -convertToData or -convertToText method.


contentByID: 

- (id) contentByID: (NSString*)key;

Search the content of this document to locate a part whose content ID matches the specified key . Recursively descend into other documents.
Wraps the supplied key in angle brackets if they are not present.
Return nil if no match is found, the matching GSMimeDocument otherwise.


contentByName: 

- (id) contentByName: (NSString*)key;

Search the content of this document to locate a part whose content-type name or content-disposition name matches the specified key. Recursively descend into other documents.
Return nil if no match is found, the matching GSMimeDocument otherwise.


contentFile 

- (NSString*) contentFile;

Convenience method to fetch the content file name from the header.


contentID 

- (NSString*) contentID;

Convenience method to fetch the content ID from the header.


contentName 

- (NSString*) contentName;

Convenience method to fetch the content name from the header.


contentSubtype 

- (NSString*) contentSubtype;

Convenience method to fetch the content sub-type from the header.


contentType 

- (NSString*) contentType;

Convenience method to fetch the content type from the header.


contentsByName: 

- (NSArray*) contentsByName: (NSString*)key;

Search the content of this document to locate all parts whose content-type name or content-disposition name matches the specified key. Do NOT recurse into other documents.
Return nil if no match is found, an array of matching GSMimeDocument instances otherwise.


convertToData 

- (NSData*) convertToData;

Return the content as an NSData object (unless it is multipart)
Perform conversion from text to data using the charset specified in the content-type header, or infer the charset, and update the header accordingly.
If the content can not be represented as a plain NSData object, this method returns nil.


convertToText 

- (NSString*) convertToText;

Return the content as an NSString object (unless it is multipart) If the content cannot be represented as text, this returns nil.


copyWithZone: 

- (id) copyWithZone: (NSZone*)z;

Returns a copy of the receiver.


deleteContent: 

- (void) deleteContent: (GSMimeDocument*)aPart;

Deletes all ocurrances of parts identical to aPart from the receiver.
Recursively deletes from enclosed documents as necessary.


deleteHeader: 

- (void) deleteHeader: (GSMimeHeader*)aHeader;

This method removes all occurrances of header objects identical to the one supplied as an argument.


deleteHeaderNamed: 

- (void) deleteHeaderNamed: (NSString*)name;

This method removes all occurrances of headers whose name matches the supplied string.


headerNamed: 

- (GSMimeHeader*) headerNamed: (NSString*)name;

This method returns the first header whose name equals the supplied argument.


headersNamed: 

- (NSArray*) headersNamed: (NSString*)name;

This method returns an array of GSMimeHeader objects for all headers whose names equal the supplied argument.


makeBoundary 

- (NSString*) makeBoundary;

Make a probably unique string suitable for use as the boundary parameter in the content of a multipart document.

This implementation provides base64 encoded data consisting of an MD5 digest of some pseudo random stuff, plus an incrementing counter. The inclusion of the counter guarantees that we won't produce two identical strings in the same run of the program.


makeContentID 

- (GSMimeHeader*) makeContentID;

Create new content ID header, set it as the content ID of the document and return it.
This is a convenience method which simply places angle brackets around an [NSProcessInfo -globallyUniqueString] to form the header value.


makeHeader: value: parameters: 

- (GSMimeHeader*) makeHeader: (NSString*)name value: (NSString*)value parameters: (NSDictionary*)parameters;

Deprecated... use -setHeader:value:parameters:


makeMessageID 

- (GSMimeHeader*) makeMessageID;

Create new message ID header, set it as the message ID of the document and return it.
This is a convenience method which simply places angle brackets around an [NSProcessInfo -globallyUniqueString] to form the header value.


rawMimeData 

- (NSMutableData*) rawMimeData;

Return an NSData object representing the MIME document as raw data ready to be sent via an email system.
Calls -rawMimeData: with the isOuter flag set to YES.


rawMimeData: 

- (NSMutableData*) rawMimeData: (BOOL)isOuter;

Return an NSData object representing the MIME document as raw data ready to be sent via an email system.

The isOuter flag denotes whether this document is the outermost part of a MIME message, or is a part of a multipart message.

During generation of the document this method will perform some consistency checks and try to automatically generate missing header information needed to build the mime data (eg. filling in the boundary parameter in the content-type header for multipart documents).
However, you should not depend on automatic behaviors but should fill in as much detail as possible before generating data.


setContent: 

- (void) setContent: (id)newContent;

Sets a new value for the content of the document.


setContent: type: 

- (void) setContent: (id)newContent type: (NSString*)type;

Convenience method calling -setContent:type:name: to set document content and type with a nil value for name... useful for top-level documents rather than parts within a document (parts should really be named).


setContent: type: name: 

- (void) setContent: (id)newContent type: (NSString*)type name: (NSString*)name;

Convenience method to set the content of the document along with creating a content-type header for it.

The type parameter may be a simple common content type (text, multipart, or application), in which case the default subtype for that type is used. Alternatively it may be full detail of a content type header value, which will be parsed into 'type', 'subtype' and 'parameters'.
NB. In this case, if the parsed data contains a 'name' parameter and the name argument is non-nil, the argument value will override the parsed value.

You can get the same effect by calling -setContent: to set the document content, then creating a GSMimeHeader instance, initialising it with the content type information you want using [GSMimeHeader -initWithName:value:parameters:] , and calling the -setHeader: method to attach it to the document.

Using this method imposes a few extra checks and restrictions on the combination of content and type/subtype you may use... so you may want to use the more primitive methods in order to bypass these checks if you are using unusual type/subtype information or if you need to provide additional parameters in the header.


setContentType: 

- (void) setContentType: (NSString*)newType;

Convenience method to set the content type of the document without altering any content. The supplied newType may be full type information including subtype and parameters as found after the colon in a mime Content-Type header.


setHeader: 

- (void) setHeader: (GSMimeHeader*)info;

This method may be called to set a header in the document. Any other headers with the same name will be removed from the document.


setHeader: value: parameters: 

- (GSMimeHeader*) setHeader: (NSString*)name value: (NSString*)value parameters: (NSDictionary*)parameters;

Convenience method to create a new header and add it to the receiver replacing any existing header of the same name.
Returns the newly created header.
See [GSMimeHeader -initWithName:value:parameters:] and -setHeader: methods.


Software documentation for the GSMimeHeader class

GSMimeHeader : NSObject

Declared in:
GNUstepBase/GSMime.h
Conforms to:
NSCopying
Standards:

Description forthcoming.



Instance Variables for GSMimeHeader Class

name

@private NSString* name;

Description forthcoming.


objects

@private NSMutableDictionary* objects;

Description forthcoming.


params

@private NSMutableDictionary* params;

Description forthcoming.


value

@private NSString* value;

Description forthcoming.





Method summary

makeQuoted: always: 

+ (NSString*) makeQuoted: (NSString*)v always: (BOOL)flag;

Makes the value into a quoted string if necessary (ie if it contains any special / non-token characters). If flag is YES then the value is made into a quoted string even if it does not contain special characters.


makeToken: 

+ (NSString*) makeToken: (NSString*)t;

Convert the supplied string to a standardized token by making it lowercase and removing all illegal characters.


copyWithZone: 

- (id) copyWithZone: (NSZone*)z;

Description forthcoming.


initWithName: value: 

- (id) initWithName: (NSString*)n value: (NSString*)v;

Convenience method calling -initWithName:value:parameters: with the supplied argument and nil parameters.


initWithName: value: parameters: 

- (id) initWithName: (NSString*)n value: (NSString*)v parameters: (NSDictionary*)p;
This is a designated initialiser for the class.

Initialise a GSMimeHeader supplying a name, a value and a dictionary of any parameters occurring after the value.


name 

- (NSString*) name;

Returns the name of this header... a lowercase string.


objectForKey: 

- (id) objectForKey: (NSString*)k;

Return extra information specific to a particular header type.


objects 

- (NSDictionary*) objects;

Returns a dictionary of all the additional objects for the header.


parameterForKey: 

- (NSString*) parameterForKey: (NSString*)k;

Return the named parameter value.


parameters 

- (NSDictionary*) parameters;

Returns the parameters of this header... a dictionary whose keys are all lowercase strings, and whose values are strings which may contain mixed case.


rawMimeData 

- (NSMutableData*) rawMimeData;

Returns the full text of the header, built from its component parts, and including a terminating CR-LF


setName: 

- (void) setName: (NSString*)s;

Sets the name of this header... converts to lowercase and removes illegal characters. If given a nil or empty string argument, sets the name to 'unknown'.


setObject: forKey: 

- (void) setObject: (id)o forKey: (NSString*)k;

Method to store specific information for particular types of header. This is used for non-standard parts of headers.
Setting a nil value for o will remove any existing value set using the k as its key.


setParameter: forKey: 

- (void) setParameter: (NSString*)v forKey: (NSString*)k;

Sets a parameter of this header... converts name to lowercase and removes illegal characters.
If a nil parameter name is supplied, removes any parameter with the specified key.


setParameters: 

- (void) setParameters: (NSDictionary*)d;

Sets all parameters of this header... converts names to lowercase and removes illegal characters from them.


setValue: 

- (void) setValue: (NSString*)s;

Sets the value of this header (without changing parameters)
If given a nil argument, set an empty string value.


text 

- (NSString*) text;

Returns the full text of the header, built from its component parts, and including a terminating CR-LF


value 

- (NSString*) value;

Returns the value of this header (excluding any parameters)


Software documentation for the GSMimeParser class

GSMimeParser : NSObject

Declared in:
GNUstepBase/GSMime.h
Standards:

This class provides support for parsing MIME messages into GSMimeDocument objects. Each parser object maintains an associated document into which data is stored.

You supply the document to be parsed as one or more data items passed to the -parse: method, and (if the method always returns YES, you give it a final nil argument to mark the end of the document.

On completion of parsing a valid document, the [GSMimeParser -mimeDocument] method returns the resulting parsed document.



Instance Variables for GSMimeParser Class

_defaultEncoding

@private NSStringEncoding _defaultEncoding;

Description forthcoming.


boundary

@private NSData* boundary;

Description forthcoming.


bytes

@private unsigned char* bytes;

Description forthcoming.


child

@private GSMimeParser* child;

Description forthcoming.


context

@private GSMimeCodingContext* context;

Description forthcoming.


data

@private NSMutableData* data;

Description forthcoming.


dataEnd

@private unsigned int dataEnd;

Description forthcoming.


document

@private GSMimeDocument* document;

Description forthcoming.


expect

@private unsigned int expect;

Description forthcoming.


flags

@private struct ... flags;

Description forthcoming.


input

@private unsigned int input;

Description forthcoming.


lineEnd

@private unsigned int lineEnd;

Description forthcoming.


lineStart

@private unsigned int lineStart;

Description forthcoming.


rawBodyLength

@private unsigned int rawBodyLength;

Description forthcoming.


sectionStart

@private unsigned int sectionStart;

Description forthcoming.





Method summary

documentFromData: 

+ (GSMimeDocument*) documentFromData: (NSData*)mimeData;

Convenience method to parse a single data item as a MIME message and return the resulting document.


mimeParser 

+ (GSMimeParser*) mimeParser;

Create and return a parser.


contextFor: 

- (GSMimeCodingContext*) contextFor: (GSMimeHeader*)info;

Return a coding context object to be used for decoding data according to the scheme specified in the header.

The default implementation supports the following transfer encodings specified in either a transfer-encoding of content-transfer-encoding header -

To add new coding schemes to the parser, you need to ovrride this method to return a new coding context for your scheme when the info argument indicates that this is appropriate.


data 

- (NSData*) data;

Return the data accumulated in the parser. If the parser is still parsing headers, this will be the header data read so far. If the parse has parsed the body of the message, this will be the data of the body, with any transfer encoding removed.


decodeData: fromRange: intoData: withContext: 

- (BOOL) decodeData: (NSData*)sData fromRange: (NSRange)aRange intoData: (NSMutableData*)dData withContext: (GSMimeCodingContext*)con;

Decodes the raw data from the specified range in the source data object and appends it to the destination data object. The context object provides information about the content encoding type in use, and the state of the decoding operation.

This method may be called repeatedly to incrementally decode information as it arrives on some communications channel. It should be called with a nil source data item (or with the atEnd flag of the context set to YES) in order to flush any information held in the context to the output data object.

You may override this method in order to implement additional coding schemes, but usually it should be enough for you to implement a custom GSMimeCodingContext subclass fotr this method to use.


expectNoHeaders 

- (void) expectNoHeaders;

This method may be called to tell the parser that it should not expect to parse any headers, and that the data it will receive is body data.
If the parse is already in the body, or is complete, this method has no effect.
This is for use when some other utility has been used to parse headers, and you have set the headers of the document owned by the parser accordingly. You can then use the GSMimeParser to read the body data into the document.


isComplete 

- (BOOL) isComplete;

Returns YES if the document parsing is known to be completed successfully. Returns NO if either more data is needed, or if the parser encountered an error.


isHttp 

- (BOOL) isHttp;

Returns YES if the parser is parsing an HTTP document rather than a true MIME document.


isInBody 

- (BOOL) isInBody;

Returns YES if all the document headers have been parsed but the document body parsing may not yet be complete.


isInHeaders 

- (BOOL) isInHeaders;

Returns YES if parsing of the document headers has not yet been completed.


mimeDocument 

- (GSMimeDocument*) mimeDocument;

Returns the GSMimeDocument instance into which data is being parsed or has been parsed.


parse: 

- (BOOL) parse: (NSData*)d;

This method is called repeatedly to pass raw mime data into the parser. It returns YES as long as it wants more data to complete parsing of a document, and NO if parsing is complete, either due to having reached the end of a document or due to an error.

Since it is not always possible to determine if the end of a MIME document has been reached from its content, the method may need to be called with a nil or empty argument after you have passed all the data to it... this tells it that the data is complete.

The parser attempts to be as flexible as possible and to continue parsing wherever it can. If an error occurs in parsing, the -isComplete method will always return NO, even after the -parse: method has been called with a nil argument.

A multipart document will be parsed to content consisting of an NSArray of GSMimeDocument instances representing each part.
Otherwise, a document will become content of type NSData, unless it is of content type text, in which case it will be an NSString.
If a document has no content type specified, it will be treated as text , unless it is identifiable as a file (eg. t has a content-disposition header containing a filename parameter).


parseHeader: 

- (BOOL) parseHeader: (NSString*)aHeader;

This method is called to parse a header line for the current document, split its contents into a GSMimeHeader object, and add that information to the document.
The method is normally used internally by the -parse: method, but you may also call it to parse an entire header line and add it to the document (this may be useful in conjunction with the -expectNoHeaders method, to parse a document body data into a document where the headers are available from a separate source).

   GSMimeParser *parser = [GSMimeParser mimeParser];

   [parser parseHeader: @"content-type: text/plain"];
   [parser expectNoHeaders];
   [parser parse: bodyData];
   [parser parse: nil];
 

The standard implementation of this method scans the header name and then calls -scanHeaderBody:into: to complete the parsing of the header.

This method also performs consistency checks on headers scanned so it is recommended that it is not overridden, but that subclasses override -scanHeaderBody:into: to implement custom scanning.

As a special case, for HTTP support, this method also parses lines in the format of HTTP responses as if they were headers named http. The resulting header object contains additional object values -

HttpMajorVersion
The first part of the version number
HttpMinorVersion
The second part of the version number
NSHTTPPropertyServerHTTPVersionKey
The full HTTP protocol version number
NSHTTPPropertyStatusCodeKey
The HTTP status code
NSHTTPPropertyStatusReasonKey
The text message (if any) after the status code


scanHeaderBody: into: 

- (BOOL) scanHeaderBody: (NSScanner*)scanner into: (GSMimeHeader*)info;

This method is called to parse a header line and split its contents into an info dictionary.

On entry, the dictionary is already partially filled, the name argument is a lowercase representation of the header name, and the scanner is set to a scan location immediately after the colon in the header string.

If the header is parsed successfully, the method should return YES, otherwise NO.

You should not call this method directly yourself, but may override it to support parsing of new headers.

You should be aware of the parsing that the standard implementation performs, and that needs to be done for certain headers in order to permit the parser to work generally -

content-disposition
Value
The content disposition (excluding parameters) as a lowercase string.
content-type
Subtype
The MIME subtype lowercase
Type
The MIME type lowercase
value
The full MIME type (xxx/yyy) in lowercase
content-transfer-encoding
Value
The transfer encoding type in lowercase
http
HttpVersion
The HTTP protocol version number
HttpMajorVersion
The first component of the version number
HttpMinorVersion
The second component of the version number
HttpStatus
The response status value (numeric code)
Value
The text message (if any)
transfer-encoding
Value
The transfer encoding type in lowercase


scanName: 

- (NSString*) scanName: (NSScanner*)scanner;

A convenience method to use a scanner (that is set up to scan a header line) to scan a name - a simple word.


scanPastSpace: 

- (BOOL) scanPastSpace: (NSScanner*)scanner;

A convenience method to scan past any whitespace in the scanner in preparation for scanning something more interesting that comes after it. Returns YES if any space was read, NO otherwise.


scanSpecial: 

- (NSString*) scanSpecial: (NSScanner*)scanner;

A convenience method to use a scanner (that is set up to scan a header line) to scan in a special character that terminated a token previously scanned. If the token was terminated by whitespace and no other special character, the string returned will contain a single space character.


scanToken: 

- (NSString*) scanToken: (NSScanner*)scanner;

A convenience method to use a scanner (that is set up to scan a header line) to scan a header token - either a quoted string or a simple word.


setBuggyQuotes: 

- (void) setBuggyQuotes: (BOOL)flag;

Method to inform the parser that the data it is parsing is likely to contain fields with buggy use of backslash quotes... and it should try to be tolerant of them and treat them as is they were escaped backslashes. This is for use with things like microsoft internet explorer, which puts the backslashes used as file path separators in parameters without quoting them.


setDefaultCharset: 

- (void) setDefaultCharset: (NSString*)aName;

Method to inform the parser that body parts with no content-type header (which are treated as text/plain) should use the specified characterset rather than the default (us-ascii)


setIsHttp 

- (void) setIsHttp;

Method to inform the parser that the data it is parsing is an HTTP document rather than true MIME. This method is called internally if the parser detects an HTTP response line at the start of the headers it is parsing.



Up