Google

12.2.2 Parsing email messages

Message object trees can be created in one of two ways: they can be created from whole cloth by instantiating Message objects and stringing them together via add_payload() and set_payload() calls, or they can be created by parsing a flat text representation of the email message.

The email package provides a standard parser that understands most email document structures, including MIME documents. You can pass the parser a string or a file object, and the parser will return to you the root Message instance of the object tree. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will return true from its is_multipart() method, and the subparts can be accessed via the get_payload() and walk() methods.

Note that the parser can be extended in limited ways, and of course you can implement your own parser completely from scratch. There is no magical connection between the email package's bundled parser and the Message class, so your custom parser can create message object trees in any way it find necessary.

The primary parser class is Parser which parses both the headers and the payload of the message. In the case of multipart messages, it will recursively parse the body of the container message. The email.Parser module also provides a second class, called HeaderParser which can be used if you're only interested in the headers of the message. HeaderParser can be much faster in this situations, since it does not attempt to parse the message body, instead setting the payload to the raw body as a string. HeaderParser has the same API as the Parser class.


Subsections
See About this document... for information on suggesting changes.