disable regexp compilation error
[squirrelmail.git] / doc / Development / mime.txt
CommitLineData
24fc5dd2 1SquirrelMail MIME Support Introduction
2======================================
cd3dc100 3
24fc5dd2 4The intended audience for this document are people who want to understand how
5the MIME code works. This is a technical documentation of how mime.php
6works and how it parses a MIME encoded message.
cd3dc100 7
8
9Object Structure
c4809aca 10----------------
1e63b430 11There are two objects that are used: "message" and "msg_header". Here is a
24fc5dd2 12brief overview of what each object contains.
cd3dc100 13
24fc5dd2 14msg_header
15 Contains variables for all the necessary parts of the header of a
16 message. This includes (but is not limited to) the following: to, from,
17 subject, type (type0), subtype (type1), filename ...
cd3dc100 18
24fc5dd2 19message
20 This contains the structure for the message. It contains two parts:
21 $header and $entities[]. $header is of type msg_header, and $entities[]
22 is an array of type $message. The $entities[] array is optional. If
23 it does not exist, then we are at a leaf node, and have an actual
24 attachment (entity) that can be displayed. Here is a tree view of how
25 this object functions.
26
27 header
28 entities
29 |
30 +--- header
31 |
32 +--- header
33 | entities
34 | |
35 | +--- header
36 | |
37 | +--- header
38 |
39 +--- header
cd3dc100 40
41
42Getting the Structure
c4809aca 43---------------------
24fc5dd2 44Previously (version 0.4 and below), SquirrelMail handled all the parsing of
45the email message. It would read the entire message in, search for
1e63b430 46boundaries, and create an array similar to the $message object described
24fc5dd2 47above. This was very inefficient.
cd3dc100 48
24fc5dd2 49Currently, all the parsing of the body of the message takes place on the
50IMAP server itself. According to RFC 2060 section 7.4.2, we can use the
51BODYSTRUCTURE function which will return the structure of the body (imagine
52that). It goes into detail of how the bodystructure should be formatted,
53and we have based our new MIME support on this specification.
cd3dc100 54
24fc5dd2 55A simple text/plain message would have a BODYSTRUCTURE similar to the
56following:
cd3dc100 57
24fc5dd2 58 ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)
cd3dc100 59
24fc5dd2 60A more complicated multipart message with an attachment would look like:
cd3dc100 61
24fc5dd2 62 (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)("TEXT"
63 "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff")
64 "<960723163407.20117h@cac.washington.edu>" "Compiler diff" "BASE64"
65 4554 73) "MIXED"))
cd3dc100 66
24fc5dd2 67Our MIME functionality implements different functions that recursively
68run through this text and parses out the structure of the message. If you
69want to learn more about how the structure of a message is returned with
70the BODYSTRUCTURE function, please see RFC 2060 section 7.4.2.
cd3dc100 71
24fc5dd2 72NOTE: SquirrelMail passes the MIME Torture Test written by Mark
73 Crispin (author of the IMAP protocol). This message is crazy! It
74 has about 30 parts nested inside each other. A very good test,
75 and SquirrelMail passed it. It can be found here:
cd3dc100 76
24fc5dd2 77 ftp://ftp.lysator.liu.se/mirror/unix/imapd/mime/torture-test.mbox
cd3dc100 78
79Getting the Body
c4809aca 80----------------
1e63b430 81Once all the structure of the message has been read into the $message
24fc5dd2 82object, we then need to display the body of one entity. There are a number
83of ways we decide which entity to display at a certain time, and I won't go
84into that here.
cd3dc100 85
24fc5dd2 86Each entity has its own ID. Entity IDs look something like "1.2.1", or
87"4.1", or just "2". You can find a detailed description of how entities
88should be identified by reading RFC 2060 section 6.4.5. To fetch the body
89of a particular entity, we use the function "BODY[<section>]". For
90instance, if we were wanting to return entity 1.2.1, we would send the
91IMAP server the command: "a001 FETCH <msg_id> BODY[1.2.1]".
cd3dc100 92
24fc5dd2 93This returns a string of the entire body. Based upon what is in the header,
94we may need to decode it or do other things to it.
cd3dc100 95
96
97Closing Notes
c4809aca 98-------------
24fc5dd2 99That is basically how it works. There is a variable in mime.php called
100$debug_mime that is defined at the top of that file. If you set it to true,
101it will output all kinds of valuable information while it tries to decode
102the MIME message.
cd3dc100 103
24fc5dd2 104The code in mime.php is pretty well documented, so you might want to poke
105around there as well to find out more details of how this works.
106
107If you have questions about this, please direct them to our mailing list:
01b125cb 108squirrelmail-users@lists.sourceforge.net