cd3dc100 |
1 | mime.txt |
2 | by Luke Ehresman |
3 | June 22, 2000 - Last updated: June 22, 2000 |
4 | |
5 | Who should read this? |
c4809aca |
6 | --------------------- |
cd3dc100 |
7 | The intended audience for this document are people who want to understand how |
8 | the MIME code works. This is a technical documentation of how mime.php |
9 | works and how it parses a MIME encoded message. |
10 | |
11 | |
12 | Object Structure |
c4809aca |
13 | ---------------- |
cd3dc100 |
14 | There are two objects that are used: "message" and "msg_header". here is a |
15 | brief overview of what each object contains. |
16 | |
17 | msg_header |
18 | Contains variables for all the necessary parts of the header of a |
19 | message. This includes (but is not limited to) the following: to, from, |
20 | subject, type (type0), subtype (type1), filename ... |
21 | |
22 | message |
23 | This contains the structure for the message. It contains two parts: |
24 | $header and $entities[]. $header is of type msg_header, and $entities[] |
25 | is an array of type $message. The $entities[] array is optional. If |
26 | it does not exist, then we are at a leaf node, and have an actual |
27 | attachment (entity) that can be displayed. Here is a tree view of how |
28 | this object functions. |
29 | |
30 | header |
31 | entities |
32 | | |
33 | +--- header |
34 | | |
35 | +--- header |
36 | | entities |
37 | | | |
38 | | +--- header |
39 | | | |
40 | | +--- header |
41 | | |
42 | +--- header |
43 | |
44 | |
45 | Getting the Structure |
c4809aca |
46 | --------------------- |
cd3dc100 |
47 | Previously (version 0.4 and below), SquirrelMail handled all the parsing of |
48 | the email message. It would read the entire message in, search for |
49 | boundaries, and created an array similar to the $message object discribed |
50 | above. This was very inefficient. |
51 | |
52 | Currently, all the parsing of the body of the message takes place on the |
53 | IMAP server itself. According to RFC 2060 section 7.4.2, we can use the |
54 | BODYSTRUCTURE function which will return the structure of the body (imagine |
55 | that). It goes into detail of how the bodystructure should be formatted, |
56 | and we have based our new MIME support on this specification. |
57 | |
58 | A simple text/plain message would have a BODYSTRUCTURE similar to the |
59 | following: |
60 | |
61 | ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23) |
62 | |
63 | A more complicated multipart message with an attachment would look like: |
64 | |
65 | (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)("TEXT" |
66 | "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff") |
67 | "<960723163407.20117h@cac.washington.edu>" "Compiler diff" "BASE64" |
68 | 4554 73) "MIXED")) |
69 | |
70 | Our MIME functionality implements different functions that recursively |
71 | run through this text and parses out the structure of the message. If you |
72 | want to learn more about how the structure of a message is returned with |
73 | the BODYSTRUCTURE function, please see RFC 2060 section 7.4.2. |
74 | |
75 | NOTE: SquirrelMail passes the MIME Torture Test written by Mark |
76 | Crispin (author of the IMAP protocol). This message is crazy! It |
77 | has about 30 parts nested inside each other. A very good test, |
78 | and SquirrelMail passed it. It can be found here: |
79 | |
80 | ftp://ftp.lysator.liu.se/mirror/unix/imapd/mime/torture-test.mbox |
81 | |
82 | Getting the Body |
c4809aca |
83 | ---------------- |
cd3dc100 |
84 | Once all of the structure of the message has been read into the $message |
85 | object, we then need to display the body of one entity. There are a number |
86 | of ways we decide which entity to display at a certain time, and I won't go |
87 | into that here. |
88 | |
89 | Each entity has its own ID. Entity IDs look something like "1.2.1", or |
90 | "4.1", or just "2". You can find a detailed description of how entities |
91 | should be identified by reading RFC 2060 section 6.4.5. To fetch the body |
92 | of a particular entity, we use the function "BODY[<section>]". For |
93 | instance, if we were wanting to return entity 1.2.1, we would send the |
94 | IMAP server the command: "a001 FETCH <msg_id> BODY[1.2.1]". |
95 | |
96 | This returns a string of the entire body. Based upon what is in the header, |
97 | we may need to decode it or do other things to it. |
98 | |
99 | |
100 | Closing Notes |
c4809aca |
101 | ------------- |
cd3dc100 |
102 | That is basically how it works. There is a variable in mime.php called |
103 | $debug_mime that is defined at the top of that file. If you set it to true, |
104 | it will output all kinds of valuable information while it tries to decode |
105 | the MIME message. |
106 | |
107 | The code in mime.php is pretty well documented, so you might want to poke |
108 | around there as well to find out more details of how this works. |
109 | |
110 | If you have questions about this, please direct them to our mailing list: |
111 | squirrelmail-list@sourceforge.net |