| 1 | <?xml version="1.0" encoding="iso-8859-1"?> |
| 2 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
| 3 | "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| 4 | <html xmlns="http://www.w3.org/1999/xhtml" |
| 5 | lang="en" xml:lang="en"> |
| 6 | <head> |
| 7 | <title>GNU MediaGoblin</title> |
| 8 | <meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1"/> |
| 9 | <meta name="generator" content="Org-mode"/> |
| 10 | <meta name="generated" content="2011-03-27 23:35:24 CDT"/> |
| 11 | <meta name="author" content="Christopher Allan Webber"/> |
| 12 | <meta name="description" content=""/> |
| 13 | <meta name="keywords" content=""/> |
| 14 | <style type="text/css"> |
| 15 | <!--/*--><![CDATA[/*><!--*/ |
| 16 | html { font-family: Times, serif; font-size: 12pt; } |
| 17 | .title { text-align: center; } |
| 18 | .todo { color: red; } |
| 19 | .done { color: green; } |
| 20 | .tag { background-color: #add8e6; font-weight:normal } |
| 21 | .target { } |
| 22 | .timestamp { color: #bebebe; } |
| 23 | .timestamp-kwd { color: #5f9ea0; } |
| 24 | .right {margin-left:auto; margin-right:0px; text-align:right;} |
| 25 | .left {margin-left:0px; margin-right:auto; text-align:left;} |
| 26 | .center {margin-left:auto; margin-right:auto; text-align:center;} |
| 27 | p.verse { margin-left: 3% } |
| 28 | pre { |
| 29 | border: 1pt solid #AEBDCC; |
| 30 | background-color: #F3F5F7; |
| 31 | padding: 5pt; |
| 32 | font-family: courier, monospace; |
| 33 | font-size: 90%; |
| 34 | overflow:auto; |
| 35 | } |
| 36 | table { border-collapse: collapse; } |
| 37 | td, th { vertical-align: top; } |
| 38 | th.right { text-align:center; } |
| 39 | th.left { text-align:center; } |
| 40 | th.center { text-align:center; } |
| 41 | td.right { text-align:right; } |
| 42 | td.left { text-align:left; } |
| 43 | td.center { text-align:center; } |
| 44 | dt { font-weight: bold; } |
| 45 | div.figure { padding: 0.5em; } |
| 46 | div.figure p { text-align: center; } |
| 47 | textarea { overflow-x: auto; } |
| 48 | .linenr { font-size:smaller } |
| 49 | .code-highlighted {background-color:#ffff00;} |
| 50 | .org-info-js_info-navigation { border-style:none; } |
| 51 | #org-info-js_console-label { font-size:10px; font-weight:bold; |
| 52 | white-space:nowrap; } |
| 53 | .org-info-js_search-highlight {background-color:#ffff00; color:#000000; |
| 54 | font-weight:bold; } |
| 55 | /*]]>*/--> |
| 56 | </style> |
| 57 | <script type="text/javascript"> |
| 58 | <!--/*--><![CDATA[/*><!--*/ |
| 59 | function CodeHighlightOn(elem, id) |
| 60 | { |
| 61 | var target = document.getElementById(id); |
| 62 | if(null != target) { |
| 63 | elem.cacheClassElem = elem.className; |
| 64 | elem.cacheClassTarget = target.className; |
| 65 | target.className = "code-highlighted"; |
| 66 | elem.className = "code-highlighted"; |
| 67 | } |
| 68 | } |
| 69 | function CodeHighlightOff(elem, id) |
| 70 | { |
| 71 | var target = document.getElementById(id); |
| 72 | if(elem.cacheClassElem) |
| 73 | elem.className = elem.cacheClassElem; |
| 74 | if(elem.cacheClassTarget) |
| 75 | target.className = elem.cacheClassTarget; |
| 76 | } |
| 77 | /*]]>*///--> |
| 78 | </script> |
| 79 | |
| 80 | </head> |
| 81 | <body> |
| 82 | <div id="content"> |
| 83 | |
| 84 | <h1 class="title">GNU MediaGoblin</h1> |
| 85 | |
| 86 | |
| 87 | <div id="table-of-contents"> |
| 88 | <h2>Table of Contents</h2> |
| 89 | <div id="text-table-of-contents"> |
| 90 | <ul> |
| 91 | <li><a href="#sec-1">1 About </a></li> |
| 92 | <li><a href="#sec-2">2 Milestones </a> |
| 93 | <ul> |
| 94 | <li><a href="#sec-2_1">2.1 Basic image hosting </a></li> |
| 95 | <li><a href="#sec-2_2">2.2 Multi-media hosting (including video and audio) </a></li> |
| 96 | <li><a href="#sec-2_3">2.3 API(s) </a></li> |
| 97 | <li><a href="#sec-2_4">2.4 Federation </a></li> |
| 98 | <li><a href="#sec-2_5">2.5 Plugin system </a></li> |
| 99 | </ul> |
| 100 | </li> |
| 101 | <li><a href="#sec-3">3 Technology </a> |
| 102 | <ul> |
| 103 | <li><a href="#sec-3_1">3.1 Why python </a></li> |
| 104 | <li><a href="#sec-3_2">3.2 Why mongodb </a></li> |
| 105 | <li><a href="#sec-3_3">3.3 Why wsgi minimalism / Why not Django </a></li> |
| 106 | </ul> |
| 107 | </li> |
| 108 | </ul> |
| 109 | </div> |
| 110 | </div> |
| 111 | |
| 112 | <div id="outline-container-1" class="outline-2"> |
| 113 | <h2 id="sec-1"><span class="section-number-2">1</span> About </h2> |
| 114 | <div class="outline-text-2" id="text-1"> |
| 115 | |
| 116 | |
| 117 | <p> |
| 118 | What is MediaGoblin? I'm shooting for: |
| 119 | </p> |
| 120 | <ul> |
| 121 | <li>Initially, a place to store all your photos that's as awesome as, |
| 122 | more awesome than, existing proprietary solutions |
| 123 | </li> |
| 124 | <li>Later, a place for all sorts of media, such as video, music, etc |
| 125 | hosting. |
| 126 | </li> |
| 127 | <li>Federated, like statusnet/ostatus (we should use ostatus, in fact!) |
| 128 | </li> |
| 129 | <li>Customizable |
| 130 | </li> |
| 131 | <li>A place for people to collaborate and show off original and derived |
| 132 | creations |
| 133 | </li> |
| 134 | <li>Free, as in freedom. Under the GNU AGPL, v3 or later. Encourages |
| 135 | free formats and free licensing for content, too. |
| 136 | </li> |
| 137 | </ul> |
| 138 | |
| 139 | <p> |
| 140 | Wow! That's pretty ambitious. Hopefully we're cool enough to do it. |
| 141 | I think we can. |
| 142 | </p> |
| 143 | <p> |
| 144 | It's also necessary, for multiple reasons. Centralization and |
| 145 | proprietization of media on the internet is a serious problem and |
| 146 | makes the web go from a system of extreme resilience to a system |
| 147 | of frightening fragility. People should be able to own their data. |
| 148 | Etc. If you're reading this, chances are you already agree though. :) |
| 149 | </p> |
| 150 | </div> |
| 151 | |
| 152 | </div> |
| 153 | |
| 154 | <div id="outline-container-2" class="outline-2"> |
| 155 | <h2 id="sec-2"><span class="section-number-2">2</span> Milestones </h2> |
| 156 | <div class="outline-text-2" id="text-2"> |
| 157 | |
| 158 | |
| 159 | <p> |
| 160 | Excepting the first, not necessarily in this order. |
| 161 | </p> |
| 162 | |
| 163 | </div> |
| 164 | |
| 165 | <div id="outline-container-2_1" class="outline-3"> |
| 166 | <h3 id="sec-2_1"><span class="section-number-3">2.1</span> Basic image hosting </h3> |
| 167 | <div class="outline-text-3" id="text-2_1"> |
| 168 | |
| 169 | </div> |
| 170 | |
| 171 | </div> |
| 172 | |
| 173 | <div id="outline-container-2_2" class="outline-3"> |
| 174 | <h3 id="sec-2_2"><span class="section-number-3">2.2</span> Multi-media hosting (including video and audio) </h3> |
| 175 | <div class="outline-text-3" id="text-2_2"> |
| 176 | |
| 177 | </div> |
| 178 | |
| 179 | </div> |
| 180 | |
| 181 | <div id="outline-container-2_3" class="outline-3"> |
| 182 | <h3 id="sec-2_3"><span class="section-number-3">2.3</span> API(s) </h3> |
| 183 | <div class="outline-text-3" id="text-2_3"> |
| 184 | |
| 185 | </div> |
| 186 | |
| 187 | </div> |
| 188 | |
| 189 | <div id="outline-container-2_4" class="outline-3"> |
| 190 | <h3 id="sec-2_4"><span class="section-number-3">2.4</span> Federation </h3> |
| 191 | <div class="outline-text-3" id="text-2_4"> |
| 192 | |
| 193 | |
| 194 | <p> |
| 195 | Maybe this is 0.2 :) |
| 196 | </p> |
| 197 | </div> |
| 198 | |
| 199 | </div> |
| 200 | |
| 201 | <div id="outline-container-2_5" class="outline-3"> |
| 202 | <h3 id="sec-2_5"><span class="section-number-3">2.5</span> Plugin system </h3> |
| 203 | <div class="outline-text-3" id="text-2_5"> |
| 204 | |
| 205 | |
| 206 | </div> |
| 207 | </div> |
| 208 | |
| 209 | </div> |
| 210 | |
| 211 | <div id="outline-container-3" class="outline-2"> |
| 212 | <h2 id="sec-3"><span class="section-number-2">3</span> Technology </h2> |
| 213 | <div class="outline-text-2" id="text-3"> |
| 214 | |
| 215 | |
| 216 | <p> |
| 217 | I have a pretty specific set of tools that I expect to use in this |
| 218 | project. Those are: |
| 219 | </p> |
| 220 | <ul> |
| 221 | <li><b><a href="http://python.org/">Python</a>:</b> because I love, and know well, the language |
| 222 | </li> |
| 223 | <li><b><a href="http://www.mongodb.org/">MongoDB</a>:</b> a "document database". Because it's extremely flexible |
| 224 | (and scales up well, but I guess not down well) |
| 225 | </li> |
| 226 | <li><b><a href="http://namlook.github.com/mongokit/">MongoKit</a>:</b> a lightweight ORM for mongodb. Helps us define our |
| 227 | structures better, does schema validation, schema evolution, and |
| 228 | helps make things more fun and pythonic. |
| 229 | </li> |
| 230 | <li><b><a href="http://jinja.pocoo.org/docs/">Jinja2</a>:</b> for templating. Pretty much django templates++ (wow, I |
| 231 | can actually pass arguments into method calls instead of tediously |
| 232 | writing custom tags!) |
| 233 | </li> |
| 234 | <li><b><a href="http://wtforms.simplecodes.com/">WTForms</a>:</b> for form handling, validation, abstraction. Almost just |
| 235 | like Django's templates, |
| 236 | </li> |
| 237 | <li><b><a href="http://pythonpaste.org/webob/">WebOb</a>:</b> gives nice request/response objects (also somewhat djangoish) |
| 238 | </li> |
| 239 | <li><b><a href="http://pythonpaste.org/deploy/">Paste Deploy</a> and <a href="http://pythonpaste.org/script/">Paste Script</a>:</b> as the default way of configuring |
| 240 | and launching the application. Since MediaGoblin will be fairly |
| 241 | wsgi minimalist though, you can probably use other ways to launch |
| 242 | it, though this will be the default. |
| 243 | </li> |
| 244 | <li><b><a href="http://routes.groovie.org/">Routes</a>:</b> for URL routing. It works well enough. |
| 245 | </li> |
| 246 | <li><b><a href="http://jquery.com/">JQuery</a>:</b> for all sorts of things on the javascript end of things, |
| 247 | for all sorts of reasons. |
| 248 | </li> |
| 249 | <li><b><a href="http://beaker.groovie.org/">Beaker</a>:</b> for sessions, because that seems like it's generally |
| 250 | considered the way to go I guess. |
| 251 | </li> |
| 252 | <li><b><a href="http://somethingaboutorange.com/mrl/projects/nose/1.0.0/">nose</a>:</b> for unit tests, because it makes testing a bit nicer. |
| 253 | </li> |
| 254 | <li><b><a href="http://celeryproject.org/">Celery</a>:</b> for task queueing (think resizing images, encoding |
| 255 | video) because some people like it, and even the people I know who |
| 256 | don't don't seem to know of anything better :) |
| 257 | </li> |
| 258 | <li><b><a href="http://www.rabbitmq.com/">RabbitMQ</a>:</b> for sending tasks to celery, because I guess that's |
| 259 | what most people do. Might be optional, might also let people use |
| 260 | MongoDB for this if they want. |
| 261 | </li> |
| 262 | </ul> |
| 263 | |
| 264 | |
| 265 | </div> |
| 266 | |
| 267 | <div id="outline-container-3_1" class="outline-3"> |
| 268 | <h3 id="sec-3_1"><span class="section-number-3">3.1</span> Why python </h3> |
| 269 | <div class="outline-text-3" id="text-3_1"> |
| 270 | |
| 271 | |
| 272 | <p> |
| 273 | Because I (Chris Webber) know Python, love Python, am capable of |
| 274 | actually making this thing happen in Python (I've worked on a lot of |
| 275 | large free software web applications before in Python, including |
| 276 | <a href="http://mirocommunity.org/">Miro Community</a>, the <a href="http://miroguide.org">Miro Guide</a>, a large portion of |
| 277 | <a href="http://creativecommons.org/">Creative Commons' site</a>, and a whole bunch of things while working at |
| 278 | <a href="http://www.imagescape.com/">Imaginary Landscape</a>). I know Python, I can make this happen in |
| 279 | Python, me starting a project like this makes sense if it's done in |
| 280 | Python. |
| 281 | </p> |
| 282 | <p> |
| 283 | You might say that PHP is way more deployable, that rails has way more |
| 284 | cool developers riding around on fixie bikes, and all of those things |
| 285 | are true, but I know Python, like Python, and think that Python is |
| 286 | pretty great. I do think that deployment in Python is not as good as |
| 287 | with PHP, but I think the days of shared hosting are (thankfully) |
| 288 | coming to an end, and will probably be replaced by cheap virtual |
| 289 | machines spun up on the fly for people who want that sort of stuff, |
| 290 | and Python will be a huge part of that future, maybe even more than |
| 291 | PHP will. The deployment tools are getting better. Maybe we can use |
| 292 | something like Silver Lining. Maybe we can just distribute as .debs |
| 293 | or .rpms. We'll figure it out. |
| 294 | </p> |
| 295 | <p> |
| 296 | But if I'm starting this project, which I am, it's gonna be in Python. |
| 297 | </p> |
| 298 | </div> |
| 299 | |
| 300 | </div> |
| 301 | |
| 302 | <div id="outline-container-3_2" class="outline-3"> |
| 303 | <h3 id="sec-3_2"><span class="section-number-3">3.2</span> Why mongodb </h3> |
| 304 | <div class="outline-text-3" id="text-3_2"> |
| 305 | |
| 306 | |
| 307 | <p> |
| 308 | In case you were wondering, I am not a NOSQL fanboy, I do not go |
| 309 | around telling people that MongoDB is web scale. Actually my choice |
| 310 | for MongoDB isn't scalability, though scaling up really nicely is a |
| 311 | pretty good feature and sets us up well in case large volume sites |
| 312 | eventually do use MediaGoblin. But there's another side of |
| 313 | scalability, and that's scaling down, which is important for |
| 314 | federation, maybe even more important than scaling up in an ideal |
| 315 | universe where everyone ran servers out of their own housing. As a |
| 316 | memory-mapped database, MongoDB is pretty hungry, so actually I spent |
| 317 | a lot of time debating whether the inability to scale down as nicely |
| 318 | as something like SQL has with sqlite meant that it was out. |
| 319 | </p> |
| 320 | <p> |
| 321 | But I decided in the end that I really want MongoDB, not for |
| 322 | scalability, but for flexibility. Schema evolution pains in SQL are |
| 323 | almost enough reason for me to want MongoDB, but not quite. The real |
| 324 | reason is because I want the ability to eventually handle multiple |
| 325 | media types through MediaGoblin, and also allow for plugins, without |
| 326 | the rigidity of tables making that difficult. In other words, |
| 327 | something like: |
| 328 | </p> |
| 329 | |
| 330 | |
| 331 | |
| 332 | <pre class="example">{"title": "Me talking until you are bored", |
| 333 | "description": "blah blah blah", |
| 334 | "media_type": "audio", |
| 335 | "media_data": { |
| 336 | "length": "2:30", |
| 337 | "codec": "OGG Vorbis"}, |
| 338 | "plugin_data": { |
| 339 | "licensing": { |
| 340 | "license": "http://creativecommons.org/licenses/by-sa/3.0/"}}} |
| 341 | </pre> |
| 342 | |
| 343 | |
| 344 | |
| 345 | <p> |
| 346 | Being able to just dump media-specific information in a media_data |
| 347 | hashtable is pretty great, and even better is having a plugin system |
| 348 | where you can just let plugins have their own entire key-value space |
| 349 | cleanly inside the document that doesn't interfere with anyone else's |
| 350 | stuff. If we were to let plugins to deposit their own information |
| 351 | inside the database, either we'd let plugins create their own tables |
| 352 | which makes SQL migrations even harder than they already are, or we'd |
| 353 | probably end up creating a table with a column for key, a column for |
| 354 | value, and a column for type in one huge table called "plugin_data" or |
| 355 | something similar. (Yo dawg, I heard you liked plugins, so I put a |
| 356 | database in your database so you can query while you query.) Gross. |
| 357 | </p> |
| 358 | <p> |
| 359 | I also don't want things to be too lose so that we forget or lose the |
| 360 | structure of things, and that's one reason why I want to use MongoKit, |
| 361 | because we can cleanly define a much structure as we want and verify |
| 362 | that documents match that structure generally without adding too much |
| 363 | bloat or overhead (mongokit is a pretty lightweight wrapper and |
| 364 | doesn't inject extra mongokit-specific stuff into the database, which |
| 365 | is nice and nicer than many other ORMs in that way). |
| 366 | </p> |
| 367 | </div> |
| 368 | |
| 369 | </div> |
| 370 | |
| 371 | <div id="outline-container-3_3" class="outline-3"> |
| 372 | <h3 id="sec-3_3"><span class="section-number-3">3.3</span> Why wsgi minimalism / Why not Django </h3> |
| 373 | <div class="outline-text-3" id="text-3_3"> |
| 374 | |
| 375 | |
| 376 | <p> |
| 377 | If you notice in the technology list above, I list a lot of components |
| 378 | that are very <a href="http://www.djangoproject.com/">Django-like</a>, but not actually Django components. What |
| 379 | can I say, I really like a lot of the ideas in Django! Which leads to |
| 380 | the question: why not just use Django? |
| 381 | </p> |
| 382 | <p> |
| 383 | While I really like Django's ideas and a lot of its components, I also |
| 384 | feel that most of the best ideas in Django I want have been |
| 385 | implemented as good or even better outside of Django. I could just |
| 386 | use Django and replace the templating system with Jinja2, and the form |
| 387 | system with wtforms, and the database with MongoDB and MongoKit, but |
| 388 | at that point, how much of Django is really left? |
| 389 | </p> |
| 390 | <p> |
| 391 | I also am sometimes saddened and irritated by how coupled all of |
| 392 | Django's components are. Loosely coupled yes, but still coupled. |
| 393 | WSGI has done a good job of providing a base layer for running |
| 394 | applications on and <a href="http://pythonpaste.org/webob/do-it-yourself.html">if you know how to do it yourself</a> it's not hard or |
| 395 | many lines of code at all to bind them together without any framework |
| 396 | at all (not even say <a href="http://pylonshq.com/">Pylons</a>, <a href="http://docs.pylonsproject.org/projects/pyramid/dev/">Pyramid</a>, or <a href="http://flask.pocoo.org/">Flask</a> which I think are still |
| 397 | great projects, especially for people who want this sort of thing but |
| 398 | have no idea how to get started). And even at this already really |
| 399 | early stage of writing MediaGoblin, that glue work is mostly done. |
| 400 | </p> |
| 401 | <p> |
| 402 | Not to say I don't think Django isn't great for a lot of things. For |
| 403 | a lot of stuff, it's still the best, but not for MediaGoblin, I think. |
| 404 | </p> |
| 405 | <p> |
| 406 | One thing that Django does super well though is documentation. It |
| 407 | still has some faults, but even with those considered I can hardly |
| 408 | think of any other project in Python that has as nice of documentation |
| 409 | as Django. It may be worth |
| 410 | <a href="http://pycon.blip.tv/file/4881071/">learning some lessons on documentation from Django</a>, on that note. |
| 411 | </p> |
| 412 | <p> |
| 413 | I'd really like to have a good, thorough hacking-howto and |
| 414 | deployment-howto, especially in the former making some notes on how to |
| 415 | make it easier for Django hackers to get started. |
| 416 | </p></div> |
| 417 | </div> |
| 418 | </div> |
| 419 | <div id="postamble"> |
| 420 | <p class="author">Author: Christopher Allan Webber</p> |
| 421 | <p class="creator">Org version 7.5 with Emacs version 24</p> |
| 422 | <a href="http://validator.w3.org/check?uri=referer">Validate XHTML 1.0</a> |
| 423 | </div> |
| 424 | </div> |
| 425 | </body> |
| 426 | </html> |