Started BasicFileStorage tests. test_basic_storage__resolve_filepath() done.
[mediagoblin.git] / READMEish.html
CommitLineData
ebc4ab71
CAW
1<?xml version="1.0" encoding="iso-8859-1"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4<html xmlns="http://www.w3.org/1999/xhtml"
5lang="en" xml:lang="en">
6<head>
7<title>GNU MediaGoblin</title>
8<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1"/>
9<meta name="generator" content="Org-mode"/>
10<meta name="generated" content="2011-03-27 23:35:24 CDT"/>
11<meta name="author" content="Christopher Allan Webber"/>
12<meta name="description" content=""/>
13<meta name="keywords" content=""/>
14<style type="text/css">
15 <!--/*--><![CDATA[/*><!--*/
16 html { font-family: Times, serif; font-size: 12pt; }
17 .title { text-align: center; }
18 .todo { color: red; }
19 .done { color: green; }
20 .tag { background-color: #add8e6; font-weight:normal }
21 .target { }
22 .timestamp { color: #bebebe; }
23 .timestamp-kwd { color: #5f9ea0; }
24 .right {margin-left:auto; margin-right:0px; text-align:right;}
25 .left {margin-left:0px; margin-right:auto; text-align:left;}
26 .center {margin-left:auto; margin-right:auto; text-align:center;}
27 p.verse { margin-left: 3% }
28 pre {
29 border: 1pt solid #AEBDCC;
30 background-color: #F3F5F7;
31 padding: 5pt;
32 font-family: courier, monospace;
33 font-size: 90%;
34 overflow:auto;
35 }
36 table { border-collapse: collapse; }
37 td, th { vertical-align: top; }
38 th.right { text-align:center; }
39 th.left { text-align:center; }
40 th.center { text-align:center; }
41 td.right { text-align:right; }
42 td.left { text-align:left; }
43 td.center { text-align:center; }
44 dt { font-weight: bold; }
45 div.figure { padding: 0.5em; }
46 div.figure p { text-align: center; }
47 textarea { overflow-x: auto; }
48 .linenr { font-size:smaller }
49 .code-highlighted {background-color:#ffff00;}
50 .org-info-js_info-navigation { border-style:none; }
51 #org-info-js_console-label { font-size:10px; font-weight:bold;
52 white-space:nowrap; }
53 .org-info-js_search-highlight {background-color:#ffff00; color:#000000;
54 font-weight:bold; }
55 /*]]>*/-->
56</style>
57<script type="text/javascript">
58<!--/*--><![CDATA[/*><!--*/
59 function CodeHighlightOn(elem, id)
60 {
61 var target = document.getElementById(id);
62 if(null != target) {
63 elem.cacheClassElem = elem.className;
64 elem.cacheClassTarget = target.className;
65 target.className = "code-highlighted";
66 elem.className = "code-highlighted";
67 }
68 }
69 function CodeHighlightOff(elem, id)
70 {
71 var target = document.getElementById(id);
72 if(elem.cacheClassElem)
73 elem.className = elem.cacheClassElem;
74 if(elem.cacheClassTarget)
75 target.className = elem.cacheClassTarget;
76 }
77/*]]>*///-->
78</script>
79
80</head>
81<body>
82<div id="content">
83
84<h1 class="title">GNU MediaGoblin</h1>
85
86
87<div id="table-of-contents">
88<h2>Table of Contents</h2>
89<div id="text-table-of-contents">
90<ul>
91<li><a href="#sec-1">1 About </a></li>
92<li><a href="#sec-2">2 Milestones </a>
93<ul>
94<li><a href="#sec-2_1">2.1 Basic image hosting </a></li>
95<li><a href="#sec-2_2">2.2 Multi-media hosting (including video and audio) </a></li>
96<li><a href="#sec-2_3">2.3 API(s) </a></li>
97<li><a href="#sec-2_4">2.4 Federation </a></li>
98<li><a href="#sec-2_5">2.5 Plugin system </a></li>
99</ul>
100</li>
101<li><a href="#sec-3">3 Technology </a>
102<ul>
103<li><a href="#sec-3_1">3.1 Why python </a></li>
104<li><a href="#sec-3_2">3.2 Why mongodb </a></li>
105<li><a href="#sec-3_3">3.3 Why wsgi minimalism / Why not Django </a></li>
106</ul>
107</li>
108</ul>
109</div>
110</div>
111
112<div id="outline-container-1" class="outline-2">
113<h2 id="sec-1"><span class="section-number-2">1</span> About </h2>
114<div class="outline-text-2" id="text-1">
115
116
117<p>
118What is MediaGoblin? I'm shooting for:
119</p>
120<ul>
121<li>Initially, a place to store all your photos that's as awesome as,
122 more awesome than, existing proprietary solutions
123</li>
124<li>Later, a place for all sorts of media, such as video, music, etc
125 hosting.
126</li>
127<li>Federated, like statusnet/ostatus (we should use ostatus, in fact!)
128</li>
129<li>Customizable
130</li>
131<li>A place for people to collaborate and show off original and derived
132 creations
133</li>
134<li>Free, as in freedom. Under the GNU AGPL, v3 or later. Encourages
135 free formats and free licensing for content, too.
136</li>
137</ul>
138
139<p>
140Wow! That's pretty ambitious. Hopefully we're cool enough to do it.
141I think we can.
142</p>
143<p>
144It's also necessary, for multiple reasons. Centralization and
145proprietization of media on the internet is a serious problem and
146makes the web go from a system of extreme resilience to a system
147of frightening fragility. People should be able to own their data.
148Etc. If you're reading this, chances are you already agree though. :)
149</p>
150</div>
151
152</div>
153
154<div id="outline-container-2" class="outline-2">
155<h2 id="sec-2"><span class="section-number-2">2</span> Milestones </h2>
156<div class="outline-text-2" id="text-2">
157
158
159<p>
160Excepting the first, not necessarily in this order.
161</p>
162
163</div>
164
165<div id="outline-container-2_1" class="outline-3">
166<h3 id="sec-2_1"><span class="section-number-3">2.1</span> Basic image hosting </h3>
167<div class="outline-text-3" id="text-2_1">
168
169</div>
170
171</div>
172
173<div id="outline-container-2_2" class="outline-3">
174<h3 id="sec-2_2"><span class="section-number-3">2.2</span> Multi-media hosting (including video and audio) </h3>
175<div class="outline-text-3" id="text-2_2">
176
177</div>
178
179</div>
180
181<div id="outline-container-2_3" class="outline-3">
182<h3 id="sec-2_3"><span class="section-number-3">2.3</span> API(s) </h3>
183<div class="outline-text-3" id="text-2_3">
184
185</div>
186
187</div>
188
189<div id="outline-container-2_4" class="outline-3">
190<h3 id="sec-2_4"><span class="section-number-3">2.4</span> Federation </h3>
191<div class="outline-text-3" id="text-2_4">
192
193
194<p>
195Maybe this is 0.2 :)
196</p>
197</div>
198
199</div>
200
201<div id="outline-container-2_5" class="outline-3">
202<h3 id="sec-2_5"><span class="section-number-3">2.5</span> Plugin system </h3>
203<div class="outline-text-3" id="text-2_5">
204
205
206</div>
207</div>
208
209</div>
210
211<div id="outline-container-3" class="outline-2">
212<h2 id="sec-3"><span class="section-number-2">3</span> Technology </h2>
213<div class="outline-text-2" id="text-3">
214
215
216<p>
217I have a pretty specific set of tools that I expect to use in this
218project. Those are:
219</p>
220<ul>
221<li><b><a href="http://python.org/">Python</a>:</b> because I love, and know well, the language
222</li>
223<li><b><a href="http://www.mongodb.org/">MongoDB</a>:</b> a "document database". Because it's extremely flexible
224 (and scales up well, but I guess not down well)
225</li>
226<li><b><a href="http://namlook.github.com/mongokit/">MongoKit</a>:</b> a lightweight ORM for mongodb. Helps us define our
227 structures better, does schema validation, schema evolution, and
228 helps make things more fun and pythonic.
229</li>
230<li><b><a href="http://jinja.pocoo.org/docs/">Jinja2</a>:</b> for templating. Pretty much django templates++ (wow, I
231 can actually pass arguments into method calls instead of tediously
232 writing custom tags!)
233</li>
234<li><b><a href="http://wtforms.simplecodes.com/">WTForms</a>:</b> for form handling, validation, abstraction. Almost just
235 like Django's templates,
236</li>
237<li><b><a href="http://pythonpaste.org/webob/">WebOb</a>:</b> gives nice request/response objects (also somewhat djangoish)
238</li>
239<li><b><a href="http://pythonpaste.org/deploy/">Paste Deploy</a> and <a href="http://pythonpaste.org/script/">Paste Script</a>:</b> as the default way of configuring
240 and launching the application. Since MediaGoblin will be fairly
241 wsgi minimalist though, you can probably use other ways to launch
242 it, though this will be the default.
243</li>
244<li><b><a href="http://routes.groovie.org/">Routes</a>:</b> for URL routing. It works well enough.
245</li>
246<li><b><a href="http://jquery.com/">JQuery</a>:</b> for all sorts of things on the javascript end of things,
247 for all sorts of reasons.
248</li>
249<li><b><a href="http://beaker.groovie.org/">Beaker</a>:</b> for sessions, because that seems like it's generally
250 considered the way to go I guess.
251</li>
252<li><b><a href="http://somethingaboutorange.com/mrl/projects/nose/1.0.0/">nose</a>:</b> for unit tests, because it makes testing a bit nicer.
253</li>
254<li><b><a href="http://celeryproject.org/">Celery</a>:</b> for task queueing (think resizing images, encoding
255 video) because some people like it, and even the people I know who
256 don't don't seem to know of anything better :)
257</li>
258<li><b><a href="http://www.rabbitmq.com/">RabbitMQ</a>:</b> for sending tasks to celery, because I guess that's
259 what most people do. Might be optional, might also let people use
260 MongoDB for this if they want.
261</li>
262</ul>
263
264
265</div>
266
267<div id="outline-container-3_1" class="outline-3">
268<h3 id="sec-3_1"><span class="section-number-3">3.1</span> Why python </h3>
269<div class="outline-text-3" id="text-3_1">
270
271
272<p>
273Because I (Chris Webber) know Python, love Python, am capable of
274actually making this thing happen in Python (I've worked on a lot of
275large free software web applications before in Python, including
276<a href="http://mirocommunity.org/">Miro Community</a>, the <a href="http://miroguide.org">Miro Guide</a>, a large portion of
277<a href="http://creativecommons.org/">Creative Commons' site</a>, and a whole bunch of things while working at
278<a href="http://www.imagescape.com/">Imaginary Landscape</a>). I know Python, I can make this happen in
279Python, me starting a project like this makes sense if it's done in
280Python.
281</p>
282<p>
283You might say that PHP is way more deployable, that rails has way more
284cool developers riding around on fixie bikes, and all of those things
285are true, but I know Python, like Python, and think that Python is
286pretty great. I do think that deployment in Python is not as good as
287with PHP, but I think the days of shared hosting are (thankfully)
288coming to an end, and will probably be replaced by cheap virtual
289machines spun up on the fly for people who want that sort of stuff,
290and Python will be a huge part of that future, maybe even more than
291PHP will. The deployment tools are getting better. Maybe we can use
292something like Silver Lining. Maybe we can just distribute as .debs
293or .rpms. We'll figure it out.
294</p>
295<p>
296But if I'm starting this project, which I am, it's gonna be in Python.
297</p>
298</div>
299
300</div>
301
302<div id="outline-container-3_2" class="outline-3">
303<h3 id="sec-3_2"><span class="section-number-3">3.2</span> Why mongodb </h3>
304<div class="outline-text-3" id="text-3_2">
305
306
307<p>
308In case you were wondering, I am not a NOSQL fanboy, I do not go
309around telling people that MongoDB is web scale. Actually my choice
310for MongoDB isn't scalability, though scaling up really nicely is a
311pretty good feature and sets us up well in case large volume sites
312eventually do use MediaGoblin. But there's another side of
313scalability, and that's scaling down, which is important for
314federation, maybe even more important than scaling up in an ideal
315universe where everyone ran servers out of their own housing. As a
316memory-mapped database, MongoDB is pretty hungry, so actually I spent
317a lot of time debating whether the inability to scale down as nicely
318as something like SQL has with sqlite meant that it was out.
319</p>
320<p>
321But I decided in the end that I really want MongoDB, not for
322scalability, but for flexibility. Schema evolution pains in SQL are
323almost enough reason for me to want MongoDB, but not quite. The real
324reason is because I want the ability to eventually handle multiple
325media types through MediaGoblin, and also allow for plugins, without
326the rigidity of tables making that difficult. In other words,
327something like:
328</p>
329
330
331
332<pre class="example">{"title": "Me talking until you are bored",
333 "description": "blah blah blah",
334 "media_type": "audio",
335 "media_data": {
336 "length": "2:30",
337 "codec": "OGG Vorbis"},
338 "plugin_data": {
339 "licensing": {
340 "license": "http://creativecommons.org/licenses/by-sa/3.0/"}}}
341</pre>
342
343
344
345<p>
346Being able to just dump media-specific information in a media_data
347hashtable is pretty great, and even better is having a plugin system
348where you can just let plugins have their own entire key-value space
349cleanly inside the document that doesn't interfere with anyone else's
350stuff. If we were to let plugins to deposit their own information
351inside the database, either we'd let plugins create their own tables
352which makes SQL migrations even harder than they already are, or we'd
353probably end up creating a table with a column for key, a column for
354value, and a column for type in one huge table called "plugin_data" or
355something similar. (Yo dawg, I heard you liked plugins, so I put a
356database in your database so you can query while you query.) Gross.
357</p>
358<p>
359I also don't want things to be too lose so that we forget or lose the
360structure of things, and that's one reason why I want to use MongoKit,
361because we can cleanly define a much structure as we want and verify
362that documents match that structure generally without adding too much
363bloat or overhead (mongokit is a pretty lightweight wrapper and
364doesn't inject extra mongokit-specific stuff into the database, which
365is nice and nicer than many other ORMs in that way).
366</p>
367</div>
368
369</div>
370
371<div id="outline-container-3_3" class="outline-3">
372<h3 id="sec-3_3"><span class="section-number-3">3.3</span> Why wsgi minimalism / Why not Django </h3>
373<div class="outline-text-3" id="text-3_3">
374
375
376<p>
377If you notice in the technology list above, I list a lot of components
378that are very <a href="http://www.djangoproject.com/">Django-like</a>, but not actually Django components. What
379can I say, I really like a lot of the ideas in Django! Which leads to
380the question: why not just use Django?
381</p>
382<p>
383While I really like Django's ideas and a lot of its components, I also
384feel that most of the best ideas in Django I want have been
385implemented as good or even better outside of Django. I could just
386use Django and replace the templating system with Jinja2, and the form
387system with wtforms, and the database with MongoDB and MongoKit, but
388at that point, how much of Django is really left?
389</p>
390<p>
391I also am sometimes saddened and irritated by how coupled all of
392Django's components are. Loosely coupled yes, but still coupled.
393WSGI has done a good job of providing a base layer for running
394applications on and <a href="http://pythonpaste.org/webob/do-it-yourself.html">if you know how to do it yourself</a> it's not hard or
395many lines of code at all to bind them together without any framework
396at all (not even say <a href="http://pylonshq.com/">Pylons</a>, <a href="http://docs.pylonsproject.org/projects/pyramid/dev/">Pyramid</a>, or <a href="http://flask.pocoo.org/">Flask</a> which I think are still
397great projects, especially for people who want this sort of thing but
398have no idea how to get started). And even at this already really
399early stage of writing MediaGoblin, that glue work is mostly done.
400</p>
401<p>
402Not to say I don't think Django isn't great for a lot of things. For
403a lot of stuff, it's still the best, but not for MediaGoblin, I think.
404</p>
405<p>
406One thing that Django does super well though is documentation. It
407still has some faults, but even with those considered I can hardly
408think of any other project in Python that has as nice of documentation
409as Django. It may be worth
410<a href="http://pycon.blip.tv/file/4881071/">learning some lessons on documentation from Django</a>, on that note.
411</p>
412<p>
413I'd really like to have a good, thorough hacking-howto and
414deployment-howto, especially in the former making some notes on how to
415make it easier for Django hackers to get started.
416</p></div>
417</div>
418</div>
419<div id="postamble">
420<p class="author">Author: Christopher Allan Webber</p>
421<p class="creator">Org version 7.5 with Emacs version 24</p>
422<a href="http://validator.w3.org/check?uri=referer">Validate XHTML 1.0</a>
423</div>
424</div>
425</body>
426</html>