Commit | Line | Data |
---|---|---|
ebc4ab71 CAW |
1 | #+latex_header: \documentclass[12pt]{article} |
2 | #+latex_header: \usepackage[margin=1in]{geometry} | |
3 | #+OPTIONS: ^:nil | |
4 | ||
869704d6 CAW |
5 | GNU MediaGoblin |
6 | ||
7 | * About | |
8 | ||
9 | What is MediaGoblin? I'm shooting for: | |
10 | ||
11 | - Initially, a place to store all your photos that's as awesome as, | |
12 | more awesome than, existing proprietary solutions | |
13 | - Later, a place for all sorts of media, such as video, music, etc | |
14 | hosting. | |
15 | - Federated, like statusnet/ostatus (we should use ostatus, in fact!) | |
16 | - Customizable | |
17 | - A place for people to collaborate and show off original and derived | |
18 | creations | |
19 | - Free, as in freedom. Under the GNU AGPL, v3 or later. Encourages | |
20 | free formats and free licensing for content, too. | |
21 | ||
22 | Wow! That's pretty ambitious. Hopefully we're cool enough to do it. | |
23 | I think we can. | |
24 | ||
25 | It's also necessary, for multiple reasons. Centralization and | |
26 | proprietization of media on the internet is a serious problem and | |
27 | makes the web go from a system of extreme resilience to a system | |
28 | of frightening fragility. People should be able to own their data. | |
29 | Etc. If you're reading this, chances are you already agree though. :) | |
30 | ||
31 | * Milestones | |
32 | ||
33 | Excepting the first, not necessarily in this order. | |
34 | ||
35 | ** Basic image hosting | |
36 | ** Multi-media hosting (including video and audio) | |
37 | ** API(s) | |
38 | ** Federation | |
39 | ||
40 | Maybe this is 0.2 :) | |
41 | ||
42 | ** Plugin system | |
43 | ||
44 | * Technology | |
45 | ||
46 | I have a pretty specific set of tools that I expect to use in this | |
47 | project. Those are: | |
48 | ||
49 | - *[[http://python.org/][Python]]:* because I love, and know well, the language | |
50 | - *[[http://www.mongodb.org/][MongoDB]]:* a "document database". Because it's extremely flexible | |
51 | (and scales up well, but I guess not down well) | |
52 | - *[[http://namlook.github.com/mongokit/][MongoKit]]:* a lightweight ORM for mongodb. Helps us define our | |
53 | structures better, does schema validation, schema evolution, and | |
54 | helps make things more fun and pythonic. | |
55 | - *[[http://jinja.pocoo.org/docs/][Jinja2]]:* for templating. Pretty much django templates++ (wow, I | |
56 | can actually pass arguments into method calls instead of tediously | |
57 | writing custom tags!) | |
58 | - *[[http://wtforms.simplecodes.com/][WTForms]]:* for form handling, validation, abstraction. Almost just | |
59 | like Django's templates, | |
60 | - *[[http://pythonpaste.org/webob/][WebOb]]:* gives nice request/response objects (also somewhat djangoish) | |
61 | - *[[http://pythonpaste.org/deploy/][Paste Deploy]] and [[http://pythonpaste.org/script/][Paste Script]]:* as the default way of configuring | |
62 | and launching the application. Since MediaGoblin will be fairly | |
63 | wsgi minimalist though, you can probably use other ways to launch | |
64 | it, though this will be the default. | |
65 | - *[[http://routes.groovie.org/][Routes]]:* for URL routing. It works well enough. | |
66 | - *[[http://jquery.com/][JQuery]]:* for all sorts of things on the javascript end of things, | |
67 | for all sorts of reasons. | |
68 | - *[[http://beaker.groovie.org/][Beaker]]:* for sessions, because that seems like it's generally | |
69 | considered the way to go I guess. | |
70 | - *[[http://somethingaboutorange.com/mrl/projects/nose/1.0.0/][nose]]:* for unit tests, because it makes testing a bit nicer. | |
71 | - *[[http://celeryproject.org/][Celery]]:* for task queueing (think resizing images, encoding | |
72 | video) because some people like it, and even the people I know who | |
73 | don't don't seem to know of anything better :) | |
74 | - *[[http://www.rabbitmq.com/][RabbitMQ]]:* for sending tasks to celery, because I guess that's | |
75 | what most people do. Might be optional, might also let people use | |
76 | MongoDB for this if they want. | |
77 | ||
78 | ** Why python | |
79 | ||
80 | Because I (Chris Webber) know Python, love Python, am capable of | |
81 | actually making this thing happen in Python (I've worked on a lot of | |
82 | large free software web applications before in Python, including | |
83 | [[http://mirocommunity.org/][Miro Community]], the [[http://miroguide.org][Miro Guide]], a large portion of | |
84 | [[http://creativecommons.org/][Creative Commons' site]], and a whole bunch of things while working at | |
85 | [[http://www.imagescape.com/][Imaginary Landscape]]). I know Python, I can make this happen in | |
86 | Python, me starting a project like this makes sense if it's done in | |
87 | Python. | |
88 | ||
89 | You might say that PHP is way more deployable, that rails has way more | |
90 | cool developers riding around on fixie bikes, and all of those things | |
91 | are true, but I know Python, like Python, and think that Python is | |
92 | pretty great. I do think that deployment in Python is not as good as | |
93 | with PHP, but I think the days of shared hosting are (thankfully) | |
94 | coming to an end, and will probably be replaced by cheap virtual | |
95 | machines spun up on the fly for people who want that sort of stuff, | |
96 | and Python will be a huge part of that future, maybe even more than | |
97 | PHP will. The deployment tools are getting better. Maybe we can use | |
98 | something like Silver Lining. Maybe we can just distribute as .debs | |
99 | or .rpms. We'll figure it out. | |
100 | ||
101 | But if I'm starting this project, which I am, it's gonna be in Python. | |
102 | ||
103 | ** Why mongodb | |
104 | ||
105 | In case you were wondering, I am not a NOSQL fanboy, I do not go | |
106 | around telling people that MongoDB is web scale. Actually my choice | |
107 | for MongoDB isn't scalability, though scaling up really nicely is a | |
108 | pretty good feature and sets us up well in case large volume sites | |
109 | eventually do use MediaGoblin. But there's another side of | |
110 | scalability, and that's scaling down, which is important for | |
111 | federation, maybe even more important than scaling up in an ideal | |
112 | universe where everyone ran servers out of their own housing. As a | |
113 | memory-mapped database, MongoDB is pretty hungry, so actually I spent | |
114 | a lot of time debating whether the inability to scale down as nicely | |
115 | as something like SQL has with sqlite meant that it was out. | |
116 | ||
117 | But I decided in the end that I really want MongoDB, not for | |
118 | scalability, but for flexibility. Schema evolution pains in SQL are | |
119 | almost enough reason for me to want MongoDB, but not quite. The real | |
120 | reason is because I want the ability to eventually handle multiple | |
121 | media types through MediaGoblin, and also allow for plugins, without | |
122 | the rigidity of tables making that difficult. In other words, | |
123 | something like: | |
124 | ||
125 | #+BEGIN_SRC javascript | |
126 | {"title": "Me talking until you are bored", | |
127 | "description": "blah blah blah", | |
128 | "media_type": "audio", | |
129 | "media_data": { | |
130 | "length": "2:30", | |
131 | "codec": "OGG Vorbis"}, | |
132 | "plugin_data": { | |
133 | "licensing": { | |
134 | "license": "http://creativecommons.org/licenses/by-sa/3.0/"}}} | |
135 | #+END_SRC | |
136 | ||
137 | Being able to just dump media-specific information in a media_data | |
138 | hashtable is pretty great, and even better is having a plugin system | |
139 | where you can just let plugins have their own entire key-value space | |
140 | cleanly inside the document that doesn't interfere with anyone else's | |
141 | stuff. If we were to let plugins to deposit their own information | |
142 | inside the database, either we'd let plugins create their own tables | |
143 | which makes SQL migrations even harder than they already are, or we'd | |
144 | probably end up creating a table with a column for key, a column for | |
145 | value, and a column for type in one huge table called "plugin_data" or | |
146 | something similar. (Yo dawg, I heard you liked plugins, so I put a | |
147 | database in your database so you can query while you query.) Gross. | |
148 | ||
149 | I also don't want things to be too lose so that we forget or lose the | |
150 | structure of things, and that's one reason why I want to use MongoKit, | |
151 | because we can cleanly define a much structure as we want and verify | |
152 | that documents match that structure generally without adding too much | |
153 | bloat or overhead (mongokit is a pretty lightweight wrapper and | |
154 | doesn't inject extra mongokit-specific stuff into the database, which | |
155 | is nice and nicer than many other ORMs in that way). | |
156 | ||
157 | ** Why wsgi minimalism / Why not Django | |
158 | ||
159 | If you notice in the technology list above, I list a lot of components | |
160 | that are very [[http://www.djangoproject.com/][Django-like]], but not actually Django components. What | |
161 | can I say, I really like a lot of the ideas in Django! Which leads to | |
162 | the question: why not just use Django? | |
163 | ||
164 | While I really like Django's ideas and a lot of its components, I also | |
165 | feel that most of the best ideas in Django I want have been | |
166 | implemented as good or even better outside of Django. I could just | |
167 | use Django and replace the templating system with Jinja2, and the form | |
168 | system with wtforms, and the database with MongoDB and MongoKit, but | |
169 | at that point, how much of Django is really left? | |
170 | ||
171 | I also am sometimes saddened and irritated by how coupled all of | |
172 | Django's components are. Loosely coupled yes, but still coupled. | |
173 | WSGI has done a good job of providing a base layer for running | |
174 | applications on and [[http://pythonpaste.org/webob/do-it-yourself.html][if you know how to do it yourself]] it's not hard or | |
175 | many lines of code at all to bind them together without any framework | |
176 | at all (not even say [[http://pylonshq.com/][Pylons]], [[http://docs.pylonsproject.org/projects/pyramid/dev/][Pyramid]], or [[http://flask.pocoo.org/][Flask]] which I think are still | |
177 | great projects, especially for people who want this sort of thing but | |
178 | have no idea how to get started). And even at this already really | |
179 | early stage of writing MediaGoblin, that glue work is mostly done. | |
180 | ||
181 | Not to say I don't think Django isn't great for a lot of things. For | |
182 | a lot of stuff, it's still the best, but not for MediaGoblin, I think. | |
183 | ||
184 | One thing that Django does super well though is documentation. It | |
185 | still has some faults, but even with those considered I can hardly | |
186 | think of any other project in Python that has as nice of documentation | |
187 | as Django. It may be worth | |
188 | [[http://pycon.blip.tv/file/4881071/][learning some lessons on documentation from Django]], on that note. | |
189 | ||
190 | I'd really like to have a good, thorough hacking-howto and | |
191 | deployment-howto, especially in the former making some notes on how to | |
192 | make it easier for Django hackers to get started. |