Blog/20131113 Mediawiki to pdf and offline

From Bjoern Hassler's website
Jump to: navigation, search

More from B's blog:

Some older entries are here.

1 Mediawiki to pdf and offline

We're are using mediawiki to provide teacher education resources for teachers in sub-Saharan Africa, see http://www.oer4schools.org. (The big picture is the 2nd MDG of achieving Universal Primary Education.) We chose mediawiki because it's an open platform, that has a lot of momentum behind it, and because we can produce pdf and offline versions, which is essential for us. The new visual editor is also an excellent development.

Eric Moeller just posted a message to the mediawiki offline community, about re-implementing the rendering pipeline for PDF in mediawiki (c.f. https://www.mediawiki.org/wiki/PDF_rendering).

1.1 Some of the issues

Seeing as this thread is about pdf, I'll start with some issues around the pdf, before giving our wider use cases. We have struggled with a few things in the pediapress Pdf export, which are:

  • We are essentially writing educational materials, and would like a way of putting text in boxes to flag the nature of that text (e.g. as a transcript, background reading, a note meant for facilitators). We have implemented this quite straight forwardly through div with a border or different background color. However, because the current pdf rendering uses the wiki text (rather than html) all this formatting is lost. See [1] for examples as to how we use boxes.
  • Numbered section headings. Sections in the pdf aren't numbered, which isn't helpful (in wiki speak, the magic word NUMBEREDHEADINGS is ignored). This may not be a problem for wikipedia articles, but we are using mediawiki to write materials for teacher education where you just need to be able to refer to the number of the section. (More on this below.)
  • We also make extensive use of the semantic mediawiki extension, e.g. to assign episodes to our videos. Again, this isn't implemented in the current pdf rendering pipeline.

I am not fully up to speed with what the plans are, but if the proposal is html->pdf rendering, rather than wiki text -> pdf rendering, then the above issues would be solved anyway.

One thing that was very useful for us is the 'exclude in print' feature, that allows you to exclude certain templates from printing (such as the right-hand menu that allows navigation through the resource, which doesn't make sense to have in print.)

1.2 Our use cases

What are our use cases? Our OER4Schools resource is used by teachers in Zambia for professional development, with very limited connectivity. The following scenarios are critical in this work (and would be similarly critical for most teacher education scenarios in sub-Saharan Africa):

Scenario 1: Pdf / print. We need to be able to print our whole professional development programme (around 250 pages). At the moment, we print each wiki page needed to pdf, and then collate them. It's not a great process. We can't use the collection extension because of the above issues.

Scenario 2: Use on local web server. We would like to be able to produce a static stand-alone version of the wiki (in html) that can run off a local webserver. It would be good if links to any non-static content pointed back at the live version (e.g. links to other namespaces, such as 'Special', as well as 'edit'/history links). Ideally, the same (or a similar) version could run off a memory stick for use on netbooks. We have tinkered with some scripts, and there are other scripts out there: We'd love some help in finding something robust.

Scenario 3: Use on (Android) tablets / phones. We would love to have a version for mobile phones and tablets. Android phones and tablets are catching on in many developing countries. Tablets are overtaking netbooks at the moment, and are starting to become available cheaply.

  • Offline access: We'd love to have some advice how we can achieve this with ZIM. I guess one issue is that we would want to update our resource, and it would be good if that didn't mean that the whole resource needs to be downloaded again. The biggest items are uploads (files, images, audio, video). I think it would be ok for the wiki text to be re-downloaded, but it would not be feasible for us to re-download uploads.
  • Online access: We'd love some advice on how to adapt the Wikipedia apps to work with our wiki, to give efficient access.

1.3 Some footnotes

We would also like to implement the mediawiki mobile rendering (as m.orbit.educ.cam.ac.uk). If somebody wanted to help us with this, we would really appreciate it.

The section numbering issue mentioned above has a bit of a twist to it. We would really like to be able to prefix mediawiki section numbers with an arbitrary string (such as the number of a "unit" or "session", where a unit/session is just a wiki page that has a number assigned through the semantic mediawiki, so that they can be ordered). It doesn't really matter what the mechanism is of prefixing the section numbers, but for us it's really helpful to have "global labels" for section numbers within pages, so that we can refer to a section as "Unit 1, Session 2, Section 3", and the heading for section 3 appears as 1.2.3.

A bit further off topic: We have also been using various offline tools (commandline based) for downloading the wiki text, doing a quick edit, and then re-uploading it. This is sometimes really very useful, and it would be good to have more official tools for this. If somebody is happy to engage with us on this, we would love to move this forward.

Finally, in some of the above scenarios 2 and 3, it would be good if it was still possible to search. On the local web server, this could just be a local webserver search, but on a mobile (offline) app, it would need to be built into the app. At the moment, it's a "nice-to-have", rather than essential.



2013-11-13 | Leave a comment | Back to blog Share on Twitter Share on Facebook