How to convert a Word doc or HTML to Wiki Markup
I came across some more tips for converting Word documents so that the content can be added to a wiki. That’s something which a corporate Knowledge Management pogramme is likely to require, and can be handy for individuals from time to time as well. The challenge is to avoid temptation to open up the wiki for raw HTML input and then putting the really bloated and non standard html code which Word produces itself into the wiki, because then nobody is ever going to really want to wade through all that rubbish in order to edit the content. That’s similar to the problem where one person likes to use one of the attempts at a WYSIWYG wiki editor for mediawiki to create pages, and then another person tries to develop the page further using the plain wiki text editor – it’s messy.So any tool which generates nice plain simple wiki text from other inputs is going to be great for migrating content from out of email attachments and intranet databases and out onto the flat hierarchy of the open wiki space.I generally use this one for converting from HTML pages into mediawiki syntax:HTML::WikiConverter
and the tip below for converting to html through gMail is a good one too. Did you know you can use that technique to convert pdf’s into editable text as well?
Convert Word doc or Webpage to wiki – A Consuming Experience
For me, the two ways which worked the best were:
- Convert Word to HTML via Gmail, then convert the Webpage’s HTML to wikitext with Emiliano Bruni’s excellent HTML2Wiki Converter (where you paste the raw HTML code into the top box, and the wiki code appears in the bottom box which you can copy and paste into your wiki). Or (less good in its conversions, I found) –
- Convert Word to wiki direct using a Word macro – Word2MediaWikiPlus worked OK, though nowhere near as well as the above, for a MediaWiki wiki and PBWiki wiki that I tried them on (and those are probably two of the more popular wiki software platforms around); the results needed quite a lot of tidying.