Confused and uncertain about the future of Wordpress?

Don’t try to fix it–Wordprexit!

Actually, it’s just a Python command-line tool to port a Wordpress blog to Hugo. There exist many tools to do this job already, but they all have some shortcomings. Here’s what Wordprexit can do:

  • Convert your Wordpress posts to Hugo posts. All the expected fields are populated in each entry’s front matter. Messy HTML is filtered through a port of the wpautop algorithm to provide paragraph breaks and converted to clean, valid Markdown.
  • Parse all the different Wordpress date/time formats–yes, there are several, including creative touches like intentional use of the year “-0001” and missing timezone information–into standard ISO format.
  • Convert Wordpress [shortcodes] to HTML.
  • Replace <img>, <figure>, and <blockquote> HTML tags with Hugo shortcodes to provide better options for styling, responsive image handling, and so on.
  • Download embedded images in the largest size possible, regardless of whether they were part of your Media Library, and place them in a Hugo Page Bundle with your entry. Each Wordpress image title is extracted (when present) and placed in your resources front matter for use by your shortcodes.
  • Convert all your comments to Staticman-compatible JSON files in your data/comments directory.

Installation and use

It’s as simple as:

# Install CLI tool
pip install wordprexit

Now download the WXR file from Wordpress. You can do this from “Tools > Export” in the admin interface. If given the option, include both posts and media, so that the exported file contains post_type “attachment” and post_type “post” data. (The “attachment” entries are your Media Library metadata.)

# Parse file (will create content/ and data/ tree in current directory)
wordprexit wxrfile.xml

Source code

You can check out the source on GitHub: https://github.com/2n3906/wordprexit