Confused and uncertain about the future of Wordpress?
Don't try to fix it–Wordprexit!
- Convert your Wordpress posts to Hugo posts. All the expected fields
are populated in each entry's front matter. Messy HTML is filtered
through a port of the
wpautopalgorithm to provide paragraph breaks and converted to clean, valid Markdown.
- Parse all the different Wordpress date/time formats–yes, there are several, including creative touches like intentional use of the year “-0001” and missing timezone information–into standard ISO format.
- Convert Wordpress
<blockquote>HTML tags with Hugo shortcodes to provide better options for styling, responsive image handling, and so on.
- Download embedded images in the largest size possible, regardless
of whether they were part of your Media Library, and place
them in a Hugo Page Bundle with your entry. Each Wordpress image title
is extracted (when present) and placed in your
resourcesfront matter for use by your shortcodes.
- Convert all your comments to Staticman-compatible JSON files
Installation and use
It's as simple as:
# Install CLI tool pip install wordprexit
Now download the WXR file from Wordpress. You can do this from “Tools > Export” in the admin interface. If given the option, include both posts and media, so that the exported file contains post_type “attachment” and post_type “post” data. (The “attachment” entries are your Media Library metadata.)
# Parse file (will create content/ and data/ tree in current directory) wordprexit wxrfile.xml
You can check out the source on GitHub: https://github.com/2n3906/wordprexit