Export WordPress XML file to separate html files.

I needed to quickly export all the articles in a WordPress install to separate html files.  There were over 400 posts so a copy and paste was not an option.  The quickest way to do this was to export it using the built in export option, then process it using php.

Here is the quick and hacky code I wrote for this specific job.  It is not a good example of php code but it did the job required.  The str_replace lines are to replace specific problems for filenames.  You will need to delete modify these to suite your particular file naming issues.  If you comment out the file_put_contents you will be able to spot any filename issues.

This is a php-cli script so don’t try to run it in your browser.

Here is the code to convert the wordpress xml export file to separate html files.

I then used the following to convert all the html to markdown using pandoc.

You can get a clean version of this with some file filtering from my github account.
https://gist.github.com/karlgray/c3ab17615b3c0f712cb4144a4734c25b

 

Be the first to comment

Leave a Reply

Your email address will not be published.


*