Moving all of that cruft from one system to another is non-trivial, and I left a lot of old stuff in a format that couldn’t be used…until now.
Note: the post header image was created using NightCafe, but the words are my own (I swear!).
Thankfully, both LiveJournal and Wordpress possess ways to export content, and so I did. Unfortunately, Wordpress exports not only the final version of a post, but also “autosave” and “revision” versions, so I had to spend some time wading through them all to get the canonical version of each post. There were 4000+ files, so it took a bit.
Jekyll uses a simple filename-based post format tucked into a
_posts directory, so a blog post written in today would be something like
.html is also OK). My exported blog posts did not have that filename format, so through the combination of a couple shell scripts and Name Mangler I was able to grab a
date meta line from the file itself and rename each file accordingly. I also needed to add a specific tag to some posts so I could hide them publicly (not everything I wrote in the past is germane to this blog, but I still want to be able to read them!).
Besides re-learning how to do string manipulation in
bash and how to actually use
sed (I’ve basically almost never used it before), the main hiccup was just dealing with a large number of files (which I segmented into folders based on the first letter in the file) and trial-and-error with the logic in my shell scripts.
However, one Jekyll-related thing that came up (that I’ve solved in the past but forgot how to do until now) was this warning when serving up my site:
Conflict: The following destination is shared by multiple files The written file may end up with unexpected contents. /path/to/jekyll/_site/blog/tags/[tag-name]/index.html - blog/tags/[tag-name]/index.html - blog/tags/[tag-name]/index.html
Most of the fixes I found on the web did not address it for me (e.g. duplicate
page.md), but once I searched for the offending tag name it was obvious:
tag name vs.
tag-name was resulting in the same page, which is not ideal.
The fix? Make sure you only use one format or the other. I just renamed any tags that had spaces to use hyphens instead and voilà.
IN THE END
Now I have a way to read everything I’ve ever blogged in the past 20 years in one place, and I control all of the content and how it’s displayed. This is the dream for all of us who, while still living in a corporate-controlled web where all your content gets locked up in a few companies’ coffers, don’t, like, actually want that to be true.
I’m sure many don’t care about that, and will continue to post their stuff on the Facebooks, Instagrams, Twitters/Xs, Threads, and their ilk, and so be it. Maybe I’m too precious with my words and their archival, but I figure I can have it this way and still link to my blog on those spaces if I want, so it’s a win-win.