Medium is a great publication platform. It has good exposure, quality content, readers that really appreciate good articles and a neat and easy to use UI. It’s especially great for writers that just start their journey.
I built my own using Pelican, a Python-based static site generator. I wrote an article explaining the whole process. For every Medium article, I need to copy the URL, run some command to transfer it into Markdown file, then generate the blog site using Pelican. It is simple, but not as simple as I like it to be. So this is a great opportunity for some quick and dirty Bash script to come for the rescue. Let’s see what we can do.
Before start writing the script, it helps to structure out what we want to accomplish, makes it easier to write quality code. Basically, we need to:
Put all article URLs into one text file manually(plan to automate this part too in the future, using some scraping framework maybe)Read every line of the file, and for each line.Extract the title and subtitleUse the title and subtitle to create meta-data needed for Pelican to turn the Markdown file into a post.Run Pelican command to generate the static site.Push the site to GitHub and trigger Netlify’s auto-buildProfit.
First of all, define our variables
#!/bin/bash
# Define variables
filename='articles.txt'
n=1
The structure the loop to read every line of the text file:
# Read in file and do processing on each one
while read line; do
# reading each line
n=$((n+1))
slug=$(echo $line | sed 's/https:\/\/towardsdatascience.com\///' ) # get slug from URL
FILE="$HOME/wayofnumbers.github.io/content/$slug.md" # generate Markdown file name from slug
mediumexporter $line > $FILE # convert medium article to markdown file
# some processing ...
done < $filename
We used the
sed
command to remove the first part of the URL: https://towardsdatascience.com/
so the rest could be used as our slug. For example, https://towardsdatascience.com/9-things-i-learned-from-blogging-on-medium-for-the-first-month-2bace214b814
turns into 9-things-i-learned-from-blogging-on-medium-for-the-first-month-2bace214b814
, perfect for a slug. Here we also uses the slug to create the filename for the MarkDown file. Then we use mediumexporter
to transfer URL into the Markdown file. You can find out more about mediumexporter
here.Now that we have the Markdown file, let’s fill in the processing code we want:
# Processing the markdown file
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE" # remove the first line
fl=$(head -n 1 $FILE) # put first line (title) into fl
firstline=$(echo $fl | sed 's/# //') # Remove '# '
tail -n +3 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE" # remove the first line
subtitle=$(head -n 1 $FILE) # put first line (subtitle) into subtitle
tail -n +2 "$FILE" > "$FILE.tmp" && mv "$FILE.tmp" "$FILE" # remove the first two line
These lines are rather self-explanatory. Now we have
firstline
variable as the title and subtitle
variable as the subtitle, we are now ready to construct the Markdown file meta-data for Pelican:# handle metadata for Pelican
meta="
Title: $firstline
Slug: $slug
Subtitle: $subtitle
Date: $(date)
Category: Machine Learning
Tags: Machine Learning, Artificial Intelligence
author: Michael Li
Summary: $firstline
[TOC]
"
You can refer to Pelican’s document here for more information about the meta-data format. Simply put, the Markdown file doesn’t need to specifically write the title and subtitle, as long as we specify the title and subtitle field in our meta-data, Pelican will automatically generate them for you in the post, with specific styles per the theme you choose.
With the correct meta-data, now we can finally update the Markdown and get it ready for site generation:
{ echo -n "$meta"; cat $FILE; } >$FILE.new # sticth meta-data and article content together
mv $FILE{.new,}
head -n -8 $FILE > $FILE.new # Remove medium's recommended articles
mv $FILE{.new,}
done < $filename # don't forget to enclose the loop.
All my Medium articles have several recommendations for further readings. I removed those for my blog(the last line of code above). Now that the Markdown file is ready, time to generate the site and push it to the server:
# push to server
cd $HOME/wayofnumbers.github.io
pelican content -s publishconf.py
git add .
git commit -m "fix"
git push origin dev
So there you go. This script only works on Pelican static site generator, but the gist of it can be applied to any of your blogging platforms. I hope you learned a thing or two. And happy blogging/coding!
Found this article useful? Follow me (Michael Li) on Medium or you can find me on Twitter @lymenlee or my blog site wayofnumbers.com.