len.ro v4, hugo based

I wanted for a while to revive and find a new location for len.ro but it took time to find the available resources (especially time). This would be the fourth (4th) iteration:

  1. was a static site in the 2000
  2. a Plone based site in 2006
  3. a Wordpress based site in 2008
  4. the 4th version is this, hugo based.

My choice was based on the following:

  • fully static output
    • less security issues
    • no more need to update wp from time to time
    • no database
  • structured (markdown) content. I know the html to md conversion will be buggy but this is life.

Data migration. 1st try.

The first try was to use blog2md which takes an wordpress xml export, parses it and generates markdown pages from html (using turndown). I quickly had to fork it to differentiate between categories and tags and to add slug’s to the page meta in order for me to recreate the same permalinks. The markdown left a bit to desire since in contained wp captions and other wp shortcodes and some <p> translation lacked proper newlines.

\[caption id="" align="alignnone" width="480" caption="Almonds chicken"\]![Almonds chicken](http://www.len.ro/photo/cooking/slides/almonds-chicken.jpg "Almonds chicken")\[/caption\] 


Data migration. 2st try.

Second try was to use the jekyll exporter to generate an archive from the existing site then use the built in import in hugo:

hugo import jekyll jekyll-export len.ro

This has a few advantages:

  • url (permalink) included
  • better markdown
  • proper categories and tags However the image captions were again badly converted:
![Almonds chicken](http://www.len.ro/photo/cooking/slides/almonds-chicken.jpg "Almonds chicken")Almonds chicken

## Ingredients

quick fix:

perl -0pi -e 's/<div class="wp-caption[^>]*>(.*\s*)<\/div>/$1/gm' *.md

Other “quick fixes”

perl -0pi -e 's/<pre class="wp-block-code">```\s*([^`]*)\s*```\s*/$1/gms' *.md