aboutsummaryrefslogtreecommitdiff
path: root/_pages/notes/a-simple-static-website-generator.html
diff options
context:
space:
mode:
Diffstat (limited to '_pages/notes/a-simple-static-website-generator.html')
-rw-r--r--_pages/notes/a-simple-static-website-generator.html307
1 files changed, 307 insertions, 0 deletions
diff --git a/_pages/notes/a-simple-static-website-generator.html b/_pages/notes/a-simple-static-website-generator.html
new file mode 100644
index 0000000..e5399a8
--- /dev/null
+++ b/_pages/notes/a-simple-static-website-generator.html
@@ -0,0 +1,307 @@
+TITLE="Andrea Corsini's Notes - A simple static website generator"
+DESCRIPTION="I share and explain my simple static website generator, written in SHell scriptng from scratch."
+---
+<article>
+ <header>
+ <h2>A simple static website generator (sswg)</h2>
+ <time>June 3, 2022</time>
+ </header>
+ <nav class="toc">
+ TOC
+ </nav>
+
+ <p>Why bother creating another static website generator? There are so many
+ awesome projects out in the open source community, such as Jekyll, Hugo,
+ just to name few of them. Well, I wanted to have something dead simple, but
+ also flexible enough to be easily customized. So I wrote my own simple
+ static website generator (sswg) using SHell scripting.</p>
+
+
+ <p>This is possible because I don't need many features yet. In the future, I
+ could end up in switching to a proper generator. Anyway, for the time being I
+ can just use mine and share it here. Hopefully, it could be useful for other
+ folks that want to practice shell scripting, and/or want to implement their
+ website generator.</p>
+
+ <p>Here, you will find the original version and its rationale. Most likely,
+ the generator will change to cope with my website needs. You can find the
+ current version in my <a href="https://git.andreacorsini.xyz/sswg">git
+ repository</a>.</p>
+
+ <h3>Features</h3>
+ <p>At the time of writing, I am a simple person. I just want to:
+ <ul>
+ <li>Invoke the content generation with a simple command, such as <code>./sswg.sh</code>.</li>
+ <li>The generation should be <i>idempotent</i>.</li>
+ <li>Copy the header and footer templates in every page.</li>
+ <li>Support some simple macro substitution, to allow different title and description for each page.</li>
+ <li>Support the insertion of verbatim source code.</li>
+ </ul>
+ </p>
+
+ <p>Anything more than that is not necessary at the moment. For example, I
+ don't need Markdown or other languages to write pages, plain HTML is good
+ enough. In the future, it would be nice to have some extra features, like
+ automatic generation of RSS and Table of Content (TOC).</p>
+
+ <h3>Folder structure</h3>
+ <p>My sswg is just a tiny shell script to include in the website root. In this
+ way, it can be versioned along all the other website code. The script expect
+ to find a folder tree similar to</p>
+
+ <samp>
+ <b>├── _assets</b><br>
+ │    ├── style.css<br>
+ │    ├── favicon.ico<br>
+ │    ├── images<br>
+ │    │     └── picture.png<br>
+ │    └── rss.xml<br>
+ <b>├── _footer.t.html</b><br>
+ <b>├── _header.t.html</b><br>
+ <b>├── _pages</b><br>
+ │     ├── email.html<br>
+ │     ├── index.html<br>
+ │     ├── my-notes.html<br>
+ │     ├── notes<br>
+ │     │     └── a-simple-static-website-generator.html<br>
+ │     └── privacy-policy.html<br>
+ <b>└── sswg.sh</b>
+ </samp>
+
+ <p>Only the elements in <b>bold</b> are mandatory. The other files and folders
+ are there to showcase. Once the script is invoked with <code>./sswg.sh</code>,
+ sswg will regenerate the website to the output folder, named <code>_static</code>.
+ Any content of the direcorty <code>_asset</code> will be copied to the root
+ of <code>_static</code>. The content of template
+ files <code>_header.t.html</code> and <code>_footer.t.html</code> are preposed
+ and appended to each HTML page, respectively. The sswg will copy to the output
+ folder all the HTML files contained in <code>_page</code>, as well as
+ sub-directories and images.</p>
+
+ <p>The output folder <code>_static</code> for this example will result in</p>
+
+ <samp>
+ ├── style.css<br>
+ ├── favicon.ico<br>
+ ├── images<br>
+ │     └── picture.png<br>
+ ├── rss.xml<br>
+ ├── email.html<br>
+ ├── index.html<br>
+ ├── my-notes.html<br>
+ ├── notes<br>
+ │     └── a-simple-static-website-generator.html<br>
+ └── privacy-policy.html<br>
+ </samp>
+
+
+ <h3>Add a webpage</h3>
+ <p>Each HTML page in the directory <code>_page</code> only contains the main
+ content, while header and footer are shared in every page. Thus we don't need
+ to copy them every time.</p>
+
+ <p>So, adding a page is very easy. Save every new page within the
+ folder <code>_page</code>, or in one of its subfolders. Start with the first
+ two lines by defining the title and description for this page, by using the
+ macro T&#8205;ITLE and D&#8205;ESCRIPTION from sswg custom syntax. Then
+ separate the macro from the rest of the page with 3 dashes "<code>---</code>".
+ For example:</p>
+
+ |<pre>
+ |T&#8205;ITLE=&quot;My new web page&quot;
+ |D&#8205;ESCRIPTION=&quot;This page contains information about...&quot;
+ |---
+ |&lt;h2&gt;TI&#8205;TLE&lt;/h2&gt;
+ |# This is a sswg comment.
+ |&lt;p&gt;A page can contain whatever valid HTML code.&lt;/p&gt;
+ |</pre>
+
+# Test comment.
+ <p>The example also shows comments syntaxt. Every line which starts with a
+ hash mark # is considered as a source comment. Hence, it won't be copied in
+ the final static HTML page.</p>
+
+ <h3>Header and Footer templates</h3>
+ <p>Inside <code>_header.t.html</code> write the page content from the DOCTYPE
+ tag, to the navigation of your page, until the beginning of your main content.
+ Use the macros <code>TI&#8205;TLE</code> and <code>D&#8205;ESCRIPTION</code> for HTML title
+ and meta description. The generator will overwrite them with the value specified
+ within the HTML page. Here an example of header templete:</p>
+
+ |<pre>
+ |&lt;!DOCTYPE html&gt;
+ |&lt;html&gt;
+ | &lt;head&gt;
+ | &lt;title&gt;TI&#8205;TLE&lt;/title&gt;
+ | &lt;meta name=&quot;description&quot; content=&quot;DE&#8205;SCRIPTION&quot;&gt;
+ | &lt;link rel=&quot;stylesheet&quot; type=&quot;text/css&quot; href=&quot;/css/style.css&quot; media=&quot;screen&quot;&gt;
+ | &lt;link rel=&quot;shortcut icon&quot; href=&quot;/favicon.ico&quot; type=&quot;image/x-icon&quot;&gt;
+ | &lt;/head&gt;
+ |
+ | &lt;body&gt;
+ | &lt;header&gt;
+ | &lt;h1&gt;Generic website title&lt;/h1&gt;
+ | &lt;div&gt;
+ | &lt;aside&gt;I believe in personal digital freedom&lt;/aside&gt;
+ | &lt;nav&gt;
+ | &lt;a href=&quot;/index.html&quot;&gt;Home&lt;/a&gt; |
+ | &lt;a href=&quot;/my-notes.html&quot;&gt;My notes&lt;/a&gt; |
+ | &lt;a href=&quot;/email.html&quot;&gt;Email&lt;/a&gt;
+ | &lt;/nav&gt;
+ | &lt;/div&gt;
+ | &lt;/header&gt;
+ | &lt;main&gt;
+ |</pre>
+
+ <p>Similarly to the header, <code>_footer.t.html</code> contains all the rest
+ of the website to be inserted at the end of each page. No macro substitution
+ is needed in this file. Here an example:</p>
+
+ |<pre>
+ | &lt;/main&gt;
+ | &lt;footer&gt;
+ | &lt;p&gt;&lt;a href=&quot;/rss.xml&quot;&gt;&lt;img src=&quot;/images/feed.svg&quot;&gt;Feed RSS&lt;/a&gt;.
+ | &lt;p&gt;Copyright &amp;copy; 2020-2021 Acme -
+ | &lt;a href=&quot;/privacy-policy.html&quot;&gt;Privacy policy&lt;/a&gt;&lt;/p&gt;
+ | &lt;/footer&gt;
+ | &lt;/body&gt;
+ |&lt;/html&gt;
+ |</pre>
+
+ <h3>Inserting verbatim source code</h3>
+ <p>I would like the HTML code assembled by sswg to maintain a proper
+ indentation. Therefore, the script takes care about indenting the page
+ contents between the header and footer templates. However, this is an issue
+ for the verbatim source code.</p>
+
+ <p>Indeed, any sort of white-spaces indentation added to the content of the
+ code block tags (<code>&lt;pre&gt;...&lt;/pre&gt;</code>) is interpreted by
+ browsers as white-space characters of the verbatim code. Therefore, the
+ white-space characters will appear in the code blocks, resulting in unwanted
+ spaces.</p>
+
+ <p>To solve this issue, I decided to prepose every code block with a pipe
+ character |. The sswg script will take care in removing the pipe and
+ white-space characters. For example, the code</p>
+
+ |<pre>
+ | |&lt;pre&gt;
+ | |cd
+ | |ls -la
+ | |&lt;/pre&gt;
+ |</pre>
+
+ <p>will be transformed into the HTML source code</p>
+
+ |<pre>
+ |&lt;pre&gt;
+ |cd
+ |ls -la
+ |&lt;/pre&gt;
+ |</pre>
+
+ <h3>Code step by step</h3>
+ <p>By the time you will read this article, the code might have changed. So you
+ will find the current version in the
+ repository <a href="https://git.andreacorsini.xyz/sswg">git.andreacorsini.xyz/sswg</a>.</p>
+
+ <p>For what concern the early static website generator script, it starts by
+ setting constants for folders and template elements:</p>
+
+ |<pre>
+ |#!/bin/sh
+ |
+ |SSWG_OUTPUT_DIR="_static"
+ |SSWG_ASSETS_DIR="_assets"
+ |SSWG_PAGES_DIR="_pages"
+ |SSWG_HEADER_TEMPLATE="_header.t.html"
+ |SSWG_FOOTER_TEMPLATE="_footer.t.html"
+ |</pre>
+
+ <p>Second, the output folder <code>_static</code> is cleaned:</p>
+
+ |<pre>
+ |rm -rf "$SSWG_OUTPUT_DIR"
+ |mkdir "$SSWG_OUTPUT_DIR"
+ |</pre>
+
+ <p>So, be careful if you copied something inside it. It is wiser to copy
+ external files into <code>_assets</code>, as they will be automatically copied
+ back into <code>_static</code>:</p>
+
+ |<pre>
+ |cp -r "$SSWG_ASSETS_DIR"/* "$SSWG_OUTPUT_DIR"/.
+ |</pre>
+
+ <p>Then, we can finally generate each HTML page:</p>
+ |<pre>
+ |for page in $(find "$SSWG_PAGES_DIR" -iname '*.html' -o \
+ | -iname '*.jpg' -o -iname '*.jpeg' -o -iname '*.png');
+ |do
+ | filename="$SSWG_OUTPUT_DIR/${page##$SSWG_PAGES_DIR/}"
+ | mkdir -p "`dirname $filename`"
+ |
+ | if [ "${filename##*.}" = "html" ]; then
+ |</pre>
+
+ <p>Prepose the header template:</p>
+ |<pre>
+ | cat "$SSWG_HEADER_TEMPLATE" >> "$filename"
+ |</pre>
+
+ <p>Indentation in the page content to match the header level:</p>
+ |<pre>
+ | cat "$page" | awk '
+ | BEGIN {print ""}
+ | FNR>3 {print " " $0}
+ | END {print ""}' >> "$filename"
+ |</pre>
+
+ <p>Append the footer template:</p>
+ |<pre>
+ | cat "$SSWG_FOOTER_TEMPLATE" >> "$filename"
+ |</pre>
+
+ <p>To perform the macro substitution, firstly shell-evaluate the first two
+ lines of the page. They are supposed to contain a shell-like declaration of the
+ variables TI&#8205;TLE and DES&#8205;CRIPTION. Secondly, each macro can be
+ substituted in the whole document.</p>
+ |<pre>
+ | eval `cat "$page" | awk 'FNR<3'`
+ | sed -i'' "s@TI&#8205;TLE@$TIT&#8205;LE@g" "$filename"
+ | sed -i'' "s@DES&#8205;CRIPTION@$DES&#8205;CRIPTION@g" "$filename"
+ |</pre>
+
+ <p>We can now remove comments and pipe + white-spaces:</p>
+ |<pre>
+ | sed -i'' "/^[ \t]*#/d" "$filename"
+ | sed -i'' "s/^[ \t]*|//g" "$filename"
+ |</pre>
+
+ <p>Continue with the rest of the script. If we are not reading an HTML file,
+ it could be either an image or a folder. Just copy it to its destination
+ in <code>_static</code>:</p>
+
+ |<pre>
+ | else
+ | cp $page $filename
+ | fi;
+ |done;
+ |</pre>
+
+ <hr>
+ <h3>Conclusion</h3>
+ <p>Static website generators are valid lightweight alternatives to more
+ complicate and larger content management systems (CMS), such as Wordpress.
+ They are normally simple to use, requires no databases, neither complicated
+ install procedures. Users are in control throught text files.</p>
+
+ <p>This post showed how to write a simple static website generator (sswg) to get our
+ web pages started. So far, the generator script has only bare-bones
+ functionalities, but it is simple enough to be extended with additional
+ features.</p>
+
+ <p>I hope you found this post useful and interesting. Don't hesitate to
+ contact me for any questions or comments.</p>
+
+</article>