Publishing from GitHub — Building a Version-Controlled Blog
Publishing from GitHub — Building a Version-Controlled Blog
When I first started organizing my writing projects, I noticed something: every article was living in a different place. Some were local files, others were drafts lost in my notes, and a few were somewhere in the cloud with no structure. It wasn’t sustainable, and more importantly, it wasn’t scalable. I needed a way to manage my posts like I manage my code — with control, traceability, and safety.
That’s when I realized GitHub was the perfect fit.
Why GitHub?
GitHub might seem like an odd choice for a blog at first. It’s designed for developers, version control, and collaboration — not necessarily publishing prose. But if you think about what good content management requires, the match is surprisingly natural.
It’s free, secure, and version-controlled by design. Each post is a Markdown file, easy to edit, diff, and review. Each commit becomes a snapshot of the article’s evolution — whether it’s an early draft or the polished, published version.
There’s also the practical side. Using GitHub means I can rely on its API to fetch files directly from my repositories. I don’t need an expensive headless CMS or an external database. I can fetch content securely, through an access token, and render it dynamically in my own frontend.
And perhaps most importantly, GitHub opens a path to future automation. It’s easy to imagine GitHub Actions or scheduled workflows that validate metadata, auto-deploy new posts, or trigger publishing pipelines — all without leaving my development environment.
For now, that automation is still a work in progress, but the foundation is already there.
How the workflow works
The writing workflow revolves around two branches in my private GitHub repositories, both inaccessible to the public. No one can browse or fetch these files directly — access happens exclusively through the front-end application, which connects securely using a GitHub token stored as an environment variable.
The main branch stores published posts. These are the final, approved versions that the website serves publicly — but only through the application layer. The raw repository itself remains private.
The drafts branch (or sometimes a dedicated draft repository) contains work in progress — pieces still being written, refined, or reviewed. This separation ensures a clean boundary between stable and evolving content: I can safely edit, restructure, or experiment in drafts without risking anything that’s already live.
The front-end determines where to load each post from based on the environment configuration.
When BLOG_ACCESS=public, the system reads from the main branch using the GitHub API and the private token.
When BLOG_ACCESS=private, it switches to the drafts branch (or draft repo), again authenticated through the same secure channel.
This setup means everything — even the “public” version — is still protected under the same private repository and authentication logic. The application, not GitHub, controls what’s visible.
Additionally, during local development I can use the same workflow to preview drafts directly from GitHub, pulling them from the drafts branch as if I were in production. This makes the local environment behave almost identically to the deployed version, ensuring I can test layout, metadata, and post loading under real conditions before publishing.
As an extra layer of control, each post also carries its own status flag (e.g., "published" or "draft").
This metadata ensures that even if a draft exists in the repository, it won’t appear on the live blog until explicitly marked as published — a small but crucial safeguard that reinforces the separation between what’s ready and what’s still in progress.
The token is stored in an environment variable called BLOG_GITHUB_TOKEN, never hardcoded in the source code. That means the build system or local environment handles access securely — no credentials are ever exposed to the client.
The relevant environment configuration looks something like this:
BLOG_STORAGE=github # Options: "local", "github"
BLOG_ACCESS=private # Options: "public", "private"
BLOG_SOURCE_LOCAL=<local-directory-here> # For lcoal development and quick previews
BLOG_SOURCE_GITHUB=https://api.github.com/repos/:username/:repo/contents/:path
BLOG_GITHUB_TOKEN=<your-token-here>
The front-end simply calls getPostContent(slug), and the function determines the source and access mode automatically. There’s no need to manually construct URLs for different branches — the function abstracts that away.
This makes the workflow simpler, consistent, and secure. Drafts remain private, published posts can be accessed through the application layer, and local development can preview content without touching GitHub authentication.
This makes it trivial to switch between modes. Locally, I can use files from disk for faster testing; in production, I can pull directly from GitHub.
Fetching and decoding posts
When the app needs to load a post, it calls getPostContent() function, who handles all sources transparently.
- If
BLOG_STORAGEis "local", it reads files from disk. - If "
github", it fetches the post via the GitHub API using the token, automatically selecting "main" or "drafts" branch depending onBLOG_ACCESS.
GitHub responses are Base64-encoded, so the content is decoded before rendering:
const decodedContent = Buffer.from(data.content, "base64").toString("utf-8");
From this point on, the rest of the pipeline doesn’t need to care where the file came from. It treats the decoded Markdown as if it were read directly from disk.
That transparency is intentional — the whole idea is to abstract away the source. Whether I’m working locally, pulling from drafts, or showing published content, the rendering logic stays identical.
import { BlogAccessType, BlogStorageType } from "@/types";
export async function getPostContent(
slug: string,
storage: BlogStorageType = process.env.BLOG_STORAGE as BlogStorageType,
access: BlogAccessType = (process.env.BLOG_ACCESS as BlogAccessType) ||
"private"
) {
try {
if (storage === "local") {
return null;
} else if (storage === "github") {
const targetBranch = access === "public" ? "main" : "drafts";
let url = `${process.env.BLOG_SOURCE_GITHUB}/${slug}.md?ref=${targetBranch}`;
const res = await fetch(url, {
headers: {
Authorization: `Bearer ${process.env.BLOG_GITHUB_TOKEN}`,
Accept: "application/vnd.github.v3+json",
},
});
if (!res.ok) throw new Error(`Failed to fetch ${slug} from GitHub repo`);
const json = await res.json();
const content = Buffer.from(json.content, "base64").toString("utf-8");
return content;
} else {
return null;
}
} catch (err) {
return null;
}
}
Data structure and metadata
Each post isn’t just Markdown — it comes with metadata that defines how it behaves within the blog. Here’s a typical post entry:
{
id: "math-set-topology1",
slug: "math-set-topology1",
collection: "mathematics",
level: "i-m-just-curious",
title: "The Birth of Sets — From Counting to Abstract Structures",
category: "",
description:
"Understanding the basics of set theory, including sets, subsets, unions, intersections, and Cartesian products",
date: "2025-12-15",
readTime: "6 min read",
tags: ["Mathematics", "Set Theory", "Foundations"],
status: "draft",
href: "",
}
This structure allows the system to index posts, build collections, filter by topic, and even estimate reading time. The status field is especially useful, since it determines whether a post is visible publicly or still considered part of the draft workflow.
Under the hood, the front-end component BlogPostPage orchestrates the process: it looks for the post locally first, then falls back to the public repo, and finally to the private one. Each fallback layer ensures that missing or private content doesn’t crash the site — instead, the user sees a graceful “Not Found” page.
This hierarchy of fallbacks, gracefully: local → public → private → not found. This allows flexible deployment setups and smooth previews without exposing sensitive data.
On error handling and fallbacks
Not every post will always exist in every source. For example, a draft may only live in the private repository until it’s ready to publish. In those cases, a request to the public repo will return a 404.
Instead of failing silently, the logic detects that and continues to check the next fallback source. Only after all options are exhausted does it return the not-found page.
The benefit of this layered design is resilience. It makes the system robust against missing files, branch mismatches, or token misconfigurations.
Why this model works
Using GitHub as a CMS might sound unconventional, but it aligns perfectly with how a developer thinks. It brings traceability — every edit has a timestamp, author, and reason. It’s also reproducible: I can rebuild my entire blog history from commits alone.
It’s secure, because authentication is handled through personal access tokens. It’s free, with no external dependencies. And it’s structured, letting Markdown remain the source of truth for both content and metadata.
- From a workflow perspective, it fits the natural rhythm of writing: draft, review, publish.
- From an engineering perspective, it fits the principle of single responsibility — content lives in repositories; rendering happens in the frontend.
Looking ahead
There’s still more to do.
Future iterations may introduce GitHub Actions to automate deployments when a post is merged into main. Metadata validation could ensure all posts have correct fields before publishing. I could even build a small dashboard to manage drafts visually — but still keep GitHub as the underlying source.
This system also opens doors for other applications: internal documentation, research notes, or even lightweight course material could all use the same model.
The elegance of it lies in its simplicity — everything is just text and commits.
In the end, the choice of GitHub wasn’t just about convenience — it was about control. It gave me a reliable, transparent, and scalable foundation for managing my writing. And as the system evolves, I know that the same principles that make version control essential for code — clarity, reversibility, and collaboration — are equally valuable for ideas.