Category: Tools & Setup

Tools & Setup is where orthogonal.info curates practical, battle-tested guides on developer productivity tools, CLI utilities, self-hosted software, and environment configuration. Whether you are bootstrapping a new development machine, evaluating self-hosted alternatives to SaaS products, or fine-tuning your terminal workflow, this category delivers step-by-step walkthroughs grounded in real-world experience. Every article is written with one goal: help you build a faster, more reliable, and more enjoyable development environment.

With over 25 in-depth posts and growing, Tools & Setup is one of the most active categories on the site — reflecting just how much time engineers spend (and save) by getting their tooling right from day one.

Key Topics Covered

Command-line productivity — Shell customization (Zsh, Fish, Starship), terminal multiplexers (tmux, Zellij), and CLI utilities like ripgrep, fd, fzf, and bat that supercharge daily workflows.
Self-hosted alternatives — Deploying and configuring tools like Gitea, Nextcloud, Vaultwarden, and Uptime Kuma so you own your data without sacrificing usability.
IDE and editor setup — Configuration guides for VS Code, Neovim, and JetBrains IDEs, including extension recommendations, keybindings, and remote development workflows.
Development environment automation — Using Ansible, Homebrew, Nix, dotfiles repositories, and container-based dev environments (Dev Containers, Devbox) to make setups reproducible.
Git workflows and tooling — Advanced Git techniques, hooks, aliases, and GUI clients that streamline version control for solo developers and teams alike.
API testing and debugging — Hands-on guides for curl, HTTPie, Postman, and browser DevTools to debug REST and GraphQL APIs efficiently.
Package and runtime management — Managing multiple language runtimes with asdf, mise, nvm, and pyenv, plus dependency management best practices.

Who This Content Is For
This category is designed for software engineers, DevOps practitioners, system administrators, and hobbyist developers who want to work smarter, not harder. Whether you are a junior developer setting up your first Linux workstation or a senior engineer optimizing a multi-machine workflow, you will find actionable advice that respects your time. The guides assume basic command-line comfort but explain advanced concepts clearly.

What You Will Learn
By exploring the articles in Tools & Setup, you will learn how to automate repetitive environment tasks so a fresh machine is productive in minutes, not days. You will discover modern CLI replacements for legacy Unix tools, understand how to evaluate self-hosted software against its SaaS equivalent, and gain confidence configuring complex development stacks. Each guide includes copy-paste commands, configuration snippets, and links to upstream documentation so you can adapt the advice to your own infrastructure.

Start browsing below to find your next productivity upgrade.

  • Free VPN: Cloudflare Tunnel & WARP Guide (2026)

    Free VPN: Cloudflare Tunnel & WARP Guide (2026)

    TL;DR: Cloudflare offers two free VPN solutions: WARP (consumer privacy VPN using WireGuard) and Cloudflare Tunnel + Zero Trust (self-hosted VPN replacement for accessing your home network). This guide covers both approaches step-by-step, with Docker Compose configs, split-tunnel setup, and security hardening. Zero Trust is free for up to 50 users — enough for any homelab or small team.

    Quick Answer: You can set up a completely free VPN using Cloudflare Tunnel and WARP. This guide covers installing cloudflared on your server, creating a tunnel, configuring DNS routes, and connecting clients — all using Cloudflare’s free tier with no bandwidth limits.

    Why Build Your Own VPN in 2026?

    Commercial VPN providers make bold promises about privacy, but their centralized architecture creates a fundamental trust problem. You’re routing all your traffic through servers you don’t control, operated by companies whose revenue model depends on subscriber volume — not security audits. ExpressVPN, NordVPN, and Surfshark have all faced scrutiny over logging practices, jurisdiction shopping, and opaque ownership structures.

    Cloudflare offers a different model. Instead of renting someone else’s VPN, you build your own using Cloudflare’s global Anycast network (330+ data centers in 120+ countries) as the transport layer. The result is a VPN that’s faster than most commercial alternatives, costs nothing, and gives you full control over access policies.

    There are two distinct approaches, and you might want both:

    • Cloudflare WARP — A consumer VPN app that encrypts your device traffic using WireGuard. Install, toggle on, done. Best for: browsing privacy on public Wi-Fi.
    • Cloudflare Tunnel + Zero Trust — A self-hosted VPN replacement that lets you access your home network (NAS, Proxmox, Pi-hole, Docker services) from anywhere without opening a single firewall port. Best for: homelabbers, remote workers, small teams.

    Part 1: Cloudflare WARP — The 5-Minute Privacy VPN

    What WARP Actually Does

    WARP is built on the WireGuard protocol — the same modern, lightweight VPN protocol that replaced IPSec and OpenVPN in most serious deployments. When you enable WARP, your device establishes an encrypted tunnel to the nearest Cloudflare data center. From there, your traffic exits onto the internet through Cloudflare’s network.

    Key technical details:

    • Protocol: WireGuard (via Cloudflare’s BoringTun implementation in Rust)
    • DNS: Queries routed through 1.1.1.1 (Cloudflare’s privacy-first DNS resolver, audited by KPMG)
    • Encryption: ChaCha20-Poly1305 for data, Curve25519 for key exchange
    • Latency impact: Typically 1-5ms added (vs. 20-50ms for most commercial VPNs) because traffic routes to the nearest Anycast PoP
    • No IP selection: WARP doesn’t let you choose exit countries — it’s a privacy tool, not a geo-unblocking tool

    Installation

    WARP runs on every major platform through the 1.1.1.1 app:

    Platform Install Method
    Windows one.one.one.one → Download
    macOS one.one.one.one → Download
    iOS App Store → search “1.1.1.1”
    Android Play Store → search “1.1.1.1”
    Linux curl -fsSL https://pkg.cloudflare.com/cloudflare-main.gpg | sudo tee /usr/share/keyrings/cloudflare-archive-keyring.gpg && echo "deb [signed-by=/usr/share/keyrings/cloudflare-archive-keyring.gpg] https://pkg.cloudflare.com/cloudflared $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/cloudflared.list && sudo apt update && sudo apt install cloudflare-warp

    After installing, launch the app and toggle WARP on. That’s it. Your DNS queries now go through 1.1.1.1 and your traffic is encrypted to Cloudflare’s edge.

    WARP vs. WARP+ vs. Zero Trust

    Feature WARP (Free) WARP+ ($) Zero Trust WARP
    Price $0 ~$5/month Free (50 users)
    Encryption WireGuard WireGuard WireGuard
    Speed optimization Standard routing Argo Smart Routing Standard routing
    Private network access No No Yes
    Access policies No No Full ZTNA
    DNS filtering No No Gateway policies

    For most people, free WARP is sufficient for everyday privacy. If you need remote access to your homelab, keep reading — Part 2 is where it gets interesting.

    Part 2: Cloudflare Tunnel + Zero Trust — The Self-Hosted VPN Replacement

    This is the setup that replaces WireGuard, OpenVPN, or Tailscale for accessing your home network. The architecture is elegant: a lightweight daemon called cloudflared runs inside your network and maintains an outbound-only encrypted tunnel to Cloudflare. Remote clients connect through Cloudflare’s network using the WARP client. No inbound ports. No dynamic DNS. No exposed IP address.

    Architecture Overview

    ┌─────────────────┐         ┌──────────────────────┐         ┌─────────────────┐
    │  Remote Device  │         │   Cloudflare Edge    │         │  Home Network   │
    │  (WARP Client)  │◄───────►│   330+ PoPs globally │◄───────►│  (cloudflared)  │
    │                 │  WireGuard│                      │ Outbound │                 │
    │  Phone/Laptop   │  Tunnel  │  Zero Trust Policies │  Tunnel  │  NAS/Docker/LAN │
    └─────────────────┘         └──────────────────────┘         └─────────────────┘
    

    Prerequisites

    • A Cloudflare account (free tier works)
    • A domain name with DNS managed by Cloudflare (required for tunnel management)
    • A server on your home network — any Linux box, Raspberry Pi, Synology NAS, or even a Docker container on TrueNAS
    • Docker + Docker Compose (recommended) or bare-metal cloudflared installation

    Step 1: Create a Tunnel in the Zero Trust Dashboard

    1. Go to one.dash.cloudflare.com → Networks → Tunnels
    2. Click Create a tunnel
    3. Select Cloudflared as the connector type
    4. Name your tunnel (e.g., homelab-tunnel)
    5. Copy the tunnel token — you’ll need this for the Docker config

    Step 2: Deploy cloudflared with Docker Compose

    Create a docker-compose.yml on your home server:

    version: "3.8"
    services:
      cloudflared:
        image: cloudflare/cloudflared:latest
        container_name: cloudflared-tunnel
        restart: unless-stopped
        command: tunnel --no-autoupdate run --token ${TUNNEL_TOKEN}
        environment:
          - TUNNEL_TOKEN=${TUNNEL_TOKEN}
        network_mode: host   # Required for private network routing
    
      # Example: expose a local service
      whoami:
        image: traefik/whoami
        container_name: whoami
        ports:
          - "8080:80"

    Create a .env file alongside it:

    TUNNEL_TOKEN=eyJhIjoiYWJj...your-token-here

    Start the tunnel:

    docker compose up -d
    docker logs cloudflared-tunnel  # Should show "Connection registered"

    Critical note: Use network_mode: host if you want to route traffic to your entire LAN subnet (192.168.x.0/24). Without it, cloudflared can only reach services within the Docker network.

    Step 3: Expose Services via Public Hostnames

    Back in the Zero Trust dashboard, under your tunnel’s Public Hostnames tab:

    1. Click Add a public hostname
    2. Set subdomain: nas, domain: yourdomain.com
    3. Service type: HTTP, URL: localhost:5000 (or wherever your service runs)
    4. Save

    Cloudflare automatically creates a DNS record. Your NAS is now accessible at https://nas.yourdomain.com — with automatic SSL, DDoS protection, and Cloudflare WAF.

    Step 4: Enable Private Network Routing (Full VPN Mode)

    This is what turns a simple tunnel into a full VPN replacement. Instead of exposing individual services, you route an entire IP subnet through the tunnel.

    1. In Zero Trust dashboard → Networks → Tunnels → your tunnel → Private Networks
    2. Add your LAN CIDR: 192.168.1.0/24 (adjust to your subnet)
    3. Go to Settings → WARP Client → Split Tunnels
    4. Switch to Include mode and add 192.168.1.0/24

    Now, any device running the WARP client (enrolled in your Zero Trust org) can access 192.168.1.x addresses as if they were on your home network. SSH into your server, access your NAS web UI, reach your Pi-hole dashboard — all without port forwarding.

    Step 5: Enroll Client Devices

    1. Install the 1.1.1.1 / WARP app on your phone or laptop
    2. Go to Settings → Account → Login to Cloudflare Zero Trust
    3. Enter your team name (set during Zero Trust setup)
    4. Authenticate with the method you configured (email OTP, Google SSO, GitHub, etc.)
    5. Enable Gateway with WARP mode

    Test it: connect to mobile data (not your home Wi-Fi) and try accessing a LAN IP like http://192.168.1.1. If the router admin page loads, your VPN is working.

    Step 6: Lock It Down — Zero Trust Access Policies

    The “Zero Trust” part of this setup is what separates it from a traditional VPN. Instead of “anyone with the VPN key gets full network access,” you define granular policies:

    Zero Trust Dashboard → Access → Applications → Add an Application
    
    Application type: Self-hosted
    Application domain: nas.yourdomain.com
    
    Policy: Allow
    Include: Emails ending in @yourdomain.com
    Require: Country equals United States (optional geo-fence)
    
    Session duration: 24 hours

    You can create different policies per service. Your Proxmox admin panel might require hardware key (FIDO2) authentication, while your Jellyfin media server only needs email OTP. This is Zero Trust Network Access (ZTNA) — the same architecture that Google BeyondCorp and Microsoft Entra use internally.

    Cloudflare Tunnel vs. Alternatives: Honest Comparison

    Feature Cloudflare Tunnel WireGuard Tailscale OpenVPN
    Price Free (50 users) Free Free (100 devices) Free
    Open ports required None 1 UDP port None 1 UDP/TCP port
    Setup complexity Medium Medium-High Low High
    Works behind CG-NAT Yes Needs port forward Yes Needs port forward
    Access control Full ZTNA policies Key-based only ACLs + SSO Cert-based
    DDoS protection Yes (Cloudflare) No No No
    SSL/TLS termination Automatic N/A N/A Manual
    Trust model Trust Cloudflare Self-hosted Trust Tailscale Self-hosted
    Best for Web services + LAN Pure privacy Mesh networking Enterprise legacy

    The honest tradeoff: Cloudflare Tunnel routes your traffic through Cloudflare’s infrastructure. If you fundamentally distrust any third party touching your packets, self-hosted WireGuard is the purist choice. But for most homelabbers, the convenience of zero open ports + free DDoS protection + granular access policies makes Cloudflare Tunnel the pragmatic winner.

    Advanced: Multi-Service Docker Stack

    Here’s a production-grade Docker Compose that exposes multiple services through a single tunnel:

    version: "3.8"
    
    services:
      cloudflared:
        image: cloudflare/cloudflared:latest
        container_name: cloudflared
        restart: unless-stopped
        command: tunnel --no-autoupdate run --token ${TUNNEL_TOKEN}
        environment:
          - TUNNEL_TOKEN=${TUNNEL_TOKEN}
        networks:
          - tunnel
        depends_on:
          - nginx
    
      nginx:
        image: nginx:alpine
        container_name: nginx-proxy
        volumes:
          - ./nginx.conf:/etc/nginx/nginx.conf:ro
        networks:
          - tunnel
    
      # Add your services here — they just need to be on the 'tunnel' network
      # Configure public hostnames in the CF dashboard to point to nginx
    
    networks:
      tunnel:
        name: cf-tunnel

    Map each service to a subdomain in the Zero Trust dashboard: grafana.yourdomain.com → http://nginx:3000, code.yourdomain.com → http://nginx:8443, etc.

    Troubleshooting Common Issues

    Tunnel shows “Disconnected” in the dashboard

    • Check Docker logs: docker logs cloudflared-tunnel
    • Verify your token hasn’t been rotated
    • Ensure outbound HTTPS (port 443) isn’t blocked by your router/ISP
    • If behind a corporate firewall, cloudflared also supports HTTP/2 over port 7844

    Private network routing doesn’t work

    • Confirm network_mode: host in Docker Compose (or use macvlan)
    • Check that the CIDR in “Private Networks” matches your actual subnet
    • Verify Split Tunnels are set to Include mode (not Exclude)
    • On the client, run warp-cli settings to verify the private routes are active

    WARP client won’t enroll

    • Double-check your team name in Zero Trust → Settings → Custom Pages
    • Ensure you’ve created a Device enrollment policy under Settings → WARP Client → Device enrollment permissions
    • Allow email domains or specific emails that can enroll

    Security Hardening Checklist

    • ☐ Enable Require Gateway in device enrollment — forces all enrolled devices through Cloudflare Gateway for DNS filtering
    • ☐ Set session duration to 24h or less for sensitive services
    • ☐ Require FIDO2/hardware keys for admin panels (Proxmox, router, etc.)
    • ☐ Enable device posture checks: require screen lock, OS version, disk encryption
    • ☐ Use Service Tokens (not user auth) for machine-to-machine tunnel access
    • ☐ Monitor Access audit logs: Zero Trust → Logs → Access
    • ☐ Never put your tunnel token in a public Git repository — use .env files and .gitignore
    • ☐ Rotate tunnel tokens periodically via the dashboard

    Recommended Hardware

    Running Cloudflare Tunnel on a dedicated device keeps your main machine clean. A mini PC is perfect for always-on tunnel hosting — low power draw, fanless, and small enough to mount behind a monitor. For Docker-based setups, a 1TB NVMe SSD gives plenty of room for containers and logs. If you're running Plex or media behind Cloudflare, check out our TrueNAS Plex setup guide.

    FAQ

    Is Cloudflare Tunnel really free?

    Yes. Cloudflare Zero Trust offers a free plan that includes tunnels, access policies, and WARP client enrollment for up to 50 users. There are no bandwidth limits on the free tier. Paid plans (starting at $7/user/month) add features like logpush, extended session management, and dedicated egress IPs.

    Can Cloudflare see my traffic?

    Cloudflare terminates TLS at their edge, so they technically could inspect unencrypted HTTP traffic passing through the tunnel. For HTTPS services, end-to-end encryption between your browser and origin server means Cloudflare sees metadata (domain, timing) but not content. If this is a concern, use WireGuard for a fully self-hosted solution where no third party touches your packets.

    Does this work with Starlink / CG-NAT / mobile hotspots?

    Yes — this is one of Cloudflare Tunnel’s biggest advantages. Since the tunnel is outbound-only, it works behind any NAT, including carrier-grade NAT (CG-NAT) used by Starlink, T-Mobile Home Internet, and most 4G/5G connections. No port forwarding needed.

    Can I use this for site-to-site VPN?

    Yes, using WARP Connector (currently in beta). Install cloudflared with WARP Connector mode on a device at each site, and Cloudflare routes traffic between subnets. This replaces traditional IPSec site-to-site tunnels.

    Cloudflare Tunnel vs. Tailscale — which should I use?

    Use Tailscale if your primary need is device-to-device mesh networking (see also our guide on home network segmentation with OPNsense) (accessing any device from any other device). Use Cloudflare Tunnel if you want to expose web services with automatic HTTPS and DDoS protection, or if you need granular ZTNA policies. Many homelabbers use both: Tailscale for device mesh, Cloudflare Tunnel for public-facing services.

    References

  • YubiKey SSH Authentication: Stop Trusting Key Files on Disk

    YubiKey SSH Authentication: Stop Trusting Key Files on Disk

    I stopped using SSH passwords three years ago. Switched to ed25519 keys, felt pretty good about it. Then my laptop got stolen from a coffee shop — lid open, session unlocked. My private key was sitting right there in ~/.ssh/, passphrase cached in the agent.

    That’s when I bought my first YubiKey.

    Why a Hardware Key Beats a Private Key File

    📌 TL;DR: YubiKey provides secure SSH authentication by storing private keys on hardware, preventing extraction or misuse even if a device is stolen or compromised. Unlike disk-stored keys, YubiKey requires physical touch for authentication, adding an extra layer of security. It supports FIDO2/resident keys and works across devices with USB-C or NFC options.
    🎯 Quick Answer: YubiKey SSH authentication stores your private key on tamper-resistant hardware so it cannot be copied or extracted, even if your machine is compromised. Configure it via ssh-keygen -t ed25519-sk to bind SSH keys to the physical device.

    Your SSH private key lives on disk. Even if it’s passphrase-protected, once the agent unlocks it, it’s in memory. Malware can dump it. A stolen laptop might still have an active agent session. Your key file can be copied without you knowing.

    A YubiKey stores the private key on the hardware. It never leaves the device. Every authentication requires a physical touch. No touch, no auth. Someone steals your laptop? They still need the physical key plugged in and your finger on it.

    That’s the difference between “my key is encrypted” and “my key literally cannot be extracted.”

    Which YubiKey to Get

    For SSH, you want a YubiKey that supports FIDO2/resident keys. Here’s what I’d recommend:

    YubiKey 5C NFC — my top pick. USB-C fits modern laptops, and the NFC means you can tap it on your phone for GitHub/Google auth too. Around $55, and I genuinely think it’s the best value if you work across multiple devices. (Full disclosure: affiliate link)

    If you’re on a tighter budget, the YubiKey 5 NFC (USB-A) does the same thing for about $50, just with the older port. Still a good option if your machines have USB-A.

    One important note: buy two. Register both with every service. Keep one on your keychain, one locked in a drawer. If you lose your primary, you’re not locked out of everything. I learned this the hard way with a 2FA lockout that took three days to resolve.

    Setting Up SSH with FIDO2 Resident Keys

    You need OpenSSH 8.2+ (check with ssh -V). Most modern distros ship with this. If you’re on macOS, the built-in OpenSSH works fine since Ventura.

    First, generate a resident key stored directly on the YubiKey:

    ssh-keygen -t ed25519-sk -O resident -O verify-required -C "yubikey-primary"

    Breaking this down:

    • -t ed25519-sk — uses the ed25519 algorithm backed by a security key (sk = security key)
    • -O resident — stores the key on the YubiKey, not just a reference to it
    • -O verify-required — requires PIN + touch every time (not just touch)
    • -C "yubikey-primary" — label it so you know which key this is

    It’ll ask you to set a PIN if you haven’t already. Pick something decent — this is your second factor alongside the physical touch.

    You’ll end up with two files: id_ed25519_sk and id_ed25519_sk.pub. The private file is actually just a handle — the real private key material lives on the YubiKey. Even if someone gets this file, it’s useless without the physical hardware.

    Adding the Key to Remote Servers

    Same as any SSH key:

    ssh-copy-id -i ~/.ssh/id_ed25519_sk.pub user@your-server

    Or manually append the public key to ~/.ssh/authorized_keys on the target machine.

    When you SSH in, you’ll see:

    Confirm user presence for key ED25519-SK SHA256:...
    User presence confirmed

    That “confirm user presence” line means it’s waiting for you to physically tap the YubiKey. No tap within ~15 seconds? Connection refused. I love this — it’s impossible to accidentally leave a session auto-connecting in the background.

    The Resident Key Trick: Any Machine, No Key Files

    This is the feature that sold me. Because the key is resident (stored on the YubiKey itself), you can pull it onto any machine:

    ssh-keygen -K

    That’s it. Plug in your YubiKey, run that command, and it downloads the key handles to your current machine. Now you can SSH from a fresh laptop, a coworker’s machine, or a server — as long as you have the YubiKey plugged in.

    No more syncing ~/.ssh folders across machines. No more “I need to get my key from my other laptop.” The YubiKey is the key.

    Hardening sshd for Key-Only Auth

    Once your YubiKey is working, lock down the server. In /etc/ssh/sshd_config:

    PasswordAuthentication no
    KbdInteractiveAuthentication no
    PubkeyAuthentication yes
    AuthenticationMethods publickey

    Reload sshd (systemctl reload sshd) and test with a new terminal before closing your current session. I’ve locked myself out exactly once by reloading before testing. Don’t be me.

    If you want to go further, you can restrict to only FIDO2 keys by requiring the sk key types in your authorized_keys entries. But for most setups, just disabling passwords is the big win.

    What About Git and GitHub?

    GitHub has supported security keys for SSH since late 2021. Add your id_ed25519_sk.pub in Settings → SSH Keys, same as any other key.

    Every git push and git pull now requires a physical touch. It adds maybe half a second to each operation. I was worried this would be annoying — it’s actually reassuring. Every push is a conscious decision.

    For your Git config, make sure you’re using the SSH URL format:

    git remote set-url origin [email protected]:username/repo.git

    Gotchas I Hit

    Agent forwarding doesn’t work with FIDO2 keys. The touch requirement is local — you can’t forward it through an SSH jump host. If you rely on agent forwarding, you’ll need to either set up ProxyJump or keep a regular ed25519 key for jump scenarios.

    macOS Sonoma has a quirk where the built-in SSH agent sometimes doesn’t prompt for the touch correctly. Fix: add SecurityKeyProvider internal to your ~/.ssh/config.

    WSL2 can’t see USB devices by default. You’ll need usbipd-win to pass the YubiKey through. It works fine once set up, but the initial config is a 10-minute detour.

    VMs need USB passthrough configured. In VirtualBox, add a USB filter for “Yubico YubiKey.” In QEMU/libvirt, use hostdev passthrough. This catches people off guard when they SSH from inside a VM and wonder why the key isn’t detected.

    My Setup

    I carry a YubiKey 5C NFC on my keychain and keep a backup YubiKey 5 Nano in my desk. The Nano stays semi-permanently in my desktop’s USB port — it’s tiny enough that it doesn’t stick out. (Full disclosure: affiliate links)

    Both keys are registered on every server, GitHub, and every service that supports FIDO2. If I lose my keychain, I walk to my desk and keep working.

    Total cost: about $80 for two keys. For context, that’s less than a month of most password manager premium plans, and it protects against a class of attacks that passwords simply can’t.

    Should You Bother?

    If you SSH into anything regularly — servers, homelabs, CI runners — yes. The setup takes 15 minutes, and the daily friction is a light tap on a USB device. The protection you get (key material that physically can’t be stolen remotely) is worth way more than the cost.

    If you’re already running a homelab with TrueNAS or managing Docker containers, this is a natural next step in locking things down. Hardware keys fill the gap between “I use SSH keys” and “my infrastructure is actually secure.”

    Start with one key, test it for a week, then buy the backup. You won’t go back.


    Join Alpha Signal for free market intelligence — daily briefings on tech, AI, and the markets that drive them.

    📚 Related Reading

    References

    1. Yubico — “Using Your YubiKey with SSH”
    2. OWASP — “Authentication Cheat Sheet”
    3. GitHub — “YubiKey-SSH Configuration Guide”
    4. NIST — “Digital Identity Guidelines”
    5. RFC Editor — “RFC 4253: The Secure Shell (SSH) Transport Layer Protocol”

    Frequently Asked Questions

    Why is YubiKey more secure than storing SSH keys on disk?

    YubiKey stores the private key on hardware, ensuring it cannot be extracted or copied. Authentication requires physical touch, preventing unauthorized access even if a device is stolen or compromised.

    What type of YubiKey is recommended for SSH authentication?

    The YubiKey 5C NFC is recommended for its USB-C compatibility and NFC functionality, making it versatile for both laptops and phones. The YubiKey 5 NFC (USB-A) is a budget-friendly alternative for older devices.

    How do you set up SSH authentication with a YubiKey?

    You need OpenSSH 8.2+ to generate a resident key stored on the YubiKey using `ssh-keygen`. The key requires PIN and physical touch for authentication, and the public key can be added to remote servers for access.

    What precautions should be taken when using YubiKey for SSH?

    It’s recommended to buy two YubiKeys: one for daily use and one as a backup. Register both with all services to avoid lockouts in case of loss or damage.

  • Browser Fingerprinting: Identify You Without Cookies

    Browser Fingerprinting: Identify You Without Cookies

    Last month I was debugging a tracking issue for a client and realized something uncomfortable: even after clearing all cookies and using a fresh incognito window, a third-party analytics script was still identifying the same user session. No cookies, no localStorage, no URL parameters. Just JavaScript reading properties that every browser willingly exposes.

    Browser fingerprinting isn’t new, but most developers I talk to still underestimate how effective it is. The EFF’s Cover Your Tracks project found that 83.6% of browsers have a unique fingerprint. Not “somewhat unique” — unique. One in a million. And that number climbs to over 94% if you include Flash or Java metadata (though those are mostly dead now).

    I spent a weekend building a fingerprinting test page to understand exactly what data points create this uniqueness. Here’s what I found.

    The Canvas API: Your GPU’s Signature

    📌 TL;DR: Browser fingerprinting allows websites to uniquely identify users without cookies by leveraging browser-exposed properties like Canvas and Audio APIs. These techniques exploit hardware and software variations to generate unique identifiers, making privacy protection more challenging. Studies show that over 83% of browsers have unique fingerprints, highlighting its effectiveness.
    🎯 Quick Answer: Browser fingerprinting uniquely identifies users without cookies by combining Canvas rendering, AudioContext output, WebGL parameters, and installed fonts into a hash. Studies show this technique can uniquely identify over 90% of browsers.

    This is the technique that surprised me most. The HTML5 Canvas API renders text and shapes slightly differently depending on your GPU, driver version, OS font rendering engine, and anti-aliasing settings. The differences are invisible to the eye but consistent across page loads.

    Here’s a minimal Canvas fingerprint in 12 lines:

    function getCanvasFingerprint() {
      const canvas = document.createElement('canvas');
      const ctx = canvas.getContext('2d');
      ctx.textBaseline = 'top';
      ctx.font = '14px Arial';
      ctx.fillStyle = '#f60';
      ctx.fillRect(125, 1, 62, 20);
      ctx.fillStyle = '#069';
      ctx.fillText('Browser fingerprint', 2, 15);
      ctx.fillStyle = 'rgba(102, 204, 0, 0.7)';
      ctx.fillText('Browser fingerprint', 4, 17);
      return canvas.toDataURL();
    }

    That toDataURL() call returns a base64 string representing the rendered pixels. On my M2 MacBook running Chrome 124, this produces a hash that differs from the same code on Chrome 124 on my Intel Linux box. Same browser version, same text, different pixel output.

    Why? The font rasterizer (FreeType vs Core Text vs DirectWrite), sub-pixel rendering settings, and GPU shader behavior all introduce tiny variations. A 2016 Princeton study tested this across 1 million browsers and found Canvas fingerprinting alone could identify about 50% of users — no cookies needed.

    AudioContext: Your Sound Card Leaks Too

    This one is less well-known. The Web Audio API processes audio signals with slight floating-point variations depending on the hardware and driver stack. You don’t even need to play a sound:

    async function getAudioFingerprint() {
      const ctx = new (window.OfflineAudioContext ||
        window.webkitOfflineAudioContext)(1, 44100, 44100);
      const oscillator = ctx.createOscillator();
      oscillator.type = 'triangle';
      oscillator.frequency.setValueAtTime(10000, ctx.currentTime);
    
      const compressor = ctx.createDynamicsCompressor();
      compressor.threshold.setValueAtTime(-50, ctx.currentTime);
      compressor.knee.setValueAtTime(40, ctx.currentTime);
      compressor.ratio.setValueAtTime(12, ctx.currentTime);
    
      oscillator.connect(compressor);
      compressor.connect(ctx.destination);
      oscillator.start(0);
      
      const buffer = await ctx.startRendering();
      const data = buffer.getChannelData(0);
      // Hash the first 4500 samples
      return data.slice(0, 4500).reduce((a, b) => a + Math.abs(b), 0);
    }

    The resulting float sum varies by sound card and audio driver. Combined with Canvas, you’re looking at roughly 70-80% unique identification rates. I tested this on three machines in my homelab — all running Debian 12, all with different audio chipsets — and got three distinct values.

    WebGL: GPU Model Exposed

    WebGL goes further than Canvas. It exposes your GPU vendor, renderer string, and supported extensions. Most browsers serve this up without hesitation:

    function getWebGLInfo() {
      const canvas = document.createElement('canvas');
      const gl = canvas.getContext('webgl');
      const debugInfo = gl.getExtension('WEBGL_debug_renderer_info');
      return {
        vendor: gl.getParameter(debugInfo.UNMASKED_VENDOR_WEBGL),
        renderer: gl.getParameter(debugInfo.UNMASKED_RENDERER_WEBGL),
        extensions: gl.getSupportedExtensions().length
      };
    }

    On my machine this returns Apple / Apple M2 / 47 extensions. That renderer string alone narrows the pool considerably. Combine it with the exact list of supported extensions (which varies by driver version) and you’ve got another high-entropy signal.

    The Full Fingerprint Stack

    A real fingerprinting library like FingerprintJS (now Fingerprint.com) combines 30+ signals. Here are the ones that contribute the most entropy, based on my testing:

    Signal Entropy (bits) How It Works
    User-Agent string ~10 OS + browser + version
    Canvas hash ~8-10 GPU + font rendering
    WebGL renderer ~7 GPU model + driver
    Installed fonts ~6-8 Font enumeration via CSS
    Screen resolution + DPR ~5 Monitor + scaling
    AudioContext ~5-6 Audio processing differences
    Timezone + locale ~4 Intl API
    WebGL extensions ~4 Supported GL features
    Navigator properties ~3 hardwareConcurrency, deviceMemory
    Color depth + touch support ~2 display.colorDepth, maxTouchPoints

    Add those up and you’re at roughly 54-63 bits of entropy. You need about 33 bits to uniquely identify someone in a pool of 8 billion. We’re at nearly double that.

    What Actually Defends Against This

    I tested four approaches on my fingerprinting test page. Here’s my honest assessment:

    Brave Browser — Best out-of-the-box protection. Brave randomizes Canvas and WebGL output on every session (adds noise to the rendered pixels). AudioContext gets similar treatment. In my tests, Brave generated a different fingerprint hash on every page load. The tradeoff: some sites that rely on Canvas for legitimate purposes (online editors, games) may behave slightly differently. I haven’t hit real issues with this in daily use.

    Firefox with privacy.resistFingerprinting — Setting privacy.resistFingerprinting = true in about:config spoofs many signals: timezone reports UTC, screen resolution reports the CSS viewport size, user-agent becomes generic. It’s effective but aggressive — it broke two web apps I use daily (a video editor and a mapping tool) because they relied on accurate screen dimensions.

    Tor Browser — The gold standard. Every Tor user presents an identical fingerprint by design. Canvas, WebGL, fonts, screen size — all normalized. But you’re trading performance (Tor routing adds 2-5x latency) and compatibility (many sites block Tor exit nodes).

    Browser extensions (Canvas Blocker, etc.) — Ironically, these can make you more unique. If 0.1% of users run Canvas Blocker, and it alters your fingerprint in a detectable way (which it does — the blocking itself is a signal), you’ve just moved from “one in a million” to “one in a thousand who runs this specific extension.” I stopped recommending these after seeing the data.

    A Practical Test You Can Run Right Now

    Visit Cover Your Tracks (EFF) — it runs a fingerprinting test and shows exactly how unique your browser is. Then visit BrowserLeaks.com for a more detailed breakdown of each signal.

    When I ran both tests in Chrome, my fingerprint was unique among their entire dataset. In Brave, it was shared with “a large number of users.” That difference matters.

    What Developers Should Do

    If you’re building analytics or auth systems, understand that fingerprinting exists in a gray area. The GDPR considers device fingerprints personal data (Article 29 Working Party Opinion 9/2014 explicitly says so). California’s CCPA covers “unique identifiers” which includes fingerprints.

    My recommendation for developers:

    • Don’t roll your own fingerprinting for tracking. If you need fraud detection, use a service like Fingerprint.com that handles consent and compliance.
    • Test your own sites with the Canvas fingerprint code above. If you’re embedding third-party scripts, check what data they’re collecting. I found three tracking scripts on a client’s site that were fingerprinting users without disclosure.
    • For your own browsing, switch to Brave. It’s Chromium-based so everything works, and the fingerprint randomization is on by default. I moved my daily driver six months ago and haven’t looked back.

    If you’re working from a home office or homelab and want to take your privacy setup further, a dedicated mini PC running a DNS-level blocker like Pi-hole or AdGuard Home catches a lot of the fingerprinting scripts at the network level before they even reach your browser. I run one on my TrueNAS box and it blocks about 30% of tracking domains across my whole network. If you are building a privacy-conscious homelab, my guide to EXIF data removal covers another vector where personal information leaks without your knowledge. Full disclosure: affiliate link.

    If you want to test what your browser reveals, I built a set of privacy-focused browser tools that run entirely client-side — no data ever leaves your machine. The PixelStrip EXIF remover handles EXIF/metadata stripping, and the DiffLab diff checker compares text without uploading to a server.

    The uncomfortable truth is that cookies were the easy privacy problem. The browser itself — its GPU, its fonts, its audio stack — is the harder one. And most people don’t even know they’re being identified.

    📡 Join Alpha Signal for free market intelligence — I share daily analysis on tech sector moves and trading signals.

    References

    1. Electronic Frontier Foundation (EFF) — “Cover Your Tracks”
    2. OWASP — “Browser Fingerprinting”
    3. ACM Digital Library — “The Web Never Forgets: Persistent Tracking Mechanisms in the Wild”
    4. developer.mozilla.org — “Canvas API”
    5. NIST — “Guide to Protecting the Confidentiality of Personally Identifiable Information (PII) (SP 800-122)”

    Frequently Asked Questions

    What is browser fingerprinting?

    Browser fingerprinting is a technique used to uniquely identify users by analyzing properties exposed by their browser, such as rendering behavior or hardware-specific variations, without relying on cookies or other storage mechanisms.

    How does the Canvas API contribute to fingerprinting?

    The Canvas API generates unique identifiers by rendering text and shapes, which vary slightly based on GPU, driver, OS font rendering, and anti-aliasing settings. These subtle differences create consistent, unique hashes across sessions.

    Can browser fingerprinting identify users across devices?

    Browser fingerprinting primarily identifies users based on device-specific properties, so it is less effective across different devices. However, it can reliably distinguish users on the same device even in incognito mode or after clearing cookies.

    How can users protect themselves from browser fingerprinting?

    Users can reduce fingerprinting risks by using privacy-focused browsers, disabling JavaScript, or employing anti-fingerprinting tools like browser extensions. However, complete protection is difficult due to the inherent nature of exposed browser properties.

    🛠️ Recommended Resources:

    Tools for privacy-conscious developers:

    Full disclosure: affiliate links.

  • Privacy-Focused Diff Checker: No Text Upload Required

    Privacy-Focused Diff Checker: No Text Upload Required

    I spent last weekend comparing two config files — a 400-line nginx setup where I’d made changes across multiple servers. I opened Diffchecker.com, pasted both files, and immediately ran into the same frustrations I’ve had for years: the page uploaded my text to their server (privacy issue for config files), there were no keyboard shortcuts to jump between changes, and the character-level highlighting was either nonexistent or buried behind a Pro paywall.

    So I built my own.

    The Problem with Every Online Diff Tool

    📌 TL;DR: I spent last weekend comparing two config files — a 400-line nginx setup where I’d made changes across multiple servers. I opened Diffchecker.
    🎯 Quick Answer: A privacy-focused diff checker runs entirely in your browser using client-side JavaScript—no text is ever uploaded to a server. This makes it safe for comparing proprietary code, credentials, or sensitive documents.

    Here’s what bugs me about existing text comparison tools. I tested the top three before writing a single line of code:

    Diffchecker.com — The default recommendation on every “best tools” list. It works, but your text gets sent to their servers. For comparing code, configs, or anything with credentials nearby, that’s a non-starter. They also paywall basic features like “find next difference” behind a $10/month subscription.

    TextDiffViewer.com — Claims client-side processing, which is good. But the UI feels like it was built in 2012. No character-level highlighting within changed lines, no unified diff view, no keyboard shortcuts. If I’m comparing 2,000 lines, I need to jump between changes, not scroll manually.

    DevToolLab’s Diff Checker — Clean UI, but limited. Only side-by-side view, no way to ignore whitespace or case differences, and no file drag-and-drop. Fine for small comparisons, frustrating for real work.

    The pattern is clear: either the tool uploads your data (privacy problem) or it’s missing features that developers actually need (navigation, views, options). It’s the same reason I built RegexLab — Regex101 sends your patterns to their server, which is a deal-breaker when you’re testing patterns against production data.

    What I Built: DiffLab

    DiffLab is a single HTML file — no server, no dependencies, no uploads. Everything runs in your browser. Close the tab and your data is gone. Here’s what makes it different:

    Three View Modes

    Most diff tools give you side-by-side and call it a day. DiffLab has three views you can switch between with keyboard shortcuts:

    • Side-by-side (press 1) — The classic two-column layout. Changed lines are paired, and character-level differences are highlighted within each line so you can see exactly what changed.
    • Unified (press 2) — Like git diff output. Removed lines appear with a - prefix, added lines with +. Compact and scannable.
    • Inline (press 3) — Shows old → new on the same line with character highlighting. Best for reviewing small edits across many lines.

    Keyboard Navigation Between Changes

    This is the feature I wanted most. In a long file with changes scattered throughout, scrolling to find each difference is painful. DiffLab tracks every change “hunk” and lets you:

    • Press J or to jump to the next change
    • Press K or to jump to the previous change
    • A floating indicator shows “Change 3 of 12” so you always know where you are

    The current change gets a visible highlight so your eye can find it instantly after scrolling.

    Character-Level Highlighting

    When a line changes, DiffLab doesn’t just highlight the whole line red/green. It finds the common prefix and suffix of the old and new line, then highlights only the characters that actually changed. If you renamed getUserById to getUserByName, only Id and Name get highlighted — not the entire function call.

    Comparison Options

    Two toggles that matter for real-world comparisons:

    • Ignore whitespace (W) — Treats a b and a b as identical. Essential for comparing code with different formatting.
    • Ignore case (C) — Treats Hello and hello as identical. Useful for config files where case doesn’t matter.

    Both options re-run the diff instantly when toggled.

    How It Works Under the Hood

    The Diff Algorithm

    DiffLab uses a Myers diff algorithm — the same algorithm behind git diff. It finds the shortest edit script (minimum number of insertions and deletions) to transform the original text into the modified text.

    For inputs under 8,000 total lines, it runs the full Myers algorithm. For larger inputs, it switches to a faster LCS-based approach with lookahead that trades perfect minimality for speed. In practice, both produce identical results for typical comparisons.

    The algorithm runs in your browser’s main thread. On my laptop, comparing two 5,000-line files takes about 40ms. The rendering is the bottleneck, not the diff calculation.

    Character Diffing

    For paired changed lines (a removed line followed by an added line at the same position), DiffLab runs a second pass. It finds the longest common prefix and suffix of the two strings, then wraps the changed middle section in highlight spans. This is simpler than running a full diff on individual characters, but it’s fast and catches the common case (a word or variable name changed in the middle of a line) perfectly.

    Rendering Strategy

    The diff output renders as an HTML table. Each row is a line, with cells for line numbers and content. I chose tables over divs because:

    1. Line numbers stay aligned with content automatically
    2. Column widths distribute properly in side-by-side mode
    3. Screen readers can navigate the structure

    Changed lines get CSS classes (diff-added, diff-removed) that map to CSS custom properties. Dark mode support comes free through prefers-color-scheme media queries — the custom properties switch automatically.

    Real Use Cases

    I’ve been using DiffLab daily since building it. Here are the situations where it’s genuinely useful:

    1. Comparing deployment configs — Before pushing a staging config to production, paste both and verify only the expected values changed.
    2. Code review diffs — When a PR is too large for GitHub’s diff view, copy-paste specific files for focused comparison.
    3. Database migration scripts — Compare the current schema dump with the new one to make sure nothing got dropped accidentally.
    4. Documentation updates — Writers comparing draft versions to see what an editor changed.
    5. API response debugging — Compare expected vs actual JSON responses. The character-level highlighting catches subtle differences in values. For formatting those JSON responses first, JSON Forge does the same client-side-only trick.

    Privacy and Offline Use

    DiffLab includes a service worker that caches everything on first visit. After that, it works completely offline — airplane mode, no internet, whatever. Your text never leaves the browser tab, and there’s no analytics tracking what you compare.

    It also passes Chrome’s PWA install requirements. Click “Install” in your browser’s address bar and it becomes a standalone app on your desktop or phone.

    Try It

    DiffLab is live at difflab.orthogonal.info. Single HTML file, zero dependencies, works offline.

    The keyboard shortcuts are the selling point: Ctrl+Enter to compare, J/K to navigate changes, 1/2/3 to switch views, W for whitespace, C for case, S to swap sides, Escape to clear.

    If you compare text files regularly, give it a try. DiffLab is part of a growing set of free browser tools that replace desktop apps — all client-side, all offline-capable. If you build developer tools, I’d love to hear what’s missing — reach out at [email protected].


    If you spend hours comparing code and config files, a good ergonomic keyboard makes the typing between comparisons much more comfortable. I switched to a split keyboard last year and my wrists thank me daily.

    For monitor setups that make side-by-side diffs actually readable, check out these ultrawide monitors for programming — the extra horizontal space is worth it.

    And if you’re doing serious code reviews, a monitor arm to position your screen at the right height reduces neck strain during long sessions.

    References

    1. OWASP — “OWASP Top Ten Privacy Risks”
    2. Mozilla Developer Network (MDN) — “Client-Side Storage and Security”
    3. GitHub — “Diff Tool: Client-Side Comparison Implementation”
    4. NIST — “Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)”
    5. RFC Editor — “RFC 4949: Internet Security Glossary, Version 2”
  • Git Worktrees: The Feature That Killed My Stash Habit

    Git Worktrees: The Feature That Killed My Stash Habit

    Last Tuesday I was deep in a refactor — 40 files touched, tests half-green — when Slack lit up: “Production’s returning 500s, can you look at main?” My old workflow: git stash, switch branches, forget what I stashed, lose 20 minutes reconstructing state. My current workflow: git worktree add ../hotfix main, fix the bug in a separate directory, push, delete the worktree, and I never left my refactor. Total context-switch cost: zero.

    Git worktrees have been around since Git 2.5 (July 2015), but I’ve met maybe three developers who actually use them. Everyone else is still juggling git stash or cloning the repo twice. That’s a shame, because worktrees solve a real, daily problem — and they’re already on your machine.

    What Git Worktrees Actually Are

    📌 TL;DR: Git worktrees allow developers to manage multiple branches or commits in separate directories without duplicating the repository. This feature eliminates the need for git stash during context switches, saving time and reducing cognitive load, especially for tasks like code reviews and hotfixes.
    🎯 Quick Answer: Git worktrees let you check out multiple branches simultaneously in separate directories, eliminating the need for git stash. Run git worktree add ../feature-branch feature-branch to work on two branches without switching or stashing.

    A worktree is a linked checkout of your repository at a different branch or commit, in a separate directory, sharing the same .git data. One repo, multiple working directories, each on its own branch. No duplicate clones, no extra disk space for the object database.

    # You're on feature/auth-refactor in ~/code/myapp
    git worktree add ../myapp-hotfix main
    # Now ~/code/myapp-hotfix has a full checkout of main
    # Both share the same .git/objects — no duplicate data

    The key constraint: each worktree must be on a different branch. Git won’t let two worktrees check out the same branch simultaneously (that would be a recipe for corruption). This is actually a feature — it forces clean separation.

    Three Workflows Where Worktrees Save Real Time

    1. Code Reviews Without Context Switching

    I review 3-5 PRs a day. Before worktrees, reviewing meant either reading diffs in the GitHub UI (fine for small changes, terrible for architectural PRs) or stashing my work to check out the PR branch locally.

    Now I keep a persistent review worktree:

    # One-time setup
    git worktree add ../review main
    
    # When a PR comes in
    cd ../review
    git fetch origin
    git checkout origin/feature/new-payment-flow
    
    # Run tests, inspect code, check behavior
    npm test
    npm run dev  # boot the app on a different port
    
    # Done reviewing? Back to my branch
    cd ../myapp
    # My work is exactly where I left it

    The time savings are measurable. I tracked it for two weeks: stash-and-switch averaged 4 minutes per review (stash, checkout, install deps, run, switch back, pop stash). Worktrees averaged 40 seconds. With 4 reviews a day, that’s ~13 minutes saved daily. Not life-changing alone, but it compounds — and the cognitive cost of interrupted focus is way higher than the clock time.

    2. Hotfixes While Mid-Feature

    This is the classic case. You’re mid-feature, things are broken in your working tree (intentionally — you’re refactoring), and production needs a patch. With worktrees:

    git worktree add ../hotfix main
    cd ../hotfix
    # fix the bug
    git commit -am "fix: null check on user.profile (#1234)"
    git push origin main
    cd ../myapp
    git worktree remove ../hotfix

    No stash. No “was I on the right commit?” No dependency reinstall because your lockfile changed between branches. Clean in, clean out.

    3. Running Two Versions Side-by-Side

    I needed to compare API response times between our v2 and v3 endpoints during a migration. With worktrees, I had both versions running simultaneously on different ports:

    git worktree add ../api-v2 release/v2
    cd ../api-v2 && PORT=3001 npm start &
    cd ../myapp && PORT=3002 npm start &
    
    # Now hit both with curl or your HTTP client
    curl -w "%{time_total}\n" http://localhost:3001/api/users
    curl -w "%{time_total}\n" http://localhost:3002/api/users

    Try doing that with stash. You can’t.

    The Commands You Actually Need

    The full git worktree subcommand has plenty of options, but in practice I use four:

    # Create a worktree for an existing branch
    git worktree add ../path branch-name
    
    # Create a worktree with a new branch (like checkout -b)
    git worktree add -b new-branch ../path starting-point
    
    # List all worktrees
    git worktree list
    
    # Remove a worktree (cleans up the directory and git references)
    git worktree remove ../path

    That’s it. Four commands cover 95% of usage. There’s also git worktree prune for cleaning up stale references if you manually delete a worktree directory instead of using remove, but you shouldn’t need it often.

    Gotchas I Hit (So You Don’t Have To)

    Node modules and build artifacts. Each worktree is a separate directory, so you need a separate node_modules in each. For a big project, that first npm install in a new worktree takes time. I mitigate this by keeping one long-lived review worktree rather than creating/destroying them constantly.

    IDE confusion. VS Code handles worktrees well — just open the worktree directory as a separate window. JetBrains IDEs can get confused if you open the same project root with different worktrees. The fix: open the specific worktree directory, not the parent.

    Submodules. If your repo uses submodules, you need to run git submodule update --init in each new worktree. Worktrees don’t automatically initialize submodules. Annoying, but a one-liner fix.

    Branch locking. Remember: one branch per worktree. If you try to check out a branch that’s already active in another worktree, Git blocks you with:

    fatal: 'main' is already checked out at '/home/user/code/myapp-hotfix'

    This is intentional and correct. If you need to work on the same branch from two directories, you have a workflow problem, not a Git problem.

    My Worktree Setup

    I keep my projects structured like this:

    ~/code/
    ├── myapp/          # main development (feature branches)
    ├── myapp-review/   # persistent review worktree (long-lived)
    └── myapp-hotfix/   # created on-demand, deleted after use

    I added a shell alias to speed up the common case:

    # ~/.zshrc or ~/.bashrc
    hotfix() {
      local branch="${1:-main}"
      local dir="../$(basename $(pwd))-hotfix"
      git worktree add "$dir" "$branch" && cd "$dir"
    }
    
    # Usage: just type 'hotfix' or 'hotfix release/v3'

    And a cleanup alias:

    worktree-clean() {
      git worktree list --porcelain | grep "^worktree" | awk '{print $2}' | while read wt; do
        if [ "$wt" != "$(git rev-parse --show-toplevel)" ]; then
          echo "Remove $wt? (y/n)"
          read answer
          [ "$answer" = "y" ] && git worktree remove "$wt"
        fi
      done
    }

    When Worktrees Aren’t the Answer

    I’m not going to pretend worktrees solve everything. They don’t make sense when:

    • You’re working solo on a single branch. No context switching means no need for worktrees.
    • Your project has a 10-minute build step. Each worktree needs its own build, so the overhead might not be worth it for infrequent switches.
    • You need the same branch in two places. Worktrees explicitly prevent this. Clone the repo instead.

    For everything else — code reviews, hotfixes, comparing versions, running multiple branches in parallel — worktrees are the right tool. I’ve been using them daily for about a year now, and git stash usage in my shell history dropped from ~15 times/week to maybe once.

    Level Up Your Git Setup

    If you’re spending real time on Git workflows, it’s worth investing in a proper reference. Pro Git by Scott Chacon covers worktrees alongside the internals that make them possible — understanding Git’s object model makes everything click. (Full disclosure: affiliate link.)

    For the terminal setup that makes all this fast, a solid mechanical keyboard actually matters when you’re typing Git commands dozens of times a day. I’ve been using a Keychron Q1 — the tactile feedback on those Brown switches makes a difference over an 8-hour session. (Affiliate link.)

    And if you want more developer workflow content, I write about tools and techniques regularly here on orthogonal.info. Check out my regex tester build or the EXIF parser deep dive for more hands-on dev tool content.

    Join Alpha Signal on Telegram for free market intelligence and developer insights.

    📚 Related Reading

    References

    1. Git Documentation — “git-worktree Documentation”
    2. Atlassian — “How to Use Git Worktree”
    3. GitHub Blog — “Using Git Worktrees for Better Context Switching”
    4. Stack Overflow — “What is the purpose of git worktree?”
    5. Thoughtbot Blog — “Git Worktrees: A Better Way to Work with Multiple Branches”
    🛠️ Recommended Resources:

    Tools and books for productive Git workflows:

    Full disclosure: affiliate links.

  • UPS Battery Backup: Sizing, Setup & NUT on TrueNAS

    UPS Battery Backup: Sizing, Setup & NUT on TrueNAS

    Last month my TrueNAS server rebooted mid-scrub during a power flicker that lasted maybe half a second. Nothing dramatic — the lights barely dimmed — but the ZFS pool came back with a degraded vdev and I spent two hours rebuilding. That’s when I finally stopped procrastinating and bought a UPS.

    If you’re running a homelab with any kind of persistent storage — especially ZFS on TrueNAS — you need battery backup. Not “eventually.” Now. Here’s what I learned picking one out and setting it up with automatic shutdown via NUT.

    Why Homelabs Need a UPS More Than Desktops Do

    📌 TL;DR: A UPS battery backup is essential for homelabs running persistent storage like TrueNAS to prevent data corruption during power outages. Pure sine wave UPS units are recommended for modern server PSUs with active PFC, ensuring compatibility and reliable operation. The article discusses UPS selection, setup, and integration with NUT for automatic shutdown during outages.
    🎯 Quick Answer: Size a UPS at 1.5× your homelab’s measured wattage, choose pure sine wave output to protect server PSUs, and configure NUT (Network UPS Tools) on TrueNAS to trigger automatic shutdown before battery depletion.

    A desktop PC losing power is annoying. You lose your unsaved work and reboot. A server losing power mid-write can corrupt your filesystem, break a RAID rebuild, or — in the worst case with ZFS — leave your pool in an unrecoverable state.

    I’ve been running TrueNAS on a custom build (I wrote about picking the right drives for it) and the one thing I kept putting off was power protection. Classic homelab mistake: spend $800 on drives, $0 on keeping them alive during outages.

    The math is simple. A decent UPS costs $150-250. A failed ZFS pool can mean rebuilding from backup (hours) or losing data (priceless). The UPS pays for itself the first time your power blips.

    Simulated Sine Wave vs. Pure Sine Wave — It Actually Matters

    Most cheap UPS units output a “simulated” or “stepped” sine wave. For basic electronics, this is fine. But modern server PSUs with active PFC (Power Factor Correction) can behave badly on simulated sine wave — they may refuse to switch to battery, reboot anyway, or run hot.

    The rule: if your server has an active PFC power supply (most ATX PSUs sold after 2020 do), get a pure sine wave UPS. Don’t save $40 on a simulated unit and then wonder why your server still crashes during outages.

    Both units I’d recommend output pure sine wave:

    APC Back-UPS Pro BR1500MS2 — My Pick

    This is what I ended up buying. The APC BR1500MS2 is a 1500VA/900W pure sine wave unit with 10 outlets, USB-A and USB-C charging ports, and — critically — a USB data port for NUT monitoring. (Full disclosure: affiliate link.)

    Why I picked it:

    • Pure sine wave output — no PFC compatibility issues
    • USB HID interface — TrueNAS recognizes it immediately via NUT, no drivers needed
    • 900W actual capacity — enough for my TrueNAS box (draws ~180W), plus my network switch and router
    • LCD display — shows load %, battery %, estimated runtime in real-time
    • User-replaceable battery — when the battery dies in 3-5 years, swap it for ~$40 instead of buying a new UPS

    At ~180W load, I get about 25 minutes of runtime. That’s more than enough for NUT to detect the outage and trigger a clean shutdown.

    CyberPower CP1500PFCLCD — The Alternative

    If APC is out of stock or you prefer CyberPower, the CP1500PFCLCD is the direct competitor. Same 1500VA rating, pure sine wave, 12 outlets, USB HID for NUT. (Affiliate link.)

    The CyberPower is usually $10-20 cheaper than the APC. Functionally, they’re nearly identical for homelab use. I went APC because I’ve had good luck with their battery replacements, but either is a solid choice. Pick whichever is cheaper when you’re shopping.

    Sizing Your UPS: VA, Watts, and Runtime

    UPS capacity is rated in VA (Volt-Amps) and Watts. They’re not the same thing. For homelab purposes, focus on Watts.

    Here’s how to size it:

    1. Measure your actual draw. A Kill A Watt meter costs ~$25 and tells you exactly how many watts your server pulls from the wall. (Affiliate link.) Don’t guess — PSU wattage ratings are maximums, not actual draw.
    2. Add up everything you want on battery. Server + router + switch is typical. Monitors and non-essential stuff go on surge-only outlets.
    3. Target 50-70% load. A 900W UPS running 450W of gear gives you reasonable runtime (~8-12 minutes) and doesn’t stress the battery.

    My setup: TrueNAS box (~180W) + UniFi switch (~15W) + router (~12W) = ~207W total. On a 900W UPS, that’s 23% load, giving me ~25 minutes of runtime. Overkill? Maybe. But I’d rather have headroom than run at 80% and get 4 minutes of battery.

    Setting Up NUT on TrueNAS for Automatic Shutdown

    A UPS without automatic shutdown is just a really expensive power strip with a battery. The whole point is graceful shutdown — your server detects the outage, saves everything, and powers down cleanly before the battery dies.

    TrueNAS has NUT (Network UPS Tools) built in. Here’s the setup:

    1. Connect the USB data cable

    Plug the USB cable from the UPS into your TrueNAS machine. Not a charging cable — the data cable that came with the UPS. Go to System → Advanced → Storage and make sure the USB device shows up.

    2. Configure the UPS service

    In TrueNAS SCALE, go to System Settings → Services → UPS:

    UPS Mode: Master
    Driver: usbhid-ups (auto-detected for APC and CyberPower)
    Port: auto
    Shutdown Mode: UPS reaches low battery
    Shutdown Timer: 30 seconds
    Monitor User: upsmon
    Monitor Password: (set something, you'll need it for NUT clients)

    3. Enable and test

    Start the UPS service, enable auto-start. Then SSH in and check:

    upsc ups@localhost

    You should see battery charge, load, input voltage, and status. If it says OL (online), you’re good. Pull the power cord from the wall briefly — it should switch to OB (on battery) and you’ll see the charge start to drop.

    4. NUT clients for other machines

    If you’re running Docker containers or other servers (like an Ollama inference box), they can connect as NUT clients to the same UPS. On a Linux box:

    apt install nut-client
    # Edit /etc/nut/upsmon.conf:
    MONITOR ups@truenas-ip 1 upsmon yourpassword slave
    SHUTDOWNCMD "/sbin/shutdown -h +0"

    Now when the UPS battery hits critical, TrueNAS shuts down first, then signals clients to do the same.

    Monitoring UPS Health Over Time

    Batteries degrade. A 3-year-old UPS might only give you 8 minutes instead of 25. NUT tracks battery health, but you need to actually look at it.

    I have a cron job that checks upsc ups@localhost battery.charge weekly and logs it. If charge drops below 80% at full load, it’s time for a replacement battery. APC replacement batteries (RBC models) run $30-50 on Amazon and take two minutes to swap.

    If you’re running a monitoring stack (Prometheus + Grafana), there’s a NUT exporter that makes this trivial. But honestly, a cron job and a log file works fine for a homelab.

    What About Rack-Mount UPS?

    If you’ve graduated to a proper server rack, the tower units I mentioned above won’t fit. The APC SMT1500RM2U is the rack-mount equivalent — 2U, 1500VA, pure sine wave, NUT compatible. It’s about 2x the price of the tower version. Only worth it if you actually have a rack.

    For most homelabbers running a Docker or K8s setup on a single tower server, the desktop UPS units are plenty. Don’t buy rack-mount gear for a shelf setup — you’re paying for the form factor, not better protection.

    The Backup Chain: UPS Is Just One Link

    A UPS protects against power loss. It doesn’t protect against drive failure, ransomware, or accidental rm -rf. If you haven’t set up a real backup strategy, I wrote about enterprise-grade backup for homelabs — the 3-2-1 rule still applies, even at home.

    The full resilience stack for a homelab: UPS for power → ZFS for disk redundancy → offsite backups for disaster recovery. Skip any layer and you’re gambling.

    Go buy a UPS. Your data will thank you the next time the power blinks.


    Free market intelligence for traders and builders: Join Alpha Signal on Telegram — daily macro, sector, and signal analysis, free.

    References

    1. APC by Schneider Electric — “How to Choose a UPS”
    2. TrueNAS Documentation — “Configuring Network UPS Tools (NUT)”
    3. CyberPower Systems — “What is Pure Sine Wave Output and Why Does It Matter?”
    4. NUT (Network UPS Tools) — “NUT User Manual”
    5. OpenZFS — “ZFS Best Practices Guide”

    Frequently Asked Questions

    Why is a UPS important for TrueNAS or homelabs?

    A UPS prevents power loss during outages, which can corrupt filesystems, disrupt RAID rebuilds, or cause irreversible damage to ZFS pools. It ensures data integrity and system reliability.

    What is the difference between simulated sine wave and pure sine wave UPS units?

    Simulated sine wave UPS units may cause issues with modern server PSUs that have active PFC, such as failing to switch to battery or overheating. Pure sine wave units are compatible and reliable for such setups.

    What features should I look for in a UPS for TrueNAS?

    Key features include pure sine wave output, sufficient wattage for your devices, USB HID interface for NUT integration, and user-replaceable batteries for long-term cost efficiency.

    How does NUT help with UPS integration on TrueNAS?

    NUT (Network UPS Tools) allows TrueNAS to monitor the UPS status and trigger a clean shutdown during power outages, preventing data loss or corruption.

  • Insider Trading Detector with Python & Free SEC Data

    Insider Trading Detector with Python & Free SEC Data

    Last month I noticed something odd. Three directors at a mid-cap biotech quietly bought shares within a five-day window — all open-market purchases, no option exercises. The stock was down 30% from its high. Two weeks later, they announced a partnership with Pfizer and the stock popped 40%.

    I didn’t catch it in real time. I found it afterward while manually scrolling through SEC filings. That annoyed me enough to build a tool that would catch the next one automatically.

    Here’s the thing about insider buying clusters: they’re one of the few signals with actual academic backing. A 2024 study from the Journal of Financial Economics found that stocks with three or more insider purchases within 30 days outperformed the market by an average of 8.7% over the following six months. Not every cluster leads to a win, but the hit rate is better than most technical indicators I’ve tested.

    The data is completely free. Every insider trade gets filed with the SEC as a Form 4, and the SEC makes all of it available through their EDGAR API — no API key, no rate limits worth worrying about (10 requests/second), no paywall. The only catch: the raw data is XML soup. That’s where edgartools comes in.

    What Counts as a “Cluster”

    📌 TL;DR: The article discusses using Python and free SEC EDGAR data to detect insider trading clusters, which are strong market signals backed by academic research. It introduces the ‘edgartools’ library to parse SEC filings and provides a script to identify clusters of significant insider purchases within a 30-day window.
    🎯 Quick Answer: Detect insider trading clusters using Python and free SEC EDGAR Form 4 data. Flag stocks where 3+ insiders buy within a 14-day window—historically, clustered insider purchases outperform the market by 7–10% annually.

    Before writing code, I needed to define what I was actually looking for. Not all insider buying is equal.

    Strong signals:

    • Open market purchases (transaction code P) — the insider spent their own money
    • Multiple different insiders buying within a 30-day window
    • Purchases by C-suite (CEO, CFO, COO) or directors — not mid-level VPs exercising options
    • Purchases larger than $50,000 — skin in the game matters

    Weak signals (I filter these out):

    • Option exercises (code M) — often automatic, not conviction
    • Gifts (code G) — tax planning, not bullish intent
    • Small purchases under $10,000 — could be a director fulfilling a minimum ownership requirement

    Setting Up the Python Environment

    You need exactly two packages:

    pip install edgartools pandas

    edgartools is an open-source Python library that wraps the SEC EDGAR API and parses the XML filings into clean Python objects. No API key required. It handles rate limiting, caching, and the various quirks of EDGAR’s data format. I’ve been using it for about six months and it’s saved me from writing a lot of painful XML parsing code.

    Here’s the core detection script:

    from edgar import Company, get_filings
    from datetime import datetime, timedelta
    from collections import defaultdict
    import pandas as pd
    
    def detect_insider_clusters(tickers, lookback_days=60,
                                min_insiders=2, min_value=50000):
        # Scan a list of tickers for insider buying clusters.
        # A cluster = multiple different insiders making open-market
        # purchases within a rolling 30-day window.
        clusters = []
    
        for ticker in tickers:
            try:
                company = Company(ticker)
                filings = company.get_filings(form="4")
    
                purchases = []
    
                for filing in filings.head(50):
                    form4 = filing.obj()
    
                    for txn in form4.transactions:
                        if txn.transaction_code != 'P':
                            continue
    
                        value = (txn.shares or 0) * (txn.price_per_share or 0)
                        if value < min_value:
                            continue
    
                        purchases.append({
                            'ticker': ticker,
                            'date': txn.transaction_date,
                            'insider': form4.reporting_owner_name,
                            'relationship': form4.reporting_owner_relationship,
                            'shares': txn.shares,
                            'price': txn.price_per_share,
                            'value': value
                        })
    
                if len(purchases) < min_insiders:
                    continue
    
                df = pd.DataFrame(purchases)
                df['date'] = pd.to_datetime(df['date'])
                df = df.sort_values('date')
    
                cutoff = datetime.now() - timedelta(days=lookback_days)
                recent = df[df['date'] >= cutoff]
    
                if len(recent) == 0:
                    continue
    
                unique_insiders = recent['insider'].nunique()
    
                if unique_insiders >= min_insiders:
                    total_value = recent['value'].sum()
                    clusters.append({
                        'ticker': ticker,
                        'insiders': unique_insiders,
                        'total_purchases': len(recent),
                        'total_value': total_value,
                        'earliest': recent['date'].min(),
                        'latest': recent['date'].max(),
                        'names': recent['insider'].unique().tolist()
                    })
    
            except Exception as e:
                print(f"Error processing {ticker}: {e}")
                continue
    
        return sorted(clusters, key=lambda x: x['insiders'], reverse=True)
    

    Scanning the S&P 500

    Running this against individual tickers is fine, but the real value is scanning broadly. I pull S&P 500 constituents from Wikipedia’s maintained list and run the detector daily:

    # Get S&P 500 tickers
    sp500 = pd.read_html(
        'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
    )[0]['Symbol'].tolist()
    
    # Takes about 15-20 minutes for 500 tickers
    # EDGAR rate limit is 10 req/sec — be respectful
    results = detect_insider_clusters(
        sp500,
        lookback_days=30,
        min_insiders=3,
        min_value=25000
    )
    
    for cluster in results:
        print(f"\n{cluster['ticker']}: {cluster['insiders']} insiders, "
              f"${cluster['total_value']:,.0f} total")
        for name in cluster['names']:
            print(f"  - {name}")
    

    When I first ran this in January, it flagged 4 companies with 3+ insider purchases in a rolling 30-day window. Two of them outperformed the S&P over the next quarter. That’s a small sample, but it matched the academic research I mentioned earlier.

    Adding Slack or Telegram Alerts

    A detector that only runs when you remember to open a terminal isn’t very useful. I run mine on a cron job (every morning at 7 AM ET) and have it push alerts to a Telegram channel:

    import requests
    
    def send_telegram_alert(cluster, bot_token, chat_id):
        msg = (
            f"🔔 Insider Cluster: ${cluster['ticker']}\n"
            f"Insiders buying: {cluster['insiders']}\n"
            f"Total value: ${cluster['total_value']:,.0f}\n"
            f"Window: {cluster['earliest'].strftime('%b %d')} - "
            f"{cluster['latest'].strftime('%b %d')}\n"
            f"Names: {', '.join(cluster['names'][:5])}"
        )
    
        requests.post(
            f"https://api.telegram.org/bot{bot_token}/sendMessage",
            json={"chat_id": chat_id, "text": msg}
        )
    

    You can also swap in Slack, Discord, or email. The detection logic stays the same — just change the notification transport.

    Performance Reality Check

    I want to be honest about what this tool can and can’t do.

    What works:

    • Catching cluster buys that I’d otherwise miss entirely. Most retail investors don’t read Form 4 filings.
    • Filtering out noise. The vast majority of insider transactions are option exercises, RSU vesting, and 10b5-1 plan sales — none of which signal much. This tool isolates the intentional purchases.
    • Speed. EDGAR filings appear within 24-48 hours of the transaction. For cluster detection (which builds over days or weeks), that latency doesn’t matter.

    What doesn’t work:

    • Single insider buys. One director buying $100K of stock might mean something, but the signal-to-noise ratio is low. Clusters are where the edge is.
    • Short-term trading. This isn’t a day-trading signal. The academic alpha shows up over 3-6 months.
    • Small caps with thin insider data. Some micro-caps only have 2-3 insiders total, so “cluster” detection becomes meaningless.

    Comparing Free Alternatives

    You don’t have to build your own. Here’s how the DIY approach stacks up:

    secform4.com — Free, decent UI, but no cluster detection. You see raw filings, not patterns. No API.

    Finnhub insider endpoint — Free tier includes /stock/insider-transactions, but limited to 100 transactions per call and 60 API calls/minute. Good for single-ticker lookups, not for scanning 500 tickers daily. I wrote about Finnhub and other finance APIs in my finance API comparison.

    OpenInsider.com — My favorite for manual browsing. Has a “cluster buys” filter built in. But no API, no automation, and the cluster definition isn’t configurable.

    The DIY edgartools approach wins if you want customizable filters, automated alerts, and the ability to pipe results into other tools (backtests, portfolio trackers, dashboards). It loses if you just want to glance at insider activity once a week — use OpenInsider for that.

    Running It 24/7 on a Raspberry Pi

    I run my scanner on a Raspberry Pi 5 that also handles a few other Python monitoring scripts. A Pi 5 with 8GB RAM handles this fine — peak memory usage is under 400MB even when scanning all 500 tickers. Total cost: about $80 for the Pi, a case, and an SD card. It’s been running since November without a restart.

    If you’d rather not manage hardware, any $5/month VPS works too. The script runs in about 20 minutes per scan and sleeps the rest of the day.

    Next Steps

    A few things I’m still experimenting with:

    • Combining with technical signals. An insider cluster at a 52-week low with RSI under 30 is more interesting than one at an all-time high. I wrote about RSI and other technical indicators if you want to add that layer.
    • Tracking 13F filings alongside Form 4s. If an insider is buying AND a major fund just initiated a position (visible in quarterly 13F filings), that’s a stronger signal. edgartools handles 13F parsing too.
    • Sector-level clustering. Sometimes multiple insiders across different companies in the same sector all start buying. That’s a sector-level signal I haven’t automated yet.

    If you want to go deeper into the quantitative side, Python for Finance by Yves Hilpisch (O’Reilly) covers the data pipeline and analysis patterns well. Full disclosure: affiliate link.

    The full source code for my detector is about 200 lines. Everything above is production-ready — I copy-pasted from my actual codebase. If you build something with it, I’d be curious to hear what you find.

    For daily market signals and insider activity alerts, join Alpha Signal on Telegram — free market intelligence, no paywall for the daily brief.

    📚 Related Reading

    Frequently Asked Questions

    What is an insider trading cluster?

    An insider trading cluster occurs when multiple insiders, such as directors or executives, make significant open-market purchases of their company’s stock within a 30-day period. These clusters are considered strong signals of potential stock performance.

    What data source is used to detect insider trading clusters?

    The data comes from SEC Form 4 filings, which disclose insider transactions. This information is freely available through the SEC’s EDGAR API.

    What tools and libraries are used in the detection process?

    The detection process uses Python along with the ‘edgartools’ library, which simplifies accessing and parsing SEC EDGAR data. Additionally, pandas is used for data manipulation.

    What criteria are used to filter strong insider trading signals?

    Strong signals include open-market purchases (transaction code P), purchases by C-suite executives or directors, transactions exceeding $50,000, and multiple insiders buying within 30 days. Weak signals, like option exercises or small purchases, are filtered out.

    References

  • RegexLab: Free Offline Regex Tester With 5 Modes Regex101 Doesn’t Have

    RegexLab: Free Offline Regex Tester With 5 Modes Regex101 Doesn’t Have

    Last week I was debugging a CloudFront log parser and pasted a chunk of raw access logs into Regex101. Mid-keystroke, I realized those logs contained client IPs, user agents, and request paths from production. All of it, shipped off to someone else’s server for “processing.”

    That’s the moment I decided to build my own regex tester.

    The Problem with Existing Regex Testers

    📌 TL;DR: Last week I was debugging a CloudFront log parser and pasted a chunk of raw access logs into Regex101. Mid-keystroke, I realized those logs contained client IPs, user agents, and request paths from production. All of it, shipped off to someone else’s server for “processing.
    🎯 Quick Answer: A privacy-first regex tester that runs entirely in the browser with zero server communication. Unlike Regex101, no input data is transmitted or logged—ideal for testing patterns against sensitive strings like API keys or PII.

    I looked at three tools I’ve used for years:

    Regex101 is the gold standard. Pattern explanations, debugger, community library — it’s feature-rich. But it sends every keystroke to their backend. Their privacy policy says they store patterns and test strings. If you’re testing regex against production data, config files, or anything containing tokens and IPs, that’s a problem.

    RegExr has a solid educational angle with the animated railroad diagrams. But the interface feels like 2015, and there’s no way to test multiple strings against the same pattern without copy-pasting repeatedly.

    Various Chrome extensions promise offline regex testing, but they request permissions to read all your browser data. I’m not trading one privacy concern for a worse one.

    What none of them do: let you define a set of test cases (this string SHOULD match, this one SHOULDN’T) and run them all at once. If you write regex for input validation, URL routing, or log parsing, you need exactly that.

    What I Built

    RegexLab is a single HTML file. No build step, no npm install, no backend. Open it in a browser and it works — including offline, since it registers a service worker.

    Three modes:

    Match mode highlights every match in real-time as you type. Capture groups show up color-coded below the result. If your pattern has named groups or numbered captures, you see exactly what each group caught.

    Replace mode gives you a live preview of string replacement. Type your replacement pattern (with $1, $2 backreferences) and see the output update instantly. I use this constantly for log reformatting and sed-style transforms.

    Multi-test mode is the feature I actually wanted. Add as many test cases as you need. Mark each one as “should match” or “should not match.” Run them all against your pattern and get a pass/fail report. Green checkmark or red X, instantly.

    This is what makes RegexLab different from Regex101. When I’m writing a URL validation pattern, I want to throw 15 different URLs at it — valid ones, edge cases with ports and fragments, obviously invalid ones — and see them all pass or fail in one view. No scrolling, no re-running.

    How It Works Under the Hood

    The entire app is ~30KB of HTML, CSS, and JavaScript. No frameworks, no dependencies. Here’s what’s happening technically:

    Pattern compilation: Every keystroke triggers a debounced (80ms) recompile. The regex is compiled with new RegExp(pattern, flags) inside a try/catch. Invalid patterns show the error message directly — no cryptic “SyntaxError,” just the relevant part of the browser’s error string.

    Match highlighting: I use RegExp.exec() in a loop with the global flag to find every match with its index position. Then I build highlighted HTML by slicing the original string at match boundaries and wrapping matches in <span class="hl"> tags. A safety counter at 10,000 prevents infinite loops from zero-length matches (a real hazard with patterns like .*).

    // Simplified version of the match loop
    const rxCopy = new RegExp(rx.source, rx.flags);
    let safety = 0;
    while ((m = rxCopy.exec(text)) !== null && safety < 10000) {
      matches.push({ start: m.index, end: m.index + m[0].length });
      if (m[0].length === 0) rxCopy.lastIndex++;
      safety++;
    }

    That lastIndex++ on zero-length matches is important. Without it, a pattern like /a*/g will match the empty string forever at the same position. Every regex tutorial skips this, and then people wonder why their browser tab freezes.

    Capture groups: When exec() returns an array with more than one element, elements at index 1+ are capture groups. I color-code them with four rotating colors (amber, pink, cyan, purple) and display them below the match result.

    Flag toggles: The flag buttons sync bidirectionally with the text input. Click a button, the text field updates. Type in the text field, the buttons update. I store flags as a simple string ("gim") and reconstruct it from button state on each click.

    State persistence: Everything saves to localStorage every 2 seconds — pattern, flags, test string, replacement, and all test cases. Reload the page and you’re right where you left off. The service worker caches the HTML for offline use.

    Common patterns library: 25 pre-built patterns for emails, URLs, IPs, dates, UUIDs, credit cards, semantic versions, and more. Click one to load it. Searchable. I pulled these from my own .bashrc aliases and validation functions I’ve written over the years.

    Design Decisions

    Dark mode by default via prefers-color-scheme. Most developers use dark themes. The light mode is there for the four people who don’t.

    Monospace everywhere that matters. Pattern input, test strings, results — all in SF Mono / Cascadia Code / Fira Code. Proportional fonts in regex testing are a war crime.

    No syntax highlighting in the pattern input. I considered it, but colored brackets and escaped characters inside an input field add complexity without much benefit. The error message and match highlighting already tell you if your pattern is right.

    Touch targets are 44px minimum. The flag toggle buttons, tab buttons, and action buttons all meet Apple’s HIG recommendation. I tested at 320px viewport width on my phone and everything still works.

    Real Use Cases

    Log parsing: I parse nginx access logs daily. A pattern like (\d+\.\d+\.\d+\.\d+).*?"(GET|POST)\s+([^"]+)"\s+(\d{3}) pulls IP, method, path, and status code. Multi-test mode lets me throw 10 sample log lines at it to make sure edge cases (HTTP/2 requests, URLs with quotes) don’t break it.

    Input validation: Building a form? Test your email/phone/date regex against a list of valid and invalid inputs in one shot. Way faster than manually testing each one.

    Search and replace: Reformatting dates from MM/DD/YYYY to YYYY-MM-DD? The replace mode with $3-$1-$2 backreferences shows you the result instantly.

    Teaching: The pattern library doubles as a learning resource. Click “Email” or “UUID” and see a production-quality regex with its flags and description. Better than Stack Overflow answers from 2012.

    Try It

    It’s live at regexlab.orthogonal.info. Works offline after the first visit. Install it as a PWA if you want it in your dock.

    If you want more tools like this — HashForge for hashing, JSON Forge for formatting JSON, QuickShrink for image compression — they’re all at apps.orthogonal.info. Same principle: single HTML file, zero dependencies, your data stays in your browser.

    Full disclosure: Mastering Regular Expressions by Jeffrey Friedl is the book that made regex click for me back in college. If you’re still guessing at lookaheads and backreferences, it’s worth the read. Regular Expressions Cookbook by Goyvaerts and Levithan is also solid if you want recipes rather than theory. And if you’re doing a lot of text processing, a good mechanical keyboard makes the difference when you’re typing backslashes all day.

    More Privacy-First Browser Tools

    References

    1. OWASP — “OWASP Testing Guide v4”
    2. Regex101 — “Privacy Policy”
    3. Mozilla Developer Network (MDN) — “Regular Expressions (RegEx)”
    4. GitHub — “Offline Regex Tester Repository”
    5. NIST — “Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)”
  • Self-Host Ollama: Local LLM Inference on Your Homelab

    Self-Host Ollama: Local LLM Inference on Your Homelab

    The $300/Month Problem

    📌 TL;DR: The $300/Month Problem I hit my OpenAI API billing dashboard last month and stared at $312.47. That’s what three months of prototyping a RAG pipeline cost me — and most of those tokens were wasted on testing prompts that didn’t work.
    🎯 Quick Answer: Self-hosting Ollama on a homelab with a used GPU can save over $300/month compared to OpenAI API costs. Run models like Llama 3 and Mistral locally with full data privacy and no per-token fees.

    I hit my OpenAI API billing dashboard last month and stared at $312.47. That’s what three months of prototyping a RAG pipeline cost me — and most of those tokens were wasted on testing prompts that didn’t work.

    Meanwhile, my TrueNAS box sat in the closet pulling 85 watts, running Docker containers I hadn’t touched in weeks. That’s when I started looking at Ollama — a dead-simple way to run open-source LLMs locally. No API keys, no rate limits, no surprise invoices.

    Three weeks in, I’ve moved about 80% of my development-time inference off the cloud. Here’s exactly how I set it up, what hardware actually matters, and the real performance numbers nobody talks about.

    Why Ollama Over vLLM, LocalAI, or text-generation-webui

    I tried all four. Here’s why I stuck with Ollama:

    vLLM is built for production throughput — batched inference, PagedAttention, the works. It’s also a pain to configure if you just want to ask a model a question. Setup took me 45 minutes and required building from source to get GPU support working on my machine.

    LocalAI supports more model formats (GGUF, GPTQ, AWQ) and has an OpenAI-compatible API out of the box. But the documentation is scattered, and I hit three different bugs in the Whisper integration before giving up.

    text-generation-webui (oobabooga) is great if you want a chat UI. But I needed an API endpoint I could hit from scripts and other services, and the API felt bolted on.

    Ollama won because: one binary, one command to pull a model, instant OpenAI-compatible API on port 11434. I had Llama 3.1 8B answering prompts in under 2 minutes from a cold start. That matters when you’re trying to build things, not babysit infrastructure.

    Hardware: What Actually Moves the Needle

    I’m running Ollama on a Mac Mini M2 with 16GB unified memory. Here’s what I learned about hardware that actually affects performance:

    Memory is everything. LLMs need to fit entirely in RAM (or VRAM) to run at usable speeds. A 7B parameter model in Q4_K_M quantization needs about 4.5GB. A 13B model needs ~8GB. A 70B model needs ~40GB. If the model doesn’t fit, it pages to disk and you’re looking at 0.5 tokens/second — basically unusable.

    GPU matters less than you think for models under 13B. Apple Silicon’s unified memory architecture means the M1/M2/M3 chips run these models surprisingly well — I get 35-42 tokens/second on Llama 3.1 8B with my M2. A dedicated NVIDIA GPU is faster (an RTX 3090 with 24GB VRAM will push 70+ tok/s on the same model), but the Mac Mini uses 15 watts doing it versus 350+ watts for the 3090.

    CPU-only is viable for small models. On a 4-core Intel box with 32GB RAM, I was getting 8-12 tokens/second on 7B models. Not great for chat, but perfectly fine for batch processing, embeddings, or code review pipelines where latency doesn’t matter.

    If you’re building a homelab inference box from scratch, here’s what I’d buy today:

    • Budget ($400-600): A used Mac Mini M2 with 16GB RAM runs 7B-13B models at very usable speeds. Power draw is laughable — 15-25 watts under inference load.
    • Mid-range ($800-1200): A Mac Mini M4 with 32GB lets you run 30B models and keeps two smaller models hot in memory. The M4 with 32GB unified memory is the sweet spot for most homelab setups.
    • GPU path ($500-900): If you already have a Linux box, grab a used RTX 3090 24GB — they’ve dropped to $600-800 and the 24GB VRAM handles 13B models at 70+ tok/s. Just make sure your PSU can handle the 350W draw.

    The Setup: 5 Minutes, Not Kidding

    On macOS or Linux:

    curl -fsSL https://ollama.com/install.sh | sh
    ollama serve &
    ollama pull llama3.1:8b

    That’s it. The model downloads (~4.7GB for the Q4_K_M quantized 8B), and you’ve got an API running on localhost:11434.

    Test it:

    curl http://localhost:11434/api/generate -d '{
      "model": "llama3.1:8b",
      "prompt": "Explain TCP three-way handshake in two sentences.",
      "stream": false
    }'

    For Docker (which is what I use on TrueNAS):

    docker run -d \
      --name ollama \
      -v ollama_data:/root/.ollama \
      -p 11434:11434 \
      --restart unless-stopped \
      ollama/ollama:latest

    Then pull your model into the running container:

    docker exec ollama ollama pull llama3.1:8b

    Real Benchmarks: What I Actually Measured

    I ran each model 10 times with the same prompt (“Write a Python function to merge two sorted lists with O(n) complexity, with docstring and type hints”) and averaged the results. Mac Mini M2, 16GB, nothing else running:

    Model Size (Q4_K_M) Tokens/sec Time to first token RAM used
    Llama 3.1 8B 4.7GB 38.2 0.4s 5.1GB
    Mistral 7B v0.3 4.1GB 41.7 0.3s 4.6GB
    CodeLlama 13B 7.4GB 22.1 0.8s 8.2GB
    Llama 3.1 70B (Q2_K) 26GB 3.8 4.2s 28GB*

    *The 70B model technically ran on 16GB with aggressive quantization but spent half its time swapping. I wouldn’t recommend it without 32GB+ RAM.

    For context: GPT-4o through the API typically returns 50-80 tokens/second, but you’re paying per token and dealing with rate limits. 38 tokens/second from a local 8B model is fast enough that you barely notice the difference when coding.

    Making It Useful: The OpenAI-Compatible API

    This is the part that made Ollama actually practical for me. It exposes an OpenAI-compatible endpoint at /v1/chat/completions, which means you can point any tool that uses the OpenAI SDK at your local instance by just changing the base URL:

    from openai import OpenAI
    
    client = OpenAI(
        base_url="http://192.168.0.43:11434/v1",
        api_key="not-needed"  # Ollama doesn't require auth
    )
    
    response = client.chat.completions.create(
        model="llama3.1:8b",
        messages=[{"role": "user", "content": "Review this PR diff..."}]
    )
    print(response.choices[0].message.content)

    I use this for:

    • Automated code review — a git hook sends diffs to the local model before I push
    • Log analysis — pipe structured logs through a prompt that flags anomalies
    • Documentation generation — point it at a module and get decent first-draft docstrings
    • Embedding generationollama pull nomic-embed-text gives you a solid embedding model for RAG without paying per-token

    None of these need GPT-4 quality. A well-prompted 8B model handles them at 90%+ accuracy, and the cost is literally zero per request.

    Gotchas I Hit (So You Don’t Have To)

    Memory pressure kills everything. When Ollama loads a model, it stays in memory until another model evicts it or you restart the service. If you’re running other containers on the same box, set OLLAMA_MAX_LOADED_MODELS=1 to prevent two 8GB models from eating all your RAM and triggering the OOM killer.

    Network binding matters. By default Ollama only listens on 127.0.0.1:11434. If you want other machines on your LAN to use it (which is the whole point of a homelab setup), set OLLAMA_HOST=0.0.0.0. But don’t expose this to the internet — there’s no auth layer. Put it behind a reverse proxy with basic auth or Tailscale if you need remote access.

    Quantization matters more than model size. A 13B model at Q4_K_M often beats a 7B at Q8. The sweet spot for most use cases is Q4_K_M — it’s roughly 4 bits per weight, which keeps quality surprisingly close to full precision while cutting memory by 4x.

    Context length eats memory fast. The default context window is 2048 tokens. Bumping it to 8192 with ollama run llama3.1 --ctx-size 8192 roughly doubles memory usage. Plan accordingly.

    When to Stay on the Cloud

    I still use GPT-4o and Claude for anything requiring deep reasoning, long context, or multi-step planning. Local 8B models are not good at complex architectural analysis or debugging subtle race conditions. They’re excellent at well-scoped, repetitive tasks with clear instructions.

    The split I’ve landed on: cloud APIs for thinking, local models for doing. My API bill dropped from $312/month to about $45.

    What I’d Do Next

    If your homelab already runs Docker, adding Ollama takes 5 minutes and costs nothing. Start with llama3.1:8b for general tasks and nomic-embed-text for embeddings. If you find yourself using it daily (you will), consider dedicated hardware — a Mac Mini or a used GPU that stays on 24/7.

    The models are improving fast. Llama 3.1 8B today is better than Llama 2 70B was a year ago. By the time you read this, there’s probably something even better on Ollama’s model library. Pull it and try it — that’s the beauty of running your own inference server.

    Related Reading

    Full disclosure: Hardware links above are affiliate links.


    📡 Want daily market intelligence with the same no-BS approach? Join Alpha Signal on Telegram for free daily signals and analysis.

    References

    1. Ollama — “Ollama Documentation”
    2. GitHub — “LocalAI: OpenAI-Compatible API for Local Models”
    3. GitHub — “vLLM: A High-Throughput and Memory-Efficient Inference and Serving Library for LLMs”
    4. TrueNAS — “TrueNAS Documentation Hub”
    5. Docker — “Docker Official Documentation”

    Frequently Asked Questions

    What is Self-Host Ollama: Local LLM Inference on Your Homelab about?

    The $300/Month Problem I hit my OpenAI API billing dashboard last month and stared at $312.47. That’s what three months of prototyping a RAG pipeline cost me — and most of those tokens were wasted on

    Who should read this article about Self-Host Ollama: Local LLM Inference on Your Homelab?

    Anyone interested in learning about Self-Host Ollama: Local LLM Inference on Your Homelab and related topics will find this article useful.

    What are the key takeaways from Self-Host Ollama: Local LLM Inference on Your Homelab?

    Meanwhile, my TrueNAS box sat in the closet pulling 85 watts, running Docker containers I hadn’t touched in weeks. That’s when I started looking at Ollama — a dead-simple way to run open-source LLMs l

  • HashForge: Privacy-First Hash Generator for All Algos

    HashForge: Privacy-First Hash Generator for All Algos

    I’ve been hashing things for years — verifying file downloads, generating checksums for deployments, creating HMAC signatures for APIs. And every single time, I end up bouncing between three or four browser tabs because no hash tool does everything I need in one place.

    So I built HashForge.

    The Problem with Existing Hash Tools

    📌 TL;DR: I’ve been hashing things for years — verifying file downloads, generating checksums for deployments, creating HMAC signatures for APIs. And every single time, I end up bouncing between three or four browser tabs because no hash tool does everything I need in one place. So I built HashForge .
    Quick Answer: HashForge is a free, privacy-first hash generator that computes MD5, SHA-1, SHA-256, SHA-512, and HMAC simultaneously — entirely in your browser with zero server uploads. It replaces multiple single-algorithm tools with one unified interface.

    Here’s what frustrated me about the current space. Most online hash generators force you to pick one algorithm at a time. Need MD5 and SHA-256 for the same input? That’s two separate page loads. Browserling’s tools, for example, have a different page for every algorithm — MD5 on one URL, SHA-256 on another, SHA-512 on yet another. You’re constantly copying, pasting, and navigating.

    Then there’s the privacy problem. Some hash generators process your input on their servers. For a tool that developers use with sensitive data — API keys, passwords, config files — that’s a non-starter. Your input should never leave your machine.

    And finally, most tools feel like they were built in 2010 and never updated. No dark mode, no mobile responsiveness, no keyboard shortcuts. They work, but they feel dated.

    What Makes HashForge Different

    All algorithms at once. Type or paste text, and you instantly see MD5, SHA-1, SHA-256, SHA-384, and SHA-512 hashes side by side. No page switching, no dropdown menus. Every algorithm, every time, updated in real-time as you type.

    Four modes in one tool. HashForge isn’t just a text hasher. It has four distinct modes:

    • Text mode: Real-time hashing as you type. Supports hex, Base64, and uppercase hex output.
    • File mode: Drag-and-drop any file — PDFs, ISOs, executables, anything. The file never leaves your browser. There’s a progress indicator for large files and it handles multi-gigabyte files using the Web Crypto API’s native streaming.
    • HMAC mode: Enter a secret key and message to generate HMAC signatures for SHA-1, SHA-256, SHA-384, and SHA-512. Essential for API development and webhook verification.
    • Verify mode: Paste two hashes and instantly compare them. Uses constant-time comparison to prevent timing attacks — the same approach used in production authentication systems.

    100% browser-side processing. Nothing — not a single byte — leaves your browser. HashForge uses the Web Crypto API for SHA algorithms and a pure JavaScript implementation for MD5 (since the Web Crypto API doesn’t support MD5). There’s no server, no analytics endpoint collecting your inputs, no “we process your data according to our privacy policy” fine print. Your data stays on your device, period.

    Technical Deep Dive

    HashForge is a single HTML file — 31KB total with all CSS and JavaScript inline. Zero external dependencies. No frameworks, no build tools, no CDN requests. This means:

    • First paint under 100ms on any modern browser
    • Works offline after the first visit (it’s a PWA with a service worker)
    • No supply chain risk — there’s literally nothing to compromise

    The MD5 Challenge

    The Web Crypto API supports SHA-1, SHA-256, SHA-384, and SHA-512 natively, but not MD5. Since MD5 is still widely used for file verification (despite being cryptographically broken), I implemented it in pure JavaScript. The implementation handles the full MD5 specification — message padding, word array conversion, and all four rounds of the compression function.

    Is MD5 secure? No. Should you use it for passwords? Absolutely not. But for verifying that a file downloaded correctly? It’s fine, and millions of software projects still publish MD5 checksums alongside SHA-256 ones.

    Constant-Time Comparison

    The hash verification mode uses constant-time comparison. In a naive string comparison, the function returns as soon as it finds a mismatched character — which means comparing “abc” against “axc” is faster than comparing “abc” against “abd”. An attacker could theoretically use this timing difference to guess a hash one character at a time.

    HashForge’s comparison XORs every byte of both hashes and accumulates the result, then checks if the total is zero. The operation takes the same amount of time regardless of where (or whether) the hashes differ. This is the same pattern used in OpenSSL’s CRYPTO_memcmp and Node.js’s crypto.timingSafeEqual.

    PWA and Offline Support

    HashForge registers a service worker that caches the page on first visit. After that, it works completely offline — no internet required. The service worker uses a network-first strategy: it tries to fetch the latest version, falls back to cache if you’re offline. This means you always get updates when connected, but never lose functionality when you’re not.

    Accessibility

    Every interactive element has proper ARIA attributes. The tab navigation follows the WAI-ARIA Tabs Pattern — arrow keys move between tabs, Home/End jump to first/last. There’s a skip-to-content link for screen reader users. All buttons have visible focus states. Keyboard shortcuts (Ctrl+1 through Ctrl+4) switch between modes.

    Real-World Use Cases

    1. Verifying software downloads. You download an ISO and the website provides a SHA-256 checksum. Drop the file into HashForge’s File mode, copy the SHA-256 output, paste it into Verify mode alongside the published checksum. Instant verification.

    2. API webhook signature verification. Stripe, GitHub, and Slack all use HMAC-SHA256 to sign webhooks. When debugging webhook handlers, you can use HashForge’s HMAC mode to manually compute the expected signature and compare it against what you’re receiving. No need to write a throwaway script.

    3. Generating content hashes for ETags. Building a static site? Hash your content to generate ETags for HTTP caching. Paste the content into Text mode, grab the SHA-256, and you have a cache key.

    4. Comparing database migration checksums. After running a migration, hash the schema dump and compare it across environments. HashForge’s Verify mode makes this a two-paste operation.

    5. Quick password hash lookups. Not for security — but when you’re debugging and need to quickly check if two plaintext values produce the same hash (checking for normalization issues, encoding problems, etc.).

    What I Didn’t Build

    I deliberately left out some features that other tools include:

    • No bcrypt/scrypt/argon2. These are password hashing algorithms, not general-purpose hash functions. They’re intentionally slow and have different APIs. Mixing them in would confuse the purpose of the tool.
    • No server-side processing. Some tools offer an “API” where you POST data and get hashes back. Why? The browser can do this natively.
    • No accounts or saved history. Hash a thing, get the result, move on. If you need to save it, copy it. Simple tools should be simple.

    Try It

    HashForge is free, open-source, and runs entirely in your browser. Try it at hashforge.orthogonal.info.

    If you find it useful, support the project — it helps me keep building privacy-first tools.

    For developers: the source is on GitHub. It’s a single HTML file, so feel free to fork it, self-host it, or tear it apart to see how it works.

    Looking for more browser-based dev tools? Check out QuickShrink (image compression), PixelStrip (EXIF removal), and TypeFast (text snippets). All free, all private, all single-file.

    Looking for a great mechanical keyboard to speed up your development workflow? I’ve been using one for years and the tactile feedback genuinely helps with coding sessions. The Keychron K2 is my daily driver — compact 75% layout, hot-swappable switches, and excellent build quality. Also worth considering: a solid USB-C hub makes the multi-monitor developer setup much cleaner.

    Get Weekly Security & DevOps Insights

    Join 500+ engineers getting actionable tutorials on Kubernetes security, homelab builds, and trading automation. No spam, unsubscribe anytime.

    Subscribe Free →

    Delivered every Tuesday. Read by engineers at Google, AWS, and startups.

    References

    1. NIST — “Secure Hash Standard (SHS)”
    2. OWASP — “Cryptographic Storage Cheat Sheet”
    3. GitHub — “Browserling Hash Tools Repository”
    4. RFC Editor — “RFC 3174 – US Secure Hash Algorithm 1 (SHA1)”
    5. Mozilla Developer Network — “Web Crypto API”

    Frequently Asked Questions

    What is HashForge: Privacy-First Hash Generator for All Algos about?

    I’ve been hashing things for years — verifying file downloads, generating checksums for deployments, creating HMAC signatures for APIs. And every single time, I end up bouncing between three or four b

    Who should read this article about HashForge: Privacy-First Hash Generator for All Algos?

    Anyone interested in learning about HashForge: Privacy-First Hash Generator for All Algos and related topics will find this article useful.

    What are the key takeaways from HashForge: Privacy-First Hash Generator for All Algos?

    So I built HashForge . The Problem with Existing Hash Tools Here’s what frustrated me about the current space. Most online hash generators force you to pick one algorithm at a time.

Also by us: StartCaaS — AI Company OS · Hype2You — AI Tech Trends