Category: Tools & Setup

Tools & Setup is where orthogonal.info curates practical, battle-tested guides on developer productivity tools, CLI utilities, self-hosted software, and environment configuration. Whether you are bootstrapping a new development machine, evaluating self-hosted alternatives to SaaS products, or fine-tuning your terminal workflow, this category delivers step-by-step walkthroughs grounded in real-world experience. Every article is written with one goal: help you build a faster, more reliable, and more enjoyable development environment.

With over 25 in-depth posts and growing, Tools & Setup is one of the most active categories on the site — reflecting just how much time engineers spend (and save) by getting their tooling right from day one.

Key Topics Covered

Command-line productivity — Shell customization (Zsh, Fish, Starship), terminal multiplexers (tmux, Zellij), and CLI utilities like ripgrep, fd, fzf, and bat that supercharge daily workflows.
Self-hosted alternatives — Deploying and configuring tools like Gitea, Nextcloud, Vaultwarden, and Uptime Kuma so you own your data without sacrificing usability.
IDE and editor setup — Configuration guides for VS Code, Neovim, and JetBrains IDEs, including extension recommendations, keybindings, and remote development workflows.
Development environment automation — Using Ansible, Homebrew, Nix, dotfiles repositories, and container-based dev environments (Dev Containers, Devbox) to make setups reproducible.
Git workflows and tooling — Advanced Git techniques, hooks, aliases, and GUI clients that streamline version control for solo developers and teams alike.
API testing and debugging — Hands-on guides for curl, HTTPie, Postman, and browser DevTools to debug REST and GraphQL APIs efficiently.
Package and runtime management — Managing multiple language runtimes with asdf, mise, nvm, and pyenv, plus dependency management best practices.

Who This Content Is For
This category is designed for software engineers, DevOps practitioners, system administrators, and hobbyist developers who want to work smarter, not harder. Whether you are a junior developer setting up your first Linux workstation or a senior engineer optimizing a multi-machine workflow, you will find actionable advice that respects your time. The guides assume basic command-line comfort but explain advanced concepts clearly.

What You Will Learn
By exploring the articles in Tools & Setup, you will learn how to automate repetitive environment tasks so a fresh machine is productive in minutes, not days. You will discover modern CLI replacements for legacy Unix tools, understand how to evaluate self-hosted software against its SaaS equivalent, and gain confidence configuring complex development stacks. Each guide includes copy-paste commands, configuration snippets, and links to upstream documentation so you can adapt the advice to your own infrastructure.

Start browsing below to find your next productivity upgrade.

  • Why the Web Crypto API Won’t Compute MD5 (and How HashForge Does It in Your Browser)

    Last week I needed an MD5 checksum to verify a file against a vendor’s published manifest. Old habit kicked in: open devtools, reach for the Web Crypto API, type one line. It failed on the spot:

    await crypto.subtle.digest('MD5', new TextEncoder().encode('abc'))
    // DOMException: Algorithm: Unrecognized name MD5

    No MD5. Not deprecated-with-a-warning — just absent, like it was never on the menu. That single rejection is the whole reason HashForge, the in-browser hash generator I keep bookmarked, ships its own MD5 routine instead of asking the browser. Here’s why the browser says no, and how HashForge works around it without uploading your file anywhere.

    The Web Crypto API blocks MD5 on purpose

    The digest side of the Web Crypto API supports exactly four algorithms: SHA-1, SHA-256, SHA-384, and SHA-512. That list is fixed in the W3C spec. MD5 isn’t missing because nobody filed a ticket — the working group left it out, along with MD4, because shipping a broken hash through an API named “crypto” invites people to misuse it.

    MD5 has had practical collision attacks since 2004, when Wang and Yu produced two different inputs with the same digest by hand-tuning the message. By 2008 researchers used MD5 collisions to forge a rogue CA certificate. The hash is finished for anything where an attacker controls the input.

    Here’s the part I find funny: the browser still lets you compute SHA-1, which Google and CWI fully collided in 2017 with the SHAttered attack. SHA-1 stayed in the spec for backward compatibility with existing protocols. MD5 never made the cut at all. The vendors drew a line, and MD5 landed on the wrong side of it.

    I agree with that call for new code. The catch is that the rest of us still bump into MD5 constantly, and almost never for security:

    • Vendor downloads still publish an MD5 next to the file
    • S3 ETags are the MD5 of the object for single-part uploads
    • Legacy rows store md5(email) for Gravatar-style lookups
    • Plenty of internal tools fingerprint content with MD5 because it’s fast and short

    So you hit a wall. The data is MD5, the browser refuses to compute MD5, and you would rather not paste a confidential file into some random “free MD5 online” site that ships it off to a server you’ve never audited.

    How HashForge fills the gap

    HashForge splits the work in two. For the SHA family it calls the native API — fast, audited, hardware-accelerated on most machines:

    const ALGOS = ['MD5','SHA-1','SHA-256','SHA-384','SHA-512'];
    
    async function hashText(text, algos, enc='hex'){
      const encoded = new TextEncoder().encode(text);
      const out = {};
      for (const algo of algos){
        if (algo === 'MD5'){
          out[algo] = formatHash(md5(encoded.buffer), enc);     // pure JS
        } else {
          const hash = await crypto.subtle.digest(algo, encoded); // native
          out[algo] = formatHash(hash, enc);
        }
      }
      return out;
    }

    For MD5 it falls back to a self-contained JavaScript implementation — the classic safeAdd / bitRotateLeft / md5cmn routine you’ve seen in a dozen libraries, working directly on an ArrayBuffer. No dependency, no network call, a couple hundred lines of code.

    Why MD5 is small enough to ship inline

    MD5 is a Merkle–Damgård construction. It pads the message to a multiple of 512 bits, then chews through it one 512-bit block at a time, updating four 32-bit state words across 64 operations grouped into 4 rounds. The whole thing is integer addition, bit rotation, and a handful of boolean mixing functions. That’s it — no S-boxes, no lookup tables, no big constants beyond a sine-derived table you can generate in one line.

    Because the algorithm is so plain, a correct MD5 fits in a few hundred bytes of minified JavaScript. SHA-512 by hand would be heavier and slower in JS, which is exactly why HashForge doesn’t reimplement the SHA family — the native crypto.subtle path is both faster and already vetted. You only drop to hand-rolled code for the one algorithm the platform won’t give you.

    The privacy detail that actually matters

    Files go through the same split. The page reads the file with file.arrayBuffer() and hands the raw bytes straight to either the native digest or the JS MD5:

    const buf  = await file.arrayBuffer();
    const hash = await crypto.subtle.digest('SHA-256', buf);

    That arrayBuffer() call is the whole privacy story. The bytes are read into memory inside your tab and never touch a network socket. Open the Network panel while you hash a 200 MB ISO and you’ll see zero requests. Pull your wifi and it keeps working, because there was never a server in the loop. Compare that to the typical “online hash calculator,” which POSTs your file to a backend and trusts you to believe their retention policy.

    Verify the output yourself in ten seconds

    Don’t take my word that the MD5 path is correct — a hash tool that quietly mis-pads is worse than no tool. Hash the empty string and abc, then check against the canonical test vectors:

    MD5("")        = d41d8cd98f00b204e9800998ecf8427e
    MD5("abc")     = 900150983cd24fb0d6963f7d28e17f72
    SHA-256("abc") = ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad

    Type abc into HashForge and you’ll get those exact bytes. I cross-checked them against md5sum and sha256sum on a Linux box before trusting the tool with anything real. Two-minute habit, and it catches a surprising number of broken implementations.

    HMAC is native-only, and that’s the right limit

    One place HashForge refuses to fill a gap: HMAC. It offers HMAC-SHA1/256/384/512 and stops there, because Web Crypto’s importKey plus sign('HMAC', ...) only accepts the SHA family. There’s no HMAC-MD5 button.

    That’s correct, not lazy. If you’re computing an HMAC you’re authenticating something, and HMAC-MD5 has no place in new code. The tool steers you to SHA-256 by simply not offering the broken option — the same stance the browser takes on raw MD5, applied one layer up.

    Which hash for which job

    A quick field guide, because this question comes up every week:

    • Matching a published checksum: use whatever the publisher used, MD5 or SHA-256. You’re catching accidental corruption, not an attacker, so a broken hash is fine here.
    • Content fingerprint, cache key, dedup: SHA-256 if you have a free choice; MD5 only to match an existing system.
    • Passwords: none of these. Use Argon2 or bcrypt. A raw SHA-256 of a password is still a leak waiting to happen.
    • Tokens and signatures: HMAC-SHA256 at minimum.

    If you want the actual math behind why MD5 fell and SHA-256 holds, Serious Cryptography by Jean-Philippe Aumasson is the clearest book I’ve found on collision attacks without drowning you in proofs. For the engineering side — where each primitive shows up in TLS, signatures, and storage — Real-World Cryptography by David Wong is the one I lend out most. Full disclosure: both are Amazon affiliate links.

    Why I keep it bookmarked

    The pitch is narrow and that’s the point. I need a hash, I can’t install a CLI on a locked-down work laptop, and I really don’t want to upload a file to a stranger’s server. HashForge does that one job: it computes all five digests at once, outputs hex or Base64, and runs on a text string or a dropped file. It pairs with the other browser-only tools I reach for — Base64Lab when I need to decode a token and PassForge when I need a random key — none of which phone home.

    Try it: HashForge. Hash something, open your Network tab, and watch nothing happen.


    Join https://t.me/alphasignal822 for free market intelligence.

  • How a Secure Password Generator Actually Works (and Why Math.random() Fails)

    Last week I was reviewing a small auth service and found this one-liner generating reset tokens:

    const token = Array.from({length: 16}, () =>
      CHARS[Math.floor(Math.random() * CHARS.length)]
    ).join('');

    It runs. It produces things like xK9$mLp2@nQ7vR4w. It also happens to be a real security bug. That exact pattern is the one I deliberately avoided when I built our free password generator — and the reason is worth 1,200 words, because almost every “roll your own” password snippet on the web gets it wrong in the same way.

    Here’s what’s broken about Math.random() for passwords, the fix, and the two gotchas that bite people who try to fix it themselves.

    Math.random() is predictable by design

    In V8 — the engine behind Chrome and Node — Math.random() has used an algorithm called xorshift128+ since version 4.9.40, shipped in late 2015. (Before that it was MWC1616, which was worse: only about 232 possible outputs.) xorshift128+ has 128 bits of internal state, a period of 2128 − 1, and it passes the TestU01 statistical suite. Statistically, the numbers look random.

    But “looks random” and “unpredictable” are different properties. xorshift128+ is a pseudo-random generator: every output is a deterministic function of that 128-bit state. And the state is recoverable. Feed enough consecutive outputs into a system of linear equations and you can solve for the internal state — there are public tools on GitHub that recover it from as few as 64 to 128 consecutive Math.random() calls. Once an attacker has the state, every future output is known. Every “random” password you generate after that point is predictable.

    For a UI animation or a Monte Carlo sim, who cares. For a password, an API key, or a session token, that’s the whole ballgame.

    crypto.getRandomValues() is the actual fix

    Browsers ship a cryptographically secure RNG (CSPRNG) through the Web Crypto API: crypto.getRandomValues(). It pulls from the operating system’s entropy pool (/dev/urandom on Linux, BCryptGenRandom on Windows) and is built so that observing past output tells you nothing about future output. There’s no recoverable 128-bit state to solve for.

    The function our generator uses is four lines:

    function secureRandom(max) {
      const arr = new Uint32Array(1);
      crypto.getRandomValues(arr);
      return arr[0] % max;
    }

    Read a fresh 32-bit unsigned integer from the CSPRNG, reduce it into the range you need, done. Swap Math.random() for this and the prediction attack above is gone. But notice that % max — that’s gotcha number one.

    Gotcha 1: modulo bias is real (but size matters)

    When you take a random integer modulo your alphabet size, the ranges usually don’t divide evenly, so some characters come up more often than others. I wanted to see how bad it actually is, so I generated 6.2 million random bytes and bucketed byte % 62 (a typical alphanumeric set):

    expected per character:  100,000
    lowest-frequency char:   ~96,900 hits
    highest-frequency char: ~121,400 hits
    ratio: 1.25

    That’s a 25% skew. It happens because 256 % 62 = 8, so byte values 0–7 each give one extra shot to the first eight characters. With a single byte feeding a 62- or 94-character set, the bias is large and easy to measure.

    The textbook fix is rejection sampling: throw away any byte in the biased tail and draw again. Rejecting values ≥ 248 dropped the skew to a 1.02 ratio in my test, at the cost of discarding about 3.1% of draws.

    But here’s the part the “always use rejection sampling” advice skips: the bias depends entirely on how big your random integer is relative to the alphabet. Our generator doesn’t read a single byte — it reads a full Uint32 (range 0 to about 4.29 billion). For a 94-character symbol set, Uint32 % 94 makes the favored characters more likely by roughly 1 part in 45 million — a bias of 0.0000022%. For a password, that’s noise far below anything that matters. So I skipped rejection sampling on purpose and kept the code simple, because a 32-bit draw already makes the bias irrelevant. If I were minting cryptographic keys I’d add the rejection step; for human passwords, a wide draw is enough.

    Gotcha 2: the 64KB quota wall

    The second surprise showed up while I was running that bias test. My first attempt asked getRandomValues() to fill one big buffer:

    crypto.getRandomValues(new Uint8Array(620000));
    // QuotaExceededError: The requested length exceeds 65,536 bytes

    getRandomValues() refuses any request over 65,536 bytes (64 KB) in a single call. It’s in the spec and every browser enforces it. If you’re generating one 16-character password you’ll never hit it, but the moment you batch-generate or fill a large buffer, you have to chunk:

    function fillSecure(buf) {
      for (let i = 0; i < buf.length; i += 65536) {
        crypto.getRandomValues(buf.subarray(i, i + 65536));
      }
    }

    Undocumented in most tutorials, and a hard failure rather than a silent one — which is at least honest of it.

    Why browser-only matters here

    Our generator runs entirely in your browser. The password is built on your machine from your OS entropy and never touches a network. That’s not a tagline — it’s the only design that makes sense for a secret. A “password generator” that does the work server-side is a service that has seen your password in plaintext, which is the same trust problem I wrote about with online SQL formatters quietly logging queries. Open the dev tools, watch the Network tab while you click generate, and you’ll see exactly zero requests.

    You can try it here: the orthogonal.info password generator. Slide to 16+ characters, toggle the symbol set, copy, done.

    One layer is never enough

    A strong, truly-random password fixes the “guessable” problem. It does nothing about phishing, reused credentials, or a leaked database. After the LastPass mess I moved my own vault into KeePassXC and put a hardware key on every account that supports one. A YubiKey 5 NFC turns a stolen password into a useless string, because login also needs the physical key in my pocket. Full disclosure: that’s an affiliate link — but it’s also literally what’s on my keyring. Generate unique passwords, store them in a real manager, and gate the important accounts with hardware 2FA. Three cheap layers beat one strong one.

    The lesson I keep relearning: in security, the code that “works” and the code that’s correct are often the same length and completely different. Math.random() works. crypto.getRandomValues() is correct.


    Want signal instead of noise on markets and tech? Join https://t.me/alphasignal822 for free market intelligence.

  • How EXIF GPS Data Is Stored in a JPEG — A Byte-Level Teardown

    Last week I wanted to prove a point to a friend who insisted his vacation photos were “fine to post.” So I opened one of his JPEGs in a hex editor, scrolled about 40 bytes in, and read his hotel’s GPS coordinates straight off the screen — no tools, no library, just the raw bytes. That’s the thing nobody tells you about EXIF: it isn’t encrypted, hashed, or hidden. It’s sitting near the front of almost every photo your phone takes, in a format you can decode by hand once you know the layout. This post is the byte-level teardown, and at the end I’ll show why PixelStrip removes that data without touching a single pixel.

    A JPEG is just a stream of markers

    Every JPEG starts with two bytes: FF D8, the Start Of Image marker. After that the file is a sequence of segments, and every segment begins with FF followed by a marker byte. The one we care about is FF E1 — that’s APP1, where EXIF lives.

    Here’s the front of a real photo, annotated:

    FF D8              SOI (start of image)
    FF E1              APP1 marker  <- EXIF starts here
    00 84              segment length = 0x0084 = 132 bytes (big-endian, always)
    45 78 69 66 00 00  "Exif\0\0"
    49 49              "II" = Intel / little-endian byte order
    2A 00              42, the TIFF magic number
    08 00 00 00        offset to first IFD = 8

    Two details trip people up here. First, that segment-length field is always big-endian, because it’s part of the JPEG container, not the EXIF payload. Second, the byte order flag (II for little-endian, MM for big-endian) only applies to everything after the Exif\0\0 header. From that point on, every multi-byte number flips based on those two bytes.

    The TIFF header and IFD entries

    What follows Exif\0\0 is a tiny TIFF file. All internal offsets are measured from the start of the byte-order mark — not the start of the file. Forget that and every pointer you read lands in the wrong place. I’ve debugged this exact off-by-six error more times than I’d like to admit.

    The 4-byte offset (here 08 00 00 00 = 8) points to the first Image File Directory, or IFD0. An IFD is dead simple:

    • 2 bytes: how many entries follow
    • 12 bytes per entry
    • 4 bytes at the end: offset to the next IFD (0 means stop)

    Each 12-byte entry breaks down as: a 2-byte tag ID, a 2-byte data type, a 4-byte value count, and a 4-byte field that holds either the value itself (if it fits in 4 bytes) or an offset to where the value actually lives. GPS coordinates don’t fit in 4 bytes, so they’re always stored by offset.

    The tag we hunt for in IFD0 is 0x8825 — the GPS IFD pointer. Its value is an offset to a separate sub-directory holding the location tags. Jump there and you find the payload.

    Decoding latitude by hand

    The GPS sub-IFD uses a handful of tags. The important ones:

    • 0x0001 GPSLatitudeRef — ASCII “N” or “S”
    • 0x0002 GPSLatitude — three RATIONAL values: degrees, minutes, seconds
    • 0x0003 GPSLongitudeRef — “E” or “W”
    • 0x0004 GPSLongitude — three more RATIONALs

    A RATIONAL is two 4-byte unsigned integers: a numerator followed by a denominator. So latitude is three of them — 24 bytes total. Here’s the actual block from that photo, little-endian:

    25 00 00 00  01 00 00 00   ->  37 / 1   = 37 degrees
    2E 00 00 00  01 00 00 00   ->  46 / 1   = 46 minutes
    C4 0B 00 00  64 00 00 00   ->  3012 / 100 = 30.12 seconds

    Convert degrees-minutes-seconds to decimal: 37 + 46/60 + 30.12/3600 = 37.7750° N. Pair that with the longitude block and you have a point accurate to roughly three meters. That’s precise enough to land on a specific building. My friend went quiet after I read his back.

    A 40-line parser in the browser

    You don’t need a library to do this. Browser DataView reads typed values out of an ArrayBuffer with explicit endianness, which is exactly what EXIF needs. Here’s the core of finding the APP1 segment and its byte order:

    function findExif(view) {
      let offset = 2; // skip the FF D8 SOI
      while (offset < view.byteLength) {
        if (view.getUint8(offset) !== 0xFF) break;
        const marker = view.getUint8(offset + 1);
        const size = view.getUint16(offset + 2); // big-endian on purpose
        if (marker === 0xE1) {
          const tiff = offset + 10;            // skip marker, length, "Exif\0\0"
          const le = view.getUint16(tiff) === 0x4949; // "II"
          return { tiff, littleEndian: le, app1Start: offset, size };
        }
        offset += 2 + size; // jump to the next segment
      }
      return null;
    }

    Note that getUint16 defaults to big-endian, which is correct for the JPEG segment length. Once you have the littleEndian flag, you pass it to every read inside the TIFF block. Reading a RATIONAL is two reads and a divide:

    function readRational(view, pos, le) {
      return view.getUint32(pos, le) / view.getUint32(pos + 4, le);
    }

    That’s the whole trick. Walk the IFD entries, find tag 0x8825, jump to the GPS sub-IFD, pull the latitude and longitude rationals, and apply the N/S/E/W sign. About 40 lines, no dependencies, runs offline.

    Two ways to strip it — and why they differ

    Now the part that actually matters. There are two ways to remove this metadata, and they are not equal.

    Re-encode the whole image. Draw the photo onto a <canvas> and call toBlob(). The new file is built from raw pixels, so it carries no EXIF at all. Clean — but every pixel gets recompressed, which means slight quality loss and a completely different byte layout. That’s the approach my QuickShrink compressor takes, and I wrote up the mechanics in how browser image compression actually works. Good when you also want a smaller file.

    Splice out the segment. If all you want is to delete the metadata and keep the image untouched, you cut the APP1 segment out of the byte stream and leave everything else identical:

    const out = new Uint8Array(bytes.byteLength - (2 + size));
    out.set(bytes.subarray(0, app1Start));
    out.set(bytes.subarray(app1Start + 2 + size), app1Start);

    The pixels stay bit-for-bit identical. No recompression, no quality loss, no visible change — just the location data gone. That’s what PixelStrip does.

    One gotcha worth knowing: a single JPEG can carry more than one metadata block. EXIF lives in APP1, but XMP often rides in a second APP1, Photoshop data sits in APP13, and the EXIF thumbnail in IFD1 can hold its own copy of the GPS tags. A parser that removes only the first APP1 it sees will miss the rest. A real stripper loops over every APPn segment, which is the unglamorous part most “remove EXIF” snippets skip.

    What to actually do with this

    If you only remember one rule: platforms are inconsistent. Twitter and iMessage scrub metadata on upload; Discord, email attachments, Slack file shares, and most forums pass it through untouched. Assume the worst and clean photos before they leave your machine.

    For a one-click clean that keeps your image quality intact, drop the photo into PixelStrip — it runs entirely in your browser, so the file never uploads anywhere, and it surgically removes EXIF, GPS, and XMP without recompressing. If you want the privacy reasoning rather than the byte layout, I covered that in how to strip EXIF data before sharing. The rest of the browser tools follow the same no-upload rule.

    If you want to go deeper than a hex editor, file-format forensics books cover exactly this kind of byte-level metadata extraction across image, document, and filesystem formats — a solid digital forensics reference is what I keep on the shelf for the weird edge cases (full disclosure: Amazon affiliate link). It’s the difference between guessing at an offset and knowing why it’s there.


    Join https://t.me/alphasignal822 for free market intelligence.

  • How Browser Image Compression Actually Works (Canvas API, toBlob, and Why Your JPEGs Shrink)

    Last week a teammate asked me why our little browser tool, QuickShrink, could take a 4.2 MB phone photo and hand back a 380 KB file that looked identical — all without uploading a single byte to a server. He assumed there was some clever backend doing the heavy lifting. There isn’t. It’s about 40 lines of JavaScript and a browser API that has shipped in every major engine since roughly 2013. I want to walk through exactly what happens between the file picker and the download link, because once you understand it, you stop trusting upload-based compressors that ship your private photos to someone else’s box.

    The whole pipeline is three steps

    Browser image compression with the Canvas API comes down to: decode the image into pixels, paint those pixels onto a canvas, then re-encode the canvas at a chosen quality. That’s it. Here’s the core of what QuickShrink runs, stripped to the essentials:

    async function compress(file, quality = 0.8) {
      // 1. Decode: turn the file bytes into a bitmap
      const bitmap = await createImageBitmap(file);
    
      // 2. Paint: draw the bitmap onto a canvas
      const canvas = document.createElement('canvas');
      canvas.width = bitmap.width;
      canvas.height = bitmap.height;
      const ctx = canvas.getContext('2d');
      ctx.drawImage(bitmap, 0, 0);
    
      // 3. Re-encode: read pixels back out as a compressed blob
      return new Promise((resolve) => {
        canvas.toBlob(resolve, 'image/jpeg', quality);
      });
    }

    The magic is in step three. When you call canvas.toBlob(callback, 'image/jpeg', 0.8), the browser runs its native JPEG encoder over the raw RGBA pixels sitting in the canvas buffer. That 0.8 is the quality factor, a number between 0 and 1, and it maps to the same quantization-table scaling that libjpeg uses under the hood. Lower the number, the encoder throws away more high-frequency detail, and the file shrinks.

    Why the file gets smaller without looking worse

    JPEG compression is lossy and it exploits a fact about human vision: we’re bad at noticing small changes in color and fine detail, but good at noticing changes in brightness and edges. The encoder splits the image into 8×8 pixel blocks, runs a discrete cosine transform on each, and then quantizes the result — rounding off the coefficients that represent detail your eye won’t miss.

    The quality factor controls how aggressive that rounding is. At 0.92 you’re barely touching anything. At 0.8 you’ve cut the file roughly in half and almost nobody can tell in a blind test. Drop to 0.6 and you’ll start seeing ringing around hard edges — text on a screenshot is where it shows up first. I settled on 0.8 as the default after eyeballing a few hundred photos. It’s the knee of the curve where you get most of the size savings before quality visibly drops.

    Real numbers from a real photo set

    I ran a batch of 20 photos straight off a Pixel 8 — landscapes, indoor shots, a couple of screenshots — through the canvas pipeline at different quality settings. Average original size was 3.8 MB per file. Here’s what came out:

    • quality 0.92 — avg 1.9 MB, about 50% reduction, visually lossless
    • quality 0.80 — avg 720 KB, about 81% reduction, no visible loss on normal viewing
    • quality 0.60 — avg 410 KB, about 89% reduction, slight softening on text edges
    • quality 0.40 — avg 280 KB, about 93% reduction, obvious artifacts

    The reason phone photos compress this well is that they start out barely compressed. Camera apps save at quality 0.95 or higher to avoid complaints, and they bake in fat EXIF blocks with GPS coordinates, lens data, and a full-size thumbnail. Re-encoding at 0.8 and dropping the metadata is where most of the savings come from. (If the metadata part interests you, I wrote a separate piece on how EXIF leaks your home address and how to strip it.)

    The gotcha nobody warns you about: createImageBitmap vs Image

    The old way to decode an image was to create an <img> element, set its src to an object URL, and wait for the onload event. It works, but it decodes on the main thread and blocks your UI while it runs. On a big panorama, that’s a visible freeze.

    // Old way - blocks the main thread
    const img = new Image();
    img.onload = () => ctx.drawImage(img, 0, 0);
    img.src = URL.createObjectURL(file);

    createImageBitmap() is the better path. It decodes off the main thread, returns a promise, and gives you an ImageBitmap that draws to canvas faster because it’s already in a GPU-friendly format. On the 20-photo batch above, switching from the Image approach to createImageBitmap cut total processing time from 4.1 seconds to 1.6 seconds. If you build anything that compresses more than one file, use it.

    One real gotcha: createImageBitmap ignores EXIF orientation by default. Photos shot in portrait can come out sideways. You fix it by passing { imageOrientation: 'from-image' } as the second argument, which most engines now honor:

    const bitmap = await createImageBitmap(file, { imageOrientation: 'from-image' });

    WebP is where the real wins are

    JPEG is the safe default, but if you don’t need to email the file to someone on a 2014 device, WebP beats it badly. Same canvas, same code, you just change the MIME type:

    canvas.toBlob(resolve, 'image/webp', 0.8);

    On my test set, WebP at quality 0.8 came out to an average of 480 KB versus JPEG’s 720 KB at the same setting — another third smaller for the same perceived quality. Every browser shipped in the last six years decodes WebP, so the compatibility argument is mostly dead unless you’re targeting ancient hardware. The one place I still reach for JPEG is when the recipient is going to drag the file into some old desktop app that chokes on WebP.

    Why “browser-only” is the part that matters

    Here’s the bit I care about most. Because every step — decode, paint, re-encode — runs inside createImageBitmap and canvas.toBlob, the image never leaves the tab. There’s no fetch, no upload, no server log with your file sitting in it. You can literally open the network tab in DevTools, compress a photo, and watch zero requests fire. Pull your ethernet cable and it still works.

    That’s not true of most “free online image compressor” sites. They POST your file to a backend, compress it there, and hand back a URL. Which means a copy of your photo — with its original GPS metadata if they don’t strip it — lives on a machine you don’t control, for however long their retention policy says, or doesn’t say. For a meme, who cares. For a photo of a document, a whiteboard with company internals, or a picture taken inside your house, that’s a real leak. I’ve gotten paranoid enough about this that I treat every upload-based dev tool as a potential logging endpoint, which is the same reason I wrote about why you should stop pasting sensitive data into online dev tools.

    Try it, or build your own

    If you just want the result, QuickShrink is the tool — drag a photo in, pick a quality, download. No account, no upload, no tracking. If you want to build your own, the code above is the whole idea; wrap it in a drag-drop handler and a quality slider and you’re done in an afternoon.

    The hardware angle matters too. Canvas re-encoding is CPU-bound and single-image-fast, but if you batch-process hundreds of RAW or high-res files, a machine with more cores and fast storage makes the difference between seconds and minutes. I do my bulk photo work on an SSD-backed box, and a good portable drive like the Samsung T7 Shield portable SSD is what I use to shuttle large photo libraries between machines without waiting on a slow USB stick. Full disclosure: that’s an Amazon affiliate link — it’s the drive I actually use.

    The takeaway: browser image compression isn’t magic and it isn’t a backend. It’s a 13-year-old canvas API, a quality number between 0 and 1, and the choice to keep your pixels on your own machine. Once you know how the pipeline works, the upload-based tools start looking like a strictly worse deal.


    Join https://t.me/alphasignal822 for free market intelligence.

  • Your Online SQL Formatter Might Be Logging Your Database Password

    Last month I watched a contractor paste a full Kubernetes secret manifest — base64 blobs and all — into the first “free YAML validator” that came up on Google. He just wanted to check indentation. What he actually did was POST a production database password to a server he’d never heard of, run by people he’ll never meet, with a privacy policy he didn’t read.

    That’s the part of online dev tools nobody talks about. A SQL formatter, a YAML validator, a JSON beautifier — they feel disposable, like a calculator. But a huge number of them send whatever you paste to a backend for processing. If that paste contains a connection string, an API key, or a customer record, you just leaked it. No breach required. You handed it over.

    Why “format my SQL” is a data exfiltration path

    Here’s the mechanic. Server-side tools work like this: your text goes into a textarea, JavaScript fires an HTTP request to /api/format, the server runs the actual formatting, and the result comes back. Simple to build, which is exactly why so many sites do it that way.

    The problem is what travels in that request body. I tested a handful of popular online formatters with my browser’s Network tab open. Several of them sent the entire input payload to their own domain. One sent it to a third-party API. The query I pasted was harmless test data, but the request was real — my text left my machine.

    Now picture the realistic version. You’re debugging a failing migration at 11pm. You copy the offending query straight out of your ORM logs to “just clean it up.” That query has a hardcoded credential a teammate left in six months ago. You paste, you format, you move on. The credential is now in someone’s request logs, maybe their analytics, maybe an LLM training pipeline if the tool resells data. You will never know.

    This isn’t paranoia. It’s the same threat model that makes pasting code into random pastebins a fireable offense at most security-conscious shops. We just don’t apply it to “format” tools because they feel too small to matter.

    The browser-only alternative

    The fix is structural, not procedural. Don’t rely on remembering to scrub secrets first — use tools that physically can’t send your data anywhere, because all the work happens in your tab.

    That’s the whole reason I built our formatters as single-file, client-side apps. When you use the SQL Formatter, the YAML Validator, or the Diff Checker, the parsing and formatting runs in JavaScript on your device. There is no /api/format endpoint. There’s no backend at all. The text in your textarea never crosses the network, because there’s nowhere for it to go.

    For a diff tool this matters even more. People routinely paste two versions of a config file — say, a working .env and a broken one — to spot what changed. Those files are nothing but secrets. A browser-only diff means you can compare two API keys character by character without either one leaving your laptop.

    How to actually verify a tool is client-side

    Don’t take any tool’s word for it, including mine. Verifying is a two-minute job and every developer should know how.

    1. Watch the Network tab. Open DevTools (F12), go to the Network panel, clear it, then paste your text and hit format. If you see a new XHR or fetch request fire with your input in the payload, the tool is server-side. If nothing happens on the network, the work is local.

    // What a server-side formatter looks like in Network tab:
    POST /api/format-sql
    Request Payload: { "query": "SELECT * FROM users WHERE token='sk_live_...'" }
    
    // What a client-side tool looks like:
    // (nothing — no request fires when you click format)

    2. Kill your connection. The bluntest test there is. Load the page, then turn off Wi-Fi or drop into airplane mode. If the tool still formats your text, it’s running entirely in the browser. If it spins or errors, it needed a server. I do this with any tool before I trust it with anything sensitive.

    3. Check for a service worker. Truly offline-capable tools register a service worker so they work with no connection at all. In DevTools, look under Application → Service Workers. Its presence is a strong signal the developer designed for offline-first, which usually means client-side processing too.

    Where this fits in a real workflow

    A few concrete cases where I reach for browser-only tools specifically because of the data:

    • Reviewing a teammate’s config PR. Diffing two Helm values files that contain registry credentials — done locally, nothing logged anywhere.
    • Cleaning up a query from prod logs. Format it to read it, without shipping whatever sensitive WHERE clause it carries to a stranger’s server.
    • Validating a CI secrets file. Checking that a GitHub Actions YAML parses before you commit, without exposing the encrypted values to a validation API.
    • On a locked-down network. Some client environments block external dev-tool domains entirely. Offline-capable tools just keep working.

    The broader point: treat every “paste your text here” box as a potential outbound network call until you’ve proven otherwise. Most of the time it’s fine. The one time it isn’t, it’s a leaked credential you can’t un-leak.

    Defense in depth still applies

    Browser-only tools remove one exfiltration path, but they don’t make you immune to the dumber failure modes — like a secret sitting in your shell history or git log in the first place. If you handle credentials daily, a hardware key cuts a whole class of phishing and credential-theft risk off at the knees. I use a YubiKey 5 Series for exactly this (full disclosure: affiliate link, but it’s the same key I carry on my own keyring). Pair that with the pre-commit secret scanning setup I wrote about earlier, and you’ve closed the two most common ways credentials walk out the door.

    Start with the small habit, though. Next time you reach for an online formatter or diff tool, open the Network tab first. If your text leaves the browser, find one that keeps it home.


    Join https://t.me/alphasignal822 for free market intelligence.

  • I Switched to KeePassXC After LastPass Got Breached — Here’s My Setup

    Last December I got the email every LastPass user dreaded: my vault backup was part of the breach. The master password was strong, but knowing encrypted blobs of my entire digital life were sitting on some attacker’s disk made me physically uncomfortable. I spent a weekend migrating everything to KeePassXC, and six months later I’m not going back.

    Why Local-First Matters for Passwords

    The LastPass breach exposed a fundamental problem with cloud password managers: your encrypted vault is only as safe as the infrastructure storing it. LastPass used 100,100 PBKDF2 iterations for newer accounts — older accounts had as few as 5,000. That’s crackable with a decent GPU rig.

    KeePassXC stores everything in a single .kdbx file on your machine. No servers, no breach notifications, no third-party trust. The file uses AES-256 or ChaCha20 encryption with Argon2d key derivation — you control the iteration count, memory usage, and parallelism. I run mine at 64MB memory / 10 iterations / 4 threads, which takes about 1 second to unlock on my laptop but would cost serious money to brute-force.

    The Setup That Actually Works Day-to-Day

    The knock against local password managers has always been “but what about sync?” Fair point. Here’s how I solved it without trusting anyone else with my vault:

    # My .kdbx lives in a Syncthing folder shared between:
    # - Work laptop (Linux)
    # - Personal desktop (Windows)
    # - Phone (via Syncthing + KeePassDX on Android)
    
    ~/.local/share/syncthing/vault/
    ├── passwords.kdbx
    └── passwords.kdbx.key   # key file (separate from master password)

    Syncthing handles peer-to-peer sync over my local network and WireGuard tunnel when I’m away. The vault never touches anyone else’s servers. Conflict resolution? KeePassXC handles .kdbx merge conflicts natively since version 2.7 — it’ll prompt you to merge changes if two devices edited simultaneously.

    Hardware Key as Second Factor

    This is where it gets good. KeePassXC supports YubiKey challenge-response as an additional key factor. My unlock requires:

    1. Master password (memorized, 6 random words)
    2. Key file (stored only on my devices, never synced to cloud)
    3. YubiKey HMAC-SHA1 challenge-response (slot 2)

    Setting this up:

    # Program YubiKey slot 2 for HMAC-SHA1 challenge-response
    ykman otp chalresp --generate 2
    
    # In KeePassXC: Database → Database Security → Add Additional Protection
    # Select "Challenge-Response" → pick your YubiKey

    An attacker who steals my .kdbx file needs all three factors. Even if they get my laptop with the key file, they still need the physical YubiKey and the password. I keep a backup YubiKey 5 NFC in my safe — $50 for peace of mind that I won’t lock myself out.

    Browser Integration Without the Extension Tax

    KeePassXC’s browser integration works through a native messaging host — no network calls, no cloud sync of browser state. I tested fill speed across three setups:

    Setup Fill latency Memory overhead
    1Password (extension) 180-400ms ~85MB
    Bitwarden (extension) 120-300ms ~60MB
    KeePassXC (native messaging) 30-80ms ~12MB

    KeePassXC fills faster because it communicates through a Unix socket to the running desktop app — no HTTP round-trips, no extension JavaScript parsing the DOM. The browser add-on is just a thin UI layer.

    # Enable browser integration (Linux)
    # KeePassXC → Tools → Settings → Browser Integration
    # Check "Enable browser integration"
    # Check "Firefox" and/or "Chromium"
    # It writes the native messaging manifest automatically to:
    # ~/.mozilla/native-messaging-hosts/org.keepassxc.keepassxc_browser.json

    Honest Comparison: KeePassXC vs The Cloud Options

    vs Bitwarden — Bitwarden is the closest competitor and genuinely good. It’s open source, self-hostable (Vaultwarden), and the free tier is generous. I’d recommend it to anyone who doesn’t want to manage sync themselves. The tradeoff: you’re trusting their server-side encryption implementation, or running your own server (which means patching, backups, certificates). KeePassXC has no server component to maintain or secure.

    vs 1Password — Polished UI, great team features, expensive ($36/year individual, $60/year family). The “Secret Key” system is clever — it means 1Password can’t decrypt your vault even with a breach. But it’s closed source. You’re trusting their claims. For a solo developer who reads source code, that’s a non-starter for me.

    vs LastPass — Just don’t. After the 2022 breach, the 2023 follow-up showing employee vaults were compromised, and the consistently slow response times… there’s no reason to trust them with anything sensitive.

    The One Thing That Annoys Me

    Mobile is worse than cloud managers. Full stop. KeePassDX on Android works, but auto-fill is flaky on some apps, and you need to manually trigger sync if you added a password on desktop 30 seconds ago. I’ve accepted this tradeoff — I add most passwords on desktop anyway, and the security model is worth the occasional inconvenience on mobile.

    Migration Script

    If you’re coming from LastPass, Bitwarden, or 1Password, KeePassXC imports CSV exports directly. Here’s my cleanup script that runs after import to organize entries:

    #!/usr/bin/env python3
    """Post-import cleanup for KeePassXC CSV import.
    Removes duplicate entries and normalizes URLs."""
    import csv, sys
    from urllib.parse import urlparse
    
    def normalize_url(url):
        parsed = urlparse(url)
        return f"{parsed.scheme}://{parsed.netloc}".lower()
    
    seen = {}
    with open(sys.argv[1]) as f:
        reader = csv.DictReader(f)
        for row in reader:
            key = (row['Username'], normalize_url(row.get('URL','')))
            if key not in seen or len(row.get('Password','')) > len(seen[key].get('Password','')):
                seen[key] = row
    
    print(f"Deduplicated: {len(seen)} unique entries")

    My Recommendation

    If you’re a developer comfortable with file management and want zero cloud trust for your passwords: KeePassXC + Syncthing + YubiKey is the strongest setup I’ve found. Total cost: $50 for the YubiKey (plus a backup), everything else is free and open source.

    If you want something that “just works” across devices without any setup: Bitwarden free tier. No shame in that — it’s genuinely good software.

    For more tools and privacy-focused workflows, check out our security guides and tools section.


    Full disclosure: Amazon links above are affiliate links (tag=orthogonalinf-20). I bought my YubiKeys at full price before writing this.

    📡 Join https://t.me/alphasignal822 for free market intelligence — we cover fintech security and trading tools daily.

  • Your Photos Are Broadcasting Your Home Address — How EXIF Metadata Works and How to Strip It

    Last month I helped a friend figure out why a stalker knew her daily routine. The answer was in her Instagram stories — not the content, but the metadata baked into every JPEG she posted. GPS coordinates, timestamps accurate to the second, even her phone model. Instagram strips EXIF on upload, but she’d been sharing originals in a group chat first.

    🔒 Strip EXIF the easy way — without uploading your photos

    Re-saving an image through a client-side tool removes embedded GPS and EXIF metadata automatically, because the file is rebuilt fresh in your browser. QuickShrink does exactly that: it compresses and re-encodes your images entirely in your browser — nothing is ever uploaded to a server, so your location data never leaves your device.

    Clean & compress your photos free →

    Most developers know EXIF exists. Fewer know exactly what’s in there, how to parse it programmatically, or how to strip it without degrading image quality. I spent a weekend building a browser-based EXIF stripper that never uploads your files, and learned more about the JPEG binary format than I expected.

    What EXIF Actually Contains (It’s Worse Than You Think)

    EXIF (Exchangeable Image File Format) lives in the APP1 marker segment of JPEG files, right after the SOI (Start of Image) marker at bytes 0xFFD8. The structure follows TIFF IFD (Image File Directory) format — a linked list of tagged key-value pairs.

    Here’s what a typical iPhone photo contains:

    GPS Latitude: 37.7749 N
    GPS Longitude: 122.4194 W
    GPS Altitude: 12.3m above sea level
    DateTime Original: 2026:05:20 14:32:07
    Make: Apple
    Model: iPhone 15 Pro Max
    Lens: iPhone 15 Pro Max back camera 6.765mm f/1.78
    Software: 18.4.1
    Orientation: Rotate 90 CW
    Focal Length: 6.765mm (equiv 24mm)
    Exposure: 1/120s at f/1.78, ISO 50
    Unique Image ID: 4A3B2C1D-...

    That’s 40+ fields in a single photo. The GPS data alone is accurate to about 3 meters with modern phones. Post enough photos from your apartment and anyone with exiftool can pinpoint your building.

    The Binary Structure: Parsing EXIF in JavaScript

    If you want to strip EXIF without re-encoding (which would lose quality), you need to understand the byte layout. A JPEG with EXIF looks like this:

    FF D8          - SOI marker (Start of Image)
    FF E1 [len]   - APP1 marker (EXIF data lives here)
      45 78 69 66 00 00  - "Exif\0\0" header
      [TIFF header + IFD entries + GPS sub-IFD]
    FF E0 [len]   - APP0 marker (JFIF, optional)
    FF DB [len]   - DQT (quantization tables)
    FF C0 [len]   - SOF (frame header)
    ...            - actual image data
    FF D9          - EOI marker

    The key insight: you can remove the entire APP1 segment without touching image pixels. The compressed image data starts at SOF and is completely independent of the metadata. Here’s the core logic I use:

    function stripExif(arrayBuffer) {
      const view = new DataView(arrayBuffer);
      if (view.getUint16(0) !== 0xFFD8) return arrayBuffer;
    
      const segments = [];
      let offset = 2;
    
      while (offset < view.byteLength) {
        const marker = view.getUint16(offset);
        if (marker === 0xFFDA) {
          segments.push(arrayBuffer.slice(offset));
          break;
        }
        const segLen = view.getUint16(offset + 2);
        if (marker !== 0xFFE1 && marker !== 0xFFED) {
          segments.push(arrayBuffer.slice(offset, offset + 2 + segLen));
        }
        offset += 2 + segLen;
      }
    
      const soi = new Uint8Array([0xFF, 0xD8]);
      const parts = [soi, ...segments.map(s => new Uint8Array(s))];
      const result = new Uint8Array(parts.reduce((a, p) => a + p.length, 0));
      let pos = 0;
      for (const part of parts) {
        result.set(part, pos);
        pos += part.length;
      }
      return result.buffer;
    }

    This approach is lossless — zero re-encoding, zero quality loss. The output file is typically 5-50KB smaller than the input because you’re removing the metadata block entirely.

    Why “Browser-Only” Matters for This

    Think about the irony: you want to strip location data from your photos for privacy… so you upload them to a random website? That site now has your original files, complete with GPS coordinates, before stripping anything.

    I built the orthogonal.info image tool to process everything client-side using the Canvas API and ArrayBuffer manipulation. Your files never leave your browser tab. Verify by opening DevTools Network tab — zero upload requests during processing.

    const file = input.files[0];
    const buffer = await file.arrayBuffer();
    const stripped = stripExif(buffer);
    const blob = new Blob([stripped], { type: 'image/jpeg' });
    const url = URL.createObjectURL(blob);

    What About PNG and WebP?

    PNG stores metadata differently — in tEXt, iTXt, and eXIf chunks rather than APP1 markers. The chunk-based format makes it straightforward to filter: read each chunk’s 4-byte type identifier, skip the ones you don’t want, concatenate the rest.

    WebP uses RIFF container format with an EXIF chunk. Same principle: parse chunks, drop the EXIF one, rebuild.

    Tools I Actually Use

    For batch processing on my homelab, I use exiftool:

    # Strip ALL metadata from every JPEG in a directory
    exiftool -all= -overwrite_original *.jpg
    
    # Keep orientation (so photos display correctly) but strip everything else
    exiftool -all= -tagsfromfile @ -Orientation -overwrite_original *.jpg

    That second command is important — if you strip the Orientation tag, portrait photos will display sideways in some viewers. Common gotcha.

    For quick one-off checks before sharing, I use our browser-based tool — compress and strip in one step, no install needed. For developers building apps that handle user uploads, the piexifjs library (3KB gzipped) handles read/write/strip operations well.

    If you’re processing images on a server, a Raspberry Pi 5 running an exiftool batch script works great as a dedicated metadata sanitizer on your network — keeps processing local and costs about $80 total with a case and SD card.

    Platforms That Strip vs. Don’t

    I tested 12 platforms in May 2026:

    Strip EXIF on upload: Instagram, Twitter/X, Facebook, LinkedIn, Discord, iMessage

    Preserve EXIF (danger zone): Email attachments, Signal (original quality), Telegram (as file), Google Drive, Dropbox shared links, most forum software

    Signal strips EXIF when you send as a compressed photo, but preserves everything when you tap “original quality.” Most people don’t realize the distinction. Telegram behaves the same way: compressed = stripped, sent as file = full metadata intact.

    The Real Risk Model

    For most people, the threat isn’t nation-state actors. It’s:

    • Selling items online with photos taken at home (Craigslist, Facebook Marketplace)
    • Sharing “original quality” photos in group chats with acquaintances
    • Uploading images to forums, bug trackers, or documentation sites
    • Dating app photos with location data if the platform doesn’t strip

    A privacy screen protector stops shoulder-surfers, but EXIF metadata is the silent leak most people never think about. Strip it before sharing. Every time.

    If you handle images in any application — whether it’s a side project or production — add EXIF stripping to your upload pipeline. It’s 20 lines of code and it protects your users from themselves.

    Related: Developer Tools Guide | DevSecOps in Practice

    Join Alpha Signal for free market intelligence — daily signals, no spam.

  • I Tested 4 Free Stock Market APIs — Here’s Which One Actually Works for Side Projects

    Last month I needed real-time-ish stock quotes for a personal trading dashboard. Nothing fancy — just current prices, daily OHLCV, and maybe some basic fundamentals. I figured this would take an afternoon. It took a week, because every “free” market data API has a different definition of “free.”

    I tested Polygon.io, Finnhub, Alpha Vantage, and yfinance (the unofficial Yahoo Finance wrapper) for a simple use case: pull 30 tickers every 5 minutes during market hours, store the data in SQLite, and trigger alerts on volume spikes. (For an event-driven variant, see how I built a Python alerter for SEC insider buying.)

    The Test Setup

    I wrote the same data pipeline four times — one per API. Each version pulls price data for 30 S&P 500 stocks, handles rate limits gracefully, and logs failures. The code ran on a $5 VPS for two weeks straight.

    import requests
    import time
    from datetime import datetime
    
    TICKERS = ["AAPL", "MSFT", "NVDA", "GOOGL", "AMZN", ...]  # 30 total
    
    def fetch_polygon(ticker, api_key):
        url = f"https://api.polygon.io/v2/aggs/ticker/{ticker}/prev"
        r = requests.get(url, params={"apiKey": api_key})
        if r.status_code == 429:
            time.sleep(12)  # free tier: 5 calls/min
            return fetch_polygon(ticker, api_key)
        return r.json()["results"][0]
    

    Polygon.io — Best Docs, Painful Rate Limits

    Polygon’s free tier gives you 5 API calls per minute. For 30 tickers, that’s 6 minutes minimum per refresh cycle. Their docs are excellent — OpenAPI spec, clear error codes, consistent response formats. The data quality is solid; I never got a stale quote during market hours.

    The catch: 5 calls/min means you’re always waiting. I ended up batching with their grouped daily endpoint (/v2/aggs/grouped/locale/us/market/stocks/{date}) which returns all tickers in one call. That’s the move if you’re on the free plan.

    Verdict: Best API design. Use the grouped endpoints and you can work within 5 calls/min. Paid plan ($29/mo) removes limits entirely.

    Finnhub — Generous Limits, Quirky Data

    Finnhub gives you 60 calls/min on the free tier. That’s 12x Polygon’s allowance. I could refresh all 30 tickers in under a minute with room to spare.

    def fetch_finnhub(ticker, api_key):
        url = "https://finnhub.io/api/v1/quote"
        r = requests.get(url, params={"symbol": ticker, "token": api_key})
        data = r.json()
        return {
            "price": data["c"],      # current
            "open": data["o"],
            "high": data["h"],
            "low": data["l"],
            "prev_close": data["pc"],
            "volume": data.get("v")  # sometimes missing!
        }
    

    The issue: volume data was missing or zero for about 8% of my calls during the first hour of trading. Pre-market data is inconsistent. And their WebSocket (which is real-time on free tier!) occasionally drops connection without sending a close frame, so your reconnect logic needs to be reliable.

    Verdict: Best free tier for polling frequency. The free WebSocket is genuinely useful for real-time dashboards. Just validate your data — don’t trust volume numbers before 10:30 AM ET.

    Alpha Vantage — The OG That’s Showing Its Age

    Alpha Vantage has been around forever. Free tier: 25 calls/day. Yes, per day, not per minute. They recently slashed this from 500/day (which was already tight).

    25 calls/day is useless for anything beyond a daily cron job checking your portfolio at close. I couldn’t even pull all 30 tickers once. The response format is also uniquely annoying — keys like “1. open” and “2. high” instead of just “open” and “high.”

    # Alpha Vantage response format... why?
    {
        "Global Quote": {
            "01. symbol": "AAPL",
            "02. open": "189.5100",
            "05. price": "191.2400",
            # seriously, numbered string keys?
        }
    }
    

    Verdict: Skip it in 2026. The rate limits make it impractical for anything but the simplest daily check. The API design feels stuck in 2015.

    yfinance — Free But Fragile

    yfinance is an unofficial Python library scraping Yahoo Finance. No API key needed. No rate limits (sort of). Sounds perfect, right?

    It broke twice during my two-week test. Yahoo changes their page structure, the library stops working, someone pushes a fix to PyPI in a day or two. For a personal project you check occasionally, that’s fine. For anything running unattended, it’s a liability.

    import yfinance as yf
    
    # Simple, but fragile
    ticker = yf.Ticker("AAPL")
    hist = ticker.history(period="1d", interval="5m")
    # Works great until it doesn't
    

    When it works, the data is rich — splits, dividends, options chains, financials, all free. The download() function handles batching natively. But I wouldn’t build anything I can’t babysit on top of it.

    Verdict: Best for Jupyter notebooks and research. Don’t put it in a cron job you want to forget about.

    My Actual Setup (What I Ended Up Using)

    I use Finnhub’s WebSocket for real-time price updates during market hours, Polygon’s grouped daily endpoint for end-of-day OHLCV, and yfinance for fundamentals data I pull once a week. Three APIs, each doing what it does best.

    import websocket
    import json
    
    def on_message(ws, message):
        data = json.loads(message)
        for trade in data.get("data", []):
            price = trade["p"]
            symbol = trade["s"]
            volume = trade["v"]
            # write to SQLite, check alerts
            check_volume_spike(symbol, volume)
    
    ws = websocket.WebSocketApp(
        f"wss://ws.finnhub.io?token={FINNHUB_KEY}",
        on_message=on_message,
        on_error=lambda ws, e: reconnect(ws),
    )
    

    Total cost: $0/month. The tradeoff is maintenance — when yfinance breaks or Finnhub drops connections, I fix it manually. If I valued my time at $50/hr, Polygon’s $29/mo plan would pay for itself in the first week.

    Quick Comparison

    Polygon.io Free: 5 calls/min, excellent docs, 15-min delayed quotes, best for batch daily data
    Finnhub Free: 60 calls/min + free WebSocket, good data (watch pre-market volume), best for real-time
    Alpha Vantage Free: 25 calls/day, outdated format, skip it
    yfinance: No limits but breaks periodically, rich data, best for research notebooks

    What I’d Recommend

    If you’re building a trading dashboard or alert system, start with Finnhub. The 60 calls/min and free WebSocket give you the most room to experiment. Once you know your architecture works, consider Polygon’s paid tier for reliability.

    If you’re doing backtesting or research, yfinance is hard to beat for the price (free). Just pin your dependency version and keep a fallback data source.

    For the actual trading execution side, I’ve been using Alpaca’s API which has its own market data included with a brokerage account — that’s a separate topic I covered recently.

    If you’re running this kind of setup on a home server, a Beelink Mini PC (affiliate link) draws about 15W and handles multiple Python processes and SQLite without breaking a sweat. I’ve been running mine 24/7 for months. A CyberPower UPS (affiliate link) keeps it alive through power blips — lost data during a brownout once, never again.

    For monitoring your API pipeline, I keep a Grafana dashboard tracking call counts, error rates, and data freshness. A portable second monitor (affiliate link) dedicated to dashboards saves constant window-switching.


    📡 I share trading signals and market intelligence daily in my free Telegram channel. If you’re building trading tools, the context helps. Join https://t.me/alphasignal822 for free market intelligence.

  • I Replaced All My Passwords with a YubiKey — Here’s What Actually Happened

    Last month I locked myself out of my GitHub account. Again. My TOTP app had synced to a new phone but silently dropped three seeds during the transfer. That was the third time in two years I’d lost access to something important because of software-based 2FA. I ordered a YubiKey 5 NFC that afternoon.

    Six weeks later, every account I care about uses FIDO2/WebAuthn hardware authentication. No more six-digit codes. No more seed backups. No more “did my authenticator app actually sync?” anxiety. Here’s what the transition actually looks like — the good parts and the frustrating ones.

    Why Software 2FA Keeps Failing

    TOTP (those six-digit rotating codes) has a fundamental problem: the secret is just a string that lives on your phone. Phone dies? Secret’s gone. Switch phones? Hope your backup worked. Get phished? An attacker with your password and your current TOTP code has everything they need — and phishing proxies like Evilginx2 automate this in real time.

    FIDO2 hardware keys solve this differently. The private key never leaves the physical device. Authentication uses a challenge-response protocol tied to the specific domain — so even if you click a perfect phishing link to g00gle.com, the key won’t respond because the domain doesn’t match. It’s not just a second factor; it’s phishing-proof by design.

    I tested this myself. I set up a fake login page on my local network and tried to authenticate with my YubiKey. Nothing happened. The browser prompted me, I tapped the key, and it simply refused. With TOTP, I would have typed the code without thinking.

    The Hardware: YubiKey 5 NFC vs. the Alternatives

    I went with the YubiKey 5 NFC (USB-A) as my primary and a YubiKey 5C NFC (USB-C) as backup. You always want two keys — if you lose one, the backup gets you back in. Full disclosure: affiliate links.

    Here’s how the main options compare:

    • YubiKey 5 NFC (~$50) — supports FIDO2, U2F, smart card (PIV), OpenPGP, OTP. Works with USB-A and NFC on phones. The Swiss Army knife option. I’ve been using mine daily for six weeks with zero issues.
    • Google Titan Security Key (~$30) — FIDO2 and U2F only. No smart card, no OpenPGP. Cheaper, but if you want to sign Git commits or use SSH keys on the hardware, you’re stuck.
    • SoloKeys Solo 2 (~$30) — open-source firmware, FIDO2 only. Great if you want to audit the code yourself. Limited protocol support compared to YubiKey.
    • Nitrokey 3 (~$50) — open-source, supports FIDO2, OpenPGP, PIV. Solid open-source alternative to YubiKey, though firmware updates have historically been slower.

    I picked YubiKey because of the protocol breadth. I use FIDO2 for web logins, PIV for SSH, and OpenPGP for Git commit signing — all on one device. If you only need web authentication, the Titan or Solo 2 will save you $20.

    Setting Up FIDO2 on Everything That Matters

    The registration process is the same everywhere: go to security settings, choose “Security Key,” tap your YubiKey when prompted, done. But the details vary enough to be annoying.

    GitHub — smooth. Settings → Password and authentication → Security keys. Register both keys (primary + backup). Took 2 minutes. GitHub also supports using the key for git push verification via SSH resident keys:

    ssh-keygen -t ed25519-sk -O resident -O application=ssh:github
    # Tap YubiKey when it blinks
    # Upload the .pub to GitHub SSH keys

    Now every git push requires a physical tap. No one’s pushing to my repos from a compromised machine.

    Google — also smooth, but with a catch. You need to enroll in Google’s Advanced Protection Program to get the full benefit. Without it, Google still allows fallback to SMS or TOTP, which defeats the purpose. With Advanced Protection, only hardware keys work. Period.

    AWS — this one frustrated me. AWS IAM supports FIDO2 for root accounts and IAM users, but the console registration flow is finicky. I had to use Chrome (Firefox didn’t trigger the WebAuthn prompt correctly in May 2026). Once registered, it works reliably.

    Cloudflare — perfect support. They use hardware keys internally and it shows. Registration took 30 seconds.

    SSH Authentication Without Software Keys

    This is where things get interesting for developers. Instead of keeping an ed25519 private key in ~/.ssh/, you can generate a resident key that lives on the YubiKey itself:

    # Generate a resident SSH key on the YubiKey
    ssh-keygen -t ed25519-sk -O resident -O verify-required
    
    # Load it from the key (works on any machine with the YubiKey plugged in)
    ssh-add -K
    
    # Check it's loaded
    ssh-add -L

    The -O verify-required flag means you need to enter the YubiKey’s PIN and tap it for each SSH connection. Paranoid? Yes. But it means even if someone steals your unlocked laptop, they can’t SSH anywhere without the physical key and the PIN.

    I use this for all my homelab connections. My TrueNAS server, my development VMs, my remote build machines — all require the YubiKey tap. The ~/.ssh/ directory on my laptop has exactly zero private key files in it now.

    The Annoying Parts (Because Nothing Is Perfect)

    I won’t pretend this is all smooth sailing. Some real friction points:

    • Mobile is awkward. NFC works on Android and iOS, but you have to hold the key against the right spot on your phone. On my Pixel 8, the NFC reader is in the center-back. On iPhones, it’s at the top. Every login on mobile involves an awkward fumble.
    • Not everything supports FIDO2. My bank doesn’t. My health insurance portal doesn’t. Some services technically support it but bury the option so deep you’d never find it without documentation.
    • Two keys minimum is expensive. At $50 each, you’re spending $100+ before you’ve protected a single account. Compared to free authenticator apps, that’s a tough sell for people who haven’t been burned yet.
    • Recovery codes are still important. If you lose both keys (fire, theft), you need recovery codes. I print mine and keep them in a fireproof safe. It’s not elegant but it works.

    What Changed After Six Weeks

    The biggest surprise wasn’t security — it was speed. Tapping a key takes about 0.5 seconds. Pulling up an authenticator app, finding the right account, and typing six digits takes 10-15 seconds. Over dozens of logins per week, that adds up.

    I also stopped worrying about phone transfers. My YubiKey doesn’t care what phone I’m using. It doesn’t sync anywhere. It doesn’t need a backup. It’s just a piece of hardware on my keyring.

    For developers specifically: the SSH resident key feature alone is worth the price. Not having private keys on disk removes an entire attack surface. Combined with a good laptop lock for when you’re at a coffee shop, your attack surface shrinks significantly.

    If you’re still using TOTP and haven’t been burned yet — you will be. It’s not a question of if, it’s when. A YubiKey 5 NFC and a backup key is the best $100 I’ve spent on security tooling this year.

    For more on security and developer workflows, check out our DevSecOps guide and homelab security guide.


    Join Alpha Signal on Telegram for free market intelligence — including weekly picks on security and infrastructure companies worth watching.

  • Build a Portfolio Rebalancing Bot with Python and Alpaca API

    Last month I noticed my portfolio had drifted 12% off target allocation. Tech was at 45% instead of 30%, bonds had dropped to 8%. I’d been meaning to rebalance for weeks but kept putting it off. So I spent a Saturday afternoon writing a Python script that does it automatically — and it’s been running every Monday morning since.

    Here’s exactly how I built it, what went wrong, and why I ended up preferring Alpaca’s API over the alternatives I tried.

    Why Automate Rebalancing?

    Manual rebalancing has two problems: you forget to do it, and when you do remember, emotions get in the way. “NVDA is up 40% — maybe I should let it ride?” That’s not a strategy, that’s gambling with extra steps.

    A rebalancing bot doesn’t care about feelings. It sells what’s overweight, buys what’s underweight, and moves on. Studies from Vanguard show that disciplined rebalancing adds roughly 0.35% annually in risk-adjusted returns. Not huge, but it compounds.

    The Setup: Alpaca + Python in 50 Lines

    I picked Alpaca because it offers commission-free trading with a proper REST API. No screen scraping, no Selenium hacks. You get a paper trading environment that mirrors production exactly — same endpoints, same response formats.

    First, install the SDK:

    pip install alpaca-trade-api pandas

    Here’s the core logic. It’s shorter than you’d expect:

    import alpaca_trade_api as tradeapi
    import pandas as pd
    
    # Target allocation (adjust these to your strategy)
    TARGET = {
        'SPY': 0.40,   # S&P 500
        'QQQ': 0.20,   # Nasdaq
        'TLT': 0.15,   # Long-term bonds
        'GLD': 0.10,   # Gold
        'VWO': 0.10,   # Emerging markets
        'BIL': 0.05,   # Short-term treasury (cash-like)
    }
    
    api = tradeapi.REST(
        key_id='your-key',
        secret_key='your-secret',
        base_url='https://paper-api.alpaca.markets'  # paper first!
    )
    
    def get_current_allocation():
        account = api.get_account()
        portfolio_value = float(account.portfolio_value)
        positions = {p.symbol: float(p.market_value) 
                     for p in api.list_positions()}
        return {sym: positions.get(sym, 0) / portfolio_value 
                for sym in TARGET}
    
    def rebalance():
        account = api.get_account()
        portfolio_value = float(account.portfolio_value)
        current = get_current_allocation()
        
        for symbol, target_pct in TARGET.items():
            current_pct = current.get(symbol, 0)
            drift = target_pct - current_pct
            
            # Only trade if drift exceeds 2% threshold
            if abs(drift) < 0.02:
                continue
                
            dollar_amount = abs(drift) * portfolio_value
            side = 'buy' if drift > 0 else 'sell'
            
            api.submit_order(
                symbol=symbol,
                notional=round(dollar_amount, 2),
                side=side,
                type='market',
                time_in_force='day'
            )
            print(f"{side.upper()} ${dollar_amount:.2f} of {symbol} "
                  f"(drift: {drift:+.1%})")
    

    The 2% drift threshold is important. Without it, you’d be making tiny trades every run, racking up tax events for no real benefit. I tested thresholds from 1% to 5% — 2% hit the sweet spot between staying close to target and minimizing unnecessary trades.

    The Gotcha That Cost Me an Hour

    Alpaca’s notional parameter (dollar-based orders) only works for stocks, not ETFs on the old API version. I kept getting 422 Unprocessable Entity errors when trying to buy fractional TLT shares. The fix: make sure you’re using API v2 and that fractional shares are enabled on your account. It’s a checkbox in the dashboard that’s off by default.

    Another thing: market orders submitted before 9:30 AM ET queue until open. That’s fine for rebalancing — you’re not trying to time anything. But if you’re running this as a cron job at 6 AM Pacific like I do, don’t panic when orders show as “pending” for a few hours.

    Scheduling: Cron vs. Cloud Functions

    I run mine as a weekly cron job on my homelab server:

    # Every Monday at 6:00 AM Pacific (13:00 UTC)
    0 13 * * 1 /usr/bin/python3 /home/scripts/rebalance.py >> /var/log/rebalance.log 2>&1

    If you don’t have a server running 24/7, AWS Lambda with EventBridge works too. The free tier covers it — this script runs in under 3 seconds and uses maybe 5MB of memory. But honestly, a $35 Raspberry Pi is simpler. No IAM roles, no deployment pipeline, no cold start delays.

    For monitoring, I have it post results to a Telegram channel. If any order fails, I get a push notification. The Finnhub WebSocket alert system I built earlier handles the real-time price monitoring side.

    Backtesting: Does This Actually Work?

    I backtested this exact allocation with monthly rebalancing against a buy-and-hold SPY position from 2015-2025 using vectorbt:

    import vectorbt as vbt
    
    # Results over 10 years:
    # Rebalanced portfolio: 11.2% CAGR, max drawdown -18.4%
    # Buy-and-hold SPY:    13.1% CAGR, max drawdown -33.7%
    

    SPY beat on raw returns (it was a great decade for US large caps), but the rebalanced portfolio had nearly half the max drawdown. In 2020, when SPY dropped 33%, my diversified mix only fell 18%. That’s the difference between sleeping fine and stress-refreshing your brokerage app at 3 AM.

    If you want to dig deeper into the technical indicators behind timing decisions, I wrote about RSI, Ichimoku, and Stochastic indicators — useful if you want to add tactical overlays on top of the base rebalancing strategy.

    Tax-Loss Harvesting Add-On

    Once you have the rebalancing bot running, adding tax-loss harvesting is straightforward. The idea: when selling an overweight position at a loss, you book that loss for tax purposes and immediately buy a correlated (but not “substantially identical”) replacement.

    # Tax-loss harvesting pairs
    PAIRS = {
        'SPY': 'VOO',   # Both track S&P 500 (different providers)
        'QQQ': 'QQQM',  # Both track Nasdaq-100
        'VWO': 'IEMG',  # Both track emerging markets
    }
    
    def harvest_losses(symbol, current_price, cost_basis):
        if current_price < cost_basis * 0.95:  # 5%+ loss
            loss = (cost_basis - current_price) * shares
            # Sell losing position, buy the pair
            api.submit_order(symbol=symbol, qty=shares, side='sell')
            api.submit_order(symbol=PAIRS[symbol], qty=shares, side='buy')
            print(f"Harvested ${loss:.2f} loss on {symbol}")
    

    Be careful with wash sale rules — you can’t buy back the same security within 30 days. The paired approach above avoids this while keeping your market exposure roughly the same.

    Monitoring With a Proper Setup

    Running trading automation without monitoring is asking for trouble. At minimum, you need:

    • Daily balance check — compare actual vs. expected portfolio value
    • Order failure alerts — any rejected order gets a push notification
    • Drift report — weekly email showing allocation vs. target
    • Kill switch — a way to disable the bot instantly if something goes wrong

    I use a simple JSON log file and a Python script that reads it to generate a weekly summary. Nothing fancy, but it’s saved me twice — once when Alpaca had an API outage and orders were silently failing, and once when a stock split threw off my position calculations.

    For the monitoring hardware side, a good multi-monitor setup helps when you’re watching positions. I use a dual monitor arm (affiliate link) to keep my terminal and brokerage dashboard side by side — worth it if you’re doing any kind of active development alongside automated trading.

    What I’d Do Differently

    If I started over, I’d skip the cron job and use Alpaca’s built-in webhook notifications to trigger rebalancing only when drift exceeds the threshold. Polling weekly works fine, but event-driven is cleaner.

    I’d also add a volatility filter — during high-VIX periods (above 30), the bot should reduce position sizes or skip rebalancing entirely. Buying into a panic selloff sounds great in theory, but the bid-ask spreads on ETFs widen during volatility, and you’ll get worse fills.

    The full script with logging, error handling, and Telegram notifications is about 200 lines. Not a weekend project — more like a focused afternoon. The hard part isn’t the code. It’s deciding on your target allocation and sticking with it when markets get weird.

    For daily market analysis and trading signals, join Alpha Signal on Telegram — free market intelligence every morning.

Also by us: StartCaaS — AI Company OS · Hype2You — AI Tech Trends