Last month I noticed something odd: three C-suite executives at a mid-cap biotech all bought shares within the same week. The stock was down 30% from its high. Two weeks later, they announced a partnership with Pfizer and the stock popped 45%. I didn’t catch it in time — I found out from a news alert like everyone else.
That bugged me. All that insider transaction data is public, filed with the SEC on Form 4. Anyone can look it up on EDGAR. The problem isn’t access — it’s that nobody wants to manually check thousands of filings every day. So I built a Python script that does it for me.
Why Insider Transactions Matter
Insider trading filings (the legal kind) are one of the few signals where incentives are perfectly aligned. When a CEO buys $2M of their own stock on the open market, they’re putting real money at risk. They know more about the business than any analyst. Studies from Seyhun (1986) and more recent work show that insider purchases outperform the market by 6-10% annually on average.
The signal isn’t in single transactions — it’s in clusters. When three or more insiders buy within a 30-day window, that’s worth paying attention to. One person buying might be tax planning. Three people buying is conviction.
The SEC EDGAR Full-Text Search API
Most people don’t know this, but the SEC launched a full-text search API (EFTS) that’s completely free. No API key, no registration. You just need to set a proper User-Agent header with your email (SEC requires this so they can contact you if you’re hammering their servers).
Here’s the base URL:
https://efts.sec.gov/LATEST/search-index?q=&dateRange=custom&startdt=2026-04-15&enddt=2026-04-22&forms=4
But the search API is better for keyword queries. For structured Form 4 data, I use the EDGAR company filings endpoint directly:
import requests
import xml.etree.ElementTree as ET
from datetime import datetime, timedelta
HEADERS = {"User-Agent": "YourName [email protected]"}
def get_recent_form4s(cik: str, days: int = 30):
"""Fetch recent Form 4 filings for a company by CIK."""
url = f"https://efts.sec.gov/LATEST/search-index?q=%22{cik}%22&forms=4&dateRange=custom"
url += f"&startdt={(datetime.now() - timedelta(days=days)).strftime('%Y-%m-%d')}"
url += f"&enddt={datetime.now().strftime('%Y-%m-%d')}"
resp = requests.get(url, headers=HEADERS)
resp.raise_for_status()
return resp.json().get("hits", {}).get("hits", [])
def parse_form4_xml(filing_url: str):
"""Parse a Form 4 XML filing to extract transaction details."""
resp = requests.get(filing_url, headers=HEADERS)
root = ET.fromstring(resp.text)
owner = root.find(".//rptOwnerName")
owner_name = owner.text if owner is not None else "Unknown"
transactions = []
for txn in root.findall(".//nonDerivativeTransaction"):
code_elem = txn.find(".//transactionCoding/transactionCode")
shares_elem = txn.find(".//transactionAmounts/transactionShares/value")
price_elem = txn.find(".//transactionAmounts/transactionPricePerShare/value")
if code_elem is not None:
transactions.append({
"owner": owner_name,
"code": code_elem.text, # P=purchase, S=sale
"shares": float(shares_elem.text) if shares_elem is not None else 0,
"price": float(price_elem.text) if price_elem is not None else 0,
})
return transactions
The transactionCode field is what matters most: P means open-market purchase (bullish), S means sale, A means grant/award (ignore these — they’re compensation, not conviction).
Detecting Cluster Buys
A single insider purchase is noise. A cluster is signal. Here’s my detection logic:
from collections import defaultdict
def detect_clusters(transactions: list, window_days: int = 30, min_insiders: int = 3):
"""Find stocks where multiple insiders bought within a window."""
buys_by_ticker = defaultdict(list)
for txn in transactions:
if txn["code"] == "P": # Open market purchase only
buys_by_ticker[txn["ticker"]].append(txn)
clusters = []
for ticker, buys in buys_by_ticker.items():
unique_buyers = set(b["owner"] for b in buys)
total_value = sum(b["shares"] * b["price"] for b in buys)
if len(unique_buyers) >= min_insiders and total_value > 100_000:
clusters.append({
"ticker": ticker,
"buyers": len(unique_buyers),
"total_value": total_value,
"transactions": buys,
})
return sorted(clusters, key=lambda x: x["total_value"], reverse=True)
I filter for clusters where at least 3 unique insiders bought, with a combined value over $100K. Below that threshold, you get too much noise from directors buying token amounts for optics.
Running It Daily with Cron
I run this script every evening at 7 PM ET (after SEC filings close for the day). It checks the last 30 days of Form 4 filings, detects clusters, and sends me a notification via ntfy — a free, open-source push notification service.
import requests
def notify(clusters):
for c in clusters:
msg = f"🔔 {c['ticker']}: {c['buyers']} insiders bought ${c['total_value']:,.0f}"
requests.post("https://ntfy.sh/your-topic", data=msg.encode())
The whole script is about 120 lines. No paid APIs. No subscriptions. Just Python, the SEC’s public data, and a cron job.
What I’ve Learned Running This for 3 Months
A few patterns I’ve noticed:
- Cluster buys in beaten-down stocks are the strongest signal. If insiders are buying after a 30%+ drawdown, they often know something the market doesn’t.
- CEO + CFO combos are more predictive than board member purchases. Board members sometimes buy for governance reasons. The CEO and CFO buying together is pure conviction.
- Ignore option exercises. Transaction codes M and F are options-related. They tell you nothing about direction — it’s just compensation mechanics.
- Small-cap signals are stronger. In mega-caps like Apple or Microsoft, insider buys barely move the needle. In $500M-$5B companies, insider clusters have led to 15-25% moves within 90 days in my limited sample.
Rate Limits and Being a Good Citizen
The SEC asks for no more than 10 requests per second. I keep mine at 2/second with a simple sleep. They will throttle or ban your IP if you abuse it. Always include your email in the User-Agent — it’s not optional, it’s in their fair access policy.
import time
def rate_limited_get(url, headers, delay=0.5):
time.sleep(delay)
return requests.get(url, headers=headers)
My Actual Setup
I run this on a Raspberry Pi 4 that also handles a few other automation scripts. Total power draw is about 5W — costs me maybe $4/year in electricity. If you don’t have a Pi, any VPS or even a scheduled GitHub Action would work.
For data analysis beyond the basic script, I dump everything into a SQLite database and use pandas. The book “Python for Data Analysis” by Wes McKinney is still the best reference if you want to go deeper with financial data wrangling. A simple groupby on ticker + 30-day rolling window gives you the cluster detection in one line.
If you want a more polished monitoring setup, I’d also recommend a decent microSD card for your Pi — the cheap ones corrupt fast when you’re writing database files daily.
Limitations
This isn’t a trading system. It’s an alerting system. Insider cluster buys tell you something is probably happening, not when to buy. I use it as one input alongside technicals and fundamentals. Sometimes the cluster fires and the stock keeps dropping for months before turning. Timing is still hard.
Form 4 filings can be delayed up to 2 business days after the transaction. By the time you see it, the move might already be priced in for liquid names. The edge is strongest in smaller, less-followed stocks where nobody else is watching the filings.
Full disclosure: affiliate links above. The full script is on my tools page. If you’re interested in market signals and trading intelligence, I publish daily analysis on Alpha Signal — free Telegram channel covering insider activity, macro shifts, and sector rotation. Join https://t.me/alphasignal822 for free market intelligence.
📧 Get weekly insights on security, trading, and tech. No spam, unsubscribe anytime.
Leave a Reply