Building a Personal Ops System on Top of the Claude CLI

I had a habit problem. Every morning I’d spend twenty minutes reconstructing what I’d been doing: scanning my calendar, checking open Linear tickets, scrolling through yesterday’s notes, trying to remember where I’d left off at the end of the previous day. And every Friday I’d spend the better part of an hour writing up what I’d accomplished that week — manually pulling from the same scattered sources.

The information existed. It just lived in five different places and required manual effort to assemble.

About a month ago I sat down and decided to automate it.

What I Was Working With

My work context is distributed across more systems than I’d like:

Linear for project and task tracking — open issues, cycle membership, completed work
Freshservice for IT support tickets — an always-moving queue of open items with ages and priorities
Google Calendar for meetings and deadlines
Git commits scattered across a dozen repos for anything I’ve actually shipped
Session diaries — timestamped log entries I write throughout the day using Claude Code

None of these talk to each other. Getting a coherent picture of what’s happening requires visiting all of them. That’s the problem I wanted to solve.

The Architecture

The system lives in a _control directory inside my notes folder and consists of three layers.

The first layer is data gathering: a set of scripts that pull from each source and write standardized Markdown files. sync_tickets.sh hits the Linear and Freshservice APIs and caches the results. refresh_calendar.sh queries my calendars via icalbuddy and generates today’s schedule plus 7- and 14-day lookaheads. gather_diaries.sh scans my session diary files for recent entries and completed tasks. These run on their own schedules — calendar every five minutes via a LaunchAgent, tickets when triggered.

The second layer is orchestration: review_runner.sh decides which review to generate based on time of day and what’s already been generated. Morning (6am–noon): a prioritized daily plan. Late afternoon/evening (4pm–9pm): an end-of-day summary. Friday evening and Monday morning: a weekly retrospective. The runner is idempotent — if a morning review has already been generated today, it does nothing. Run it twice and you get the same result.

The third layer is synthesis: Claude. Each review script assembles context from the data files, writes a large prompt to a temp file, and pipes it into claude -p --model sonnet. The output is a self-contained HTML file with embedded CSS.

Getting the Context Windows Right

The part that took the most iteration was figuring out how much history to include and how to handle boundary cases.

For daily reviews, an 18-hour lookback window works better than “today.” If I stay up late writing code, those session diary entries should appear in the next morning’s review. A strict “since midnight” cutoff drops them.

For Monday morning, the lookback extends to 72 hours to cover Friday and the weekend. The prompt explicitly tells Claude to frame this as “what happened since Friday” rather than “what happened today.” Without that instruction, Claude would present Thursday’s work as if it happened this morning, which is disorienting.

For weekly reviews, I use a rolling 7-day window rather than “since last Monday.” The calendar-anchored approach sounds more logical but creates a real problem: if you run the weekly review on a Tuesday (because you forgot Monday), the rolling window still has data. A Monday-anchored window has essentially nothing until enough of the week has elapsed.

Linear’s “completed this week” query has the same issue. I maintain three separate completion files: CompletedTodayLinear.md, CompletedYesterdayLinear.md (which uses Friday’s data on Mondays to cover the full weekend), and CompletedThisWeekLinear.md with a rolling 7-day window. Each serves a different consumer: today’s file goes to the end-of-day summary, yesterday’s to the morning review and standup helper, the rolling week to the weekly retrospective.

The Milestone Tracker

One of the more useful pieces is milestone tracking. I maintain a _milestones.md file with upcoming deadlines — conferences, project deliverables, cycle ends — in a simple format that includes the date, type, a path to related prep work, and an optional linked Linear project.

parse_milestones.sh reads this file and calculates countdowns automatically. More importantly, it checks the prep-path for recent file activity and flags anything that hasn’t been touched in more than seven days. If a conference is three weeks away and the prep directory hasn’t been modified since last month, that surfaces in every review as a warning rather than staying invisible until the week before.

The manual maintenance is a deliberate choice. Automatically pulling deadlines from calendar or Linear sounds appealing, but important deadlines come from many sources and always need a human to decide they matter. The file is the source of truth; the script just keeps the arithmetic current.

Standup in Seconds

There’s a special case I didn’t anticipate needing until I needed it: standup in the morning when the VPN is slow and you just need to know what you did yesterday.

standup.sh generates an HTML summary with no API calls and no network dependencies. It pulls from three local sources: scratch notes from the previous day (lines marked - [x]), git commits from the last 24 hours, and the cached Linear completions file. The “what I’m working on today” section is extracted from the previous morning review’s HTML by looking for focus items with a specific CSS class. The whole thing takes under a second.

There’s also a “Copy for Slack” button. This required using document.execCommand('copy') rather than navigator.clipboard.writeText() because the output is displayed in MDMenuBar’s WKWebView, and navigator.clipboard requires a secure context that a local file doesn’t provide. Small detail, mattered in practice.

Network Resilience

A persistent problem with automating anything on a laptop is that the laptop sleeps and wakes, and when it wakes up the network isn’t immediately available. The first version of the system would fail silently when the hourly runner fired thirty seconds after wake, before Wi-Fi had reconnected.

The current version has a check_network() function at the top of the runner that pings a reliable host and retries three times with 8-second gaps before giving up. Eight seconds is enough for most post-wake Wi-Fi reconnections; three attempts covers the slow cases. If it gives up, it exits cleanly without leaving partial output or corrupting cached data. The API scripts have an additional guard: if they can’t reach the network, they exit 0 without touching the cached files, so stale data stays available rather than getting replaced with nothing.

Velocity Tracking

The weekly review prompted a question I didn’t have a good answer to: how much am I actually getting done in a given cycle versus how much I thought I would?

velocity.sh queries Linear’s API for the last twelve cycles and records issue and point completion rates in an append-only JSON file. Each run deduplicates by cycle number, so it’s safe to run repeatedly. The weekly review prompt includes recent velocity history, and Claude uses it to contextualize the current week — “this is above your average pace” or “this is about consistent with recent cycles.”

It’s a small addition but it changed how the weekly review reads. Instead of a raw list of what happened, there’s a bit of interpretation.

The Prompts

The prompts are large. A morning review context includes the 18-hour diary lookback, open Linear issues grouped by project, Freshservice tickets with ages, today’s calendar, the next 7-day calendar lookahead, outstanding scratch items, and the milestone summary. That’s usually between 4,000 and 8,000 tokens of context before Claude writes a single word.

One thing that became clear quickly: the prompts can’t go as CLI arguments. Shell argument length limits vary by platform, but in practice you’ll hit them with prompts of this size on some systems. Every review script writes the assembled prompt to /tmp/<name>-prompt.md and then pipes it via stdin: claude -p --model sonnet < /tmp/morning-prompt.md. This also has the useful side effect of making the prompts debuggable — I can look at exactly what Claude received if something looks wrong.

What It Feels Like to Use

In the morning I open MDMenuBar, switch to the Scratch tab, and the Dashboard is already rendered. The morning review usually appeared while I was still in bed. It lists what’s in flight, what’s due soon, milestone warnings if relevant, and a prioritized recommendation for where to focus first.

At the end of the day there’s a summary waiting — what was completed, what carried over, any suggested Linear tasks to create based on things that came up but aren’t tracked yet. I don’t always agree with the suggestions, but they’re a useful prompt.

On Fridays there’s a weekly review that I use as the basis for the message I send my manager. It’s not written in a voice I’d use directly, but it gives me a complete list of what happened in a week so I’m not trying to reconstruct it from memory on a Friday afternoon.

The self-evaluation script is newer. At the end of a quarter it synthesizes weekly reviews, diary entries, and velocity data into a structured self-assessment. The first time I ran it I realized I’d significantly underestimated how much I’d shipped in Q1 — I’d forgotten things that felt small at the time but added up.

What It Cost to Build

About three weeks of evenings and weekends to get to a version I’d actually rely on. Most of that time was iteration: getting the time window logic right, handling edge cases around midnight and Monday mornings, making the HTML output good enough that I’d actually read it instead of skim past it.

The ongoing maintenance cost is low. I add entries to _milestones.md when deadlines appear, and occasionally tweak a prompt when the output drifts from what I want. The scripts themselves have been stable for weeks.

The Claude API cost is minimal. Reviews are infrequent and the model is efficient on synthesis tasks. I’m not running this constantly — the idempotency logic ensures it runs at most three times a day.

What I’d Do Differently

The biggest architectural decision I’d reconsider is having each review script assemble its own context. There’s now significant duplication in how the scripts load and format data. A cleaner design would have a single context assembly layer with shared logic, and the review scripts would just specify which data sources to include and at what lookback window. I’ve resisted refactoring it because it works and refactoring would require touching eight scripts simultaneously, but it’s the debt I’m aware of.

I’d also have added search earlier. search_reviews.sh strips HTML from all the generated review files and does a text search with highlighted results. It took me six weeks to realize I wanted it and about two hours to build it. Should have been there from the start.

The Bigger Picture

What I built is a personal information system that makes it easy to stay oriented in complex, multi-stream work. The underlying idea — aggregate context from authoritative sources, synthesize with a language model, keep humans in the loop for decisions — isn’t novel. But the specific application to personal work management is something I hadn’t seen done in a way that fit how I actually work.

If any of this sounds useful and you want to adapt it, the main things that will transfer directly are the time window logic, the network resilience approach, and the idempotent runner pattern. The specific API integrations will depend on what tools you use. The prompts are the part that will take the most tuning for your particular working style.

Happy to share more details if there are specific parts people want to dig into.

What I Was Working With#

The Architecture#

Getting the Context Windows Right#

The Milestone Tracker#

Standup in Seconds#

Network Resilience#

Velocity Tracking#

The Prompts#

What It Feels Like to Use#

What It Cost to Build#

What I’d Do Differently#

The Bigger Picture#