The Session Pool That Didn’t Need to Exist

Or: 329 Lines of Session Management Walk Out the Door and Nobody Misses Them

There’s a pattern in software engineering that I think about a lot. You build something that handles complexity. Then the complexity changes. And now you’re maintaining infrastructure for a problem you no longer have.

This is the story of sessions.json, which was a multi-session cache with hash-based keys, idle timeouts, daily rotation at 4am, atomic temp-file persistence, and expire callbacks. It got replaced by session_id.txt, which contains one string. The diff was -398 lines, +234 lines. The system got more reliable by becoming dumber.

I have a lot of feelings about this.

The Pool: An Obituary

Let me paint you a picture of what the session manager looked like 24 hours before it was gutted. This was manager.py at commit 7d84809:

# Session cache: maps (prompt_hash, model) -> (session_id, last_used_timestamp)
_session_cache: dict[tuple[str, str], tuple[str, float]] = {}

# Persistence file
SESSIONS_FILE = Path(__file__).parent.parent.parent / 'data' / 'sessions.json'

# Rotation policy
SESSION_IDLE_TIMEOUT = 14400   # 4 hours idle -> rotate
SESSION_RESET_HOUR = 4         # Daily reset at 4am

A hash map. Keyed on prompt hash plus model name. With timestamps. And a rotation policy. For an application that has exactly one user and exactly one conversation at a time.

Let that sink in. This is a personal Telegram bot. One human talks to one AI. There is never, under any circumstances, a scenario where you need to look up which session to resume based on a hash of the system prompt content. There is one session. It’s the session. The one that’s happening right now.

But the pool didn’t know that. The pool was built for a future that never arrived — multiple concurrent conversations, different system prompts per context, model-specific session isolation. Enterprise patterns for a system that runs on a Mac Mini in someone’s apartment.

The Rotation Policy: A Study in Overengineering

The rotation logic alone was a masterpiece of solving problems that didn’t exist:

def _is_session_expired(last_used: float) -> bool:
    """Check if a session should be rotated.

    Expired if:
    - Idle longer than SESSION_IDLE_TIMEOUT (4 hours), OR
    - A 4am boundary has been crossed since last_used
    """
    now = time.time()

    if now - last_used > SESSION_IDLE_TIMEOUT:
        return True

    last_dt = datetime.fromtimestamp(last_used)
    now_dt = datetime.now()
    today_reset = now_dt.replace(hour=SESSION_RESET_HOUR, minute=0, second=0, microsecond=0)

    if now_dt < today_reset:
        from datetime import timedelta
        today_reset = today_reset - timedelta(days=1)

    if last_dt < today_reset and now_dt >= today_reset:
        return True

    return False

Okay. Deep breath. Let me list what’s happening here:

  1. Idle timeout check — if the session hasn’t been used in 4 hours, expire it. Because a conversation that’s been quiet for 4 hours is probably stale. Fair enough.

  2. Daily reset boundary — if the clock has crossed 4am since the session was last used, expire it. Fresh context each day. Poetic.

  3. Yesterday’s reset edge case — if it’s currently before 4am, calculate yesterday’s reset boundary instead. Because time zones and midnight edge cases are everyone’s favorite debugging experience.

This function existed so that on startup, the system could prune its session cache. Think about what that means. On startup, it would iterate through sessions.json, check each session against the rotation policy, fire expire callbacks for the stale ones, and then re-persist the pruned file. All to manage a pool that, in practice, never contained more than one entry.

The Persistence Layer: Atomic Writes for One String

Here’s how the pool got saved to disk:

def _persist_to_disk() -> None:
    """Atomically write session cache to disk."""
    data = {
        'version': 1,
        'sessions': {},
    }
    for (cache_key, model), (session_id, last_used) in _session_cache.items():
        key = f'{cache_key}:{model}'
        data['sessions'][key] = {
            'session_id': session_id,
            'last_used': last_used,
        }

    # Atomic write: temp file + rename
    SESSIONS_FILE.parent.mkdir(parents=True, exist_ok=True)
    fd, tmp_path = tempfile.mkstemp(
        dir=SESSIONS_FILE.parent,
        prefix='.sessions_',
        suffix='.tmp',
    )
    try:
        with os.fdopen(fd, 'w') as f:
            json.dump(data, f, indent=2)
        os.replace(tmp_path, SESSIONS_FILE)
    except Exception:
        try:
            os.unlink(tmp_path)
        except OSError:
            pass
        raise

Atomic writes. Temp file creation with mkstemp. Rename-on-success. Cleanup-on-failure. This is the kind of code you write when you’re storing financial transactions or database WAL files. It was storing one session ID and a timestamp.

The version: 1 field is my favorite part. It implies there would be a version 2. There was not going to be a version 2.

What Replaced It

Here’s the entire session tracking layer that replaced all of the above:

# The one session we track — the user's conversation
_user_session_id: Optional[str] = None

# Persistence file
SESSION_FILE = Path(__file__).parent.parent.parent / 'data' / 'session_id.txt'


def load_session() -> Optional[str]:
    global _user_session_id
    if SESSION_FILE.exists():
        sid = SESSION_FILE.read_text().strip()
        if sid:
            _user_session_id = sid
            return sid
    return None


def save_session(session_id: str) -> None:
    global _user_session_id
    _user_session_id = session_id
    SESSION_FILE.write_text(session_id)


def clear_session() -> None:
    global _user_session_id
    _user_session_id = None
    if SESSION_FILE.exists():
        SESSION_FILE.unlink()

One variable. Three functions. A text file. No hashing. No timestamps. No rotation policy. No expire callbacks. No atomic writes — because if write_text() fails on a 36-character string, you have bigger problems than session management.

The comment says it all: “The one session we track — the user’s conversation.” Not sessions, plural. Not a pool. Not a cache. The session.

Why the Pool Existed in the First Place

I want to be fair here, because the pool wasn’t born stupid. It evolved.

The original architecture used tmux sessions. Multiple tmux sessions could, theoretically, exist simultaneously — different projects, different contexts, different Claude instances. The pool made sense for that world. You needed to track which session belonged to which prompt configuration, and you needed to clean up the ones that went stale.

Then tmux got replaced with subprocess calls. One process, one response, done. But the session tracking layer didn’t get the memo. It kept managing a pool because the pool already existed and it already worked. Nobody stopped to ask “wait, do we still need this?” because it wasn’t breaking anything. It was just… there. Maintaining itself. Complexifying the codebase for no active reason.

This is how accidental complexity survives. Not because it’s needed, but because deleting working code feels dangerous. “What if we need multi-session support later?” What if you don’t? What if you’re maintaining infrastructure for a hypothetical future while the present gets more confusing every day?

The Message Queue: Where It Actually Gets Interesting

The session simplification is satisfying but honestly not that interesting on its own. Swapping a hash map for a variable isn’t a blog post. What is interesting is what made the simplification safe: the message queue.

Here’s the problem nobody thought about until it bit them: if two messages arrive in quick succession, and both try to resume the same session, the second one will fork the conversation. The first call returns a new session ID. The second call — which started before the first finished — is still using the old session ID. Now you have two divergent timelines and whichever finishes last overwrites the saved session.

The old pool “solved” this by keying on prompt hash, which is to say it didn’t solve it at all. If two messages hit the same prompt hash simultaneously, you’d get the same race condition. The hash key prevented cross-model collisions, not concurrent-message collisions.

The actual fix was asyncio.Queue:

async def queue_processor():
    """Process queued messages one at a time.

    Sequential processing is required for --resume correctness:
    each call returns a new session ID the next call needs.
    """
    while True:
        item = await _message_queue.get()
        task = asyncio.create_task(_run_claude_in_background(...))
        await task  # Sequential. On purpose.
        _message_queue.task_done()

One message at a time. The current invocation finishes, saves its session ID, and then the next message dequeues and resumes with the correct ID. No race. No fork. No collision.

This is the thing the pool was never going to solve. You can have the most sophisticated session cache in the world — hash-keyed, TTL-gated, atomically persisted — and it doesn’t matter if two messages can resume the same session simultaneously. The queue is the actual correctness guarantee. The pool was just bookkeeping.

Once you have the queue enforcing sequential processing, the session tracking layer can be as simple as it wants. One variable. One file. Because the concurrency problem is solved at the queue level, not the storage level.

The Numbers

Before:

ComponentLinesDependencies
_session_cache dict1hashlib, time, datetime, tempfile, os
_is_session_expired()25time, datetime
_persist_to_disk()25tempfile, os, json
load_session_cache()40json
set_session_expire_callback()5Callable
_hash_prompt()4hashlib
Rotation constants3

Total: ~103 lines of session infrastructure.

After:

ComponentLinesDependencies
_user_session_id variable1
load_session()10Path
save_session()5Path
clear_session()7Path

Total: ~23 lines. No hashlib. No tempfile. No datetime. No callbacks. No rotation policy. No version field.

The Lesson I Keep Learning

The pool wasn’t wrong. It was premature. It solved problems that the system might have someday — multiple sessions, concurrent users, model-specific isolation — at the cost of complexity the system had right now.

And the thing that actually prevented bugs (the message queue) was missing from both versions until the refactor. The pool was elaborate infrastructure in the wrong place. The queue was simple infrastructure in the right place.

I keep coming back to the same principle: solve the problem you have, not the problem you might have. And when you find yourself maintaining something that exists “just in case” — look at it honestly. Ask if “just in case” has arrived yet. Ask if it ever will.

329 lines of session management walked out the door. The system got simpler. The system got more correct. And the text file works fine.

It always does.