Future Roadmap for Mnemex¶

This document outlines potential future improvements and implementation approaches for Mnemex.

1. Spaced Repetition¶

What it is: A learning technique where review intervals increase exponentially (e.g., SuperMemo, Anki algorithms).

Current State in Mnemex: - You have touch_memory() which reinforces memories - Decay algorithm reduces scores over time - But there's no proactive suggestion of when to review

Potential Implementation:

# Calculate optimal review time based on current strength
next_review = calculate_next_review(memory)
  = current_time + (strength * base_interval * (use_count ^ β))

# SM-2 inspired spacing
intervals = [1 day, 3 days, 7 days, 14 days, 30 days, ...]

Features to add: 1. Review scheduling - Track next_review_at timestamp 2. Review queue tool - get_review_queue() returns memories due for review 3. Review outcome tracking - Easy/medium/hard adjusts next interval 4. Adaptive intervals - Learn from user's actual recall patterns

Benefit: Memories you want to keep get reinforced just before they'd decay too much. More efficient than random touching.

2. Adaptive Decay Parameters¶

The Problem: Current λ (decay rate) and β (use weight) are fixed. But different memory types should decay differently: - Preferences: slow decay - Project context: medium decay - Random facts: fast decay

Approaches:

A. Category-Based Adaptation¶

DECAY_PROFILES = {
    "preference": {"lambda": 5.7e-7, "beta": 0.3},  # 14-day half-life
    "decision": {"lambda": 1.15e-6, "beta": 0.5},   # 7-day half-life
    "context": {"lambda": 2.67e-6, "beta": 0.6},    # 3-day half-life (default)
    "fact": {"lambda": 8.02e-6, "beta": 0.8},       # 1-day half-life
}

Auto-detect category from tags or content analysis.

B. Usage-Pattern Learning¶

Track actual usage patterns and adjust:

if memory.use_count > 10 and time_since_last_use < 1_day:
    # Frequently accessed → slow decay
    memory.custom_lambda = memory.custom_lambda * 0.8
elif memory.use_count < 3 and time_since_last_use > 7_days:
    # Rarely accessed → fast decay
    memory.custom_lambda = memory.custom_lambda * 1.2

C. Reinforcement Learning¶

Track which memories get promoted vs forgotten
Learn optimal parameters per memory type
Requires more data but most powerful

Recommendation: Start with Category-Based (simple, immediate benefit), then add Usage-Pattern Learning (moderate complexity).

3. Clustering & Consolidation: LLM vs Algorithmic?¶

Current clustering (algorithmic): - ✅ Embeddings-based similarity (cosine distance) - ✅ Duplicate detection (high threshold like 0.88+) - ✅ Cluster formation (medium threshold like 0.78-0.83)

Consolidation Options:

Option A: Pure Algorithmic (No LLM)¶

def consolidate_algorithmic(cluster):
    if similarity > 0.95:
        # Near-duplicates: keep newer, delete older
        return keep_newest(cluster)

    if similarity > 0.85:
        # High overlap: merge tags, combine entities
        return merge_metadata(cluster)

    if similarity > 0.75:
        # Related: just create relations, don't merge
        return link_memories(cluster)

Pros: Fast, deterministic, no external dependencies Cons: Can't understand semantic nuance, might lose information

Option B: LLM-Assisted (Hybrid)¶

def consolidate_with_llm(cluster):
    # 1. Algorithmic pre-filter
    if similarity < 0.75:
        return "no_action"

    # 2. LLM decides merge strategy
    prompt = f"""
    These memories are similar. Should they be:
    1. Merged (duplicates/redundant)
    2. Linked (related but distinct)
    3. Kept separate

    Memory 1: {mem1.content}
    Memory 2: {mem2.content}
    """

    decision = llm_call(prompt)

    # 3. If merge, LLM writes consolidated version
    if decision == "merge":
        merged_content = llm_call(f"Merge these: {memories}")
        return create_consolidated_memory(merged_content)

Pros: Smart decisions, preserves semantic meaning Cons: Slower, requires MCP client support, not deterministic

Option C: Algorithmic with Human Review¶

def consolidate_interactive(cluster):
    # Show side-by-side comparison
    preview = generate_merge_preview(cluster)

    # User approves/rejects/edits
    return {
        "action": "preview",
        "original_memories": cluster,
        "suggested_merge": algorithmic_merge(cluster),
        "user_can_edit": True
    }

Pros: User control, no LLM needed, no data loss Cons: Manual work required

Recommendation:¶

Start with Option C (Algorithmic + Human Review) because: 1. Safe - No automatic deletions, user confirms 2. Fast - No LLM calls needed 3. Flexible - User can edit merged content 4. MCP-friendly - Returns preview, client handles approval

Later, add Option B (LLM-assisted) as an opt-in feature for power users.

Implementation:

@mcp.tool()
def consolidate_memories(cluster_id: str, mode: str = "preview"):
    cluster = get_cluster(cluster_id)

    if mode == "preview":
        # Algorithmic merge
        merged = {
            "content": merge_content_smart(cluster),
            "tags": union(tags),
            "entities": union(entities),
            "strength": max(strengths) * 1.1,
            "original_ids": [m.id for m in cluster]
        }
        return {"preview": merged, "action": "awaiting_approval"}

    if mode == "apply":
        # User approved, do the merge
        new_mem = create_memory(merged)
        for old_mem in cluster:
            mark_as_consolidated(old_mem, new_mem.id)
        return {"success": True, "new_id": new_mem.id}

4. Performance Improvements¶

Current Bottlenecks:

A. In-Memory Search (JSONL files)¶

Every search reads entire file
O(n) for every query
Gets slow at 10K+ memories

Solution:

# Option 1: Index by tags/entities
tag_index = {"typescript": [mem_id1, mem_id2, ...]}
entity_index = {"Claude": [mem_id3, mem_id4, ...]}

# Option 2: Bloom filter for quick "not found"
if not bloom_filter.might_contain(query):
    return []  # Fast path

# Option 3: Incremental compaction
compact_if(num_tombstones > 1000 or file_size > 10MB)

B. Embedding Generation¶

Slow for large batches
Re-computes for duplicates

Solution:

# Cache embeddings by content hash
embedding_cache[hash(content)] = embedding

C. Decay Calculation¶

Calculates score for every memory on every search

Solution:

# Pre-compute scores periodically
background_task:
    update_all_scores_cached()
    sleep(60)  # Refresh every minute

# Search uses cached scores
def search(query):
    candidates = filter_by_tags(query)
    # Use pre-computed scores, don't recalc
    return sort_by(candidates, key=lambda m: m.cached_score)

Benchmarking Plan:

# tests/performance/test_benchmarks.py
def benchmark_search():
    for n in [100, 1000, 10000, 100000]:
        memories = generate_test_memories(n)
        start = time()
        search(query)
        print(f"n={n}: {time() - start}s")

def benchmark_decay():
    # Measure score calculation speed

def benchmark_compaction():
    # Measure JSONL rewrite performance

5. Other Improvements¶

A. Testing Coverage¶

Current gaps (likely): - Edge cases in decay models - LTM index updates - Git backup failures - Concurrent access

Plan:

# Generate coverage report
pytest --cov=mnemex --cov-report=html
open htmlcov/index.html

# Focus on <80% coverage modules
# Add integration tests for CLI tools

B. Production Hardening¶

Error handling for corrupted JSONL
Graceful degradation if embeddings fail
File locking for concurrent access
Backup before destructive operations

C. GitHub Release (v1.0.0)¶

Tag the current commit
Generate changelog
Build wheel
Publish to PyPI (optional)

D. More Examples¶

Claude prompt templates for auto-save
Different use cases (personal assistant, dev env, research)
Integration with other tools (Raycast, Alfred, etc.)

Completed: UV Tool Install Migration ✅¶

Changes Made¶

Installation Simplified:

Before:

git clone https://github.com/simplemindedbot/mnemex.git
cd mnemex
uv pip install -e .
# Complex MCP config with paths and PYTHONPATH

After:

uv tool install git+https://github.com/simplemindedbot/mnemex.git
# Simple MCP config: {"command": "mnemex"}

MCP Config Updates¶

Before:

{
  "mcpServers": {
    "mnemex": {
      "command": "uv",
      "args": ["--directory", "/path/to/mnemex", "run", "mnemex"],
      "env": {"PYTHONPATH": "/path/to/mnemex/src"}
    }
  }
}

After:

{
  "mcpServers": {
    "mnemex": {
      "command": "mnemex"
    }
  }
}

Migration Guide for Users¶

For existing users switching from editable install:

# 1. Uninstall editable version
uv pip uninstall mnemex

# 2. Install as tool
uv tool install git+https://github.com/simplemindedbot/mnemex.git

# 3. Update Claude config to just: {"command": "mnemex"}
#    Remove the --directory, run, and PYTHONPATH settings

Your data is safe! This only changes how the command is installed. Your memories in ~/.config/mnemex/ are untouched.

Completed: Consolidation Tool ✅¶

Implementation Summary¶

Completed: Algorithmic consolidation with preview/apply modes

Files Added: - src/mnemex/core/consolidation.py - Core merging logic - tests/test_consolidation.py - Comprehensive test suite (15 tests, 100% coverage)

Features: - Smart content merging (preserves unique information, detects duplicates) - Tag and entity merging (union of all values) - Strength calculation based on cluster cohesion - Timestamp preservation (earliest created_at, latest last_used) - Relation tracking (consolidated_from relations) - Auto-detect mode (finds high-cohesion clusters automatically) - Preview mode (dry-run to inspect before applying)

Usage:

# Auto-detect and preview
consolidate_memories(auto_detect=True, mode="preview", cohesion_threshold=0.75)

# Apply consolidation
consolidate_memories(auto_detect=True, mode="apply", cohesion_threshold=0.80)

Test Results: All 15 tests passing: - test_merge_tags, test_merge_entities, test_merge_metadata - test_merge_content_duplicates, test_merge_content_distinct - test_calculate_merged_strength - test_generate_consolidation_preview - test_execute_consolidation - test_consolidation_preserves_timestamps

Priority Order¶

~~Consolidation Tool (1-2 days) - Implement algorithmic merge with preview~~ ✅ DONE
Spaced Repetition (2-3 days) - Add review queue and scheduling
Adaptive Decay (3-4 days) - Category-based decay profiles
Performance (1-2 days) - Benchmarking and optimization
Production Hardening (ongoing) - Testing and error handling