Skip to content

Admin Operations

Administrative endpoints require elevated privileges (ADMIN_API_TOKEN) for managing enrichment processing and embedding generation. These operations are intended for maintenance, debugging, and bulk data operations.

For standard memory operations (store, recall, update, delete), see Memory Operations. For consolidation scheduling, see Consolidation Operations.


Admin operations require dual authentication:

  1. Standard API Token (AUTOMEM_API_TOKEN) — Required for all endpoints except /health
  2. Admin Token (ADMIN_API_TOKEN) — Additional token for privileged operations
Token TypeHeader MethodsQuery ParameterEnvironment Variable
API TokenAuthorization: Bearer <token> / X-API-Key: <token>?api_key=<token>AUTOMEM_API_TOKEN
Admin TokenX-Admin-Token: <token> / X-Admin-Api-Key: <token>?admin_token=<token>ADMIN_API_TOKEN
graph TB
    subgraph "Authentication Layers"
        Layer1["Layer 1: API Token<br/>AUTOMEM_API_TOKEN<br/>Guards all endpoints except /health"]
        Layer2["Layer 2: Admin Token<br/>ADMIN_API_TOKEN<br/>Guards privileged operations"]
    end

    subgraph "Protected Operations"
        Reprocess["/enrichment/reprocess"]
        Reembed["/admin/reembed"]
    end

    subgraph "Also Public"
        Health["/health<br/>(no auth)"]
    end

    subgraph "API Token Required (when AUTOMEM_API_TOKEN configured)"
        Status["/enrichment/status"]
    end

    Layer1-->Layer2
    Layer1-->Status
    Layer2-->Reprocess
    Layer2-->Reembed
Status CodeResponseMeaning
401 Unauthorized{"error": "Unauthorized"}Missing or invalid AUTOMEM_API_TOKEN
401 Admin authorization required{"error": "Admin authorization required"}Missing or invalid ADMIN_API_TOKEN
403 Admin token not configured{"error": "Admin token not configured"}Server has no ADMIN_API_TOKEN environment variable set

Authentication: API token (when AUTOMEM_API_TOKEN is configured)

Purpose: Monitor the enrichment pipeline’s health and processing statistics. Unlike /health, this endpoint is protected by the global API token guard whenever AUTOMEM_API_TOKEN is set.

{
"status": "running",
"queue_size": 3,
"pending": 2,
"inflight": 1,
"max_attempts": 3,
"stats": {}
}
FieldTypeDescription
statusstring"running" if enrichment worker is active, "stopped" if worker thread is dead
queue_sizeintegerCurrent number of jobs in the enrichment queue
pendingintegerCount of memories waiting to be processed
inflightintegerCount of memories currently being processed
max_attemptsintegerMaximum retry attempts per memory before marking failed (from ENRICHMENT_MAX_ATTEMPTS)
statsobjectLifetime enrichment statistics (nested object with processing counters)
Terminal window
curl "https://your-automem-instance/enrichment/status" \
-H "Authorization: Bearer YOUR_API_TOKEN"
ObservationLikely CauseAction
status: "stopped"Worker thread crashedCheck application logs for exceptions, restart service
queue_size increasingWorker processing slower than intakeMonitor inflight, check for spaCy or OpenAI issues
High failure count in statsEnrichment logic errorsReview application logs, check Qdrant connectivity
inflight stuckWorker deadlockedRestart enrichment worker or service

Authentication: API token + Admin token

Purpose: Force re-enrichment of specific memories. Useful after:

  • Updating enrichment logic or configuration
  • Adding spaCy model capabilities
  • Fixing corrupted enrichment metadata
  • Recovering from systematic enrichment failures
ParameterTypeRequiredDescription
idsarray[string]YesList of memory UUIDs to reprocess (non-empty)

Reprocessing always forces re-queuing regardless of current pending/in-flight state.

Terminal window
curl -X POST https://your-automem-instance/enrichment/reprocess \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"ids": [
"a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"b2c3d4e5-f6a7-8901-bcde-f12345678901"
]
}'
FieldTypeDescription
statusstringAlways "queued"
countintegerNumber of memories successfully added to enrichment queue
idsarray[string]List of memory UUIDs that were queued
{
"status": "queued",
"count": 2,
"ids": [
"a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"b2c3d4e5-f6a7-8901-bcde-f12345678901"
]
}
graph TB
    Request["POST /enrichment/reprocess<br/>{ids: [...]}"]
    Auth["require_admin_token()"]
    Validate["Validate Request"]

    subgraph "Queueing Loop"
        Enqueue["enqueue_enrichment()<br/>with forced=True"]
        IncrQueued["queued_count++"]
    end

    Response["Return 202:<br/>status, count, ids"]

    Request-->Auth
    Auth-->Validate
    Validate-->Enqueue
    Enqueue-->IncrQueued
    IncrQueued-->Response

The reprocessing operation performs these steps:

  1. Validation Phase — Validates that the ids array is non-empty
  2. Queueing — Calls enqueue_enrichment(memory_id, forced=True) which:
    • Acquires state.enrichment_lock
    • Adds memory ID to state.enrichment_pending
    • Puts an EnrichmentJob with forced=True in queue
  3. Background Processing — The enrichment_worker() thread picks up jobs and calls enrich_memory() which:
    • Extracts entities via spaCy (if installed)
    • Creates temporal PRECEDED_BY edges
    • Finds semantic neighbors via Qdrant
    • Creates SIMILAR_TO relationships
    • Detects patterns and creates EXEMPLIFIES edges
    • Updates metadata.enriched_at timestamp

Authentication: API token + Admin token

Purpose: Regenerate embeddings for all memories in batches. Critical for:

  • Migrating to a different embedding model
  • Recovering from Qdrant data loss
  • Fixing corrupted embeddings
  • Bulk embedding generation after initial import
ParameterTypeRequiredDescription
batch_sizeintegerNoEmbeddings per OpenAI API call. Default: 32. Max recommended: 100
limitintegerNoMax memories to process. If omitted, processes all memories in database
forcebooleanNoRe-embed memories even if embeddings already exist. Default: false
Terminal window
curl -X POST https://your-automem-instance/admin/reembed \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 32}'
graph TB
    Request["POST /admin/reembed<br/>{batch_size: 32, limit: 1000}"]
    Auth["require_admin_token()"]
    Init["Retrieve pre-initialized OpenAI client<br/>get_openai_client()"]

    FetchAll["Single FalkorDB query:<br/>MATCH (m:Memory)<br/>[WHERE m.content IS NOT NULL]<br/>RETURN id, content, tags, …<br/>ORDER BY timestamp DESC<br/>[LIMIT limit if set]"]

    subgraph "Batch Processing Loop"
        Slice["Slice next batch_size memories"]

        CallOpenAI["OpenAI API:<br/>embeddings.create()<br/>model=embedding_model<br/>input=[contents]"]

        UpdateQdrant["Qdrant: upsert()<br/>PointStruct with full payload<br/>(metadata_preserved=True)"]

        IncrCount["processed_count += batch_size"]
    end

    Response["Return summary:<br/>status, processed, failed,<br/>total, batch_size,<br/>metadata_preserved"]

    Request-->Auth
    Auth-->Init
    Init-->FetchAll
    FetchAll-->Slice
    Slice-->CallOpenAI
    CallOpenAI-->UpdateQdrant
    UpdateQdrant-->IncrCount
    IncrCount-->|More batches?|Slice
    IncrCount-->|Done|Response
FieldTypeDescription
statusstringResult status
processedintegerNumber of memories successfully re-embedded
failedintegerNumber of memories that failed re-embedding
totalintegerTotal memory count in database at operation start
batch_sizeintegerBatch size used (from request or default 32)
metadata_preservedbooleanWhether existing metadata was preserved during re-embedding
{
"status": "complete",
"processed": 1000,
"failed": 0,
"total": 1000,
"batch_size": 32,
"metadata_preserved": true
}

Phase 1: Memory Enumeration and Content Fetch

Fetches all memory data (or up to limit) in a single FalkorDB query:

MATCH (m:Memory)
WHERE m.content IS NOT NULL
RETURN m.id AS id, m.content AS content, m.tags AS tags,
m.importance AS importance, m.timestamp AS timestamp,
m.type AS type, m.confidence AS confidence,
m.metadata AS metadata, m.updated_at AS updated_at,
m.last_accessed AS last_accessed
ORDER BY m.timestamp DESC
-- When the `limit` request parameter is set, a LIMIT clause is appended after this line

When force=true, the WHERE m.content IS NOT NULL filter is omitted from the query. However, the Python collection loop still checks if content: before adding a row to the processing list, so memories with null or empty content are excluded regardless of force. There is no separate per-batch content retrieval step — all memory data is loaded upfront.

Phase 2: OpenAI Embedding Generation

Generates embeddings for the entire batch in a single API call using the configured embedding_model. Dimension is determined by the VECTOR_SIZE environment variable (default 1024).

Phase 3: Qdrant Update

Embeddings are written to Qdrant only. Qdrant failures are logged but don’t halt the operation (graceful degradation). FalkorDB graph data is not modified by this operation.

Batch SizeOpenAI API Calls (1000 memories)Approx TimeCost (1000 memories)
10100~5 minutes$0.06
3232~2 minutes$0.06
5020~1 minute$0.06
10010~30 seconds$0.06

Recommendations:

  • Default 32 balances API call overhead and failure blast radius
  • Use 100 for large migrations (>10K memories) with stable OpenAI access
  • Use 10 during testing or with rate-limited OpenAI keys
  • Monitor processed count to detect stalls mid-operation

The operation continues even if individual batches fail:

ErrorCauseBehavior
OpenAI API rate limitExceeded quotaRetries with exponential backoff (handled by OpenAI SDK)
Missing memory contentDeleted between enumeration and fetchLogged, skipped, processing continues
Qdrant connection failureNetwork issue or Qdrant downLogged, FalkorDB still updated (graceful degradation)
Invalid content formatNull or non-string contentLogged, skipped

All errors are logged with structured context:

logger.exception("Failed to generate embeddings for batch", extra={"batch_ids": ids})

Authentication: API token + Admin token

Purpose: Perform non-destructive drift repair between FalkorDB and Qdrant. Detects and reconciles discrepancies without deleting data.

ParameterTypeRequiredDescription
batch_sizeintegerNoNumber of memories to process per batch. Default: 32
dry_runbooleanNoIf true, report drift without making changes. Default: false
Terminal window
curl -X POST https://your-automem-instance/admin/sync \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 32, "dry_run": false}'

Admin operations can:

  • Force expensive OpenAI API calls (re-embedding entire database)
  • Trigger resource-intensive enrichment reprocessing
  • Access operational metrics (enrichment statistics)

Without admin token protection, a compromised API token could:

  1. Generate thousands of dollars in OpenAI costs via repeated re-embedding
  2. Overload enrichment workers with duplicate jobs
  3. Enumerate all memory IDs via reprocess endpoint
PracticeRationaleImplementation
Separate tokensLimits blast radius of API token compromiseUse different values for AUTOMEM_API_TOKEN and ADMIN_API_TOKEN
Rotate periodicallyReduces window of exposureRegenerate tokens monthly, update all clients
Restrict admin accessMinimize privilege escalation riskShare admin token only with operations team
Use headers, not query paramsPrevents token leakage in logsPrefer Authorization: Bearer and X-Admin-Token headers
Monitor admin operationsDetect anomalous usageAlert on high-frequency /admin/reembed calls
Audit admin callsForensic capabilityLog admin operations with IP, timestamp, token hash

After spaCy model upgrades or enrichment logic changes:

Terminal window
# 1. Get all memory IDs that need reprocessing
MEMORY_IDS=$(curl -s "https://your-instance/recall?limit=100&query=*" \
-H "Authorization: Bearer $TOKEN" | jq -r '[.results[].memory.memory_id]')
# 2. Reprocess them
curl -X POST https://your-instance/enrichment/reprocess \
-H "Authorization: Bearer $TOKEN" \
-H "X-Admin-Token: $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"ids\": $MEMORY_IDS}"

When switching from text-embedding-3-small (1024-d, set by VECTOR_SIZE) to text-embedding-3-large (3072-d):

  1. Update VECTOR_SIZE environment variable
  2. Recreate the Qdrant collection with new dimensions
  3. Run /admin/reembed with batch_size=50 to regenerate all embeddings

If Qdrant data is corrupted or lost but FalkorDB is intact:

Terminal window
curl -X POST https://your-automem-instance/admin/reembed \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "X-Admin-Token: YOUR_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"batch_size": 50}'

See Operations / Health for complete recovery procedures.