- Add valid_at/invalid_at passthrough in triplet extraction prompt (both zh/en)
- Propagate temporal_validity to EntityEntityEdge in ExtractionOrchestrator
- Use coalesce() for valid_at/invalid_at in Neo4j cypher queries to handle NULLs
- Fix workspace_id/config_id UUID parsing in read_memory config resolution
- Downgrade verbose extraction pipeline logs from info to debug
- Remove UUID and short API key patterns from sensitive filter to reduce false positives
- Standardize log message format (use = spacing, end_user_id label)
- Fix misindented TODO comment in write_pipeline.py
- Rename StatementExtractionStep → StatementTemporalExtractionStep and
extract_statement.jinja2 → extract_statement_temporal.jinja2 to reflect
merged temporal extraction logic
- Move extraction_pipeline_orchestrator.py out of steps/ to engine root
- Move dedup_step.py into steps/ directory
- Introduce WriteMemoryRequest schema to replace positional args in write_memory()
- Extract _resolve_and_load_config, _preprocess_files, _write_neo4j, and
_invalidate_interest_cache as private helpers in MemoryAgentService
- Remove shadow pipeline and simplify NEW_PIPELINE_ENABLED branch
- Merge 类型归属/成员隶属/任职服务 relation types into single 归属身份关系 in triplet prompt
- Add alias merge logic (别名属于) in deduplication and MERGE_ALIAS_BELONGS_TO Cypher query
- Add StorageType, Language, MessageItem enums/models to memory_agent_schema
- Reduce AgentMemory_Long_Term.DEFAULT_SCOPE from 6 to 1
- Delete standalone extract_temporal.jinja2 (logic merged into statement step)
Introduce ExtractionStep abstraction with modular pipeline stages:
- Add base ExtractionStep class with render/call/parse lifecycle
- Implement StatementExtractionStep, TripletExtractionStep,
EmbeddingStep, EmotionStep, GraphBuildStep, and DedupStep
- Add SidecarStepFactory for hot-pluggable non-critical steps
- Define Pydantic I/O schemas for all pipeline stages
- Refactor WritePipeline to orchestrate new step-based flow
- Add NEW_PIPELINE_ENABLED env switch for old/new pipeline routing
- Add emotion_enabled config flag to MemoryConfig
- Fix workspace_id reference in get_end_user_connected_config
Introduce a layered pipeline architecture for the memory write flow:
- WritePipeline: orchestrates preprocess → extract → store → cluster → summarize
with deadlock retry, resource cleanup, and pilot-run support
- MemoryService: facade that delegates to WritePipeline, placeholder methods
for read/forget/reflect
- BearLogger: structured step-level logging with perf threshold alerts
- Shadow pipeline integration in MemoryAgentService (env-gated pilot run)
Also includes:
- Fix deprecated SQLAlchemy declarative_base import
- Extend Neo4j Entity fulltext index to cover description and aliases
- Migrate Pydantic schemas to v2 (ConfigDict, field_validator)
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
- Fix write_router to use actual_end_user_id instead of end_user_id
- Add task status tracking via Redis in scheduler
- Expose task_id in memory write response
- Fix logging import path in scheduler
- Replace direct memory agent service calls with unified MemoryService in read endpoint
- Update query preprocessor to use new prompt format and return structured queries
- Enhance MemorySearchResult model with filtering, merging, and ID tracking capabilities
- Add intermediate outputs display for problem split, perceptual retrieval, and search results
- Fix parameter alignment and remove unused history parameter in memory agent service
- Consolidate memory search services by removing separate content_search.py and perceptual_search.py
- Update model client handling in base_pipeline.py to use ModelApiKeyService for LLM client initialization
- Add new prompt files and modify existing services to support consolidated search architecture
- Refactor memory read pipeline and related services to use updated model client approach
- Use Literal['set', 'remove'] for MetadataFieldChange.action instead of str
- Simplify field_path description to reflect current schema
- Remove redundant isinstance check in extract_user_metadata_task
- Replace UserMetadata full-object overwrite with incremental MetadataFieldChange
operations (set/remove per field path)
- Convert profile.role and profile.domain from scalar strings to lists
- Remove UserMetadataBehavioralHints and knowledge_tags fields
- Update Jinja2 prompt to instruct LLM to output incremental changes
- Update extract_user_metadata_task to apply changes via deep-copy and
per-field mutation for proper SQLAlchemy change detection
- Minor lint: remove unnecessary f-string prefixes in tasks.py
- Consolidate memory search services by removing separate content_search.py and perceptual_search.py
- Update model client handling in base_pipeline.py to use ModelApiKeyService for LLM client initialization
- Add new prompt files and modify existing services to support consolidated search architecture
- Refactor memory read pipeline and related services to use updated model client approach
- Replace hardcoded user placeholder name lists in write_tools and
user_memory_service with shared _USER_PLACEHOLDER_NAMES constant
- Filter user placeholder names during alias merging in _merge_attribute
to prevent cross-role alias contamination on non-user entities
- Use toLower() in Cypher query for case-insensitive name matching
- Change PgSQL->Neo4j alias sync condition from 'if pg_aliases' to
'if info is not None' so empty aliases correctly clear stale data
- Skip alias merging for user entities during dedup (_merge_attribute and
_merge_entities_with_aliases) to prevent dirty data from overwriting
PgSQL authoritative aliases
- Add PgSQL→Neo4j alias sync after Neo4j write in write_tools to
ensure Neo4j user entities always reflect the PgSQL source
- Remove deduped_aliases (Neo4j history) from alias sync in
extraction_orchestrator, only append newly extracted aliases to PgSQL
- Guard Neo4j MERGE cypher to preserve existing aliases for user
entities (name IN ['用户','我','User','I'])
- Fix emotion_analytics_service query to use ExtractedEntity label
and entity_type property
- Replace storage_services/search with new read_services/memory_search structure
- Implement content_search and perceptual_search strategies
- Add query_preprocessor for search optimization
- Create memory_service as unified interface
- Update celery_app and graph_search for new architecture
- Add enums for memory operations
- Implement base_pipeline and memory_read pipeline patterns
- Make MetadataExtractor language param optional (default None) to
support auto-detection fallback when no language is explicitly set
- Refactor clean_metadata from walrus-operator dict comprehension to
explicit loop for correctness and readability
- Remove _replace_first_person_with_user from StatementExtractor to preserve
original user text for downstream metadata/alias extraction
- Delete metadata_utils.py module, inline clean_metadata into Celery task
- Remove unused imports and commented-out collect_user_raw_messages method
- Apply formatting cleanup across metadata models and extraction orchestrator
- Merge alias add/remove into MetadataExtractionResponse and Celery metadata task,
removing the separate sync step from extraction_orchestrator
- Replace first-person pronouns ("我") with "用户" in statement extraction to
preserve identity semantics for downstream metadata/alias extraction
- Update extract_statement.jinja2 prompt to enforce "用户" as subject for user
statements instead of resolving to real names
- Add alias change instructions (aliases_to_add/aliases_to_remove) to
extract_user_metadata.jinja2 with incremental merge logic
- Deduplicate special entities ("用户", "AI助手") in graph_saver by reusing
existing Neo4j node IDs per end_user_id
- Sync final aliases from PgSQL to Neo4j user entity nodes after metadata write
- Remove merge_metadata and its helper functions from metadata_utils.py
- Pass existing_metadata to MetadataExtractor.extract_metadata() as LLM context
- Add merge instructions to extract_user_metadata.jinja2 prompt (zh/en)
- Update Celery task to read existing metadata before extraction and overwrite
- Simplify field descriptions in UserMetadataProfile model
- Add _update_timestamps helper to track changed fields
- Replace version-based optimistic locking and retry loop with apoc.atomic.add/insert for concurrent safety
- Merge duplicate accesses within a batch before updating (access_count_delta)
- Simplify _calculate_update to only compute on new timestamps instead of full history rebuild
- Remove max_retries instance variable (kept as param for backward compat)
- Trim verbose docstrings and inline comments
- Increase max_retries from 3 to 5 for concurrent conflict recovery
- Add randomized exponential backoff between retries to reduce contention
- Merge duplicate node accesses in batch operations to avoid self-conflicts
- Support access_times parameter for merged batch access counting
- Add Community node label support in atomic update content field map
- Strengthen anti-hallucination rules in extract_triplet prompt to
enforce verbatim-only alias extraction, removing suggestive examples
- Add _extract_deduped_entity_aliases to sync historical aliases from
Neo4j two-stage dedup into PgSQL end_user_info
- Remove unused _fetch_neo4j_user_aliases; reuse injected connector
instead of instantiating new Neo4jConnector
- Simplify _would_merge_cross_role and reuse clean_cross_role_aliases
in _normalize_special_entity_names
- Reuse _USER_PLACEHOLDER_NAMES from dedup module to avoid duplication
- Extract fetch_neo4j_assistant_aliases() into deduped_and_disamb.py as
single source of truth, replacing inline Cypher in write_tools and
extraction_orchestrator
- Normalize USER_PLACEHOLDER_NAMES to lowercase and apply .lower() on
all comparisons to prevent case-variant names leaking into aliases
- Extract user aliases from raw dialog statements instead of post-dedup
entities to bypass merge pollution
- Add alias cross-cleaning step in _normalize_special_entity_names to
strip AI assistant aliases from user entities before dedup
- Call clean_cross_role_aliases after second-layer dedup to handle
historical dirty data merged from Neo4j
- Fix syntax error in prompt_utils.py (ontology_types variable assignment)
- Add speaker context to triplet extraction prompt to distinguish alias ownership
- Add explicit examples and rules in extract_triplet.jinja2 for user vs AI alias attribution
- Introduce cross-role merge protection in dedup (accurate, fuzzy, and LLM stages)
- Normalize special entity names (用户/AI助手) before deduplication
- Add clean_cross_role_aliases() to sanitize aliases before Neo4j write
- Refactor _update_end_user_other_name to merge aliases from PgSQL instead of Neo4j
- Filter AI assistant aliases from user alias extraction in orchestrator
- Create new memory_config_api_controller.py for dedicated memory configuration management
- Add /end_user/info GET endpoint to retrieve end user information (aliases, metadata)
- Add /end_user/info/update POST endpoint to update end user details
- Move /memory/configs endpoint from memory_api_controller to memory_config_api_controller
- Extract _get_current_user helper function to build user context from API key auth
- Support optional app_id parameter in end user creation with UUID validation
- Update service controller imports with alphabetical ordering and multi-line formatting
- Register memory_config_api_controller router in service module initialization
- Refactor memory_api_controller imports for consistency and clarity