Commit Graph

234 Commits

Author SHA1 Message Date
lanceyq
b0ddd12cc6 feat(memory): add emotion batch extraction task and improve extraction prompts
- Add extract_emotion_batch_task for async emotion extraction
- Refine Chinese entity types and relation types in extraction prompts
- Add STATEMENT_EMOTION_UPDATE Cypher query for Neo4j backfill
- Refactor statement_step and triplet_step implementations
2026-05-08 11:26:04 +08:00
lanceyq
a98011fc8a feat(memory): implement step-based extraction pipeline architecture
Introduce ExtractionStep abstraction with modular pipeline stages:
- Add base ExtractionStep class with render/call/parse lifecycle
- Implement StatementExtractionStep, TripletExtractionStep,
  EmbeddingStep, EmotionStep, GraphBuildStep, and DedupStep
- Add SidecarStepFactory for hot-pluggable non-critical steps
- Define Pydantic I/O schemas for all pipeline stages
- Refactor WritePipeline to orchestrate new step-based flow
- Add NEW_PIPELINE_ENABLED env switch for old/new pipeline routing
- Add emotion_enabled config flag to MemoryConfig
- Fix workspace_id reference in get_end_user_connected_config
2026-05-08 11:26:04 +08:00
lanceyq
41535c34e6 feat(memory): add WritePipeline and MemoryService facade
Introduce a layered pipeline architecture for the memory write flow:
- WritePipeline: orchestrates preprocess → extract → store → cluster → summarize
  with deadlock retry, resource cleanup, and pilot-run support
- MemoryService: facade that delegates to WritePipeline, placeholder methods
  for read/forget/reflect
- BearLogger: structured step-level logging with perf threshold alerts
- Shadow pipeline integration in MemoryAgentService (env-gated pilot run)

Also includes:
- Fix deprecated SQLAlchemy declarative_base import
- Extend Neo4j Entity fulltext index to cover description and aliases
- Migrate Pydantic schemas to v2 (ConfigDict, field_validator)
2026-05-08 11:26:04 +08:00
Eternity
cd34d5f5ce fix(api): convert config_id to string in write_router 2026-04-24 20:13:46 +08:00
Eternity
b6e27da7b0 fix(api): convert end_user_id to string in write_router 2026-04-24 19:56:55 +08:00
Eternity
422af69904 fix(api): correct import paths in memory_read and celery task command
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
2026-04-24 19:09:18 +08:00
Eternity
8dee2eae6a fix(api): correct import paths in memory_read and celery task command
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
2026-04-24 18:50:58 +08:00
Eternity
caef0fe44e fix(api): correct import paths in memory_read and celery task command
- Fix relative imports in memory_read.py to use absolute app paths
- Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`
2026-04-24 18:36:27 +08:00
Eternity
be10bab763 refactor(core): migrate task scheduler to per-user queue with dynamic sharding 2026-04-24 14:21:18 +08:00
Eternity
f93ec8d609 fix(core): fix end_user_id reference and add task status tracking
- Fix write_router to use actual_end_user_id instead of end_user_id
- Add task status tracking via Redis in scheduler
- Expose task_id in memory write response
- Fix logging import path in scheduler
2026-04-22 18:06:14 +08:00
Eternity
c5ae82c3c2 refactor(core): migrate memory write tasks to centralized scheduler 2026-04-22 16:50:06 +08:00
Eternity
dc3207b1d3 Merge branch 'develop' into refactor/memory_search
# Conflicts:
#	api/app/core/memory/storage_services/search/__init__.py
2026-04-20 18:07:07 +08:00
Eternity
688503a1ca refactor(memory): integrate unified memory service into agent controller
- Replace direct memory agent service calls with unified MemoryService in read endpoint
- Update query preprocessor to use new prompt format and return structured queries
- Enhance MemorySearchResult model with filtering, merging, and ID tracking capabilities
- Add intermediate outputs display for problem split, perceptual retrieval, and search results
- Fix parameter alignment and remove unused history parameter in memory agent service
2026-04-20 17:43:52 +08:00
Ke Sun
d4129edcf5 Merge pull request #923 from SuanmoSuanyangTechnology/feat/enduser-info-apikey
feat(memory): add V1 memory config management endpoints and memory read/write API
2026-04-17 21:03:10 +08:00
Eternity
749cf79581 refactor(memory): consolidate memory search services and update model client handling
- Consolidate memory search services by removing separate content_search.py and perceptual_search.py
- Update model client handling in base_pipeline.py to use ModelApiKeyService for LLM client initialization
- Add new prompt files and modify existing services to support consolidated search architecture
- Refactor memory read pipeline and related services to use updated model client approach
2026-04-17 10:35:45 +08:00
miao
0dd8cc5d43 Merge remote-tracking branch 'origin/develop' into feat/enduser-info-apikey 2026-04-17 10:21:26 +08:00
lanceyq
643f69bb90 refactor(memory): tighten metadata field types and clean up descriptions
- Use Literal['set', 'remove'] for MetadataFieldChange.action instead of str
- Simplify field_path description to reflect current schema
- Remove redundant isinstance check in extract_user_metadata_task
2026-04-16 17:29:00 +08:00
lanceyq
73fbc19747 refactor(memory): switch metadata extraction from full-replace to incremental changes
- Replace UserMetadata full-object overwrite with incremental MetadataFieldChange
  operations (set/remove per field path)
- Convert profile.role and profile.domain from scalar strings to lists
- Remove UserMetadataBehavioralHints and knowledge_tags fields
- Update Jinja2 prompt to instruct LLM to output incremental changes
- Update extract_user_metadata_task to apply changes via deep-copy and
  per-field mutation for proper SQLAlchemy change detection
- Minor lint: remove unnecessary f-string prefixes in tasks.py
2026-04-16 17:14:30 +08:00
Eternity
a01525e239 refactor(memory): consolidate memory search services and update model client handling
- Consolidate memory search services by removing separate content_search.py and perceptual_search.py
- Update model client handling in base_pipeline.py to use ModelApiKeyService for LLM client initialization
- Add new prompt files and modify existing services to support consolidated search architecture
- Refactor memory read pipeline and related services to use updated model client approach
2026-04-16 13:43:38 +08:00
Eternity
2716a55c7f feat(memory): implement quick search pipeline with Neo4j integration 2026-04-15 12:18:23 +08:00
lanceyq
49e0801d15 refactor(memory): unify user placeholder names and harden alias sync logic
- Replace hardcoded user placeholder name lists in write_tools and
user_memory_service with shared _USER_PLACEHOLDER_NAMES constant
- Filter user placeholder names during alias merging in _merge_attribute
  to prevent cross-role alias contamination on non-user entities
- Use toLower() in Cypher query for case-insensitive name matching
- Change PgSQL->Neo4j alias sync condition from 'if pg_aliases' to
  'if info is not None' so empty aliases correctly clear stale data
2026-04-14 18:06:56 +08:00
lanceyq
811193dd75 fix(memory): make PgSQL the single source of truth for user entity aliases
- Skip alias merging for user entities during dedup (_merge_attribute and
  _merge_entities_with_aliases) to prevent dirty data from overwriting
  PgSQL authoritative aliases
- Add PgSQL→Neo4j alias sync after Neo4j write in write_tools to
  ensure Neo4j user entities always reflect the PgSQL source
- Remove deduped_aliases (Neo4j history) from alias sync in
  extraction_orchestrator, only append newly extracted aliases to PgSQL
- Guard Neo4j MERGE cypher to preserve existing aliases for user
  entities (name IN ['用户','我','User','I'])
- Fix emotion_analytics_service query to use ExtractedEntity label
  and entity_type property
2026-04-14 17:28:24 +08:00
Eternity
dca3173ed9 refactor(memory): restructure memory search architecture
- Replace storage_services/search with new read_services/memory_search structure
- Implement content_search and perceptual_search strategies
- Add query_preprocessor for search optimization
- Create memory_service as unified interface
- Update celery_app and graph_search for new architecture
- Add enums for memory operations
- Implement base_pipeline and memory_read pipeline patterns
2026-04-13 14:03:47 +08:00
lanceyq
cd018814fe fix(memory): improve metadata language detection and clean_metadata logic
- Make MetadataExtractor language param optional (default None) to
  support auto-detection fallback when no language is explicitly set
- Refactor clean_metadata from walrus-operator dict comprehension to
  explicit loop for correctness and readability
2026-04-10 00:42:11 +08:00
lanceyq
e0b7e95af6 refactor(memory): remove first-person pronoun replacement and inline metadata utils
- Remove _replace_first_person_with_user from StatementExtractor to preserve
  original user text for downstream metadata/alias extraction
- Delete metadata_utils.py module, inline clean_metadata into Celery task
- Remove unused imports and commented-out collect_user_raw_messages method
- Apply formatting cleanup across metadata models and extraction orchestrator
2026-04-10 00:29:18 +08:00
lanceyq
15a863b41a feat(memory): unify alias extraction into metadata pipeline and deduplicate user entity nodes
- Merge alias add/remove into MetadataExtractionResponse and Celery metadata task,
  removing the separate sync step from extraction_orchestrator
- Replace first-person pronouns ("我") with "用户" in statement extraction to
  preserve identity semantics for downstream metadata/alias extraction
- Update extract_statement.jinja2 prompt to enforce "用户" as subject for user
  statements instead of resolving to real names
- Add alias change instructions (aliases_to_add/aliases_to_remove) to
  extract_user_metadata.jinja2 with incremental merge logic
- Deduplicate special entities ("用户", "AI助手") in graph_saver by reusing
  existing Neo4j node IDs per end_user_id
- Sync final aliases from PgSQL to Neo4j user entity nodes after metadata write
2026-04-09 21:55:59 +08:00
lanceyq
e0546e01ef refactor(memory): delegate metadata merging to LLM instead of code-based merge
- Remove merge_metadata and its helper functions from metadata_utils.py
- Pass existing_metadata to MetadataExtractor.extract_metadata() as LLM context
- Add merge instructions to extract_user_metadata.jinja2 prompt (zh/en)
- Update Celery task to read existing metadata before extraction and overwrite
- Simplify field descriptions in UserMetadataProfile model
- Add _update_timestamps helper to track changed fields
2026-04-09 15:10:29 +08:00
lanceyq
f2d7479229 feat(memory): add async user metadata extraction pipeline
- Add MetadataExtractor to collect user-related statements post-dedup
  and extract profile/behavioral metadata via independent LLM call
- Add Celery task (extract_user_metadata) routed to memory_tasks queue
- Add metadata models (UserMetadata, UserMetadataProfile, etc.)
- Add metadata utility functions (clean, validate, merge with _op support)
- Add Jinja2 prompt template for metadata extraction (zh/en)
- Fix Lucene query parameter naming: rename `q` to `query` across all
  Cypher queries, graph_search functions, and callers
- Escape `/` in Lucene queries to prevent TokenMgrError
- Add `speaker` field to ChunkNode and persist it in Neo4j
- Remove unused imports (argparse, os, UUID) in search.py
- Fix unnecessary db context nesting in interest distribution task
2026-04-09 11:01:56 +08:00
Ke Sun
cfbf83f71e Merge pull request #787 from SuanmoSuanyangTechnology/fix/atomic-update
fix(memory): improve optimistic lock resilience in access history man…
2026-04-07 10:57:20 +08:00
lanceyq
99862db7a0 refactor(forgetting-engine): replace optimistic locking with APOC atomic operations in access history manager
- Replace version-based optimistic locking and retry loop with apoc.atomic.add/insert for concurrent safety
- Merge duplicate accesses within a batch before updating (access_count_delta)
- Simplify _calculate_update to only compute on new timestamps instead of full history rebuild
- Remove max_retries instance variable (kept as param for backward compat)
- Trim verbose docstrings and inline comments
2026-04-03 18:40:03 +08:00
lanceyq
00a8099857 changes:(api) Change the "jitter" to "tremble". 2026-04-03 16:55:53 +08:00
lanceyq
117e29fbe3 fix(memory): improve optimistic lock resilience in access history manager
- Increase max_retries from 3 to 5 for concurrent conflict recovery
- Add randomized exponential backoff between retries to reduce contention
- Merge duplicate node accesses in batch operations to avoid self-conflicts
- Support access_times parameter for merged batch access counting
- Add Community node label support in atomic update content field map
2026-04-03 16:46:09 +08:00
Ke Sun
bc5ea2d421 Merge pull request #784 from SuanmoSuanyangTechnology/fix/aliases-extract
feat(memory): prevent cross-role alias contamination between user and…
2026-04-03 15:26:31 +08:00
lanceyq
c4ff1a325b refactor(memory): harden alias extraction and sync PgSQL with Neo4j deduped aliases
- Strengthen anti-hallucination rules in extract_triplet prompt to
  enforce verbatim-only alias extraction, removing suggestive examples
- Add _extract_deduped_entity_aliases to sync historical aliases from
  Neo4j two-stage dedup into PgSQL end_user_info
- Remove unused _fetch_neo4j_user_aliases; reuse injected connector
  instead of instantiating new Neo4jConnector
- Simplify _would_merge_cross_role and reuse clean_cross_role_aliases
  in _normalize_special_entity_names
- Reuse _USER_PLACEHOLDER_NAMES from dedup module to avoid duplication
2026-04-03 14:38:55 +08:00
lanceyq
15b3ce3dd5 refactor(memory): deduplicate assistant alias query and fix case-sensitive placeholder matching
- Extract fetch_neo4j_assistant_aliases() into deduped_and_disamb.py as
  single source of truth, replacing inline Cypher in write_tools and
  extraction_orchestrator
- Normalize USER_PLACEHOLDER_NAMES to lowercase and apply .lower() on
  all comparisons to prevent case-variant names leaking into aliases
2026-04-03 13:15:57 +08:00
lanceyq
9cc19047b4 fix(memory): prevent cross-role alias contamination in entity dedup
- Extract user aliases from raw dialog statements instead of post-dedup
  entities to bypass merge pollution
- Add alias cross-cleaning step in _normalize_special_entity_names to
  strip AI assistant aliases from user entities before dedup
- Call clean_cross_role_aliases after second-layer dedup to handle
  historical dirty data merged from Neo4j
- Fix syntax error in prompt_utils.py (ontology_types variable assignment)
2026-04-03 12:34:04 +08:00
lanceyq
7890970a39 feat(memory): prevent cross-role alias contamination between user and AI entities
- Add speaker context to triplet extraction prompt to distinguish alias ownership
- Add explicit examples and rules in extract_triplet.jinja2 for user vs AI alias attribution
- Introduce cross-role merge protection in dedup (accurate, fuzzy, and LLM stages)
- Normalize special entity names (用户/AI助手) before deduplication
- Add clean_cross_role_aliases() to sanitize aliases before Neo4j write
- Refactor _update_end_user_other_name to merge aliases from PgSQL instead of Neo4j
- Filter AI assistant aliases from user alias extraction in orchestrator
2026-04-03 10:57:30 +08:00
Eternity
9cbe9d5edc feat(memory): add perceptual memory retrieval service with BM25+embedding fusion 2026-04-01 18:03:07 +08:00
Timebomb2018
264183cec2 feat(models): support reasoning_content streaming 2026-04-01 15:47:43 +08:00
Ke Sun
7ce29019f7 feat(memory): Add memory config API controller and end user info endpoints
- Create new memory_config_api_controller.py for dedicated memory configuration management
- Add /end_user/info GET endpoint to retrieve end user information (aliases, metadata)
- Add /end_user/info/update POST endpoint to update end user details
- Move /memory/configs endpoint from memory_api_controller to memory_config_api_controller
- Extract _get_current_user helper function to build user context from API key auth
- Support optional app_id parameter in end user creation with UUID validation
- Update service controller imports with alphabetical ordering and multi-line formatting
- Register memory_config_api_controller router in service module initialization
- Refactor memory_api_controller imports for consistency and clarity
2026-04-01 15:06:26 +08:00
Ke Sun
3ea42ac27f Merge remote-tracking branch 'origin/release/v0.2.9' into develop 2026-03-31 19:16:13 +08:00
lanceyq
c90b58bbcd [fix] The "write_tools" module actively shuts down the client, and it closes before the task event loop is completed. 2026-03-30 18:19:50 +08:00
lanceyq
052c7c19b3 [fix] Avoid unnecessary index creation costs 2026-03-30 17:44:02 +08:00
lanceyq
e15af5a2ba [fix] Create a complete index 2026-03-30 17:44:02 +08:00
Ke Sun
ea8db7cd90 Merge pull request #728 from SuanmoSuanyangTechnology/fix/aliases
[fix] Refusing the user, I went to "other_name"
2026-03-30 17:26:22 +08:00
Eternity
8dd24533bf fix(memory,task): add Redis fair lock for ordered memory writes 2026-03-30 17:20:54 +08:00
lanceyq
dae7431075 [fix] Refusing the user, I went to "other_name" 2026-03-30 15:39:53 +08:00
Eternity
7acb7045f0 feat(agent, memory): add agent-perceived memory writing 2026-03-30 13:39:49 +08:00
lanceyq
5c11da6a2e [changes] Semantic pruning enables the file to pass through 2026-03-27 19:25:17 +08:00
lanceyq
289b1989e5 [changes] Semantic pruning enables the file to pass through 2026-03-27 19:13:38 +08:00