MemoryBear

Author	SHA1	Message	Date
lanceyq	6419dcd932	[commit] Refactor write pipeline	2026-05-08 11:28:24 +08:00
lanceyq	9dc9b7aee7	refactor(memory): remove legacy extraction pipeline and add dialog_at temporal grounding - Delete ExtractionOrchestrator (~2500 lines) and write_tools legacy path; MemoryService/WritePipeline is now the sole write path - Remove NEW_PIPELINE_ENABLED feature flag from memory_agent_service - Simplify pilot_run_service to always use PilotWritePipeline - Add dialog_at field to statement and triplet extraction prompts as the primary reference time for resolving relative temporal expressions - Rewrite relative time phrases (e.g. 昨天, 下周) into concrete dates directly in statement_text when stably resolvable from dialog_at - Rename extracat_Pruning.jinja2 to extracat_pruning.jinja2; expand few-shot examples and update memory type enum (drop NULL, add agreement/repetition/other)	2026-05-08 11:28:24 +08:00
lanceyq	cf389bb978	refactor(memory): remove expired_at field and add dialog_at timestamp Remove the deprecated expired_at field from all graph models, Neo4j Cypher queries, repositories, and pipeline code. Replace with dialog_at on StatementNode to track the original dialog timestamp. - Strip expired_at from DialogueNode, ChunkNode, StatementNode, ExtractedEntityNode, edges, and all Cypher queries - Add dialog_at to MessageItem schema and propagate through extraction and graph build steps - Extract emotion/metadata async submission from WritePipeline into a generic _submit_celery_task helper - Add post_store_dedup_and_alias_merge Celery task for async alias merging and second-layer dedup after Neo4j write - Switch pytest async backend from anyio to asyncio_mode=auto	2026-05-08 11:27:59 +08:00
lanceyq	d66d601e41	refactor(memory): redesign metadata extraction as async pipeline step - Replace extract_user_metadata_task with entity-level extract_metadata_batch_task - Add MetadataExtractionStep following ExtractionStep pattern with Jinja2 prompts - Flatten MetadataExtractionResponse to 9-field schema (aliases, core_facts, etc.) - Add Cypher queries for incremental metadata writeback and alias edge redirection - Wire _extract_metadata into WritePipeline as Step 3.6 (fire-and-forget) - Add pilot_write() to MemoryService; refactor pilot_run_service to use it - Extract snapshot logic into WriteSnapshotRecorder	2026-05-08 11:27:51 +08:00
lanceyq	4af9b02815	feat(memory): propagate temporal validity fields through extraction pipeline - Add valid_at/invalid_at passthrough in triplet extraction prompt (both zh/en) - Propagate temporal_validity to EntityEntityEdge in ExtractionOrchestrator - Use coalesce() for valid_at/invalid_at in Neo4j cypher queries to handle NULLs - Fix workspace_id/config_id UUID parsing in read_memory config resolution - Downgrade verbose extraction pipeline logs from info to debug - Remove UUID and short API key patterns from sensitive filter to reduce false positives - Standardize log message format (use = spacing, end_user_id label) - Fix misindented TODO comment in write_pipeline.py	2026-05-08 11:26:24 +08:00
lanceyq	1f0c88a5f0	refactor(memory): consolidate write pipeline and rename statement extraction step - Rename StatementExtractionStep → StatementTemporalExtractionStep and extract_statement.jinja2 → extract_statement_temporal.jinja2 to reflect merged temporal extraction logic - Move extraction_pipeline_orchestrator.py out of steps/ to engine root - Move dedup_step.py into steps/ directory - Introduce WriteMemoryRequest schema to replace positional args in write_memory() - Extract _resolve_and_load_config, _preprocess_files, _write_neo4j, and _invalidate_interest_cache as private helpers in MemoryAgentService - Remove shadow pipeline and simplify NEW_PIPELINE_ENABLED branch - Merge 类型归属/成员隶属/任职服务 relation types into single 归属身份关系 in triplet prompt - Add alias merge logic (别名属于) in deduplication and MERGE_ALIAS_BELONGS_TO Cypher query - Add StorageType, Language, MessageItem enums/models to memory_agent_schema - Reduce AgentMemory_Long_Term.DEFAULT_SCOPE from 6 to 1 - Delete standalone extract_temporal.jinja2 (logic merged into statement step)	2026-05-08 11:26:24 +08:00
lanceyq	7747ed7ac1	refactor(memory): enhance extraction ontology and add assistant pruning graph support - Expand entity type ontology with detailed definitions, examples, and notes (merged types: 地点设施, 物品设备, 产品服务, 软件平台, 角色职业, 知识能力, 偏好习惯目标, 称呼别名, 智能体) - Add relation ontology taxonomy with 15 predicate categories and usage rules - Strengthen reference resolution rules: resolve pronouns before extraction, skip unresolvable references entirely - Add guidelines to avoid extracting abstract propositions, emotions, and low-value entities (effort/reward/success patterns) - Add 7 new extraction examples covering edge cases - Add AssistantOriginal/AssistantPruned node models and graph persistence (PRUNED_TO and BELONGS_TO_DIALOG edges, Neo4j indexes and constraints) - Add graph_build_step.py for building graph nodes/edges from DialogData - Update write_pipeline.py to pass assistant pruning nodes/edges to graph saver - Update data_pruning.py with related preprocessing changes	2026-05-08 11:26:24 +08:00
lanceyq	2355536b44	refactor(memory): add PilotWritePipeline and enrich extraction schema - Add dedicated PilotWritePipeline (statement → triplet → graph_build → layer-1 dedup, no Neo4j write) - Add type_description/predicate_description fields across entity and triplet models, Cypher queries, and graph builders - Refactor data_pruning with LRU cache and snapshot support; skip assistant chunks in extraction - Remove strict Predicate enum whitelist; support statement_text alias in legacy extractor - Wire PipelineSnapshot through preprocessing and emotion extraction for debug tracing - Add PILOT_RUN_USE_REFACTORED_PIPELINE env toggle for pipeline selection	2026-05-08 11:26:04 +08:00
lanceyq	b0ddd12cc6	feat(memory): add emotion batch extraction task and improve extraction prompts - Add extract_emotion_batch_task for async emotion extraction - Refine Chinese entity types and relation types in extraction prompts - Add STATEMENT_EMOTION_UPDATE Cypher query for Neo4j backfill - Refactor statement_step and triplet_step implementations	2026-05-08 11:26:04 +08:00
lanceyq	a98011fc8a	feat(memory): implement step-based extraction pipeline architecture Introduce ExtractionStep abstraction with modular pipeline stages: - Add base ExtractionStep class with render/call/parse lifecycle - Implement StatementExtractionStep, TripletExtractionStep, EmbeddingStep, EmotionStep, GraphBuildStep, and DedupStep - Add SidecarStepFactory for hot-pluggable non-critical steps - Define Pydantic I/O schemas for all pipeline stages - Refactor WritePipeline to orchestrate new step-based flow - Add NEW_PIPELINE_ENABLED env switch for old/new pipeline routing - Add emotion_enabled config flag to MemoryConfig - Fix workspace_id reference in get_end_user_connected_config	2026-05-08 11:26:04 +08:00
lanceyq	41535c34e6	feat(memory): add WritePipeline and MemoryService facade Introduce a layered pipeline architecture for the memory write flow: - WritePipeline: orchestrates preprocess → extract → store → cluster → summarize with deadlock retry, resource cleanup, and pilot-run support - MemoryService: facade that delegates to WritePipeline, placeholder methods for read/forget/reflect - BearLogger: structured step-level logging with perf threshold alerts - Shadow pipeline integration in MemoryAgentService (env-gated pilot run) Also includes: - Fix deprecated SQLAlchemy declarative_base import - Extend Neo4j Entity fulltext index to cover description and aliases - Migrate Pydantic schemas to v2 (ConfigDict, field_validator)	2026-05-08 11:26:04 +08:00
Eternity	e38a60e107	feat(core): add configurable SANDBOX_URL for code node sandbox requests	2026-04-29 20:24:10 +08:00
Timebomb2018	6f10296969	fix(workspace): deactivate user when removed from last active workspace	2026-04-28 18:34:06 +08:00
Timebomb2018	28694fefb0	fix(app): adjust thinking budget tokens default and validation range The default thinking budget tokens value was changed from 10000 to 1024 in base.py, and the minimum validation constraint was updated from 1024 to 1 in app_schema.py to allow smaller budgets while maintaining backward compatibility.	2026-04-28 16:10:44 +08:00
Timebomb2018	d3058ce379	fix(workspace): make delete workspace member async and invalidate user tokens	2026-04-28 15:04:13 +08:00
Ke Sun	7621321d1b	Revert "refactor(memory): replace raw dict responses with Pydantic schema mod…"	2026-04-27 18:50:26 +08:00
Ke Sun	0e29b0b2a5	Merge pull request #1016 from SuanmoSuanyangTechnology/feat/episodic-memory-detail-and-pagination refactor(memory): replace raw dict responses with Pydantic schema mod…	2026-04-27 18:43:53 +08:00
lanceyq	2fa4d29548	fix(memory): use explicit None checks and remove unnecessary Optional type - Replace truthiness checks with 'is not None' for data.message in graph_data and community_graph endpoints to handle empty string correctly - Remove Optional wrapper from GraphStatistics.edge_types since it already has a default_factory	2026-04-27 18:39:33 +08:00
lanceyq	9a5ce7f7c6	refactor(memory): replace raw dict responses with Pydantic schema models in user memory controllers - Add user_memory_schema.py with typed Pydantic models for all user memory API responses: MemoryInsightReportData, UserSummaryData, GraphData, MemoryTypeStatItem, cache result models, and RelationshipEvolutionData - Refactor user_memory_controllers.py to construct schema instances and return model_dump() instead of raw dicts - Remove unused imports (datetime, timestamp_to_datetime, EndUserInfoResponse, EndUserInfoCreate, EndUser)	2026-04-27 17:57:06 +08:00
Timebomb2018	531d785629	fix(multimodal): support HTML image tags in document extraction and chat responses - Replace plain image URLs with `<img src="..." data-url="...">` HTML tags in multimodal and document extractor services - Propagate citations from workflow end events to client responses - Update system prompts to instruct LLMs to render images using Markdown `![alt](url)` with strict UUID-preserving URL copying	2026-04-27 17:56:58 +08:00
山程漫悟	ce4a3daec7	Merge pull request #1012 from SuanmoSuanyangTechnology/fix/wxy-032 feat(workflow): augment logging queries and ameliorate error handling	2026-04-27 16:00:49 +08:00
Timebomb2018	98d8d7b261	fix(conversation_schema): refine citations field type to Dict[str, Any]	2026-04-27 15:49:21 +08:00
Timebomb2018	12a08a487d	fix(tool_controller): re-raise HTTPException to preserve original status codes	2026-04-27 15:47:34 +08:00
Timebomb2018	f7fa33c0c4	Merge remote-tracking branch 'origin/release/v0.3.2' into fix/Timebomb_032	2026-04-27 15:36:03 +08:00
Timebomb2018	faf8d1a51a	fix(workflow): add reasoning content, suggested questions, citations and audio status support - Introduce `reasoning_content`, `suggested_questions`, `citations`, and `audio_status` fields in conversation and app response schemas - Conditionally set `audio_status` to `"pending"` only when `audio_url` is present - Replace `model_dump` override with `@model_serializer(mode="wrap")` for cleaner serialization logic - Change knowledge base validation failure from `RuntimeError` to warning + `continue` to avoid halting retrieval on invalid KB	2026-04-27 15:35:26 +08:00
wxy	adb7f873b5	Merge remote-tracking branch 'origin/fix/wxy-032' into fix/wxy-032	2026-04-27 15:29:54 +08:00
wxy	b64bcc2c50	feat(workflow): augment logging queries and ameliorate error handling - Augment log search with app type filtering to enable keyword searching within workflow_executions. - Introduce execution sequence markers to ensure logs are displayed in the correct chronological order. - Ameliorate error handling to capture successful node outputs alongside failure details. - Rectify the processing of empty JSON bodies in HTTP request nodes.	2026-04-27 15:20:25 +08:00
山程漫悟	d9de96cffa	Merge pull request #1011 from wanxunyang/fix/wxy-032 fix(api_key): bypass publication check for SERVICE type API keys	2026-04-27 14:44:19 +08:00
wxy	546bfb9627	fix(api_key): bypass publication check for SERVICE type API keys - Exclude SERVICE type keys from application publication validation since their resource_id targets the workspace instead of an application.	2026-04-27 14:05:06 +08:00
Timebomb2018	a268d0f7f1	fix(multimodal_service): add '文档内容：' prefix to document text and simplify image placeholder text	2026-04-27 12:25:27 +08:00
Ke Sun	675c7faf32	Merge pull request #1004 from SuanmoSuanyangTechnology/fix/memory_search fix(api): convert config_id to string in write_router	2026-04-25 11:08:51 +08:00
Eternity	cd34d5f5ce	fix(api): convert config_id to string in write_router	2026-04-24 20:13:46 +08:00
Ke Sun	1403b38648	Merge pull request #1003 from SuanmoSuanyangTechnology/fix/memory_search fix(api): convert end_user_id to string in write_router	2026-04-24 19:59:24 +08:00
Eternity	b6e27da7b0	fix(api): convert end_user_id to string in write_router	2026-04-24 19:56:55 +08:00
山程漫悟	2c14344d3f	Merge pull request #1002 from SuanmoSuanyangTechnology/feature/agent-tool_xjn fix(multimodal_service)	2026-04-24 19:42:38 +08:00
Timebomb2018	141fd94513	fix(multimodal_service): refactor image processing to use intermediate list before extending result	2026-04-24 19:40:57 +08:00
Ke Sun	ed5f98a746	Merge pull request #1000 from SuanmoSuanyangTechnology/fix/memory_search fix(api): correct import paths in memory_read and celery task command	2026-04-24 19:11:23 +08:00
Eternity	422af69904	fix(api): correct import paths in memory_read and celery task command - Fix relative imports in memory_read.py to use absolute app paths - Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`	2026-04-24 19:09:18 +08:00
山程漫悟	6cb48664b7	Merge pull request #992 from wanxunyang/develop-wxy fix(workflow): rectify error handling and bolster execution logging	2026-04-24 18:58:40 +08:00
Eternity	8dee2eae6a	fix(api): correct import paths in memory_read and celery task command - Fix relative imports in memory_read.py to use absolute app paths - Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`	2026-04-24 18:50:58 +08:00
wxy	f63bcd6321	refactor(tool): flatten request body parameters for model exposure - Refactor the extraction logic in tool service to flatten request body parameters into independent arguments exposed to the model.	2026-04-24 18:49:55 +08:00
Eternity	caef0fe44e	fix(api): correct import paths in memory_read and celery task command - Fix relative imports in memory_read.py to use absolute app paths - Change celery scheduler command from `python app/celery_task_scheduler.py` to `python -m app.celery_task_scheduler`	2026-04-24 18:36:27 +08:00
wxy	21eb500680	refactor(workflow): streamline node execution handling and log service logic - Consolidate node data retrieval from workflow_executions.output_data to unify storage access. - Optimize the construction of messages and execution records to support opening suggestions. - Eliminate redundant queries and storage logic to simplify the overall codebase structure.	2026-04-24 18:20:14 +08:00
Ke Sun	c70f536acc	Merge pull request #986 from SuanmoSuanyangTechnology/feat/episodic-memory-detail-and-pagination feat:episodic memory detail and pagination	2026-04-24 18:19:11 +08:00
Ke Sun	5f96a6380e	Merge pull request #990 from SuanmoSuanyangTechnology/feature/celery-task-scheduler Feature/celery task scheduler	2026-04-24 18:19:00 +08:00
Timebomb2018	4b0afe867a	fix(app_chat_service,draft_run_service): move system_prompt augmentation before LangChainAgent instantiation	2026-04-24 18:00:44 +08:00
Timebomb2018	8f31236303	fix(app_chat_service,draft_run_service): move system_prompt augmentation before LangChainAgent instantiation	2026-04-24 17:48:15 +08:00
Timebomb2018	f2aedd29bc	refactor(http_request): simplify request handling and remove unused fields - Removed `last_request` field and related logic for storing raw request string - Replaced `_extract_output` and `_extract_extra_fields` to use `process_data` instead of `request` - Updated `_build_content` to directly parse JSON body without intermediate rendering step - Modified `execute` to generate `process_data` from actual HTTP request object instead of manual string building - Added `process_data` field to `HttpRequestNodeOutput` model for consistent debugging info	2026-04-24 17:09:01 +08:00
wwq	cf8db47389	feat(workflow): augment logging capabilities with execution status and loop support - Augment workflow logs with execution status fields and loop node information. - Refactor log service to handle distinct processing logic for workflows and agents. - Construct message and node logs derived from workflow_executions data.	2026-04-24 17:02:03 +08:00
Timebomb2018	74be09340c	feat(multimodal): support tenant-aware document image storage and improve image placeholder labeling - Pass workspace_id to multimodal_service.process_files across app_chat_service, draft_run_service - Fetch tenant_id from workspace in multimodal_service for proper file storage scoping - Update image placeholder format from "[第N页第M张图片]" to "[图片第N页第M张图片]" for clarity - Add strict URL preservation rules to system prompt for agents handling document images - Refactor _save_doc_image_to_storage to accept explicit tenant_id and workspace_id instead of inferring from FileMetadata	2026-04-24 15:56:06 +08:00

1 2 3 4 5 ...

1478 Commits