Remove the deprecated expired_at field from all graph models, Neo4j
Cypher queries, repositories, and pipeline code. Replace with dialog_at
on StatementNode to track the original dialog timestamp.
- Strip expired_at from DialogueNode, ChunkNode, StatementNode,
ExtractedEntityNode, edges, and all Cypher queries
- Add dialog_at to MessageItem schema and propagate through extraction
and graph build steps
- Extract emotion/metadata async submission from WritePipeline into
a generic _submit_celery_task helper
- Add post_store_dedup_and_alias_merge Celery task for async alias
merging and second-layer dedup after Neo4j write
- Switch pytest async backend from anyio to asyncio_mode=auto
- Rename StatementExtractionStep → StatementTemporalExtractionStep and
extract_statement.jinja2 → extract_statement_temporal.jinja2 to reflect
merged temporal extraction logic
- Move extraction_pipeline_orchestrator.py out of steps/ to engine root
- Move dedup_step.py into steps/ directory
- Introduce WriteMemoryRequest schema to replace positional args in write_memory()
- Extract _resolve_and_load_config, _preprocess_files, _write_neo4j, and
_invalidate_interest_cache as private helpers in MemoryAgentService
- Remove shadow pipeline and simplify NEW_PIPELINE_ENABLED branch
- Merge 类型归属/成员隶属/任职服务 relation types into single 归属身份关系 in triplet prompt
- Add alias merge logic (别名属于) in deduplication and MERGE_ALIAS_BELONGS_TO Cypher query
- Add StorageType, Language, MessageItem enums/models to memory_agent_schema
- Reduce AgentMemory_Long_Term.DEFAULT_SCOPE from 6 to 1
- Delete standalone extract_temporal.jinja2 (logic merged into statement step)
Introduce ExtractionStep abstraction with modular pipeline stages:
- Add base ExtractionStep class with render/call/parse lifecycle
- Implement StatementExtractionStep, TripletExtractionStep,
EmbeddingStep, EmotionStep, GraphBuildStep, and DedupStep
- Add SidecarStepFactory for hot-pluggable non-critical steps
- Define Pydantic I/O schemas for all pipeline stages
- Refactor WritePipeline to orchestrate new step-based flow
- Add NEW_PIPELINE_ENABLED env switch for old/new pipeline routing
- Add emotion_enabled config flag to MemoryConfig
- Fix workspace_id reference in get_end_user_connected_config
Introduce a layered pipeline architecture for the memory write flow:
- WritePipeline: orchestrates preprocess → extract → store → cluster → summarize
with deadlock retry, resource cleanup, and pilot-run support
- MemoryService: facade that delegates to WritePipeline, placeholder methods
for read/forget/reflect
- BearLogger: structured step-level logging with perf threshold alerts
- Shadow pipeline integration in MemoryAgentService (env-gated pilot run)
Also includes:
- Fix deprecated SQLAlchemy declarative_base import
- Extend Neo4j Entity fulltext index to cover description and aliases
- Migrate Pydantic schemas to v2 (ConfigDict, field_validator)
The default thinking budget tokens value was changed from 10000 to 1024 in base.py, and the minimum validation constraint was updated from 1024 to 1 in app_schema.py to allow smaller budgets while maintaining backward compatibility.
- Replace truthiness checks with 'is not None' for data.message in graph_data and community_graph endpoints to handle empty string correctly
- Remove Optional wrapper from GraphStatistics.edge_types since it already has a default_factory
- Add user_memory_schema.py with typed Pydantic models for all user memory
API responses: MemoryInsightReportData, UserSummaryData, GraphData,
MemoryTypeStatItem, cache result models, and RelationshipEvolutionData
- Refactor user_memory_controllers.py to construct schema instances and
return model_dump() instead of raw dicts
- Remove unused imports (datetime, timestamp_to_datetime, EndUserInfoResponse,
EndUserInfoCreate, EndUser)
- Introduce `reasoning_content`, `suggested_questions`, `citations`, and `audio_status` fields in conversation and app response schemas
- Conditionally set `audio_status` to `"pending"` only when `audio_url` is present
- Replace `model_dump` override with `@model_serializer(mode="wrap")` for cleaner serialization logic
- Change knowledge base validation failure from `RuntimeError` to warning + `continue` to avoid halting retrieval on invalid KB
- Augment workflow logs with execution status fields and loop node information.
- Refactor log service to handle distinct processing logic for workflows and agents.
- Construct message and node logs derived from workflow_executions data.
Added `allow_download` flag to citation config and `download_url` field to citation output. Implemented `/citations/{document_id}/download` endpoint to serve original files when enabled. Removed unused `files` field and `HttpRequestDataProcessing` model from HTTP request node config.
Added document image extraction capability for PDF and DOCX files, including page/index metadata and storage integration. Extended `process_files` with `document_image_recognition` flag to conditionally enable vision-based image processing when model supports it. Updated knowledge repository and workflow node logic to enforce status=1 checks. Added PyMuPDF dependency.
- feat(http_request): augment debugging capabilities with raw request generation and improved error handling.
- feat(app_log): extend session filtering logic to support retrieving all session types.
- feat(log): add 'process' field to node execution records for better data tracking.
- Augment HTTP request node capabilities and add generated curl commands for easier debugging.
feat(log): implement workflow execution logs and search functionality
- Add detailed logging for workflow node execution and enable search capabilities within application logs.
feat(auth): introduce middleware to verify application publication status
- Add a check to ensure the application is published before allowing access.
fix(converter): rectify variable handling logic in Dify converter
- Correct issues related to processing variables within the Dify converter module.
refactor(model): remove quota check decorator from model update operations
- Decouple quota validation from the model update process to streamline the logic.
- Fix write_router to use actual_end_user_id instead of end_user_id
- Add task status tracking via Redis in scheduler
- Expose task_id in memory write response
- Fix logging import path in scheduler
feat(app): support file metadata in chat messages and DSL app overwrite
- Extended chat message file objects with `name`, `size`, and `file_type` fields across app_chat_service and workflow_service
- Added ability to overwrite existing app configurations via DSL import in app_dsl_service, including type validation and config update logic for AgentConfig, MultiAgentConfig, and WorkflowConfig
- Trigger opening statement on new conversation in run/run_stream
- Fix /opening endpoint to support workflow app type
- Fix features field missing in workflow config release snapshot
- Knowledge node returns citations alongside chunks
- Aggregate citations from all knowledge nodes in result builder
- Filter citations based on features.citation.enabled switch
- Fix WorkflowConfigCreate circular import in app_schema
- Rename endpoints from write_api_service/read_api_service to write/read for clarity
- Add async task-based endpoints (/write, /read) that dispatch to Celery with fair locking
- Add task status polling endpoints (/write/status, /read/status) to check async operation results
- Add synchronous endpoints (/write/sync, /read/sync) for blocking operations with direct results
- Introduce TaskStatusResponse schema for task status polling responses
- Add MemoryWriteSyncResponse and MemoryReadSyncResponse schemas for sync operations
- Implement write_memory_sync and read_memory_sync methods in MemoryAPIService
- Remove await from async service calls in task-based endpoints (now handled by Celery)
- Add Query parameter import for task_id in status endpoints
- Update docstrings to clarify async vs sync behavior and task polling workflow
- Integrate task_service for retrieving Celery task results
- Create new memory_config_api_controller.py for dedicated memory configuration management
- Add /end_user/info GET endpoint to retrieve end user information (aliases, metadata)
- Add /end_user/info/update POST endpoint to update end user details
- Move /memory/configs endpoint from memory_api_controller to memory_config_api_controller
- Extract _get_current_user helper function to build user context from API key auth
- Support optional app_id parameter in end user creation with UUID validation
- Update service controller imports with alphabetical ordering and multi-line formatting
- Register memory_config_api_controller router in service module initialization
- Refactor memory_api_controller imports for consistency and clarity
- Should be merged after v0.2.9
- Create new end_user_api_controller.py with POST /end_user/create endpoint
- Implement API Key authentication requirement with memory scope
- Add support for optional memory_config_id parameter with workspace default fallback
- Update memory_api_schema.py to remove workspace_id from request (now derived from API key auth)
- Add memory_config_id field to CreateEndUserResponse schema
- Register end_user_api_controller router in service module
- Migrate end user creation from unauthenticated to authenticated API flow