Release/v0.2.3 (#355)

* feat(web): add PageEmpty component * feat(web): add PageTabs component * feat(web): add PageEmpty component * feat(web): add PageTabs component * feat(prompt): add history tracking for prompt releases * feat(web): add prompt menu * refactor: The PageScrollList component supports two generic parameters * feat(web): BodyWrapper compoent update PageLoading * feat(web): add Ontology menu * feat(web): memory management add scene * feat(tasks): add celery task configuration for periodic jobs - Add ignore_result=True to prevent storing results for periodic tasks - Set max_retries=0 to skip failed periodic tasks without retry attempts - Configure acks_late=False for immediate acknowledgment in beat tasks - Add time_limit and soft_time_limit to regenerate_memory_cache task (3600s/3300s) - Add time_limit and soft_time_limit to workspace_reflection_task (300s/240s) - Add time_limit and soft_time_limit to run_forgetting_cycle_task (7200s/7000s) - Improve task reliability and resource management for scheduled jobs * feat(sandbox): add Node.js code execution support to sandbox * Release/v0.2.2 (#260) * [modify] migration script * [add] migration script * fix(web): change form message * fix(web): the memoryContent field is compatible with numbers and strings * feat(web): code node hidden * fix(model): 1. create a basic model to check if the name and provider are duplicated. 2. The result shows error models because the provider created API Keys for all matching models. --------- Co-authored-by: Mark <zhuwenhui5566@163.com> Co-authored-by: zhaoying <yzhao96@best-inc.com> Co-authored-by: yingzhao <zhaoyingyz@126.com> Co-authored-by: Timebomb2018 <18868801967@163.com> * Feature/ontology class clean (#249) * [add] Complete ontology engineering feature implementation * [add] Add ontology feature integration and validation utilities * [add] Add OWL validator and validation utilities * [fix] Add missing render_ontology_extraction_prompt function * [fix]Add dependencies, fix functionality * [add] migration script * feat(celery): add dedicated periodic tasks worker and queue (#261) * fix(web): conflict resolve * Fix/v022 bug (#263) * [fix]Fix the issue of inconsistent language in explicit and episodic memory. * [fix]Fix the issue of inconsistent language in explicit and episodic memory. * [add]Add scene_id * [fix]Based on the AI review to fix the code * Fix/develop memory reflex (#265) * 遗漏的历史映射 * 遗漏的历史映射 * 反思后台报错处理 * [add] migration script * fix: chat conversation_id add node_start * feat(web): show code node * fix(web): Restructure the CustomSelect component, repair the interface that is called multiple times when the form is updated * feat(web): RadioGroupCard support block mode * feat(web): create space add icon * feat(app and model): token consumption statistics * Add/develop memory (#264) * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 遗漏的历史映射 * 新增长期记忆功能 * 新增长期记忆功能 * 新增长期记忆功能 * 知识库检索多余字段 * 长期 * feat(app and model): token consumption statistics of the cluster * memory_BUG_fix * fix(web): prompt history remove pageLoading * fix(prompt): remove hard-coded import of prompt file paths (#279) * Fix/develop memory bug (#274) * 遗漏的历史映射 * 遗漏的历史映射 * fix_timeline_memories * fix(web): update retrieve_type key * Fix/develop memory bug (#276) * 遗漏的历史映射 * 遗漏的历史映射 * fix_timeline_memories * fix_timeline_memories * write_gragp/bug_fix * write_gragp/bug_fix * write_gragp/bug_fix * chore(celery): disable periodic task scheduling * fix(prompt): remove hard-coded import of prompt file paths --------- Co-authored-by: lixinyue11 <94037597+lixinyue11@users.noreply.github.com> Co-authored-by: zhaoying <yzhao96@best-inc.com> Co-authored-by: yingzhao <zhaoyingyz@126.com> Co-authored-by: Ke Sun <kesun5@illinois.edu> * fix(web): remove delete confirm content * refactor(workflow): relocate template directory into workflow * feat(memory): add long-term storage task routing and batching * fix(web): PageScrollList loading update * fix(web): PageScrollList loading update * Ontology v1 bug (#291) * [changes]Add 'id' as the secondary sorting key, and 'scene_id' now returns a UUID object * [fix]Fix the "end_user" return to be sorted by update time. * [fix]Set the default values of the memory configuration model based on the spatial model. * [fix]Remove the entity extraction check combination model, read the configuration list, and add the return of scene_id * [fix]Fix the "end_user" return to be sorted by update time. * [fix] * fix(memory): add Redis session validation - Add macOS fork() safety configuration in celery_app.py to prevent initialization issues - Add null/False checks for Redis session queries in term_memory_save to handle missing sessions gracefully - Add null/False checks in memory_long_term_storage to prevent processing empty Redis results - Add null/False checks in aggregate_judgment before format_parsing to avoid errors on missing data - Initialize redis_messages variable in window_dialogue for consistency - Add debug logging when no existing session found in Redis for better troubleshooting - Add TODO comments for magic numbers (scope=6, time=5) to be extracted as constants - Improve error handling when Redis returns False or empty results instead of crashing * fix(web): PageScrollList style update * fix(workflow): fix argument passing in code execution nodes * fix(web): prompt add disabled * fix(web): space icon required * feat(app): modify the key of the token * fix(fix the key of the app's token): * fix(workflow): switch code input encoding to base64+URL encoding * [add]The main project adds multi-API Key load balancing. * [changes]Attribute security access, secure numerical conversion, unified use of local variables * fix(web): save add session update * fix(web): language editor support paste * [changes]Active status filtering logic, API Key selection strategy * memory_BUG * memory_BUG_long_term * [changes] * memory_BUG_long_term * memory_BUG_long_term * Fix/release memory bug (#306) * memory_BUG_fix * memory_BUG * memory_BUG_long_term * memory_BUG_long_term * memory_BUG_long_term * knowledge_retrieval/bug/fix * knowledge_retrieval/bug/fix * knowledge_retrieval/bug/fix * [fix]1.The "read_all_config" interface returns "scene_name";2.Memory configuration for lightweight query ontology scenarios * fix(web): replace code editor * [changes]Modify the description of the time for the recent event * [changes]Modify the code based on the AI review * feat(web): update memory config ontology api * fix(web): ui update * knowledge_retrieval/bug/fix * knowledge_retrieval/bug/fix * knowledge_retrieval/bug/fix * feat(workflow): add token usage statistics for question classifier and parameter extraction * feat(web): move prompt menu * Multiple independent transactions - single transaction * Multiple independent transactions - single transaction * Multiple independent transactions - single transaction * Multiple independent transactions - single transaction * Write Missing None (#321) * Write Missing None * Write Missing None * Write Missing None * Apply suggestion from @sourcery-ai[bot] Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Write Missing None --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Fix/release memory bug (#324) * Write Missing None * Write Missing None * Write Missing None * Apply suggestion from @sourcery-ai[bot] Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Write Missing None * redis update * redis update * redis update * redis update --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Fix/writer memory bug (#326) * [fix]Fix the bug * [fix]Fix the bug * [fix]Correct the direction indication. * fix(web): markdown table ui update * Fix/release memory bug (#332) * Write Missing None * Write Missing None * Write Missing None * Apply suggestion from @sourcery-ai[bot] Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Write Missing None * redis update * redis update * redis update * redis update * writer_dup_bug/fix --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Fix/fact summary (#333) * [fix]Disable the contents related to fact_summary * [fix]Disable the contents related to fact_summary * [fix]Modify the code based on the AI review * Fix/release memory bug (#335) * Write Missing None * Write Missing None * Write Missing None * Apply suggestion from @sourcery-ai[bot] Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Write Missing None * redis update * redis update * redis update * redis update * writer_dup_bug/fix * writer_graph_bug/fix * writer_graph_bug/fix --------- Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> * Revert "feat(web): move prompt menu" This reverts commit 9e6e8f50f8. * fix(web): ui update * fix(web): update text * fix(web): ui update * fix(model): change the "vl" model type of dashscope to "chat" * fix(model): change the "vl" model type of dashscope to "chat" --------- Co-authored-by: zhaoying <yzhao96@best-inc.com> Co-authored-by: Eternity <1533512157@qq.com> Co-authored-by: Mark <zhuwenhui5566@163.com> Co-authored-by: yingzhao <zhaoyingyz@126.com> Co-authored-by: Timebomb2018 <18868801967@163.com> Co-authored-by: 乐力齐 <162269739+lanceyq@users.noreply.github.com> Co-authored-by: lixinyue11 <94037597+lixinyue11@users.noreply.github.com> Co-authored-by: lixinyue <2569494688@qq.com> Co-authored-by: Eternity <61316157+myhMARS@users.noreply.github.com> Co-authored-by: lanceyq <1982376970@qq.com> Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2026-02-06 19:01:57 +08:00
parent eab7225d83
commit 79ab929fb0
187 changed files with 12252 additions and 1656 deletions
--- a/api/app/services/app_chat_service.py
+++ b/api/app/services/app_chat_service.py
@@ -171,7 +171,14 @@ class AppChatService:
        self.conversation_service.save_conversation_messages(
            conversation_id=conversation_id,
            user_message=message,
-            assistant_message=result["content"]
+            assistant_message=result["content"],
+            meta_data={
+                "usage": result.get("usage", {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": 0
+                })
+            }
        )

        elapsed_time = time.time() - start_time
@@ -310,6 +317,7 @@ class AppChatService:

            # 流式调用 Agent
            full_content = ""
+            total_tokens = 0
            async for chunk in agent.chat_stream(
                    message=message,
                    history=history,
@@ -320,9 +328,12 @@ class AppChatService:
                    config_id=config_id,
                    memory_flag=memory_flag
            ):
-                full_content += chunk
-                # 发送消息块事件
-                yield f"event: message\ndata: {json.dumps({'content': chunk}, ensure_ascii=False)}\n\n"
+                if isinstance(chunk, int):
+                    total_tokens = chunk
+                else:
+                    full_content += chunk
+                    # 发送消息块事件
+                    yield f"event: message\ndata: {json.dumps({'content': chunk}, ensure_ascii=False)}\n\n"

            elapsed_time = time.time() - start_time

@@ -339,7 +350,7 @@ class AppChatService:
                content=full_content,
                meta_data={
                    "model": api_key_obj.model_name,
-                    "usage": {}
+                    "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": total_tokens}
                }
            )

@@ -416,7 +427,11 @@ class AppChatService:
            meta_data={
                "mode": result.get("mode"),
                "elapsed_time": result.get("elapsed_time"),
-                "sub_results": result.get("sub_results")
+                "usage": result.get("usage", {
+                            "prompt_tokens": 0,
+                            "completion_tokens": 0,
+                            "total_tokens": 0
+                        })
            }
        )

@@ -458,6 +473,7 @@ class AppChatService:
            yield f"event: start\ndata: {json.dumps({'conversation_id': str(conversation_id)}, ensure_ascii=False)}\n\n"

            full_content = ""
+            total_tokens = 0

            # 2. 创建编排器
            orchestrator = MultiAgentOrchestrator(self.db, config)
@@ -474,16 +490,26 @@ class AppChatService:
                    storage_type=storage_type,
                    user_rag_memory_id=user_rag_memory_id
            ):
-                yield event
-                # 尝试提取内容（用于保存）
-                if "data:" in event:
-                    try:
-                        data_line = event.split("data: ", 1)[1].strip()
-                        data = json.loads(data_line)
-                        if "content" in data:
-                            full_content += data["content"]
-                    except:
-                        pass
+                if "sub_usage" in event:
+                    if "data:" in event:
+                        try:
+                            data_line = event.split("data: ", 1)[1].strip()
+                            data = json.loads(data_line)
+                            if "total_tokens" in data:
+                                total_tokens += data["total_tokens"]
+                        except:
+                            pass
+                else:
+                    yield event
+                    # 尝试提取内容（用于保存）
+                    if "data:" in event:
+                        try:
+                            data_line = event.split("data: ", 1)[1].strip()
+                            data = json.loads(data_line)
+                            if "content" in data:
+                                full_content += data["content"]
+                        except:
+                            pass

            elapsed_time = time.time() - start_time

@@ -499,7 +525,12 @@ class AppChatService:
                role="assistant",
                content=full_content,
                meta_data={
-                    "elapsed_time": elapsed_time
+                    "elapsed_time": elapsed_time,
+                    "usage": {
+                        "prompt_tokens": 0,
+                        "completion_tokens": 0,
+                        "total_tokens": total_tokens
+                    }
                }
            )

--- a/api/app/services/app_statistics_service.py
+++ b/api/app/services/app_statistics_service.py
@@ -187,7 +187,7 @@ class AppStatisticsService:
                daily_tokens[date_str] = 0
            daily_tokens[date_str] += int(tokens)
        
-        daily_data = [{"date": date, "tokens": tokens} for date, tokens in sorted(daily_tokens.items()) if tokens != 0]
-        total = sum(row["tokens"] for row in daily_data)
+        daily_data = [{"date": date, "count": tokens} for date, tokens in sorted(daily_tokens.items()) if tokens != 0]
+        total = sum(row["count"] for row in daily_data)
        
        return {"daily": daily_data, "total": total}
--- a/api/app/services/conversation_service.py
+++ b/api/app/services/conversation_service.py
@@ -1,4 +1,5 @@
 """会话服务"""
+import os
 import uuid
 from datetime import datetime, timedelta
 from typing import Annotated
@@ -298,7 +299,8 @@ class ConversationService:
            self,
            conversation_id: uuid.UUID,
            user_message: str,
-            assistant_message: str
+            assistant_message: str,
+            meta_data: Optional[dict] = None
    ):
        """
        Save a pair of user and assistant messages to the conversation.
@@ -307,6 +309,7 @@ class ConversationService:
            conversation_id (uuid.UUID): Conversation UUID.
            user_message (str): User's message content.
            assistant_message (str): Assistant's response content.
+            meta_data (Optional[dict]): Optional metadata for the messages.
        """
        self.add_message(
            conversation_id=conversation_id,
@@ -317,7 +320,8 @@ class ConversationService:
        self.add_message(
            conversation_id=conversation_id,
            role="assistant",
-            content=assistant_message
+            content=assistant_message,
+            meta_data=meta_data
        )

        logger.debug(
@@ -526,12 +530,12 @@ class ConversationService:
                takeaways=[],
                info_score=0,
            )
-
-        with open('app/services/prompt/conversation_summary_system.jinja2', 'r', encoding='utf-8') as f:
+        prompt_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'prompt')
+        with open(os.path.join(prompt_path, 'conversation_summary_system.jinja2'), 'r', encoding='utf-8') as f:
            system_prompt = f.read()
        rendered_system_message = Template(system_prompt).render()

-        with open('app/services/prompt/conversation_summary_user.jinja2', 'r', encoding='utf-8') as f:
+        with open(os.path.join(prompt_path, 'conversation_summary_user.jinja2'), 'r', encoding='utf-8') as f:
            user_prompt = f.read()
        rendered_user_message = Template(user_prompt).render(
            language=language,
--- a/api/app/services/draft_run_service.py
+++ b/api/app/services/draft_run_service.py
@@ -110,6 +110,8 @@ def create_long_term_memory_tool(memory_config: Dict[str, Any], end_user_id: str
                result = task_service.get_task_memory_read_result(task.id)
                status = result.get("status")
                logger.info(f"读取任务状态：{status}")
+                if memory_content:
+                    memory_content = memory_content['answer']

            finally:
                db.close()
@@ -123,7 +125,6 @@ def create_long_term_memory_tool(memory_config: Dict[str, Any], end_user_id: str
                    "content_length": len(str(memory_content))
                }
            )
-
            return f"检索到以下历史记忆：\n\n{memory_content}"
        except Exception as e:
            logger.error("长期记忆检索失败", extra={"error": str(e), "error_type": type(e).__name__})
@@ -442,7 +443,14 @@ class DraftRunService:
                    user_message=message,
                    assistant_message=result["content"],
                    app_id=agent_config.app_id,
-                    user_id=user_id
+                    user_id=user_id,
+                    meta_data={
+                        "usage": result.get("usage", {
+                            "prompt_tokens": 0,
+                            "completion_tokens": 0,
+                            "total_tokens": 0
+                        })
+                    }
                )

            response = {
@@ -649,6 +657,7 @@ class DraftRunService:

            # 9. 流式调用 Agent
            full_content = ""
+            total_tokens = 0
            async for chunk in agent.chat_stream(
                message=message,
                history=history,
@@ -659,14 +668,22 @@ class DraftRunService:
                user_rag_memory_id=user_rag_memory_id,
                memory_flag=memory_flag
            ):
-                full_content += chunk
-                # 发送消息块事件
-                yield self._format_sse_event("message", {
-                    "content": chunk
-                })
+                if isinstance(chunk, int):
+                    total_tokens = chunk
+                else:
+                    full_content += chunk
+                    # 发送消息块事件
+                    yield self._format_sse_event("message", {
+                        "content": chunk
+                    })

            elapsed_time = time.time() - start_time

+            if sub_agent:
+                yield self._format_sse_event("sub_usage", {
+                        "total_tokens": total_tokens
+                    })
+
            # 10. 保存会话消息
            if not sub_agent and agent_config.memory and agent_config.memory.get("enabled"):
                await self._save_conversation_message(
@@ -674,7 +691,10 @@ class DraftRunService:
                    user_message=message,
                    assistant_message=full_content,
                    app_id=agent_config.app_id,
-                    user_id=user_id
+                    user_id=user_id,
+                    meta_data={
+                        "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": total_tokens}
+                    }
                )

            # 11. 发送结束事件
@@ -898,6 +918,7 @@ class DraftRunService:
        conversation_id: str,
        user_message: str,
        assistant_message: str,
+        meta_data: dict,
        app_id: Optional[uuid.UUID] = None,
        user_id: Optional[str] = None
    ) -> None:
@@ -909,6 +930,7 @@ class DraftRunService:
            assistant_message: AI 回复消息
            app_id: 应用ID（未使用，保留用于兼容性）
            user_id: 用户ID（未使用，保留用于兼容性）
+            meta_data: token消耗
        """
        try:
            from app.services.conversation_service import ConversationService
@@ -927,7 +949,8 @@ class DraftRunService:
            conversation_service.add_message(
                conversation_id=conv_uuid,
                role="assistant",
-                content=assistant_message
+                content=assistant_message,
+                meta_data=meta_data
            )

            logger.debug(
--- a/api/app/services/handoffs_service.py
+++ b/api/app/services/handoffs_service.py
@@ -4,7 +4,7 @@ import uuid
 from typing import List, Dict, Any, Optional, AsyncGenerator, Annotated
 from typing_extensions import TypedDict

-from langchain_core.messages import HumanMessage, AIMessage, BaseMessage
+from langchain_core.messages import HumanMessage, AIMessage, BaseMessage, AIMessageChunk
 from langgraph.graph import StateGraph, START, END
 from langgraph.types import Command
 from langgraph.checkpoint.memory import MemorySaver
@@ -727,9 +727,12 @@ class HandoffsService:
        
        # 提取响应
        response_content = ""
+        total_tokens = 0
        for msg in result.get("messages", []):
            if isinstance(msg, AIMessage):
                response_content = msg.content
+                response_meta = msg.response_metadata if hasattr(msg, 'response_metadata') else None
+                total_tokens = response_meta.get("token_usage", {}).get("total_tokens", 0) if response_meta else 0
                break
        
        return {
@@ -737,7 +740,12 @@ class HandoffsService:
            "active_agent": result.get("active_agent"),
            "response": response_content,
            "message_count": len(result.get("messages", [])),
-            "handoff_count": result.get("handoff_count", 0)
+            "handoff_count": result.get("handoff_count", 0),
+            "usage": {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": total_tokens
+                }
        }
    
    async def chat_stream(
@@ -830,6 +838,12 @@ class HandoffsService:
                
                # 捕获 LLM 结束事件，输出收集到的工具调用
                elif kind == "on_chat_model_end":
+                    output_message = event.get("data", {}).get("output", {})
+                    if isinstance(output_message, AIMessageChunk):
+                        response_meta = output_message.response_metadata if hasattr(output_message, 'response_metadata') else None
+                        total_tokens = response_meta.get("token_usage", {}).get("total_tokens",
+                                                                                0) if response_meta else 0
+                        yield f"event: sub_usage\ndata: {json.dumps({"total_tokens": total_tokens}, ensure_ascii=False)}\n\n"
                    if collected_tool_calls:
                        # 找到参数最完整的 transfer 工具调用
                        best_tc = None
--- a/api/app/services/memory_dashboard_service.py
+++ b/api/app/services/memory_dashboard_service.py
@@ -53,7 +53,10 @@ def get_workspace_end_users(
    workspace_id: uuid.UUID, 
    current_user: User
 ) -> List[EndUser]:
-    """获取工作空间的所有宿主（优化版本：减少数据库查询次数）"""
+    """获取工作空间的所有宿主（优化版本：减少数据库查询次数）
+    
+    返回结果按 updated_at 从新到旧排序（NULL 值排在最后）
+    """
    business_logger.info(f"获取工作空间宿主列表: workspace_id={workspace_id}, 操作者: {current_user.username}")
    
    try:        
@@ -68,9 +71,14 @@ def get_workspace_end_users(
        app_ids = [app.id for app in apps_orm]
        
        # 批量查询所有 end_users（一次查询而非循环查询）
+        # 按 updated_at 降序排序，NULL 值排在最后；id 作为次级排序键保证确定性
        from app.models.end_user_model import EndUser as EndUserModel
+        from sqlalchemy import desc, nullslast
        end_users_orm = db.query(EndUserModel).filter(
            EndUserModel.app_id.in_(app_ids)
+        ).order_by(
+            nullslast(desc(EndUserModel.updated_at)),
+            desc(EndUserModel.id)
        ).all()
        
        # 转换为 Pydantic 模型（只在需要时转换）
--- a/api/app/services/memory_reflection_service.py
+++ b/api/app/services/memory_reflection_service.py
@@ -89,7 +89,6 @@ class WorkspaceAppService:
        
        for release in app_releases:
            memory_content = self._extract_memory_content(release.config)
-            memory_content=resolve_config_id(memory_content, self.db)
            if memory_content and memory_content in processed_configs:
                continue

@@ -122,16 +121,12 @@ class WorkspaceAppService:
    def _get_memory_config(self, memory_content: str) -> Dict[str, Any]:
        """Retrieve memory_config information based on memory_content"""
        try:
-            memory_config_result = MemoryConfigRepository.query_reflection_config_by_id(self.db, int(memory_content))
-
-            # memory_config_query, memory_config_params = MemoryConfigRepository.build_select_reflection(memory_content)
-            # memory_config_result = self.db.execute(text(memory_config_query), memory_config_params).fetchone()
-            # if memory_config_result is None:
-            #     return None
+            memory_content = resolve_config_id(memory_content, self.db)
+            memory_config_result = MemoryConfigRepository.query_reflection_config_by_id(self.db, (memory_content))

            if memory_config_result:
                return {
-                    "config_id": memory_config_result.config_id,
+                    "config_id": memory_content,
                    "enable_self_reflexion": memory_config_result.enable_self_reflexion,
                    "iteration_period": memory_config_result.iteration_period,
                    "reflexion_range": memory_config_result.reflexion_range,
@@ -291,7 +286,7 @@ class MemoryReflectionService:
                # 检查是否需要执行反思
                should_execute = False
                hours_diff = 0
-                
+
                if current_reflection_time is None:
                    # 首次执行反思
                    should_execute = True
@@ -303,11 +298,11 @@ class MemoryReflectionService:
                            reflection_time = datetime.fromisoformat(current_reflection_time)
                        else:
                            reflection_time = current_reflection_time
-                        
+
                        current_time = datetime.now()
                        time_diff = current_time - reflection_time
                        hours_diff = int(time_diff.total_seconds() / 3600)
-                        
+
                        # 检查是否达到反思周期
                        if hours_diff >= iteration_period:
                            should_execute = True
@@ -317,7 +312,7 @@ class MemoryReflectionService:
                    except (ValueError, TypeError) as e:
                        api_logger.warning(f"解析反思时间失败: {e}，将执行反思")
                        should_execute = True
-                
+
                if should_execute:
                    api_logger.info(f"与上次的反思时间间隔为: {hours_diff} 小时")
                    # 3. 执行反思引擎
@@ -350,7 +345,7 @@ class MemoryReflectionService:
                        "next_reflection_in_hours": iteration_period - hours_diff
                    }

-            
+
        except Exception as e:
            config_id = config_data.get("config_id", "unknown")
            api_logger.error(f"启动反思失败，config_id: {config_id}, end_user_id: {end_user_id}, 错误: {str(e)}")
@@ -361,7 +356,7 @@ class MemoryReflectionService:
                "end_user_id": end_user_id,
                "config_data": config_data
            }
-    
+
    def _create_reflection_config_from_data(self, config_data: Dict[str, Any]) -> ReflectionConfig:
        """Create reflective configuration objects from configuration data"""

@@ -369,12 +364,12 @@ class MemoryReflectionService:
        if reflexion_range_value is None or reflexion_range_value == "":
            reflexion_range_value = "partial"
        reflexion_range = ReflectionRange(reflexion_range_value)
-        
+
        baseline_value = config_data.get("baseline")
        if baseline_value is None or baseline_value == "":
            baseline_value = "TIME"
        baseline = ReflectionBaseline(baseline_value)
-        
+
        # iteration_period =
        iteration_period = config_data.get("iteration_period", 24)
        if isinstance(iteration_period, str):
@@ -382,7 +377,6 @@ class MemoryReflectionService:
                iteration_period = int(iteration_period)
            except (ValueError, TypeError):
                iteration_period = 24  # 默认24小时
-        
        return ReflectionConfig(
            enabled=config_data.get("enable_self_reflexion", False),
            iteration_period=str(iteration_period),  # ReflectionConfig期望字符串
--- a/api/app/services/memory_storage_service.py
+++ b/api/app/services/memory_storage_service.py
@@ -129,6 +129,12 @@ class DataConfigService: # 数据配置服务类（PostgreSQL）
            if not params.rerank_id:
                params.rerank_id = configs.get('rerank')

+        # reflection_model_id 和 emotion_model_id 默认与 llm_id 一致
+        if not params.reflection_model_id:
+            params.reflection_model_id = params.llm_id
+        if not params.emotion_model_id:
+            params.emotion_model_id = params.llm_id
+
        config = MemoryConfigRepository.create(self.db, params)
        self.db.commit()
        return {"affected": 1, "config_id": config.config_id}
@@ -177,11 +183,11 @@ class DataConfigService: # 数据配置服务类（PostgreSQL）

    # --- Read All ---
    def get_all(self, workspace_id = None) -> List[Dict[str, Any]]: # 获取所有配置参数
-        configs = MemoryConfigRepository.get_all(self.db, workspace_id)
+        results = MemoryConfigRepository.get_all(self.db, workspace_id)

        # 将 ORM 对象转换为字典列表
        data_list = []
-        for config in configs:
+        for config, scene_name in results:
            # 安全地转换 user_id 为 int
            config_id_old = None
            if config.config_id_old:
@@ -203,6 +209,8 @@ class DataConfigService: # 数据配置服务类（PostgreSQL）
                "end_user_id": config.end_user_id,
                "config_id_old": config_id_old,
                "apply_id": config.apply_id,
+                "scene_id": str(config.scene_id) if config.scene_id else None,
+                "scene_name": scene_name,  # 新增：场景名称
                "llm_id": config.llm_id,
                "embedding_id": config.embedding_id,
                "rerank_id": config.rerank_id,
@@ -628,10 +636,9 @@ async def analytics_recent_activity_stats() -> Dict[str, Any]:
                if m < 1:
                    latest_relative = "刚刚"
                elif m < 60:
-                    latest_relative = f"{m}分钟前"
+                    latest_relative = "一会前"
                else:
-                    h = int(m // 60)
-                    latest_relative = f"{h}小时前" if h < 24 else f"{int(h // 24)}天前"
+                    latest_relative = "较早前"
    except Exception:
        pass

--- a/api/app/services/multi_agent_orchestrator.py
+++ b/api/app/services/multi_agent_orchestrator.py
@@ -280,14 +280,22 @@ class MultiAgentOrchestrator:

            # 4. 提取子 Agent 的 conversation_id（用于多轮对话）
            sub_conversation_id = None
+            total_tokens = 0
+            
            if isinstance(results, dict):
                sub_conversation_id = results.get("conversation_id") or results.get("result", {}).get("conversation_id")
+                # 提取 token 信息
+                usage = results.get("usage", {}) or results.get("result", {}).get("usage", {})
+                total_tokens += usage.get("total_tokens", 0)
            elif isinstance(results, list) and results:
                for item in results:
                    if "result" in item:
                        sub_conversation_id = item["result"].get("conversation_id")
                        if sub_conversation_id:
                            break
+                    # 累加每个子 Agent 的 token
+                    usage = item.get("usage", {}) or item.get("result", {}).get("usage", {})
+                    total_tokens += usage.get("total_tokens", 0)

            logger.info(
                "多 Agent 任务完成",
@@ -301,9 +309,15 @@ class MultiAgentOrchestrator:
            return {
                "message": final_result,
                "conversation_id": sub_conversation_id,
+                "mode": OrchestrationMode.SUPERVISOR,
                "elapsed_time": elapsed_time,
                "strategy": routing_decision.get("collaboration_strategy", "single"),
-                "sub_results": results
+                "sub_results": results,
+                "usage": {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": total_tokens
+                }
            }

        except Exception as e:
@@ -1552,10 +1566,12 @@ class MultiAgentOrchestrator:
            return {
                "message": result.get("response", ""),
                "conversation_id": result.get("conversation_id"),
+                "mode": OrchestrationMode.COLLABORATION,
                "elapsed_time": elapsed_time,
                "strategy": "collaboration",
                "active_agent": result.get("active_agent"),
-                "sub_results": result
+                "sub_results": result,
+                "usage": result.get("usage")
            }

        except Exception as e:
--- a/api/app/services/multi_agent_service.py
+++ b/api/app/services/multi_agent_service.py
@@ -1,5 +1,6 @@
 """多 Agent 配置管理服务"""
 import uuid
+import json
 from typing import Optional, List, Tuple, Any, Annotated

 from fastapi import Depends
@@ -427,6 +428,23 @@ class MultiAgentService:
            memory=getattr(request, 'memory', True)  # 记忆功能参数
        )

+        await self._save_conversation_message(
+            conversation_id=request.conversation_id,
+            user_message=request.message,
+            assistant_message=result.get("message", ""),
+            app_id=app_id,
+            user_id=request.user_id,
+            meta_data={
+                "mode": result.get("mode"),
+                "elapsed_time": result.get("elapsed_time"),
+                "usage": result.get("usage", {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": 0
+                })
+            }
+        )
+
        return result

    async def run_stream(
@@ -451,11 +469,14 @@ class MultiAgentService:
            raise ResourceNotFoundException("多 Agent 配置", str(app_id))

        if not config.is_active:
-            raise BusinessException("多 Agent 配置已禁用", BizCode.RESOURCE_DISABLED)
+            raise BusinessException("多 Agent 配置已禁用", BizCode.NOT_FOUND)

        # 2. 创建编排器
        orchestrator = MultiAgentOrchestrator(self.db, config)

+        full_content = ""
+        total_tokens = 0
+
        # 3. 流式执行任务
        async for event in orchestrator.execute_stream(
            message=request.message,
@@ -468,7 +489,88 @@ class MultiAgentService:
            storage_type=storage_type,
            user_rag_memory_id=user_rag_memory_id
        ):
-            yield event
+            if "sub_usage" in event:
+                if "data:" in event:
+                    try:
+                        data_line = event.split("data: ", 1)[1].strip()
+                        data = json.loads(data_line)
+                        if "total_tokens" in data:
+                            total_tokens += data["total_tokens"]
+                    except:
+                        pass
+            else:
+                yield event
+                if "data:" in event:
+                    try:
+                        data_line = event.split("data: ", 1)[1].strip()
+                        data = json.loads(data_line)
+                        if "content" in data:
+                            full_content += data["content"]
+                    except:
+                        pass
+
+        await self._save_conversation_message(
+            conversation_id=request.conversation_id,
+            user_message=request.message,
+            assistant_message=full_content,
+            app_id=app_id,
+            user_id=request.user_id,
+            meta_data={
+                "usage": {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": total_tokens
+                }
+            }
+        )
+
+    async def _save_conversation_message(
+        self,
+        conversation_id: uuid.UUID,
+        user_message: str,
+        assistant_message: str,
+        meta_data: dict,
+        app_id: Optional[uuid.UUID] = None,
+        user_id: Optional[str] = None
+    ) -> None:
+        """保存会话消息
+
+        Args:
+            conversation_id: 会话ID
+            user_message: 用户消息
+            assistant_message: AI 回复消息
+            meta_data: 元数据（包括 token 消耗）
+            app_id: 应用ID
+            user_id: 用户ID
+        """
+        try:
+            from app.services.conversation_service import ConversationService
+
+            conversation_service = ConversationService(self.db)
+
+            conversation_service.add_message(
+                conversation_id=conversation_id,
+                role="user",
+                content=user_message
+            )
+            conversation_service.add_message(
+                conversation_id=conversation_id,
+                role="assistant",
+                content=assistant_message,
+                meta_data=meta_data
+            )
+
+            logger.debug(
+                "保存多 Agent 会话消息",
+                extra={
+                    "conversation_id": conversation_id,
+                    "user_message_length": len(user_message),
+                    "assistant_message_length": len(assistant_message)
+                }
+            )
+
+        except Exception as e:
+            logger.warning("保存会话消息失败", extra={"error": str(e)})

    # def add_sub_agent(
    #     self,
--- a/api/app/services/ontology_service.py
+++ b/api/app/services/ontology_service.py
--- a/api/app/services/prompt_optimizer_service.py
+++ b/api/app/services/prompt_optimizer_service.py
@@ -1,3 +1,4 @@
+import os
 import re
 import uuid
 from typing import Any, AsyncGenerator
@@ -18,7 +19,8 @@ from app.models.prompt_optimizer_model import (
 )
 from app.repositories.model_repository import ModelConfigRepository, ModelApiKeyRepository
 from app.repositories.prompt_optimizer_repository import (
-    PromptOptimizerSessionRepository
+    PromptOptimizerSessionRepository,
+    PromptReleaseRepository
 )
 from app.schemas.prompt_optimizer_schema import OptimizePromptResult

@@ -28,6 +30,8 @@ logger = get_business_logger()
 class PromptOptimizerService:
    def __init__(self, db: Session):
        self.db = db
+        self.optim_repo = PromptOptimizerSessionRepository(self.db)
+        self.release_repo = PromptReleaseRepository(self.db)

    def get_model_config(
            self,
@@ -78,10 +82,12 @@ class PromptOptimizerService:
        Returns:
            PromptOptimzerSession: The newly created prompt optimization session.
        """
-        session = PromptOptimizerSessionRepository(self.db).create_session(
+        session = self.optim_repo.create_session(
            tenant_id=tenant_id,
            user_id=user_id
        )
+        self.db.commit()
+        self.db.refresh(session)
        return session

    def get_session_message_history(
@@ -106,7 +112,7 @@ class PromptOptimizerService:
                - role (str): The role of the message sender, e.g., 'system', 'user', or 'assistant'.
                - content (str): The content of the message.
        """
-        history = PromptOptimizerSessionRepository(self.db).get_session_history(
+        history = self.optim_repo.get_session_history(
            session_id=session_id,
            user_id=user_id
        )
@@ -177,11 +183,12 @@ class PromptOptimizerService:
            base_url=api_config.api_base
        ), type=ModelType(model_config.type))
        try:
-            with open('app/services/prompt/prompt_optimizer_system.jinja2', 'r', encoding='utf-8') as f:
+            prompt_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'prompt')
+            with open(os.path.join(prompt_path, 'prompt_optimizer_system.jinja2'), 'r', encoding='utf-8') as f:
                opt_system_prompt = f.read()
            rendered_system_message = Template(opt_system_prompt).render()

-            with open('app/services/prompt/prompt_optimizer_user.jinja2', 'r', encoding='utf-8') as f:
+            with open(os.path.join(prompt_path, 'prompt_optimizer_user.jinja2'), 'r', encoding='utf-8') as f:
                opt_user_prompt = f.read()
        except FileNotFoundError:
            raise BusinessException(message="System prompt template not found", code=BizCode.NOT_FOUND)
@@ -296,4 +303,165 @@ class PromptOptimizerService:
            role=role,
            content=content
        )
+        self.db.commit()
+        self.db.refresh(message)
        return message
+
+    def save_prompt(
+            self,
+            tenant_id: uuid.UUID,
+            session_id: uuid.UUID,
+            title: str,
+            prompt: str
+    ) -> dict:
+        """
+        Create and save a new prompt release for a given session.
+
+        Args:
+            tenant_id (uuid.UUID): The ID of the tenant owning the prompt.
+            session_id (uuid.UUID): The ID of the session to associate with this prompt.
+            title (str): The title of the prompt release.
+            prompt (str): The content of the prompt.
+
+        Returns:
+            dict: A dictionary containing:
+                - id (UUID): The unique ID of the created prompt release.
+                - session_id (UUID): The session ID linked to the release.
+                - title (str): The title of the prompt.
+                - prompt (str): The prompt content.
+                - created_at (int): Timestamp (in milliseconds) of when the prompt was created.
+
+        Raises:
+            BusinessException: If a prompt release already exists for the given session.
+        """
+        session = self.optim_repo.get_session_by_id(session_id)
+        if session is None or session.tenant_id != tenant_id:
+            raise BusinessException(
+                "Session does not exist or the current user has no access",
+                BizCode.BAD_REQUEST
+            )
+
+        if self.release_repo.get_prompt_by_session_id(session_id):
+            raise BusinessException(
+                "A release already exists for the current session",
+                BizCode.BAD_REQUEST
+            )
+
+        prompt_obj = self.release_repo.create_prompt_release(
+            tenant_id=tenant_id,
+            title=title,
+            session_id=session_id,
+            prompt=prompt
+        )
+        self.db.commit()
+        self.db.refresh(prompt_obj)
+        return {
+            "id": prompt_obj.id,
+            "session_id": prompt_obj.session_id,
+            "title": prompt_obj.title,
+            "prompt": prompt_obj.prompt,
+            "created_at": int(prompt_obj.created_at.timestamp() * 1000)
+        }
+
+    def delete_prompt(
+            self,
+            tenant_id: uuid.UUID,
+            prompt_id: uuid.UUID
+    ) -> None:
+        """
+        Soft delete a prompt release by prompt_id.
+
+        Args:
+            tenant_id (uuid.UUID): Tenant identifier.
+            prompt_id (uuid.UUID): Prompt identifier.
+
+        Raises:
+            BusinessException: If the prompt does not exist or already deleted.
+        """
+        prompt_obj = self.release_repo.get_prompt_by_id(prompt_id)
+        if not prompt_obj or prompt_obj.is_delete:
+            raise BusinessException(
+                "Prompt does not exist or has already been deleted",
+                BizCode.NOT_FOUND
+            )
+
+        if prompt_obj.tenant_id != tenant_id:
+            raise BusinessException(
+                "No permission to delete this prompt",
+                BizCode.FORBIDDEN
+            )
+
+        self.release_repo.soft_delete_prompt(prompt_obj)
+        self.db.commit()
+        logger.info(f"Prompt soft deleted, prompt_id={prompt_id}, tenant_id={tenant_id}")
+
+    def get_release_list(
+            self,
+            tenant_id: uuid.UUID,
+            page: int,
+            page_size: int,
+            filter_keyword: str | None = None
+    ) -> dict[str, int | list[Any]]:
+        """
+        Get paginated list of prompt releases with optional filter.
+
+        Args:
+            tenant_id (uuid.UUID): Tenant identifier.
+            page (int): Page number (starting from 1).
+            page_size (int): Number of items per page.
+            filter_keyword (str | None): Optional keyword to filter by title.
+
+        Returns:
+            dict: Contains total count, pagination info, and list of releases.
+        """
+        offset = (page - 1) * page_size
+
+        # Get total count and releases based on filter
+        if filter_keyword:
+            total = self.release_repo.count_prompts_by_keyword(tenant_id, filter_keyword)
+            releases = self.release_repo.search_prompts_paginated(
+                tenant_id=tenant_id,
+                keyword=filter_keyword,
+                offset=offset,
+                limit=page_size
+            )
+        else:
+            total = self.release_repo.count_prompts(tenant_id)
+            releases = self.release_repo.get_prompts_paginated(
+                tenant_id=tenant_id,
+                offset=offset,
+                limit=page_size
+            )
+
+        items = []
+        for release in releases:
+            # Get first user message from session
+            first_message = self.optim_repo.get_first_user_message(
+                session_id=release.session_id
+            )
+
+            items.append({
+                "id": release.id,
+                "title": release.title,
+                "prompt": release.prompt,
+                "created_at": int(release.created_at.timestamp() * 1000),
+                "first_message": first_message
+            })
+
+        log_msg = f"Retrieved {len(items)} prompt releases, page={page}, tenant_id={tenant_id}"
+        if filter_keyword:
+            log_msg += f", filter='{filter_keyword}'"
+        logger.info(log_msg)
+
+        result = {
+            "page": {
+                "total": total,
+                "page": page,
+                "page_size": page_size,
+                "hasnext": page * page_size < total
+            },
+            "keyword": filter_keyword,
+            "items": items
+        }
+
+        return result
--- a/api/app/services/shared_chat_service.py
+++ b/api/app/services/shared_chat_service.py
@@ -282,7 +282,14 @@ class SharedChatService:
        self.conversation_service.save_conversation_messages(
            conversation_id=conversation.id,
            user_message=message,
-            assistant_message=result["content"]
+            assistant_message=result["content"],
+            meta_data={
+                "usage": result.get("usage", {
+                    "prompt_tokens": 0,
+                    "completion_tokens": 0,
+                    "total_tokens": 0
+                })
+            }
        )
        # self.conversation_service.add_message(
        #     conversation_id=conversation.id,
@@ -469,6 +476,7 @@ class SharedChatService:
            
            # 流式调用 Agent
            full_content = ""
+            total_tokens = 0
            async for chunk in agent.chat_stream(
                message=message,
                history=history,
@@ -479,9 +487,12 @@ class SharedChatService:
                config_id=config_id,
                memory_flag=memory_flag
            ):
-                full_content += chunk
-                # 发送消息块事件
-                yield f"event: message\ndata: {json.dumps({'content': chunk}, ensure_ascii=False)}\n\n"
+                if isinstance(chunk, int):
+                    total_tokens = chunk
+                else:
+                    full_content += chunk
+                    # 发送消息块事件
+                    yield f"event: message\ndata: {json.dumps({'content': chunk}, ensure_ascii=False)}\n\n"
            
            elapsed_time = time.time() - start_time
            
@@ -498,7 +509,7 @@ class SharedChatService:
                content=full_content,
                meta_data={
                    "model": api_key_obj.model_name,
-                    "usage": {}
+                    "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": total_tokens}
                }
            )

--- a/api/app/services/user_memory_service.py
+++ b/api/app/services/user_memory_service.py
@@ -15,6 +15,7 @@ from app.core.memory.utils.llm.llm_utils import MemoryClientFactory
 from app.db import get_db_context
 from app.repositories.conversation_repository import ConversationRepository
 from app.repositories.end_user_repository import EndUserRepository
+from app.repositories.neo4j.cypher_queries import Graph_Node_query
 from app.repositories.neo4j.neo4j_connector import Neo4jConnector
 from app.schemas.memory_episodic_schema import EmotionSubject, EmotionType, type_mapping
 from app.services.implicit_memory_service import ImplicitMemoryService
@@ -1508,7 +1509,6 @@ async def analytics_graph_data(
        user_uuid = uuid.UUID(end_user_id)
        repo = EndUserRepository(db)
        end_user = repo.get_by_id(user_uuid)
-        
        if not end_user:
            logger.warning(f"未找到 end_user_id 为 {end_user_id} 的用户")
            return {
@@ -1562,21 +1562,11 @@ async def analytics_graph_data(
            }
        else:
            # 查询所有节点
-            node_query = """
-            MATCH (n)
-            WHERE n.end_user_id = $end_user_id
-            RETURN 
-                elementId(n) as id,
-                labels(n)[0] as label,
-                properties(n) as properties
-            LIMIT $limit
-            """
+            node_query=Graph_Node_query
            node_params = {
                "end_user_id": end_user_id,
                "limit": limit
            }
-
-
        # 执行节点查询
        node_results = await _neo4j_connector.execute_query(node_query, **node_params)
        
@@ -1587,9 +1577,9 @@ async def analytics_graph_data(
        
        for record in node_results:
            node_id = record["id"]
-            node_label = record["label"]
+            node_labels = record.get("labels", [])
+            node_label = node_labels[0] if node_labels else "Unknown"
            node_props = record["properties"]
-            
            # 根据节点类型提取需要的属性字段
            filtered_props = await _extract_node_properties(node_label, node_props,node_id)