%% MemoryBear 在线检索时序图(Query Pipeline)
%% 起点:用户 Query;终点:LLM 生成的回答
sequenceDiagram
autonumber
participant User as 用户/API
participant WF as Workflow Engine
(workflow/nodes/knowledge/node.py)
participant Config as config.py
KnowledgeRetrievalNodeConfig
participant Retriever as nlp/search.py
knowledge_retrieval()
participant Dealer as nlp/search.py:349
Dealer.search()
participant Qryr as nlp/query.py
Query理解
participant ESVec as ESVector
elasticsearch_vector.py
participant Graph as graphrag/search.py
KGSearch.retrieval()
participant Rerank as models/rerank.py
RedBearRerank
participant Prompt as prompts/generator.py
participant LLM as llm/chat_model.py
Base.chat()
participant Cache as utils/redis_conn.py
Note over User,Cache: === 阶段 1:Query 准备 ===
User->>WF: 用户输入 Query
WF->>WF: _render_template(query, variable_pool)
WF->>Config: 读取 knowledge_bases[]
reranker_id / retrieve_type
Note over Retriever,ESVec: === 阶段 2:多知识库检索 ===
loop 每个 Knowledge Base
WF->>Retriever: knowledge_retrieval(query, config)
Retriever->>DB: 验证 KB 状态 (chunk_num>0, status=1)
alt RetrieveType == PARTICIPLE
Retriever->>ESVec: search_by_full_text(query, top_k)
ESVec->>ESVec: match on page_content (ik_max_word)
ESVec-->>Retriever: List[DocumentChunk]
else RetrieveType == SEMANTIC
Retriever->>ESVec: search_by_vector(query, top_k)
ESVec->>ESVec: script_score cosineSimilarity
ESVec-->>Retriever: List[DocumentChunk]
else RetrieveType == HYBRID
par
Retriever->>ESVec: search_by_vector()
ESVec-->>Retriever: rs1
and
Retriever->>ESVec: search_by_full_text()
ESVec-->>Retriever: rs2
end
Retriever->>Retriever: _deduplicate_docs(rs1, rs2)
Retriever->>Rerank: rerank(query, docs, top_k)
Rerank->>Rerank: similarity() 交叉编码评分
Rerank-->>Retriever: sorted docs
else RetrieveType == GRAPH
par
Retriever->>ESVec: search_by_vector()
ESVec-->>Retriever: rs1
and
Retriever->>ESVec: search_by_full_text()
ESVec-->>Retriever: rs2
end
Retriever->>Retriever: dedup + rerank
Retriever->>Graph: kg_retriever.retrieval(question)
Graph->>Graph: query_rewrite() → keywords + entities
Graph->>ESVec: get_relevant_ents_by_keywords()
Graph->>ESVec: get_relevant_relations_by_txt()
Graph->>Graph: n_hop_with_weight 路径扩展
Graph->>Graph: Score = pagerank * sim
Graph->>Graph: _community_retrieval_()
Graph-->>Retriever: Entity+Relation+CommunityReport chunk
Retriever->>Retriever: insert(0, graph_result)
end
Retriever-->>WF: List[DocumentChunk]
end
WF->>WF: _deduplicate_docs(all_results)
alt reranker_id 配置
WF->>Rerank: rerank(query, all_results, reranker_top_k)
Rerank-->>WF: reranked chunks
end
Note over Prompt,Cache: === 阶段 3:Prompt 组装 + LLM 生成 ===
WF->>WF: 返回 {"chunks": [...], "citations": [...]}
WF->>Prompt: citation_prompt(chunks)
Prompt->>Prompt: 组装 System Prompt + 检索上下文
Prompt->>Cache: get_llm_cache(model, prompt)
alt cache miss
Prompt->>LLM: chat(system, history, gen_conf)
LLM-->>Prompt: answer, tokens
Prompt->>Cache: set_llm_cache(model, prompt, answer)
else cache hit
Cache-->>Prompt: cached answer
end
Note over User,Cache: === 阶段 4:后处理 ===
Prompt->>Dealer: insert_citations(answer, chunks, chunk_v)
Dealer->>Dealer: pagerank*sim 定位引用位置
Dealer-->>Prompt: answer_with_citations, cited_ids
Prompt-->>User: 最终回答(含引用标注)