Commit Graph

59 Commits

Author SHA1 Message Date
Mark
f8d1ed51a7 [fix] system prompt fit error 2026-05-07 19:37:34 +08:00
Mark
9fa83ed01e [modify] QA pair 2026-05-07 19:04:19 +08:00
Mark
ad2e885f72 [fix] index_not_found_exception 2026-05-06 18:34:07 +08:00
Mark
70c6d161c8 [fix] delete chunk refresh index 2026-05-06 15:19:46 +08:00
Mark
f85c0594c9 [fix] es vector 2026-04-29 15:24:25 +08:00
Mark
64e640d882 [add] batch chunk. qa_prompt set 2026-04-28 15:33:44 +08:00
Mark
140311048a [modify] rag qa chunk 2026-04-28 14:04:36 +08:00
Mark
2997558bc8 Merge branch 'release/v0.3.2' into feature/rag2
* release/v0.3.2: (245 commits)
  fix(conversation_schema): refine citations field type to Dict[str, Any]
  fix(tool_controller): re-raise HTTPException to preserve original status codes
  fix(workflow): add reasoning content, suggested questions, citations and audio status support
  feat(workflow): augment logging queries and ameliorate error handling
  fix(api_key): bypass publication check for SERVICE type API keys
  fix(multimodal_service): add '文档内容:' prefix to document text and simplify image placeholder text
  fix(api): convert config_id to string in write_router
  fix(api): convert end_user_id to string in write_router
  fix(multimodal_service): refactor image processing to use intermediate list before extending result
  fix(web): node status ui
  fix(api): correct import paths in memory_read and celery task command
  fix(api): correct import paths in memory_read and celery task command
  refactor(tool): flatten request body parameters for model exposure
  fix(api): correct import paths in memory_read and celery task command
  refactor(workflow): streamline node execution handling and log service logic
  feat(web): http request add process
  feat(web): workflow app logs
  fix(app_chat_service,draft_run_service): move system_prompt augmentation before LangChainAgent instantiation
  fix(app_chat_service,draft_run_service): move system_prompt augmentation before LangChainAgent instantiation
  refactor(http_request): simplify request handling and remove unused fields
  ...

# Conflicts:
#	api/app/controllers/file_controller.py
#	api/app/tasks.py
2026-04-27 16:13:57 +08:00
Mark
30cdf229de [modify] rag file system 2026-04-27 16:05:27 +08:00
Mark
1ea0f308ba [fix] celery task 2026-04-22 11:47:32 +08:00
Mark
aecb0f6497 Merge branch 'feature/rag2' into release/v0.3.1
* feature/rag2:
  [modify] fix
  [modify] Optimize ES connections and add rerank security checks
2026-04-21 13:44:39 +08:00
Timebomb2018
441b21774d fix(rag): replace semicolon separators with newlines in Excel parser output 2026-04-14 17:56:30 +08:00
Timebomb2018
75e95bab01 refactor(rag): simplify Excel parsing logic and remove redundant chunk_token_num assignment 2026-04-14 17:10:52 +08:00
Mark
3b359df02f [modify] fix 2026-04-14 17:02:11 +08:00
Mark
fcf3071cb0 [modify] Optimize ES connections and add rerank security checks 2026-04-14 16:46:57 +08:00
Timebomb2018
e3265e4ba3 fix(http-request,embedding,naive): tighten form-data validation, reduce truncation length to 8000, and disable chunking for Excel
The form-data validation now ensures all items in the list are of type HttpFormData. Truncation length for embedding inputs is reduced from 8191 to 8000 to accommodate tokenizer differences and avoid overflow. Excel parsing now disables chunking by setting chunk_token_num to 0, aligning with intended behavior for structured file ingestion.
2026-04-14 16:14:01 +08:00
Timebomb2018
ca4f7aa65d refactor(rag/nlp): refactor reranking logic to apply post-deduplication and remove debug log 2026-04-09 19:35:43 +08:00
Timebomb2018
130684cac0 refactor(rag/nlp): standardize knowledge graph retrieval to use DocumentChunk and add debug logging
The knowledge graph retrieval logic in `search.py` was updated to consistently return `DocumentChunk` instances instead of raw dictionaries, improving type safety and alignment with the RAG pipeline's expected data structure. Additionally, debug logging was enhanced in `draft_run_service.py` to log the full `retrieve_chunks_result` before extracting page content, aiding troubleshooting.
2026-04-09 19:07:53 +08:00
Timebomb2018
a7b8ba0c66 fix(rag): fix pdfplumber concurrency issue and add debug logging
The pdfplumber parser now uses a global lock to prevent concurrent access issues during PDF image rendering. Additionally, added a warning log to trace knowledge retrieval results for debugging purposes. The syntax fix in knowledge node's match case ensures correct pattern matching behavior.

BREAKING CHANGE: The pdfplumber parser now requires LOCK_KEY_pdfplumber to be defined in sys.modules for thread safety.

Closes #841
2026-04-09 17:48:16 +08:00
Timebomb2018
70aab94fc3 feat(knowledge): support graph retrieval type with dynamic API key selection 2026-04-09 15:00:49 +08:00
Timebomb2018
54cff5861a feat(model): add volcano model 2026-03-25 11:45:49 +08:00
lixiangcheng1
4bc030c1ef Merge branch 'feature/knowledge_lxc' into develop 2026-03-19 15:10:45 +08:00
lixiangcheng1
86a0aa1f9f 【fix]Nested query of folder knowledge base retrieve 2026-03-19 15:08:50 +08:00
lixiangcheng1
d77220a603 Merge branch 'feature/knowledge_lxc' into develop 2026-03-19 08:19:24 +08:00
lixiangcheng1
f52b681133 【fix]Nested query of folder knowledge base 2026-03-19 08:17:58 +08:00
lixiangcheng1
b31e526e4d Merge branch 'feature/knowledge_lxc' into develop 2026-02-10 14:09:52 +08:00
lixiangcheng1
26abf7b586 [fix] parse excel 2026-02-10 14:05:01 +08:00
Mark
7b72bf0cd0 Merge branch 'release/v0.2.3' into develop
# Conflicts:
#	api/app/core/agent/langchain_agent.py
#	api/app/core/memory/agent/langgraph_graph/write_graph.py
#	api/app/repositories/neo4j/graph_saver.py
#	api/app/services/draft_run_service.py
2026-02-06 14:48:50 +08:00
lixiangcheng1
db46c186aa [ADD]Three party synchronization
1. Three party web website data access - Web site synchronization
Building a knowledge base by crawling web page data in batches through web crawlers
Web site synchronization utilizes crawler technology, which can automatically capture all websites under the same domain name through a single entry website. Currently, it supports up to 200 subpages. For compliance and security reasons, only static site crawling is supported, mainly used for quickly building knowledge bases on various document sites.
2. Feishu Knowledge Base
By configuring Feishu document permissions, a knowledge base can be built using Feishu documents, and the documents will not undergo secondary storage
3. Language Bird Knowledge Base
You can configure the permissions of the language bird document to build a knowledge base using the language bird document, and the document will not undergo secondary storage
2026-02-06 12:18:40 +08:00
lixinyue
7922fc3b0e knowledge_retrieval/bug/fix 2026-02-04 15:53:13 +08:00
lixinyue
514c19a247 knowledge_retrieval/bug/fix 2026-02-04 15:51:13 +08:00
lixinyue
41550d4a41 knowledge_retrieval/bug/fix 2026-02-04 15:44:26 +08:00
lixiangcheng1
3aa2cdd754 Merge branch 'feature/knowledge_lxc' into develop 2026-01-27 18:30:56 +08:00
lixiangcheng1
d93d52cf10 [fix]remove aspose-slides 2026-01-27 18:30:27 +08:00
lixinyue11
3601737869 Fix/memory bug fix (#171) 2026-01-26 11:53:34 +08:00
lixiangcheng1
eb58e0ea63 [ADD]transcribing the content of MP4 video files into text and precisely marking the timestamps 2026-01-19 15:27:54 +08:00
lixiangcheng1
46752420da [ADD]transcribing the content of MP3 audio files into text and precisely marking the timestamps 2026-01-19 13:33:06 +08:00
lixiangcheng1
7165d53982 [fix]Clearly debug the model API key 2026-01-07 20:45:23 +08:00
lixiangcheng1
742d54342b [fix]parsed excel document error:float division by zero 2025-12-31 12:58:30 +08:00
lixiangcheng1
c78dc1fd47 [fix]parsed excel document error:float division by zero 2025-12-31 09:51:00 +08:00
lixiangcheng1
07bcb54ed3 [fix]entity_resolution.py:199: SyntaxWarning: invalid escape sequence '\d' 2025-12-31 08:54:36 +08:00
lixiangcheng1
37f72f919f [fix]parsed excel document error:float division by zero 2025-12-30 19:31:54 +08:00
lixiangcheng1
775d36b16b [fix]parsed excel document error:float division by zero 2025-12-30 19:05:29 +08:00
lixiangcheng1
909c536b47 [fix]parsed excel document error:float division by zero 2025-12-30 18:20:34 +08:00
lixiangcheng1
724eb4f801 [ADD]Intelligent physical examination inquiry knowledge base adds usage graph options 2025-12-30 15:21:48 +08:00
lixiangcheng1
0078028992 [ADD]Support graph search 2025-12-30 11:53:16 +08:00
lixiangcheng1
6defcaf982 [fxi]PNG image failed to parse after uploading
[TAPD] ID: 1004154
2025-12-29 17:18:51 +08:00
lixiangcheng1
34fa178f11 [fix]build_graphrag_for_kb 2025-12-29 11:55:17 +08:00
lixiangcheng1
cd12844a7c [fix]build knowledge graph 2025-12-27 17:12:04 +08:00
lixiangcheng1
a0c362244e [ADD]Add functions related to knowledge base graph:
Add functions related to knowledge base graph:
1. Entity type generation,
2. Knowledge base graph acquisition,
3. Hard deletion of knowledge base graph,
4. Knowledge base graph reconstruction (asynchronous)
2025-12-27 13:53:10 +08:00