Commit Graph

50 Commits

Author SHA1 Message Date
Mark
1ea0f308ba [fix] celery task 2026-04-22 11:47:32 +08:00
Mark
aecb0f6497 Merge branch 'feature/rag2' into release/v0.3.1
* feature/rag2:
  [modify] fix
  [modify] Optimize ES connections and add rerank security checks
2026-04-21 13:44:39 +08:00
Timebomb2018
441b21774d fix(rag): replace semicolon separators with newlines in Excel parser output 2026-04-14 17:56:30 +08:00
Timebomb2018
75e95bab01 refactor(rag): simplify Excel parsing logic and remove redundant chunk_token_num assignment 2026-04-14 17:10:52 +08:00
Mark
3b359df02f [modify] fix 2026-04-14 17:02:11 +08:00
Mark
fcf3071cb0 [modify] Optimize ES connections and add rerank security checks 2026-04-14 16:46:57 +08:00
Timebomb2018
e3265e4ba3 fix(http-request,embedding,naive): tighten form-data validation, reduce truncation length to 8000, and disable chunking for Excel
The form-data validation now ensures all items in the list are of type HttpFormData. Truncation length for embedding inputs is reduced from 8191 to 8000 to accommodate tokenizer differences and avoid overflow. Excel parsing now disables chunking by setting chunk_token_num to 0, aligning with intended behavior for structured file ingestion.
2026-04-14 16:14:01 +08:00
Timebomb2018
ca4f7aa65d refactor(rag/nlp): refactor reranking logic to apply post-deduplication and remove debug log 2026-04-09 19:35:43 +08:00
Timebomb2018
130684cac0 refactor(rag/nlp): standardize knowledge graph retrieval to use DocumentChunk and add debug logging
The knowledge graph retrieval logic in `search.py` was updated to consistently return `DocumentChunk` instances instead of raw dictionaries, improving type safety and alignment with the RAG pipeline's expected data structure. Additionally, debug logging was enhanced in `draft_run_service.py` to log the full `retrieve_chunks_result` before extracting page content, aiding troubleshooting.
2026-04-09 19:07:53 +08:00
Timebomb2018
a7b8ba0c66 fix(rag): fix pdfplumber concurrency issue and add debug logging
The pdfplumber parser now uses a global lock to prevent concurrent access issues during PDF image rendering. Additionally, added a warning log to trace knowledge retrieval results for debugging purposes. The syntax fix in knowledge node's match case ensures correct pattern matching behavior.

BREAKING CHANGE: The pdfplumber parser now requires LOCK_KEY_pdfplumber to be defined in sys.modules for thread safety.

Closes #841
2026-04-09 17:48:16 +08:00
Timebomb2018
70aab94fc3 feat(knowledge): support graph retrieval type with dynamic API key selection 2026-04-09 15:00:49 +08:00
Timebomb2018
54cff5861a feat(model): add volcano model 2026-03-25 11:45:49 +08:00
lixiangcheng1
4bc030c1ef Merge branch 'feature/knowledge_lxc' into develop 2026-03-19 15:10:45 +08:00
lixiangcheng1
86a0aa1f9f 【fix]Nested query of folder knowledge base retrieve 2026-03-19 15:08:50 +08:00
lixiangcheng1
d77220a603 Merge branch 'feature/knowledge_lxc' into develop 2026-03-19 08:19:24 +08:00
lixiangcheng1
f52b681133 【fix]Nested query of folder knowledge base 2026-03-19 08:17:58 +08:00
lixiangcheng1
b31e526e4d Merge branch 'feature/knowledge_lxc' into develop 2026-02-10 14:09:52 +08:00
lixiangcheng1
26abf7b586 [fix] parse excel 2026-02-10 14:05:01 +08:00
Mark
7b72bf0cd0 Merge branch 'release/v0.2.3' into develop
# Conflicts:
#	api/app/core/agent/langchain_agent.py
#	api/app/core/memory/agent/langgraph_graph/write_graph.py
#	api/app/repositories/neo4j/graph_saver.py
#	api/app/services/draft_run_service.py
2026-02-06 14:48:50 +08:00
lixiangcheng1
db46c186aa [ADD]Three party synchronization
1. Three party web website data access - Web site synchronization
Building a knowledge base by crawling web page data in batches through web crawlers
Web site synchronization utilizes crawler technology, which can automatically capture all websites under the same domain name through a single entry website. Currently, it supports up to 200 subpages. For compliance and security reasons, only static site crawling is supported, mainly used for quickly building knowledge bases on various document sites.
2. Feishu Knowledge Base
By configuring Feishu document permissions, a knowledge base can be built using Feishu documents, and the documents will not undergo secondary storage
3. Language Bird Knowledge Base
You can configure the permissions of the language bird document to build a knowledge base using the language bird document, and the document will not undergo secondary storage
2026-02-06 12:18:40 +08:00
lixinyue
7922fc3b0e knowledge_retrieval/bug/fix 2026-02-04 15:53:13 +08:00
lixinyue
514c19a247 knowledge_retrieval/bug/fix 2026-02-04 15:51:13 +08:00
lixinyue
41550d4a41 knowledge_retrieval/bug/fix 2026-02-04 15:44:26 +08:00
lixiangcheng1
3aa2cdd754 Merge branch 'feature/knowledge_lxc' into develop 2026-01-27 18:30:56 +08:00
lixiangcheng1
d93d52cf10 [fix]remove aspose-slides 2026-01-27 18:30:27 +08:00
lixinyue11
3601737869 Fix/memory bug fix (#171) 2026-01-26 11:53:34 +08:00
lixiangcheng1
eb58e0ea63 [ADD]transcribing the content of MP4 video files into text and precisely marking the timestamps 2026-01-19 15:27:54 +08:00
lixiangcheng1
46752420da [ADD]transcribing the content of MP3 audio files into text and precisely marking the timestamps 2026-01-19 13:33:06 +08:00
lixiangcheng1
7165d53982 [fix]Clearly debug the model API key 2026-01-07 20:45:23 +08:00
lixiangcheng1
742d54342b [fix]parsed excel document error:float division by zero 2025-12-31 12:58:30 +08:00
lixiangcheng1
c78dc1fd47 [fix]parsed excel document error:float division by zero 2025-12-31 09:51:00 +08:00
lixiangcheng1
07bcb54ed3 [fix]entity_resolution.py:199: SyntaxWarning: invalid escape sequence '\d' 2025-12-31 08:54:36 +08:00
lixiangcheng1
37f72f919f [fix]parsed excel document error:float division by zero 2025-12-30 19:31:54 +08:00
lixiangcheng1
775d36b16b [fix]parsed excel document error:float division by zero 2025-12-30 19:05:29 +08:00
lixiangcheng1
909c536b47 [fix]parsed excel document error:float division by zero 2025-12-30 18:20:34 +08:00
lixiangcheng1
724eb4f801 [ADD]Intelligent physical examination inquiry knowledge base adds usage graph options 2025-12-30 15:21:48 +08:00
lixiangcheng1
0078028992 [ADD]Support graph search 2025-12-30 11:53:16 +08:00
lixiangcheng1
6defcaf982 [fxi]PNG image failed to parse after uploading
[TAPD] ID: 1004154
2025-12-29 17:18:51 +08:00
lixiangcheng1
34fa178f11 [fix]build_graphrag_for_kb 2025-12-29 11:55:17 +08:00
lixiangcheng1
cd12844a7c [fix]build knowledge graph 2025-12-27 17:12:04 +08:00
lixiangcheng1
a0c362244e [ADD]Add functions related to knowledge base graph:
Add functions related to knowledge base graph:
1. Entity type generation,
2. Knowledge base graph acquisition,
3. Hard deletion of knowledge base graph,
4. Knowledge base graph reconstruction (asynchronous)
2025-12-27 13:53:10 +08:00
lixiangcheng1
6338edda11 [ADD]Support parsing of unstructured data MP3, MP4, etc 2025-12-24 17:50:03 +08:00
朱文辉
0c3486248f Merge #41 into develop from feature/20251219_myh
feat(workflow): implement a workflow node for knowledge base retrieval

* feature/20251219_myh: (3 commits)
  feat(workflow): support multi-variable assignment in assigner node
  feat(workflow): implement a workflow node for knowledge base retrieval
  fix(template): remove default initial model in templates

Signed-off-by: Eternity <1533512157@qq.com>
Reviewed-by: zhuwenhui5566@163.com <zhuwenhui5566@163.com>
Merged-by: zhuwenhui5566@163.com <zhuwenhui5566@163.com>

CR-link: https://codeup.aliyun.com/redbearai/python/redbear-mem-open/change/41
2025-12-24 12:21:14 +08:00
mengyonghao
8c4d31e4d5 feat(workflow): implement a workflow node for knowledge base retrieval 2025-12-24 12:10:52 +08:00
lixiangcheng1
879b3da7ef [fix] mineru parser 2025-12-24 11:13:03 +08:00
lixiangcheng1
c0d6604981 [fix]document chunk QA 2025-12-18 18:51:32 +08:00
lixiangcheng1
64ecd8cabc [fix]Add knowledge graph functionality to document parsing configuration 2025-12-18 15:54:31 +08:00
Mark
7386ea32f1 [modify] ignore 2025-12-16 17:39:08 +08:00
Mark
a4e276ab27 [MODIFY] Code optimization 2025-12-15 14:09:43 +08:00
Ke Sun
c1adc62ec6 feat: Add base project structure with API and web components 2025-12-02 20:28:01 +08:00