Commit Graph

31 Commits

Author SHA1 Message Date
Timebomb2018
a268d0f7f1 fix(multimodal_service): add '文档内容:' prefix to document text and simplify image placeholder text 2026-04-27 12:25:27 +08:00
Timebomb2018
141fd94513 fix(multimodal_service): refactor image processing to use intermediate list before extending result 2026-04-24 19:40:57 +08:00
Timebomb2018
74be09340c feat(multimodal): support tenant-aware document image storage and improve image placeholder labeling
- Pass workspace_id to multimodal_service.process_files across app_chat_service, draft_run_service
- Fetch tenant_id from workspace in multimodal_service for proper file storage scoping
- Update image placeholder format from "[第N页 第M张图片]" to "[图片 第N页 第M张图片]" for clarity
- Add strict URL preservation rules to system prompt for agents handling document images
- Refactor _save_doc_image_to_storage to accept explicit tenant_id and workspace_id instead of inferring from FileMetadata
2026-04-24 15:56:06 +08:00
Timebomb2018
767eb5e6f2 feat(multimodal): support document image extraction and inline vision processing
Added document image extraction capability for PDF and DOCX files, including page/index metadata and storage integration. Extended `process_files` with `document_image_recognition` flag to conditionally enable vision-based image processing when model supports it. Updated knowledge repository and workflow node logic to enforce status=1 checks. Added PyMuPDF dependency.
2026-04-24 11:18:50 +08:00
Ke Sun
3ea42ac27f Merge remote-tracking branch 'origin/release/v0.2.9' into develop 2026-03-31 19:16:13 +08:00
Timebomb2018
f485398768 fix(workflow):
Parsing of DOC files
2026-03-27 19:13:51 +08:00
Eternity
bca43fcc75 perf(workflow): expose extract_document_text as instance method, optimize knowledge base parallel search
- Change extract_document_text from private to instance method in multimodal service for external access
- Optimize knowledge base search logic to improve parallel retrieval performance
2026-03-27 12:23:18 +08:00
Timebomb2018
68489f1b28 feat(workflow): Document extraction node 2026-03-26 16:05:24 +08:00
Timebomb2018
def7367e33 Merge branch 'refs/heads/feature/agent-tool_xjn' into feature/20260105_xjn 2026-03-25 11:48:42 +08:00
Timebomb2018
54cff5861a feat(model): add volcano model 2026-03-25 11:45:49 +08:00
Eternity
89d188fbf3 Merge branch 'develop' into feature/multimodel_memory
# Conflicts:
#	api/app/core/memory/storage_services/extraction_engine/knowledge_extraction/embedding_generation.py
#	api/app/repositories/neo4j/add_nodes.py
#	api/app/repositories/neo4j/cypher_queries.py
#	api/app/repositories/neo4j/graph_saver.py
#	api/app/services/memory_agent_service.py
#	api/app/services/multimodal_service.py
2026-03-24 14:15:18 +08:00
Eternity
2ff81ba101 feat(memory): support perception-aware memory writing in workflow and Neo4j nodes 2026-03-23 16:33:25 +08:00
Timebomb2018
240f1d431b fix(app): Multimodal file storage 2026-03-20 19:45:41 +08:00
Timebomb2018
a51e34852c fix(app features): Support for xls and doc files 2026-03-19 21:41:45 +08:00
Timebomb2018
6105d46198 fix(bug): bug fix 2026-03-19 17:54:32 +08:00
Timebomb2018
7056865726 fix(agetn features):
1. Historical multimodal message writing is incorporated into the conversation context;
2. Resolve the issues where csv, json, and txt files cannot be recognized due to encoding problems;
3. File quantity limit;
4. Error details
2026-03-19 17:25:44 +08:00
Timebomb2018
f6efa0d711 fix(agent): Reading of docx multimodal files; Multimodal attachment history record 2026-03-18 22:29:10 +08:00
Ke Sun
da3f875555 Merge pull request #590 from SuanmoSuanyangTechnology/fix/perceptual-filename
fix(multimodel): gate perceptual memory writes on provider support
2026-03-18 12:00:48 +08:00
Timebomb2018
7e5e1609b0 fix(app): The bugs that were fixed in the previous version but were later rolled back. 2026-03-18 11:50:17 +08:00
Timebomb2018
7b99a32a1e fix(app):
1.The end users are still bound to the app.
2. Multi-modal file support includes xlsx, csv, and json.
3. The file routing protocol is consistent with the page routing.
2026-03-18 10:46:55 +08:00
Eternity
262a9ddc48 fix(multimodel): filter unsupported files during perception memory write 2026-03-17 17:20:51 +08:00
Timebomb2018
dfcc85a466 fix(app): Experience sharing: Adding 'features' to agent_config parameters 2026-03-17 14:58:28 +08:00
Eternity
73aee97be5 fix(multimodel): handle 302 redirect when downloading files 2026-03-13 16:46:03 +08:00
Eternity
b71bc1f875 feat(multimodel): support multimodal memory display and improve code style 2026-03-13 14:47:56 +08:00
Eternity
99e94b3567 feat(workflow,app): add MIME-based file handling and HTTP response files 2026-03-10 18:28:16 +08:00
Timebomb2018
590ec3a446 feat(model and app):
1. Increase support for visual models and multimodal models;
2. The application and workflow can input various multimodal files such as images, documents, audio, and videos.
2026-03-05 09:55:54 +08:00
Mark
73a432879a [modify] local_file bug fix 2026-02-06 18:56:22 +08:00
Eternity
b3f39eedac feat(workflow, skill): add multimodal image support to workflows and skill prompt generation 2026-02-05 12:25:53 +08:00
Mark
3f42ea2c61 [add] bedrock claude support 2026-02-03 12:05:39 +08:00
Mark
a6c5c44ed8 [modify] agent call tools strategy 2026-02-02 20:21:16 +08:00
Mark
3f389d685a [add] multimodal 2026-02-02 19:52:51 +08:00