Merge branch 'feature/rag2' into develop

* feature/rag2:
  [fix] system prompt fit error
  [modify] QA pair
This commit is contained in:
Mark
2026-05-07 19:49:24 +08:00
2 changed files with 8 additions and 13 deletions

View File

@@ -149,13 +149,12 @@ def qa_proposal(chat_mdl, content, topn=3, custom_prompt=None):
"""
if custom_prompt:
template = PROMPT_JINJA_ENV.from_string(custom_prompt)
sys_prompt = template.render(topn=topn)
else:
template = PROMPT_JINJA_ENV.from_string(QUESTION_PROMPT_TEMPLATE)
rendered_prompt = template.render(content=content, topn=topn)
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}]
sys_prompt = QUESTION_PROMPT_TEMPLATE
msg = [{"role": "system", "content": sys_prompt}, {"role": "user", "content": content}]
_, msg = message_fit_in(msg, getattr(chat_mdl, 'max_length', 8096))
raw = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
raw = chat_mdl.chat(sys_prompt, msg[1:], {"temperature": 0.2})
if isinstance(raw, tuple):
raw = raw[0]
raw = re.sub(r"^.*</think>", "", raw, flags=re.DOTALL)

View File

@@ -2,23 +2,19 @@
You are a text analyzer and knowledge extraction expert.
## Task
Generate {{ topn }} question-answer pairs from the given text content.
Generate question-answer pairs from the given text content.
## Requirements
- Understand and summarize the text content, and generate the top {{ topn }} important question-answer pairs.
- Understand and summarize the text content, then generate up to {{ topn }} important question-answer pairs.
- Each question-answer pair MUST be on a single line, formatted as: Q: <question> A: <answer>
- The questions SHOULD NOT have overlapping meanings.
- The questions SHOULD cover the main content of the text as much as possible.
- The answers MUST be concise, accurate, and directly derived from the text content.
- The answers SHOULD be self-contained and understandable without additional context.
- Both questions and answers MUST be in the same language as the given text content.
- Output question-answer pairs ONLY, no extra explanation.
- If the text is too short or lacks substantive content, generate fewer pairs rather than padding.
- Output question-answer pairs ONLY, no extra explanation or commentary.
## Example Output
Q: What is the capital of France? A: The capital of France is Paris.
Q: When was the Eiffel Tower built? A: The Eiffel Tower was built in 1889.
---
## Text Content
{{ content }}