[modify] QA pair

This commit is contained in:
Mark
2026-05-07 19:04:19 +08:00
parent e222490bce
commit 9fa83ed01e
2 changed files with 11 additions and 14 deletions

View File

@@ -149,13 +149,14 @@ def qa_proposal(chat_mdl, content, topn=3, custom_prompt=None):
""" """
if custom_prompt: if custom_prompt:
template = PROMPT_JINJA_ENV.from_string(custom_prompt) template = PROMPT_JINJA_ENV.from_string(custom_prompt)
rendered_user = template.render(content=content, topn=topn)
msg = [{"role": "user", "content": rendered_user}]
sys_prompt = ""
else: else:
template = PROMPT_JINJA_ENV.from_string(QUESTION_PROMPT_TEMPLATE) sys_prompt = QUESTION_PROMPT_TEMPLATE
rendered_prompt = template.render(content=content, topn=topn) msg = [{"role": "user", "content": f"## Text Content (topn: {topn})\n\n{content}"}]
_, msg = message_fit_in([{"role": "system", "content": sys_prompt}] + msg, getattr(chat_mdl, 'max_length', 8096))
msg = [{"role": "system", "content": rendered_prompt}, {"role": "user", "content": "Output: "}] raw = chat_mdl.chat(sys_prompt, msg, {"temperature": 0.2})
_, msg = message_fit_in(msg, getattr(chat_mdl, 'max_length', 8096))
raw = chat_mdl.chat(rendered_prompt, msg[1:], {"temperature": 0.2})
if isinstance(raw, tuple): if isinstance(raw, tuple):
raw = raw[0] raw = raw[0]
raw = re.sub(r"^.*</think>", "", raw, flags=re.DOTALL) raw = re.sub(r"^.*</think>", "", raw, flags=re.DOTALL)

View File

@@ -2,23 +2,19 @@
You are a text analyzer and knowledge extraction expert. You are a text analyzer and knowledge extraction expert.
## Task ## Task
Generate {{ topn }} question-answer pairs from the given text content. Generate question-answer pairs from the given text content.
## Requirements ## Requirements
- Understand and summarize the text content, and generate the top {{ topn }} important question-answer pairs. - Understand and summarize the text content, then generate up to {{ topn }} important question-answer pairs.
- Each question-answer pair MUST be on a single line, formatted as: Q: <question> A: <answer> - Each question-answer pair MUST be on a single line, formatted as: Q: <question> A: <answer>
- The questions SHOULD NOT have overlapping meanings. - The questions SHOULD NOT have overlapping meanings.
- The questions SHOULD cover the main content of the text as much as possible. - The questions SHOULD cover the main content of the text as much as possible.
- The answers MUST be concise, accurate, and directly derived from the text content. - The answers MUST be concise, accurate, and directly derived from the text content.
- The answers SHOULD be self-contained and understandable without additional context. - The answers SHOULD be self-contained and understandable without additional context.
- Both questions and answers MUST be in the same language as the given text content. - Both questions and answers MUST be in the same language as the given text content.
- Output question-answer pairs ONLY, no extra explanation. - If the text is too short or lacks substantive content, generate fewer pairs rather than padding.
- Output question-answer pairs ONLY, no extra explanation or commentary.
## Example Output ## Example Output
Q: What is the capital of France? A: The capital of France is Paris. Q: What is the capital of France? A: The capital of France is Paris.
Q: When was the Eiffel Tower built? A: The Eiffel Tower was built in 1889. Q: When was the Eiffel Tower built? A: The Eiffel Tower was built in 1889.
---
## Text Content
{{ content }}