refactor(memory): enhance extraction ontology and add assistant pruning graph support
- Expand entity type ontology with detailed definitions, examples, and notes (merged types: 地点设施, 物品设备, 产品服务, 软件平台, 角色职业, 知识能力, 偏好习惯目标, 称呼别名, 智能体) - Add relation ontology taxonomy with 15 predicate categories and usage rules - Strengthen reference resolution rules: resolve pronouns before extraction, skip unresolvable references entirely - Add guidelines to avoid extracting abstract propositions, emotions, and low-value entities (effort/reward/success patterns) - Add 7 new extraction examples covering edge cases - Add AssistantOriginal/AssistantPruned node models and graph persistence (PRUNED_TO and BELONGS_TO_DIALOG edges, Neo4j indexes and constraints) - Add graph_build_step.py for building graph nodes/edges from DialogData - Update write_pipeline.py to pass assistant pruning nodes/edges to graph saver - Update data_pruning.py with related preprocessing changes
This commit is contained in:
@@ -46,6 +46,12 @@ async def create_fulltext_indexes():
|
||||
OPTIONS { indexConfig: { `fulltext.analyzer`: 'cjk' } }
|
||||
""")
|
||||
|
||||
# 创建 AssistantPruned 剪枝文本全文索引
|
||||
await connector.execute_query("""
|
||||
CREATE FULLTEXT INDEX assistantPrunedFulltext IF NOT EXISTS FOR (p:AssistantPruned) ON EACH [p.text]
|
||||
OPTIONS { indexConfig: { `fulltext.analyzer`: 'cjk' } }
|
||||
""")
|
||||
|
||||
finally:
|
||||
await connector.close()
|
||||
|
||||
@@ -135,6 +141,17 @@ async def create_vector_indexes():
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
""")
|
||||
|
||||
# AssistantPruned text embedding index (optional, for semantic search on pruned hints)
|
||||
await connector.execute_query("""
|
||||
CREATE VECTOR INDEX assistant_pruned_embedding_index IF NOT EXISTS
|
||||
FOR (p:AssistantPruned)
|
||||
ON p.text_embedding
|
||||
OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 1024,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
""")
|
||||
finally:
|
||||
await connector.close()
|
||||
|
||||
@@ -179,6 +196,22 @@ async def create_unique_constraints():
|
||||
"""
|
||||
)
|
||||
|
||||
# AssistantOriginal.id unique
|
||||
await connector.execute_query(
|
||||
"""
|
||||
CREATE CONSTRAINT assistant_original_id_unique IF NOT EXISTS
|
||||
FOR (o:AssistantOriginal) REQUIRE o.id IS UNIQUE
|
||||
"""
|
||||
)
|
||||
|
||||
# AssistantPruned.id unique
|
||||
await connector.execute_query(
|
||||
"""
|
||||
CREATE CONSTRAINT assistant_pruned_id_unique IF NOT EXISTS
|
||||
FOR (p:AssistantPruned) REQUIRE p.id IS UNIQUE
|
||||
"""
|
||||
)
|
||||
|
||||
finally:
|
||||
await connector.close()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user