Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 2.93 KB

File metadata and controls

36 lines (27 loc) · 2.93 KB

optimize_query_mapper

Optimize queries in question-answer pairs to make them more specific and detailed.

This mapper refines the questions in a QA pair, making them more specific and detailed while ensuring that the original answer can still address the optimized question. It uses a predefined system prompt for the optimization process. The optimized query is extracted from the raw output by stripping any leading or trailing whitespace. The mapper utilizes a CUDA accelerator for faster processing.

优化问答对中的查询,使其更具体和详细。

该映射器改进问答对中的问题,使其更具体和详细,同时确保原始答案仍能回答优化后的问题。它使用预定义的系统提示进行优化过程。优化后的查询通过去除任何前导或尾随空格从原始输出中提取。映射器利用CUDA加速器进行更快的处理。

Type 算子类型: mapper

Tags 标签: gpu, vllm, hf, api

🔧 Parameter Configuration 参数配置

name 参数名 type 类型 default 默认值 desc 说明
api_or_hf_model <class 'str'> 'Qwen/Qwen2.5-7B-Instruct' API or huggingface model name.
is_hf_model <class 'bool'> True If true, use huggingface model. Otherwise, use API.
api_endpoint typing.Optional[str] None URL endpoint for the API.
response_path typing.Optional[str] None Path to extract content from the API response. Defaults to 'choices.0.message.content'.
system_prompt typing.Optional[str] None System prompt for guiding the optimization task.
input_template typing.Optional[str] None Template for building the input for the model. Please make sure the template contains one placeholder '{}', which corresponds to the question and answer pair generated by param qa_pair_template.
qa_pair_template typing.Optional[str] None Template for formatting the question and answer pair. Please make sure the template contains two '{}' to format question and answer.
output_pattern typing.Optional[str] None Regular expression pattern to extract question and answer from model response.
try_num typing.Annotated[int, Gt(gt=0)] 3 The number of retry attempts when there is an API call error or output parsing error.
enable_vllm <class 'bool'> False Whether to use VLLM for inference acceleration.
model_params typing.Optional[typing.Dict] None Parameters for initializing the model.
sampling_params typing.Optional[typing.Dict] None Sampling parameters for text generation (e.g., {'temperature': 0.9, 'top_p': 0.95}).
kwargs '' Extra keyword arguments.

🔗 related links 相关链接