autogen_ext.tools.azure#
- pydantic model AzureAISearchConfig[源代码]#
基类:
BaseModel
带验证的 Azure AI Search 配置。
此类定义 Azure AI Search 工具的配置参数,包括 认证、搜索行为、缓存和嵌入设置。
备注
此类需要
autogen-ext
包的azure
额外组件。pip install -U "autogen-ext[azure]"
备注
先决条件:
必须在您的 Azure 订阅中创建 Azure AI Search 服务
搜索索引必须为您的用例正确配置:
向量搜索: 索引必须包含向量字段
语义搜索: 索引必须配置语义设置
混合搜索: 必须同时配置向量字段和文本字段
所需软件包:
基础功能:
azure-search-documents>=11.4.0
Azure OpenAI 嵌入:
openai azure-identity
OpenAI 嵌入:
openai
- 使用示例:
from azure.core.credentials import AzureKeyCredential from autogen_ext.tools.azure import AzureAISearchConfig # 全文搜索基础配置 config = AzureAISearchConfig( name="doc-search", endpoint="https://your-search.search.windows.net", # 您的 Azure AI Search 端点 index_name="<your-index>", # 您的搜索索引名称 credential=AzureKeyCredential("<your-key>"), # 您的 Azure AI Search 管理密钥 query_type="simple", search_fields=["content", "title"], # 更新为您的可搜索字段 top=5, ) # 使用 Azure OpenAI 嵌入的向量搜索配置 vector_config = AzureAISearchConfig( name="vector-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), query_type="vector", vector_fields=["embedding"], # 更新为您的向量字段名称 embedding_provider="azure_openai", embedding_model="text-embedding-ada-002", openai_endpoint="https://your-openai.openai.azure.com", # 您的 Azure OpenAI 端点 openai_api_key="<your-openai-key>", # 您的 Azure OpenAI 密钥 top=5, ) # 带语义排序的混合搜索配置 hybrid_config = AzureAISearchConfig( name="hybrid-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), query_type="semantic", semantic_config_name="<your-semantic-config>", # 您的语义配置名称 search_fields=["content", "title"], # 更新为您的搜索字段 vector_fields=["embedding"], # 更新为您的向量字段名称 embedding_provider="openai", embedding_model="text-embedding-ada-002", openai_api_key="<your-openai-key>", # 您的 OpenAI API 密钥 top=5, )
Show JSON schema
{ "title": "AzureAISearchConfig", "description": "\u5e26\u9a8c\u8bc1\u7684 Azure AI Search \u914d\u7f6e\u3002\n\n\u6b64\u7c7b\u5b9a\u4e49 Azure AI Search \u5de5\u5177\u7684\u914d\u7f6e\u53c2\u6570\uff0c\u5305\u62ec\n\u8ba4\u8bc1\u3001\u641c\u7d22\u884c\u4e3a\u3001\u7f13\u5b58\u548c\u5d4c\u5165\u8bbe\u7f6e\u3002\n\n.. note::\n \u6b64\u7c7b\u9700\u8981 ``autogen-ext`` \u5305\u7684 ``azure`` \u989d\u5916\u7ec4\u4ef6\u3002\n\n .. code-block:: bash\n\n pip install -U \"autogen-ext[azure]\"\n\n.. note::\n **\u5148\u51b3\u6761\u4ef6:**\n\n 1. \u5fc5\u987b\u5728\u60a8\u7684 Azure \u8ba2\u9605\u4e2d\u521b\u5efa Azure AI Search \u670d\u52a1\n 2. \u641c\u7d22\u7d22\u5f15\u5fc5\u987b\u4e3a\u60a8\u7684\u7528\u4f8b\u6b63\u786e\u914d\u7f6e:\n\n - \u5411\u91cf\u641c\u7d22: \u7d22\u5f15\u5fc5\u987b\u5305\u542b\u5411\u91cf\u5b57\u6bb5\n - \u8bed\u4e49\u641c\u7d22: \u7d22\u5f15\u5fc5\u987b\u914d\u7f6e\u8bed\u4e49\u8bbe\u7f6e\n - \u6df7\u5408\u641c\u7d22: \u5fc5\u987b\u540c\u65f6\u914d\u7f6e\u5411\u91cf\u5b57\u6bb5\u548c\u6587\u672c\u5b57\u6bb5\n 3. \u6240\u9700\u8f6f\u4ef6\u5305:\n\n - \u57fa\u7840\u529f\u80fd: ``azure-search-documents>=11.4.0``\n - Azure OpenAI \u5d4c\u5165: ``openai azure-identity``\n - OpenAI \u5d4c\u5165: ``openai``\n\n\u4f7f\u7528\u793a\u4f8b:\n .. code-block:: python\n\n from azure.core.credentials import AzureKeyCredential\n from autogen_ext.tools.azure import AzureAISearchConfig\n\n # \u5168\u6587\u641c\u7d22\u57fa\u7840\u914d\u7f6e\n config = AzureAISearchConfig(\n name=\"doc-search\",\n endpoint=\"https://your-search.search.windows.net\", # \u60a8\u7684 Azure AI Search \u7aef\u70b9\n index_name=\"<your-index>\", # \u60a8\u7684\u641c\u7d22\u7d22\u5f15\u540d\u79f0\n credential=AzureKeyCredential(\"<your-key>\"), # \u60a8\u7684 Azure AI Search \u7ba1\u7406\u5bc6\u94a5\n query_type=\"simple\",\n search_fields=[\"content\", \"title\"], # \u66f4\u65b0\u4e3a\u60a8\u7684\u53ef\u641c\u7d22\u5b57\u6bb5\n top=5,\n )\n\n # \u4f7f\u7528 Azure OpenAI \u5d4c\u5165\u7684\u5411\u91cf\u641c\u7d22\u914d\u7f6e\n vector_config = AzureAISearchConfig(\n name=\"vector-search\",\n endpoint=\"https://your-search.search.windows.net\",\n index_name=\"<your-index>\",\n credential=AzureKeyCredential(\"<your-key>\"),\n query_type=\"vector\",\n vector_fields=[\"embedding\"], # \u66f4\u65b0\u4e3a\u60a8\u7684\u5411\u91cf\u5b57\u6bb5\u540d\u79f0\n embedding_provider=\"azure_openai\",\n embedding_model=\"text-embedding-ada-002\",\n openai_endpoint=\"https://your-openai.openai.azure.com\", # \u60a8\u7684 Azure OpenAI \u7aef\u70b9\n openai_api_key=\"<your-openai-key>\", # \u60a8\u7684 Azure OpenAI \u5bc6\u94a5\n top=5,\n )\n\n # \u5e26\u8bed\u4e49\u6392\u5e8f\u7684\u6df7\u5408\u641c\u7d22\u914d\u7f6e\n hybrid_config = AzureAISearchConfig(\n name=\"hybrid-search\",\n endpoint=\"https://your-search.search.windows.net\",\n index_name=\"<your-index>\",\n credential=AzureKeyCredential(\"<your-key>\"),\n query_type=\"semantic\",\n semantic_config_name=\"<your-semantic-config>\", # \u60a8\u7684\u8bed\u4e49\u914d\u7f6e\u540d\u79f0\n search_fields=[\"content\", \"title\"], # \u66f4\u65b0\u4e3a\u60a8\u7684\u641c\u7d22\u5b57\u6bb5\n vector_fields=[\"embedding\"], # \u66f4\u65b0\u4e3a\u60a8\u7684\u5411\u91cf\u5b57\u6bb5\u540d\u79f0\n embedding_provider=\"openai\",\n embedding_model=\"text-embedding-ada-002\",\n openai_api_key=\"<your-openai-key>\", # \u60a8\u7684 OpenAI API \u5bc6\u94a5\n top=5,\n )", "type": "object", "properties": { "name": { "description": "The name of this tool instance", "title": "Name", "type": "string" }, "description": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Description explaining the tool's purpose", "title": "Description" }, "endpoint": { "description": "The full URL of your Azure AI Search service", "title": "Endpoint", "type": "string" }, "index_name": { "description": "Name of the search index to query", "title": "Index Name", "type": "string" }, "credential": { "anyOf": [], "description": "Azure credential for authentication (API key or token)", "title": "Credential" }, "api_version": { "default": "2023-10-01-preview", "description": "Azure AI Search API version to use. Defaults to 2023-10-01-preview.", "title": "Api Version", "type": "string" }, "query_type": { "default": "simple", "description": "Type of search to perform: simple, full, semantic, or vector", "enum": [ "simple", "full", "semantic", "vector" ], "title": "Query Type", "type": "string" }, "search_fields": { "anyOf": [ { "items": { "type": "string" }, "type": "array" }, { "type": "null" } ], "default": null, "description": "Fields to search within documents", "title": "Search Fields" }, "select_fields": { "anyOf": [ { "items": { "type": "string" }, "type": "array" }, { "type": "null" } ], "default": null, "description": "Fields to return in search results", "title": "Select Fields" }, "vector_fields": { "anyOf": [ { "items": { "type": "string" }, "type": "array" }, { "type": "null" } ], "default": null, "description": "Fields to use for vector search", "title": "Vector Fields" }, "top": { "anyOf": [ { "type": "integer" }, { "type": "null" } ], "default": null, "description": "Maximum number of results to return. For vector searches, acts as k in k-NN.", "title": "Top" }, "filter": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "OData filter expression to refine search results", "title": "Filter" }, "semantic_config_name": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Semantic configuration name for enhanced results", "title": "Semantic Config Name" }, "enable_caching": { "default": false, "description": "Whether to cache search results", "title": "Enable Caching", "type": "boolean" }, "cache_ttl_seconds": { "default": 300, "description": "How long to cache results in seconds", "title": "Cache Ttl Seconds", "type": "integer" }, "embedding_provider": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Name of embedding provider for client-side embeddings", "title": "Embedding Provider" }, "embedding_model": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Model name for client-side embeddings", "title": "Embedding Model" }, "openai_api_key": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "API key for OpenAI/Azure OpenAI embeddings", "title": "Openai Api Key" }, "openai_api_version": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "API version for Azure OpenAI embeddings", "title": "Openai Api Version" }, "openai_endpoint": { "anyOf": [ { "type": "string" }, { "type": "null" } ], "default": null, "description": "Endpoint URL for Azure OpenAI embeddings", "title": "Openai Endpoint" } }, "required": [ "name", "endpoint", "index_name", "credential" ] }
- Fields:
api_version (str)
cache_ttl_seconds (int)
credential (azure.core.credentials.AzureKeyCredential | azure.core.credentials_async.AsyncTokenCredential)
description (str | None)
embedding_model (str | None)
embedding_provider (str | None)
enable_caching (bool)
endpoint (str)
filter (str | None)
index_name (str)
name (str)
openai_api_key (str | None)
openai_api_version (str | None)
openai_endpoint (str | None)
query_type (Literal['simple', 'full', 'semantic', 'vector'])
search_fields (List[str] | None)
select_fields (List[str] | None)
semantic_config_name (str | None)
top (int | None)
vector_fields (List[str] | None)
- Validators:
normalize_query_type
»query_type
validate_endpoint
»endpoint
validate_interdependent_fields
»all fields
validate_top
»top
- field api_version: str = '2023-10-01-preview'#
Azure AI Search API version to use. Defaults to 2023-10-01-preview.
- Validated by:
validate_interdependent_fields
- field cache_ttl_seconds: int = 300#
How long to cache results in seconds
- Validated by:
validate_interdependent_fields
- field credential: AzureKeyCredential | AsyncTokenCredential [Required]#
Azure credential for authentication (API key or token)
- Validated by:
validate_interdependent_fields
- field description: str | None = None#
Description explaining the tool's purpose
- Validated by:
validate_interdependent_fields
- field embedding_model: str | None = None#
Model name for client-side embeddings
- Validated by:
validate_interdependent_fields
- field embedding_provider: str | None = None#
Name of embedding provider for client-side embeddings
- Validated by:
validate_interdependent_fields
- field enable_caching: bool = False#
Whether to cache search results
- Validated by:
validate_interdependent_fields
- field endpoint: str [Required]#
The full URL of your Azure AI Search service
- Validated by:
validate_endpoint
validate_interdependent_fields
- field filter: str | None = None#
OData filter expression to refine search results
- Validated by:
validate_interdependent_fields
- field index_name: str [Required]#
Name of the search index to query
- Validated by:
validate_interdependent_fields
- field name: str [Required]#
The name of this tool instance
- Validated by:
validate_interdependent_fields
- field openai_api_key: str | None = None#
API key for OpenAI/Azure OpenAI embeddings
- Validated by:
validate_interdependent_fields
- field openai_api_version: str | None = None#
API version for Azure OpenAI embeddings
- Validated by:
validate_interdependent_fields
- field openai_endpoint: str | None = None#
Endpoint URL for Azure OpenAI embeddings
- Validated by:
validate_interdependent_fields
- field query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple'#
Type of search to perform: simple, full, semantic, or vector
- Validated by:
normalize_query_type
validate_interdependent_fields
- field search_fields: List[str] | None = None#
Fields to search within documents
- Validated by:
validate_interdependent_fields
- field select_fields: List[str] | None = None#
Fields to return in search results
- Validated by:
validate_interdependent_fields
- field semantic_config_name: str | None = None#
Semantic configuration name for enhanced results
- Validated by:
validate_interdependent_fields
- field top: int | None = None#
Maximum number of results to return. For vector searches, acts as k in k-NN.
- Validated by:
validate_interdependent_fields
validate_top
- class AzureAISearchTool(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], description: str | None = None, api_version: str = DEFAULT_API_VERSION, query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple', search_fields: List[str] | None = None, select_fields: List[str] | None = None, vector_fields: List[str] | None = None, top: int | None = None, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None)[源代码]#
基类:
EmbeddingProviderMixin
,BaseAzureAISearchTool
用于查询 Azure 搜索索引的 Azure AI 搜索工具。
该工具提供了简化的接口,用于使用多种搜索方法查询 Azure AI 搜索索引。 建议使用工厂方法创建针对特定搜索类型定制的实例:
全文搜索:适用于传统的基于关键词的搜索、Lucene 查询或语义重新排序的结果。 - 使用 AzureAISearchTool.create_full_text_search() - 支持的 query_type:"simple"(关键词)、"full"(Lucene)、"semantic"(语义)。
向量搜索:适用于基于向量嵌入的纯相似性搜索。 - 使用 AzureAISearchTool.create_vector_search()
混合搜索:结合向量搜索与全文或语义搜索,以同时获得两者的优势。 - 使用 AzureAISearchTool.create_hybrid_search() - 文本组件可以通过 query_type 参数设置为 "simple"、"full" 或 "semantic"。
每个工厂方法都会根据所选的搜索策略配置适当的默认值和验证。
警告
如果设置 query_type="semantic",则还必须提供有效的 semantic_config_name。 此配置必须事先在 Azure AI 搜索索引中设置好。
- component_provider_override: ClassVar[str | None] = 'autogen_ext.tools.azure.AzureAISearchTool'#
覆盖组件的provider字符串。这应该用于防止内部模块名称成为模块名称的一部分。
- classmethod create_full_text_search(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], description: str | None = None, api_version: str | None = None, query_type: Literal['simple', 'full', 'semantic'] = 'simple', search_fields: List[str] | None = None, select_fields: List[str] | None = None, top: int | None = 5, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300) AzureAISearchTool [源代码]#
创建用于传统文本搜索的工具。
此工厂方法创建一个专为全文搜索优化的 AzureAISearchTool, 支持关键词匹配、Lucene 语法和语义搜索功能。
- 参数:
name -- 工具实例的名称
endpoint -- Azure AI 搜索服务的完整 URL
index_name -- 要查询的搜索索引名称
credential -- 用于身份验证的 Azure 凭据(API 密钥或令牌)
description -- 可选描述,解释工具的用途
api_version -- 要使用的 Azure AI 搜索 API 版本
query_type --
要执行的文本搜索类型:
simple : 基本关键词搜索,匹配精确术语及其变体
full: 使用 Lucene 查询语法进行高级搜索,支持复杂查询
semantic: 基于 AI 的搜索,理解语义和上下文,提供增强的相关性排序
search_fields -- 文档中要搜索的字段
select_fields -- 搜索结果中要返回的字段
top -- 要返回的最大结果数(默认:5)
filter -- 用于优化搜索结果的 OData 过滤表达式
semantic_config_name -- 语义配置名称(语义 query_type 必需)
enable_caching -- 是否缓存搜索结果
cache_ttl_seconds -- 缓存结果的持续时间(秒)
- Returns:
一个初始化好的用于全文搜索的 AzureAISearchTool
示例
from azure.core.credentials import AzureKeyCredential from autogen_ext.tools.azure import AzureAISearchTool # 基本关键词搜索 tool = AzureAISearchTool.create_full_text_search( name="doc-search", endpoint="https://your-search.search.windows.net", # 您的 Azure AI 搜索端点 index_name="<your-index>", # 您的搜索索引名称 credential=AzureKeyCredential("<your-key>"), # 您的 Azure AI 搜索管理员密钥 query_type="simple", # 启用关键词搜索 search_fields=["content", "title"], # 必需:要搜索的字段 select_fields=["content", "title", "url"], # 可选:要返回的字段 top=5, ) # 全文(Lucene 查询)搜索 full_text_tool = AzureAISearchTool.create_full_text_search( name="doc-search", endpoint="https://your-search.search.windows.net", # 您的 Azure AI 搜索端点 index_name="<your-index>", # 您的搜索索引名称 credential=AzureKeyCredential("<your-key>"), # 您的 Azure AI 搜索管理员密钥 query_type="full", # 启用 Lucene 查询语法 search_fields=["content", "title"], # 必需:要搜索的字段 select_fields=["content", "title", "url"], # 可选:要返回的字段 top=5, ) # 带重新排序的语义搜索 # 注意:确保您的索引已启用语义配置 semantic_tool = AzureAISearchTool.create_full_text_search( name="semantic-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), query_type="semantic", # 启用语义排序 semantic_config_name="<your-semantic-config>", # 语义搜索必需 search_fields=["content", "title"], # 必需:要搜索的字段 select_fields=["content", "title", "url"], # 可选:要返回的字段 top=5, ) # 搜索工具可与 Agent 一起使用 # assistant = Agent("assistant", tools=[semantic_tool])
- classmethod create_hybrid_search(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], vector_fields: List[str], search_fields: List[str], description: str | None = None, api_version: str | None = None, query_type: Literal['simple', 'full', 'semantic'] = 'simple', select_fields: List[str] | None = None, top: int = 5, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None) AzureAISearchTool [源代码]#
创建一个结合向量搜索和文本搜索能力的工具。
此工厂方法创建一个配置为混合搜索的 AzureAISearchTool, 它结合了向量相似度和传统文本搜索的优势。
- 参数:
name -- 该工具实例的名称
endpoint -- Azure AI 搜索服务的完整 URL
index_name -- 要查询的搜索索引名称
credential -- 用于身份验证的 Azure 凭证(API 密钥或令牌)
vector_fields -- 用于向量搜索的字段(必填)
search_fields -- 用于文本搜索的字段(必填)
description -- 可选描述,说明工具的用途
api_version -- 要使用的 Azure AI 搜索 API 版本
query_type --
要执行的文本搜索类型:
simple: 基本关键词搜索,匹配精确术语及其变体
full: 使用 Lucene 查询语法进行高级搜索,适用于复杂查询
semantic: 基于 AI 的搜索,理解语义和上下文,提供增强的相关性排名
select_fields -- 要在搜索结果中返回的字段
top -- 要返回的最大结果数(默认:5)
filter -- 用于优化搜索结果的 OData 过滤表达式
semantic_config_name -- 语义配置名称(当 query_type="semantic" 时必填)
enable_caching -- 是否缓存搜索结果
cache_ttl_seconds -- 缓存结果的秒数
embedding_provider -- 客户端嵌入的提供程序(例如 'azure_openai', 'openai')
embedding_model -- 客户端嵌入的模型(例如 'text-embedding-ada-002')
openai_api_key -- OpenAI/Azure OpenAI 嵌入的 API 密钥
openai_api_version -- Azure OpenAI 嵌入的 API 版本
openai_endpoint -- Azure OpenAI 嵌入的端点 URL
- Returns:
一个初始化好的用于混合搜索的 AzureAISearchTool
- 抛出:
ValueError -- 如果 vector_fields 或 search_fields 为空
ValueError -- 如果 query_type 为 "semantic" 但没有 semantic_config_name
ValueError -- 如果 embedding_provider 是 'azure_openai' 但没有 openai_endpoint
ValueError -- 如果缺少必需参数或参数无效
示例
from azure.core.credentials import AzureKeyCredential from autogen_ext.tools.azure import AzureAISearchTool # 使用服务端向量化的基本混合搜索 tool = AzureAISearchTool.create_hybrid_search( name="hybrid-search", endpoint="https://your-search.search.windows.net", # 你的 Azure AI 搜索端点 index_name="<your-index>", # 你的搜索索引名称 credential=AzureKeyCredential("<your-key>"), # 你的 Azure AI 搜索管理员密钥 vector_fields=["content_vector"], # 你的向量字段名称 search_fields=["content", "title"], # 你的可搜索字段 top=5, ) # 带有语义排名和 Azure OpenAI 嵌入的混合搜索 semantic_tool = AzureAISearchTool.create_hybrid_search( name="semantic-hybrid-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), vector_fields=["content_vector"], search_fields=["content", "title"], query_type="semantic", # 启用语义排名 semantic_config_name="<your-semantic-config>", # 你的语义配置名称 embedding_provider="azure_openai", # 使用 Azure OpenAI 进行嵌入 embedding_model="text-embedding-ada-002", # 要使用的嵌入模型 openai_endpoint="https://your-openai.openai.azure.com", # 你的 Azure OpenAI 端点 openai_api_key="<your-openai-key>", # 你的 Azure OpenAI 密钥 openai_api_version="2024-02-15-preview", # Azure OpenAI API 版本 select_fields=["content", "title", "url"], # 要在结果中返回的字段 filter="language eq 'en'", # 可选的 OData 过滤器 top=5, ) # 搜索工具可以与 Agent 一起使用 # assistant = Agent("assistant", tools=[semantic_tool])
- classmethod create_vector_search(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], vector_fields: List[str], description: str | None = None, api_version: str | None = None, select_fields: List[str] | None = None, top: int = 5, filter: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None) AzureAISearchTool [源代码]#
创建用于纯向量/相似性搜索的工具。
此工厂方法创建一个专为向量搜索优化的 AzureAISearchTool, 允许使用向量嵌入进行基于语义相似性的匹配。
- 参数:
name -- 工具实例的名称
endpoint -- Azure AI 搜索服务的完整 URL
index_name -- 要查询的搜索索引名称
credential -- 用于身份验证的 Azure 凭据(API 密钥或令牌)
vector_fields -- 用于向量搜索的字段(必需)
description -- 可选描述,解释工具的用途
api_version -- 要使用的 Azure AI 搜索 API 版本
select_fields -- 搜索结果中要返回的字段
top -- 要返回的最大结果数 / k-NN 中的 k(默认:5)
filter -- 用于优化搜索结果的 OData 过滤表达式
enable_caching -- 是否缓存搜索结果
cache_ttl_seconds -- 缓存结果的持续时间(秒)
embedding_provider -- 客户端嵌入的提供者(如 'azure_openai', 'openai')
embedding_model -- 客户端嵌入的模型(如 'text-embedding-ada-002')
openai_api_key -- OpenAI/Azure OpenAI 嵌入的 API 密钥
openai_api_version -- Azure OpenAI 嵌入的 API 版本
openai_endpoint -- Azure OpenAI 嵌入的端点 URL
- Returns:
一个初始化好的用于向量搜索的 AzureAISearchTool
- 抛出:
ValueError -- 如果 vector_fields 为空
ValueError -- 如果 embedding_provider 为 'azure_openai' 但没有 openai_endpoint
ValueError -- 如果缺少必需参数或参数无效
- Example Usage:
from azure.core.credentials import AzureKeyCredential from autogen_ext.tools.azure import AzureAISearchTool # 使用服务端向量化的向量搜索 tool = AzureAISearchTool.create_vector_search( name="vector-search", endpoint="https://your-search.search.windows.net", # 您的 Azure AI 搜索端点 index_name="<your-index>", # 您的搜索索引名称 credential=AzureKeyCredential("<your-key>"), # 您的 Azure AI 搜索管理员密钥 vector_fields=["content_vector"], # 您的向量字段名称 select_fields=["content", "title", "url"], # 结果中要返回的字段 top=5, ) # 使用 Azure OpenAI 嵌入的向量搜索 azure_openai_tool = AzureAISearchTool.create_vector_search( name="azure-openai-vector-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), vector_fields=["content_vector"], embedding_provider="azure_openai", # 使用 Azure OpenAI 进行嵌入 embedding_model="text-embedding-ada-002", # 要使用的嵌入模型 openai_endpoint="https://your-openai.openai.azure.com", # 您的 Azure OpenAI 端点 openai_api_key="<your-openai-key>", # 您的 Azure OpenAI 密钥 openai_api_version="2024-02-15-preview", # Azure OpenAI API 版本 select_fields=["content", "title", "url"], # 结果中要返回的字段 top=5, ) # 使用 OpenAI 嵌入的向量搜索 openai_tool = AzureAISearchTool.create_vector_search( name="openai-vector-search", endpoint="https://your-search.search.windows.net", index_name="<your-index>", credential=AzureKeyCredential("<your-key>"), vector_fields=["content_vector"], embedding_provider="openai", # 使用 OpenAI 进行嵌入 embedding_model="text-embedding-ada-002", # 要使用的嵌入模型 openai_api_key="<your-openai-key>", # 您的 OpenAI API 密钥 select_fields=["content", "title", "url"], # 结果中要返回的字段 top=5, ) # 与 Agent 一起使用该工具 # assistant = Agent("assistant", tools=[azure_openai_tool])
- class BaseAzureAISearchTool(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], description: str | None = None, api_version: str = DEFAULT_API_VERSION, query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple', search_fields: List[str] | None = None, select_fields: List[str] | None = None, vector_fields: List[str] | None = None, top: int | None = None, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None)[源代码]#
基类:
BaseTool
[SearchQuery
,SearchResults
],Component
[AzureAISearchConfig
],EmbeddingProvider
,ABC
Azure AI 搜索工具的抽象基类。
该类定义了所有 Azure AI 搜索工具的通用接口和功能。 它处理配置管理、客户端初始化以及子类必须实现的抽象方法。
- 属性:
search_config: 搜索服务的配置参数。
- 注意:
这是一个抽象基类,不应直接实例化。 请使用具体实现或 AzureAISearchTool 中的工厂方法。
- component_config_schema#
- component_provider_override: ClassVar[str | None] = 'autogen_ext.tools.azure.BaseAzureAISearchTool'#
覆盖组件的provider字符串。这应该用于防止内部模块名称成为模块名称的一部分。
- return_value_as_string(value: SearchResults) str [源代码]#
将搜索结果转换为字符串表示形式。
- async run(args: str | Dict[str, Any] | SearchQuery, cancellation_token: CancellationToken | None = None) SearchResults [源代码]#
对 Azure AI 搜索索引执行搜索。
- 参数:
args -- 搜索查询文本或 SearchQuery 对象
cancellation_token -- 用于取消操作的可选令牌
- Returns:
SearchResults -- 包含搜索结果和元数据的容器
- 抛出:
ValueError -- 如果搜索查询为空或无效
ValueError -- 如果存在认证错误或其他搜索问题
CancelledError -- 如果操作被取消
- property schema: ToolSchema#
返回该工具的架构。
- pydantic model SearchQuery[源代码]#
基类:
BaseModel
搜索查询参数。
这个简化接口只需要一个搜索查询字符串。 所有其他参数(top、filters、vector fields等)都在工具创建时指定 而不是在查询时指定,这使得语言模型更容易生成结构化输出。
- 参数:
query (str) -- 搜索查询文本。
Show JSON schema
{ "title": "SearchQuery", "description": "\u641c\u7d22\u67e5\u8be2\u53c2\u6570\u3002\n\n\u8fd9\u4e2a\u7b80\u5316\u63a5\u53e3\u53ea\u9700\u8981\u4e00\u4e2a\u641c\u7d22\u67e5\u8be2\u5b57\u7b26\u4e32\u3002\n\u6240\u6709\u5176\u4ed6\u53c2\u6570\uff08top\u3001filters\u3001vector fields\u7b49\uff09\u90fd\u5728\u5de5\u5177\u521b\u5efa\u65f6\u6307\u5b9a\n\u800c\u4e0d\u662f\u5728\u67e5\u8be2\u65f6\u6307\u5b9a\uff0c\u8fd9\u4f7f\u5f97\u8bed\u8a00\u6a21\u578b\u66f4\u5bb9\u6613\u751f\u6210\u7ed3\u6784\u5316\u8f93\u51fa\u3002\n\nArgs:\n query (str): \u641c\u7d22\u67e5\u8be2\u6587\u672c\u3002", "type": "object", "properties": { "query": { "description": "Search query text", "title": "Query", "type": "string" } }, "required": [ "query" ] }
- Fields:
query (str)
- pydantic model SearchResult[源代码]#
基类:
BaseModel
搜索结果。
- 参数:
score (float) -- 搜索得分。
content (ContentDict) -- 文档内容。
metadata (MetadataDict) -- 关于文档的附加元数据。
Show JSON schema
{ "title": "SearchResult", "description": "\u641c\u7d22\u7ed3\u679c\u3002\n\nArgs:\n score (float): \u641c\u7d22\u5f97\u5206\u3002\n content (ContentDict): \u6587\u6863\u5185\u5bb9\u3002\n metadata (MetadataDict): \u5173\u4e8e\u6587\u6863\u7684\u9644\u52a0\u5143\u6570\u636e\u3002", "type": "object", "properties": { "score": { "description": "The search score", "title": "Score", "type": "number" }, "content": { "description": "The document content", "title": "Content", "type": "object" }, "metadata": { "description": "Additional metadata about the document", "title": "Metadata", "type": "object" } }, "required": [ "score", "content", "metadata" ] }
- Fields:
content (Dict[str, Any])
metadata (Dict[str, Any])
score (float)
- field content: ContentDict [Required]#
The document content
- field metadata: MetadataDict [Required]#
Additional metadata about the document
- pydantic model SearchResults[源代码]#
基类:
BaseModel
搜索结果的容器。
- 参数:
results (List[SearchResult]) -- 搜索结果列表。
Show JSON schema
{ "title": "SearchResults", "description": "\u641c\u7d22\u7ed3\u679c\u7684\u5bb9\u5668\u3002\n\nArgs:\n results (List[SearchResult]): \u641c\u7d22\u7ed3\u679c\u5217\u8868\u3002", "type": "object", "properties": { "results": { "description": "List of search results", "items": { "$ref": "#/$defs/SearchResult" }, "title": "Results", "type": "array" } }, "$defs": { "SearchResult": { "description": "\u641c\u7d22\u7ed3\u679c\u3002\n\nArgs:\n score (float): \u641c\u7d22\u5f97\u5206\u3002\n content (ContentDict): \u6587\u6863\u5185\u5bb9\u3002\n metadata (MetadataDict): \u5173\u4e8e\u6587\u6863\u7684\u9644\u52a0\u5143\u6570\u636e\u3002", "properties": { "score": { "description": "The search score", "title": "Score", "type": "number" }, "content": { "description": "The document content", "title": "Content", "type": "object" }, "metadata": { "description": "Additional metadata about the document", "title": "Metadata", "type": "object" } }, "required": [ "score", "content", "metadata" ], "title": "SearchResult", "type": "object" } }, "required": [ "results" ] }
- Fields:
results (List[autogen_ext.tools.azure._ai_search.SearchResult])
- field results: List[SearchResult] [Required]#
List of search results
- class VectorizableTextQuery(*, text: str, k_nearest_neighbors: int | None = None, fields: str | None = None, exhaustive: bool | None = None, oversampling: float | None = None, weight: float | None = None, **kwargs: Any)[源代码]#
基类:
VectorQuery
The query parameters to use for vector search when a text value that needs to be vectorized is provided.
All required parameters must be populated in order to send to server.
- 变量:
kind (str or VectorQueryKind) -- The kind of vector query being performed. Required. Known values are: "vector" and "text".
k_nearest_neighbors (int) -- Number of nearest neighbors to return as top hits.
fields (str) -- Vector Fields of type Collection(Edm.Single) to be included in the vector searched.
exhaustive (bool) -- When true, triggers an exhaustive k-nearest neighbor search across all vectors within the vector index. Useful for scenarios where exact matches are critical, such as determining ground truth values.
oversampling (float) -- Oversampling factor. Minimum value is 1. It overrides the 'defaultOversampling' parameter configured in the index definition. It can be set only when 'rerankWithOriginalVectors' is true. This parameter is only permitted when a compression method is used on the underlying vector field.
weight (float) -- Relative weight of the vector query when compared to other vector query and/or the text query within the same search request. This value is used when combining the results of multiple ranking lists produced by the different vector queries and/or the results retrieved through the text query. The higher the weight, the higher the documents that matched that query will be in the final ranking. Default is 1.0 and the value needs to be a positive number larger than zero.
text (str) -- The text to be vectorized to perform a vector search query. Required.