autogen_ext.tools.azure#

pydantic model AzureAISearchConfig[源代码]#

基类:BaseModel

带验证的 Azure AI Search 配置。

此类定义 Azure AI Search 工具的配置参数,包括 认证、搜索行为、缓存和嵌入设置。

备注

此类需要 autogen-ext 包的 azure 额外组件。

pip install -U "autogen-ext[azure]"

备注

先决条件:

  1. 必须在您的 Azure 订阅中创建 Azure AI Search 服务

  2. 搜索索引必须为您的用例正确配置:

    • 向量搜索: 索引必须包含向量字段

    • 语义搜索: 索引必须配置语义设置

    • 混合搜索: 必须同时配置向量字段和文本字段

  3. 所需软件包:

    • 基础功能: azure-search-documents>=11.4.0

    • Azure OpenAI 嵌入: openai azure-identity

    • OpenAI 嵌入: openai

使用示例:
from azure.core.credentials import AzureKeyCredential
from autogen_ext.tools.azure import AzureAISearchConfig

# 全文搜索基础配置
config = AzureAISearchConfig(
    name="doc-search",
    endpoint="https://your-search.search.windows.net",  # 您的 Azure AI Search 端点
    index_name="<your-index>",  # 您的搜索索引名称
    credential=AzureKeyCredential("<your-key>"),  # 您的 Azure AI Search 管理密钥
    query_type="simple",
    search_fields=["content", "title"],  # 更新为您的可搜索字段
    top=5,
)

# 使用 Azure OpenAI 嵌入的向量搜索配置
vector_config = AzureAISearchConfig(
    name="vector-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    query_type="vector",
    vector_fields=["embedding"],  # 更新为您的向量字段名称
    embedding_provider="azure_openai",
    embedding_model="text-embedding-ada-002",
    openai_endpoint="https://your-openai.openai.azure.com",  # 您的 Azure OpenAI 端点
    openai_api_key="<your-openai-key>",  # 您的 Azure OpenAI 密钥
    top=5,
)

# 带语义排序的混合搜索配置
hybrid_config = AzureAISearchConfig(
    name="hybrid-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    query_type="semantic",
    semantic_config_name="<your-semantic-config>",  # 您的语义配置名称
    search_fields=["content", "title"],  # 更新为您的搜索字段
    vector_fields=["embedding"],  # 更新为您的向量字段名称
    embedding_provider="openai",
    embedding_model="text-embedding-ada-002",
    openai_api_key="<your-openai-key>",  # 您的 OpenAI API 密钥
    top=5,
)

Show JSON schema
{
   "title": "AzureAISearchConfig",
   "description": "\u5e26\u9a8c\u8bc1\u7684 Azure AI Search \u914d\u7f6e\u3002\n\n\u6b64\u7c7b\u5b9a\u4e49 Azure AI Search \u5de5\u5177\u7684\u914d\u7f6e\u53c2\u6570\uff0c\u5305\u62ec\n\u8ba4\u8bc1\u3001\u641c\u7d22\u884c\u4e3a\u3001\u7f13\u5b58\u548c\u5d4c\u5165\u8bbe\u7f6e\u3002\n\n.. note::\n    \u6b64\u7c7b\u9700\u8981 ``autogen-ext`` \u5305\u7684 ``azure`` \u989d\u5916\u7ec4\u4ef6\u3002\n\n    .. code-block:: bash\n\n        pip install -U \"autogen-ext[azure]\"\n\n.. note::\n    **\u5148\u51b3\u6761\u4ef6:**\n\n    1. \u5fc5\u987b\u5728\u60a8\u7684 Azure \u8ba2\u9605\u4e2d\u521b\u5efa Azure AI Search \u670d\u52a1\n    2. \u641c\u7d22\u7d22\u5f15\u5fc5\u987b\u4e3a\u60a8\u7684\u7528\u4f8b\u6b63\u786e\u914d\u7f6e:\n\n       - \u5411\u91cf\u641c\u7d22: \u7d22\u5f15\u5fc5\u987b\u5305\u542b\u5411\u91cf\u5b57\u6bb5\n       - \u8bed\u4e49\u641c\u7d22: \u7d22\u5f15\u5fc5\u987b\u914d\u7f6e\u8bed\u4e49\u8bbe\u7f6e\n       - \u6df7\u5408\u641c\u7d22: \u5fc5\u987b\u540c\u65f6\u914d\u7f6e\u5411\u91cf\u5b57\u6bb5\u548c\u6587\u672c\u5b57\u6bb5\n    3. \u6240\u9700\u8f6f\u4ef6\u5305:\n\n       - \u57fa\u7840\u529f\u80fd: ``azure-search-documents>=11.4.0``\n       - Azure OpenAI \u5d4c\u5165: ``openai azure-identity``\n       - OpenAI \u5d4c\u5165: ``openai``\n\n\u4f7f\u7528\u793a\u4f8b:\n    .. code-block:: python\n\n        from azure.core.credentials import AzureKeyCredential\n        from autogen_ext.tools.azure import AzureAISearchConfig\n\n        # \u5168\u6587\u641c\u7d22\u57fa\u7840\u914d\u7f6e\n        config = AzureAISearchConfig(\n            name=\"doc-search\",\n            endpoint=\"https://your-search.search.windows.net\",  # \u60a8\u7684 Azure AI Search \u7aef\u70b9\n            index_name=\"<your-index>\",  # \u60a8\u7684\u641c\u7d22\u7d22\u5f15\u540d\u79f0\n            credential=AzureKeyCredential(\"<your-key>\"),  # \u60a8\u7684 Azure AI Search \u7ba1\u7406\u5bc6\u94a5\n            query_type=\"simple\",\n            search_fields=[\"content\", \"title\"],  # \u66f4\u65b0\u4e3a\u60a8\u7684\u53ef\u641c\u7d22\u5b57\u6bb5\n            top=5,\n        )\n\n        # \u4f7f\u7528 Azure OpenAI \u5d4c\u5165\u7684\u5411\u91cf\u641c\u7d22\u914d\u7f6e\n        vector_config = AzureAISearchConfig(\n            name=\"vector-search\",\n            endpoint=\"https://your-search.search.windows.net\",\n            index_name=\"<your-index>\",\n            credential=AzureKeyCredential(\"<your-key>\"),\n            query_type=\"vector\",\n            vector_fields=[\"embedding\"],  # \u66f4\u65b0\u4e3a\u60a8\u7684\u5411\u91cf\u5b57\u6bb5\u540d\u79f0\n            embedding_provider=\"azure_openai\",\n            embedding_model=\"text-embedding-ada-002\",\n            openai_endpoint=\"https://your-openai.openai.azure.com\",  # \u60a8\u7684 Azure OpenAI \u7aef\u70b9\n            openai_api_key=\"<your-openai-key>\",  # \u60a8\u7684 Azure OpenAI \u5bc6\u94a5\n            top=5,\n        )\n\n        # \u5e26\u8bed\u4e49\u6392\u5e8f\u7684\u6df7\u5408\u641c\u7d22\u914d\u7f6e\n        hybrid_config = AzureAISearchConfig(\n            name=\"hybrid-search\",\n            endpoint=\"https://your-search.search.windows.net\",\n            index_name=\"<your-index>\",\n            credential=AzureKeyCredential(\"<your-key>\"),\n            query_type=\"semantic\",\n            semantic_config_name=\"<your-semantic-config>\",  # \u60a8\u7684\u8bed\u4e49\u914d\u7f6e\u540d\u79f0\n            search_fields=[\"content\", \"title\"],  # \u66f4\u65b0\u4e3a\u60a8\u7684\u641c\u7d22\u5b57\u6bb5\n            vector_fields=[\"embedding\"],  # \u66f4\u65b0\u4e3a\u60a8\u7684\u5411\u91cf\u5b57\u6bb5\u540d\u79f0\n            embedding_provider=\"openai\",\n            embedding_model=\"text-embedding-ada-002\",\n            openai_api_key=\"<your-openai-key>\",  # \u60a8\u7684 OpenAI API \u5bc6\u94a5\n            top=5,\n        )",
   "type": "object",
   "properties": {
      "name": {
         "description": "The name of this tool instance",
         "title": "Name",
         "type": "string"
      },
      "description": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Description explaining the tool's purpose",
         "title": "Description"
      },
      "endpoint": {
         "description": "The full URL of your Azure AI Search service",
         "title": "Endpoint",
         "type": "string"
      },
      "index_name": {
         "description": "Name of the search index to query",
         "title": "Index Name",
         "type": "string"
      },
      "credential": {
         "anyOf": [],
         "description": "Azure credential for authentication (API key or token)",
         "title": "Credential"
      },
      "api_version": {
         "default": "2023-10-01-preview",
         "description": "Azure AI Search API version to use. Defaults to 2023-10-01-preview.",
         "title": "Api Version",
         "type": "string"
      },
      "query_type": {
         "default": "simple",
         "description": "Type of search to perform: simple, full, semantic, or vector",
         "enum": [
            "simple",
            "full",
            "semantic",
            "vector"
         ],
         "title": "Query Type",
         "type": "string"
      },
      "search_fields": {
         "anyOf": [
            {
               "items": {
                  "type": "string"
               },
               "type": "array"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Fields to search within documents",
         "title": "Search Fields"
      },
      "select_fields": {
         "anyOf": [
            {
               "items": {
                  "type": "string"
               },
               "type": "array"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Fields to return in search results",
         "title": "Select Fields"
      },
      "vector_fields": {
         "anyOf": [
            {
               "items": {
                  "type": "string"
               },
               "type": "array"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Fields to use for vector search",
         "title": "Vector Fields"
      },
      "top": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Maximum number of results to return. For vector searches, acts as k in k-NN.",
         "title": "Top"
      },
      "filter": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "OData filter expression to refine search results",
         "title": "Filter"
      },
      "semantic_config_name": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Semantic configuration name for enhanced results",
         "title": "Semantic Config Name"
      },
      "enable_caching": {
         "default": false,
         "description": "Whether to cache search results",
         "title": "Enable Caching",
         "type": "boolean"
      },
      "cache_ttl_seconds": {
         "default": 300,
         "description": "How long to cache results in seconds",
         "title": "Cache Ttl Seconds",
         "type": "integer"
      },
      "embedding_provider": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Name of embedding provider for client-side embeddings",
         "title": "Embedding Provider"
      },
      "embedding_model": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Model name for client-side embeddings",
         "title": "Embedding Model"
      },
      "openai_api_key": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "API key for OpenAI/Azure OpenAI embeddings",
         "title": "Openai Api Key"
      },
      "openai_api_version": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "API version for Azure OpenAI embeddings",
         "title": "Openai Api Version"
      },
      "openai_endpoint": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "description": "Endpoint URL for Azure OpenAI embeddings",
         "title": "Openai Endpoint"
      }
   },
   "required": [
      "name",
      "endpoint",
      "index_name",
      "credential"
   ]
}

Fields:
  • api_version (str)

  • cache_ttl_seconds (int)

  • credential (azure.core.credentials.AzureKeyCredential | azure.core.credentials_async.AsyncTokenCredential)

  • description (str | None)

  • embedding_model (str | None)

  • embedding_provider (str | None)

  • enable_caching (bool)

  • endpoint (str)

  • filter (str | None)

  • index_name (str)

  • name (str)

  • openai_api_key (str | None)

  • openai_api_version (str | None)

  • openai_endpoint (str | None)

  • query_type (Literal['simple', 'full', 'semantic', 'vector'])

  • search_fields (List[str] | None)

  • select_fields (List[str] | None)

  • semantic_config_name (str | None)

  • top (int | None)

  • vector_fields (List[str] | None)

Validators:
  • normalize_query_type » query_type

  • validate_endpoint » endpoint

  • validate_interdependent_fields » all fields

  • validate_top » top

field api_version: str = '2023-10-01-preview'#

Azure AI Search API version to use. Defaults to 2023-10-01-preview.

Validated by:
  • validate_interdependent_fields

field cache_ttl_seconds: int = 300#

How long to cache results in seconds

Validated by:
  • validate_interdependent_fields

field credential: AzureKeyCredential | AsyncTokenCredential [Required]#

Azure credential for authentication (API key or token)

Validated by:
  • validate_interdependent_fields

field description: str | None = None#

Description explaining the tool's purpose

Validated by:
  • validate_interdependent_fields

field embedding_model: str | None = None#

Model name for client-side embeddings

Validated by:
  • validate_interdependent_fields

field embedding_provider: str | None = None#

Name of embedding provider for client-side embeddings

Validated by:
  • validate_interdependent_fields

field enable_caching: bool = False#

Whether to cache search results

Validated by:
  • validate_interdependent_fields

field endpoint: str [Required]#

The full URL of your Azure AI Search service

Validated by:
  • validate_endpoint

  • validate_interdependent_fields

field filter: str | None = None#

OData filter expression to refine search results

Validated by:
  • validate_interdependent_fields

field index_name: str [Required]#

Name of the search index to query

Validated by:
  • validate_interdependent_fields

field name: str [Required]#

The name of this tool instance

Validated by:
  • validate_interdependent_fields

field openai_api_key: str | None = None#

API key for OpenAI/Azure OpenAI embeddings

Validated by:
  • validate_interdependent_fields

field openai_api_version: str | None = None#

API version for Azure OpenAI embeddings

Validated by:
  • validate_interdependent_fields

field openai_endpoint: str | None = None#

Endpoint URL for Azure OpenAI embeddings

Validated by:
  • validate_interdependent_fields

field query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple'#

Type of search to perform: simple, full, semantic, or vector

Validated by:
  • normalize_query_type

  • validate_interdependent_fields

field search_fields: List[str] | None = None#

Fields to search within documents

Validated by:
  • validate_interdependent_fields

field select_fields: List[str] | None = None#

Fields to return in search results

Validated by:
  • validate_interdependent_fields

field semantic_config_name: str | None = None#

Semantic configuration name for enhanced results

Validated by:
  • validate_interdependent_fields

field top: int | None = None#

Maximum number of results to return. For vector searches, acts as k in k-NN.

Validated by:
  • validate_interdependent_fields

  • validate_top

field vector_fields: List[str] | None = None#

Fields to use for vector search

Validated by:
  • validate_interdependent_fields

validator normalize_query_type  »  query_type[源代码]#

将查询类型标准化为规范值。

validator validate_endpoint  »  endpoint[源代码]#

验证端点是否为有效 URL。

validator validate_interdependent_fields  »  all fields[源代码]#

在所有字段解析完成后验证相互依赖的字段。

validator validate_top  »  top[源代码]#

如果提供了 top 参数,确保其为正整数。

class AzureAISearchTool(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], description: str | None = None, api_version: str = DEFAULT_API_VERSION, query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple', search_fields: List[str] | None = None, select_fields: List[str] | None = None, vector_fields: List[str] | None = None, top: int | None = None, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None)[源代码]#

基类:EmbeddingProviderMixin, BaseAzureAISearchTool

用于查询 Azure 搜索索引的 Azure AI 搜索工具。

该工具提供了简化的接口,用于使用多种搜索方法查询 Azure AI 搜索索引。 建议使用工厂方法创建针对特定搜索类型定制的实例:

  1. 全文搜索:适用于传统的基于关键词的搜索、Lucene 查询或语义重新排序的结果。 - 使用 AzureAISearchTool.create_full_text_search() - 支持的 query_type:"simple"(关键词)、"full"(Lucene)、"semantic"(语义)。

  2. 向量搜索:适用于基于向量嵌入的纯相似性搜索。 - 使用 AzureAISearchTool.create_vector_search()

  3. 混合搜索:结合向量搜索与全文或语义搜索,以同时获得两者的优势。 - 使用 AzureAISearchTool.create_hybrid_search() - 文本组件可以通过 query_type 参数设置为 "simple"、"full" 或 "semantic"。

每个工厂方法都会根据所选的搜索策略配置适当的默认值和验证。

警告

如果设置 query_type="semantic",则还必须提供有效的 semantic_config_name。 此配置必须事先在 Azure AI 搜索索引中设置好。

component_provider_override: ClassVar[str | None] = 'autogen_ext.tools.azure.AzureAISearchTool'#

覆盖组件的provider字符串。这应该用于防止内部模块名称成为模块名称的一部分。

创建用于传统文本搜索的工具。

此工厂方法创建一个专为全文搜索优化的 AzureAISearchTool, 支持关键词匹配、Lucene 语法和语义搜索功能。

参数:
  • name -- 工具实例的名称

  • endpoint -- Azure AI 搜索服务的完整 URL

  • index_name -- 要查询的搜索索引名称

  • credential -- 用于身份验证的 Azure 凭据(API 密钥或令牌)

  • description -- 可选描述,解释工具的用途

  • api_version -- 要使用的 Azure AI 搜索 API 版本

  • query_type --

    要执行的文本搜索类型:

    • simple : 基本关键词搜索,匹配精确术语及其变体

    • full: 使用 Lucene 查询语法进行高级搜索,支持复杂查询

    • semantic: 基于 AI 的搜索,理解语义和上下文,提供增强的相关性排序

  • search_fields -- 文档中要搜索的字段

  • select_fields -- 搜索结果中要返回的字段

  • top -- 要返回的最大结果数(默认:5)

  • filter -- 用于优化搜索结果的 OData 过滤表达式

  • semantic_config_name -- 语义配置名称(语义 query_type 必需)

  • enable_caching -- 是否缓存搜索结果

  • cache_ttl_seconds -- 缓存结果的持续时间(秒)

Returns:

一个初始化好的用于全文搜索的 AzureAISearchTool

示例

from azure.core.credentials import AzureKeyCredential
from autogen_ext.tools.azure import AzureAISearchTool

# 基本关键词搜索
tool = AzureAISearchTool.create_full_text_search(
    name="doc-search",
    endpoint="https://your-search.search.windows.net",  # 您的 Azure AI 搜索端点
    index_name="<your-index>",  # 您的搜索索引名称
    credential=AzureKeyCredential("<your-key>"),  # 您的 Azure AI 搜索管理员密钥
    query_type="simple",  # 启用关键词搜索
    search_fields=["content", "title"],  # 必需:要搜索的字段
    select_fields=["content", "title", "url"],  # 可选:要返回的字段
    top=5,
)

# 全文(Lucene 查询)搜索
full_text_tool = AzureAISearchTool.create_full_text_search(
    name="doc-search",
    endpoint="https://your-search.search.windows.net",  # 您的 Azure AI 搜索端点
    index_name="<your-index>",  # 您的搜索索引名称
    credential=AzureKeyCredential("<your-key>"),  # 您的 Azure AI 搜索管理员密钥
    query_type="full",  # 启用 Lucene 查询语法
    search_fields=["content", "title"],  # 必需:要搜索的字段
    select_fields=["content", "title", "url"],  # 可选:要返回的字段
    top=5,
)

# 带重新排序的语义搜索
# 注意:确保您的索引已启用语义配置
semantic_tool = AzureAISearchTool.create_full_text_search(
    name="semantic-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    query_type="semantic",  # 启用语义排序
    semantic_config_name="<your-semantic-config>",  # 语义搜索必需
    search_fields=["content", "title"],  # 必需:要搜索的字段
    select_fields=["content", "title", "url"],  # 可选:要返回的字段
    top=5,
)

# 搜索工具可与 Agent 一起使用
# assistant = Agent("assistant", tools=[semantic_tool])

创建一个结合向量搜索和文本搜索能力的工具。

此工厂方法创建一个配置为混合搜索的 AzureAISearchTool, 它结合了向量相似度和传统文本搜索的优势。

参数:
  • name -- 该工具实例的名称

  • endpoint -- Azure AI 搜索服务的完整 URL

  • index_name -- 要查询的搜索索引名称

  • credential -- 用于身份验证的 Azure 凭证(API 密钥或令牌)

  • vector_fields -- 用于向量搜索的字段(必填)

  • search_fields -- 用于文本搜索的字段(必填)

  • description -- 可选描述,说明工具的用途

  • api_version -- 要使用的 Azure AI 搜索 API 版本

  • query_type --

    要执行的文本搜索类型:

    • simple: 基本关键词搜索,匹配精确术语及其变体

    • full: 使用 Lucene 查询语法进行高级搜索,适用于复杂查询

    • semantic: 基于 AI 的搜索,理解语义和上下文,提供增强的相关性排名

  • select_fields -- 要在搜索结果中返回的字段

  • top -- 要返回的最大结果数(默认:5)

  • filter -- 用于优化搜索结果的 OData 过滤表达式

  • semantic_config_name -- 语义配置名称(当 query_type="semantic" 时必填)

  • enable_caching -- 是否缓存搜索结果

  • cache_ttl_seconds -- 缓存结果的秒数

  • embedding_provider -- 客户端嵌入的提供程序(例如 'azure_openai', 'openai')

  • embedding_model -- 客户端嵌入的模型(例如 'text-embedding-ada-002')

  • openai_api_key -- OpenAI/Azure OpenAI 嵌入的 API 密钥

  • openai_api_version -- Azure OpenAI 嵌入的 API 版本

  • openai_endpoint -- Azure OpenAI 嵌入的端点 URL

Returns:

一个初始化好的用于混合搜索的 AzureAISearchTool

抛出:
  • ValueError -- 如果 vector_fields 或 search_fields 为空

  • ValueError -- 如果 query_type 为 "semantic" 但没有 semantic_config_name

  • ValueError -- 如果 embedding_provider 是 'azure_openai' 但没有 openai_endpoint

  • ValueError -- 如果缺少必需参数或参数无效

示例

from azure.core.credentials import AzureKeyCredential
from autogen_ext.tools.azure import AzureAISearchTool

# 使用服务端向量化的基本混合搜索
tool = AzureAISearchTool.create_hybrid_search(
    name="hybrid-search",
    endpoint="https://your-search.search.windows.net",  # 你的 Azure AI 搜索端点
    index_name="<your-index>",  # 你的搜索索引名称
    credential=AzureKeyCredential("<your-key>"),  # 你的 Azure AI 搜索管理员密钥
    vector_fields=["content_vector"],  # 你的向量字段名称
    search_fields=["content", "title"],  # 你的可搜索字段
    top=5,
)

# 带有语义排名和 Azure OpenAI 嵌入的混合搜索
semantic_tool = AzureAISearchTool.create_hybrid_search(
    name="semantic-hybrid-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    vector_fields=["content_vector"],
    search_fields=["content", "title"],
    query_type="semantic",  # 启用语义排名
    semantic_config_name="<your-semantic-config>",  # 你的语义配置名称
    embedding_provider="azure_openai",  # 使用 Azure OpenAI 进行嵌入
    embedding_model="text-embedding-ada-002",  # 要使用的嵌入模型
    openai_endpoint="https://your-openai.openai.azure.com",  # 你的 Azure OpenAI 端点
    openai_api_key="<your-openai-key>",  # 你的 Azure OpenAI 密钥
    openai_api_version="2024-02-15-preview",  # Azure OpenAI API 版本
    select_fields=["content", "title", "url"],  # 要在结果中返回的字段
    filter="language eq 'en'",  # 可选的 OData 过滤器
    top=5,
)

# 搜索工具可以与 Agent 一起使用
# assistant = Agent("assistant", tools=[semantic_tool])

创建用于纯向量/相似性搜索的工具。

此工厂方法创建一个专为向量搜索优化的 AzureAISearchTool, 允许使用向量嵌入进行基于语义相似性的匹配。

参数:
  • name -- 工具实例的名称

  • endpoint -- Azure AI 搜索服务的完整 URL

  • index_name -- 要查询的搜索索引名称

  • credential -- 用于身份验证的 Azure 凭据(API 密钥或令牌)

  • vector_fields -- 用于向量搜索的字段(必需)

  • description -- 可选描述,解释工具的用途

  • api_version -- 要使用的 Azure AI 搜索 API 版本

  • select_fields -- 搜索结果中要返回的字段

  • top -- 要返回的最大结果数 / k-NN 中的 k(默认:5)

  • filter -- 用于优化搜索结果的 OData 过滤表达式

  • enable_caching -- 是否缓存搜索结果

  • cache_ttl_seconds -- 缓存结果的持续时间(秒)

  • embedding_provider -- 客户端嵌入的提供者(如 'azure_openai', 'openai')

  • embedding_model -- 客户端嵌入的模型(如 'text-embedding-ada-002')

  • openai_api_key -- OpenAI/Azure OpenAI 嵌入的 API 密钥

  • openai_api_version -- Azure OpenAI 嵌入的 API 版本

  • openai_endpoint -- Azure OpenAI 嵌入的端点 URL

Returns:

一个初始化好的用于向量搜索的 AzureAISearchTool

抛出:
  • ValueError -- 如果 vector_fields 为空

  • ValueError -- 如果 embedding_provider 为 'azure_openai' 但没有 openai_endpoint

  • ValueError -- 如果缺少必需参数或参数无效

Example Usage:
from azure.core.credentials import AzureKeyCredential
from autogen_ext.tools.azure import AzureAISearchTool

# 使用服务端向量化的向量搜索
tool = AzureAISearchTool.create_vector_search(
    name="vector-search",
    endpoint="https://your-search.search.windows.net",  # 您的 Azure AI 搜索端点
    index_name="<your-index>",  # 您的搜索索引名称
    credential=AzureKeyCredential("<your-key>"),  # 您的 Azure AI 搜索管理员密钥
    vector_fields=["content_vector"],  # 您的向量字段名称
    select_fields=["content", "title", "url"],  # 结果中要返回的字段
    top=5,
)

# 使用 Azure OpenAI 嵌入的向量搜索
azure_openai_tool = AzureAISearchTool.create_vector_search(
    name="azure-openai-vector-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    vector_fields=["content_vector"],
    embedding_provider="azure_openai",  # 使用 Azure OpenAI 进行嵌入
    embedding_model="text-embedding-ada-002",  # 要使用的嵌入模型
    openai_endpoint="https://your-openai.openai.azure.com",  # 您的 Azure OpenAI 端点
    openai_api_key="<your-openai-key>",  # 您的 Azure OpenAI 密钥
    openai_api_version="2024-02-15-preview",  # Azure OpenAI API 版本
    select_fields=["content", "title", "url"],  # 结果中要返回的字段
    top=5,
)

# 使用 OpenAI 嵌入的向量搜索
openai_tool = AzureAISearchTool.create_vector_search(
    name="openai-vector-search",
    endpoint="https://your-search.search.windows.net",
    index_name="<your-index>",
    credential=AzureKeyCredential("<your-key>"),
    vector_fields=["content_vector"],
    embedding_provider="openai",  # 使用 OpenAI 进行嵌入
    embedding_model="text-embedding-ada-002",  # 要使用的嵌入模型
    openai_api_key="<your-openai-key>",  # 您的 OpenAI API 密钥
    select_fields=["content", "title", "url"],  # 结果中要返回的字段
    top=5,
)

# 与 Agent 一起使用该工具
# assistant = Agent("assistant", tools=[azure_openai_tool])
class BaseAzureAISearchTool(name: str, endpoint: str, index_name: str, credential: AzureKeyCredential | AsyncTokenCredential | Dict[str, str], description: str | None = None, api_version: str = DEFAULT_API_VERSION, query_type: Literal['simple', 'full', 'semantic', 'vector'] = 'simple', search_fields: List[str] | None = None, select_fields: List[str] | None = None, vector_fields: List[str] | None = None, top: int | None = None, filter: str | None = None, semantic_config_name: str | None = None, enable_caching: bool = False, cache_ttl_seconds: int = 300, embedding_provider: str | None = None, embedding_model: str | None = None, openai_api_key: str | None = None, openai_api_version: str | None = None, openai_endpoint: str | None = None)[源代码]#

基类:BaseTool[SearchQuery, SearchResults], Component[AzureAISearchConfig], EmbeddingProvider, ABC

Azure AI 搜索工具的抽象基类。

该类定义了所有 Azure AI 搜索工具的通用接口和功能。 它处理配置管理、客户端初始化以及子类必须实现的抽象方法。

属性:

search_config: 搜索服务的配置参数。

注意:

这是一个抽象基类,不应直接实例化。 请使用具体实现或 AzureAISearchTool 中的工厂方法。

async close() None[源代码]#

显式关闭 Azure SearchClient(如需清理)。

component_config_schema#

AzureAISearchConfig 的别名

component_provider_override: ClassVar[str | None] = 'autogen_ext.tools.azure.BaseAzureAISearchTool'#

覆盖组件的provider字符串。这应该用于防止内部模块名称成为模块名称的一部分。

return_value_as_string(value: SearchResults) str[源代码]#

将搜索结果转换为字符串表示形式。

async run(args: str | Dict[str, Any] | SearchQuery, cancellation_token: CancellationToken | None = None) SearchResults[源代码]#

对 Azure AI 搜索索引执行搜索。

参数:
  • args -- 搜索查询文本或 SearchQuery 对象

  • cancellation_token -- 用于取消操作的可选令牌

Returns:

SearchResults -- 包含搜索结果和元数据的容器

抛出:
property schema: ToolSchema#

返回该工具的架构。

pydantic model SearchQuery[源代码]#

基类:BaseModel

搜索查询参数。

这个简化接口只需要一个搜索查询字符串。 所有其他参数(top、filters、vector fields等)都在工具创建时指定 而不是在查询时指定,这使得语言模型更容易生成结构化输出。

参数:

query (str) -- 搜索查询文本。

Show JSON schema
{
   "title": "SearchQuery",
   "description": "\u641c\u7d22\u67e5\u8be2\u53c2\u6570\u3002\n\n\u8fd9\u4e2a\u7b80\u5316\u63a5\u53e3\u53ea\u9700\u8981\u4e00\u4e2a\u641c\u7d22\u67e5\u8be2\u5b57\u7b26\u4e32\u3002\n\u6240\u6709\u5176\u4ed6\u53c2\u6570\uff08top\u3001filters\u3001vector fields\u7b49\uff09\u90fd\u5728\u5de5\u5177\u521b\u5efa\u65f6\u6307\u5b9a\n\u800c\u4e0d\u662f\u5728\u67e5\u8be2\u65f6\u6307\u5b9a\uff0c\u8fd9\u4f7f\u5f97\u8bed\u8a00\u6a21\u578b\u66f4\u5bb9\u6613\u751f\u6210\u7ed3\u6784\u5316\u8f93\u51fa\u3002\n\nArgs:\n    query (str): \u641c\u7d22\u67e5\u8be2\u6587\u672c\u3002",
   "type": "object",
   "properties": {
      "query": {
         "description": "Search query text",
         "title": "Query",
         "type": "string"
      }
   },
   "required": [
      "query"
   ]
}

Fields:
  • query (str)

field query: str [Required]#

Search query text

pydantic model SearchResult[源代码]#

基类:BaseModel

搜索结果。

参数:
  • score (float) -- 搜索得分。

  • content (ContentDict) -- 文档内容。

  • metadata (MetadataDict) -- 关于文档的附加元数据。

Show JSON schema
{
   "title": "SearchResult",
   "description": "\u641c\u7d22\u7ed3\u679c\u3002\n\nArgs:\n    score (float): \u641c\u7d22\u5f97\u5206\u3002\n    content (ContentDict): \u6587\u6863\u5185\u5bb9\u3002\n    metadata (MetadataDict): \u5173\u4e8e\u6587\u6863\u7684\u9644\u52a0\u5143\u6570\u636e\u3002",
   "type": "object",
   "properties": {
      "score": {
         "description": "The search score",
         "title": "Score",
         "type": "number"
      },
      "content": {
         "description": "The document content",
         "title": "Content",
         "type": "object"
      },
      "metadata": {
         "description": "Additional metadata about the document",
         "title": "Metadata",
         "type": "object"
      }
   },
   "required": [
      "score",
      "content",
      "metadata"
   ]
}

Fields:
  • content (Dict[str, Any])

  • metadata (Dict[str, Any])

  • score (float)

field content: ContentDict [Required]#

The document content

field metadata: MetadataDict [Required]#

Additional metadata about the document

field score: float [Required]#

The search score

pydantic model SearchResults[源代码]#

基类:BaseModel

搜索结果的容器。

参数:

results (List[SearchResult]) -- 搜索结果列表。

Show JSON schema
{
   "title": "SearchResults",
   "description": "\u641c\u7d22\u7ed3\u679c\u7684\u5bb9\u5668\u3002\n\nArgs:\n    results (List[SearchResult]): \u641c\u7d22\u7ed3\u679c\u5217\u8868\u3002",
   "type": "object",
   "properties": {
      "results": {
         "description": "List of search results",
         "items": {
            "$ref": "#/$defs/SearchResult"
         },
         "title": "Results",
         "type": "array"
      }
   },
   "$defs": {
      "SearchResult": {
         "description": "\u641c\u7d22\u7ed3\u679c\u3002\n\nArgs:\n    score (float): \u641c\u7d22\u5f97\u5206\u3002\n    content (ContentDict): \u6587\u6863\u5185\u5bb9\u3002\n    metadata (MetadataDict): \u5173\u4e8e\u6587\u6863\u7684\u9644\u52a0\u5143\u6570\u636e\u3002",
         "properties": {
            "score": {
               "description": "The search score",
               "title": "Score",
               "type": "number"
            },
            "content": {
               "description": "The document content",
               "title": "Content",
               "type": "object"
            },
            "metadata": {
               "description": "Additional metadata about the document",
               "title": "Metadata",
               "type": "object"
            }
         },
         "required": [
            "score",
            "content",
            "metadata"
         ],
         "title": "SearchResult",
         "type": "object"
      }
   },
   "required": [
      "results"
   ]
}

Fields:
  • results (List[autogen_ext.tools.azure._ai_search.SearchResult])

field results: List[SearchResult] [Required]#

List of search results

class VectorizableTextQuery(*, text: str, k_nearest_neighbors: int | None = None, fields: str | None = None, exhaustive: bool | None = None, oversampling: float | None = None, weight: float | None = None, **kwargs: Any)[源代码]#

基类:VectorQuery

The query parameters to use for vector search when a text value that needs to be vectorized is provided.

All required parameters must be populated in order to send to server.

变量:
  • kind (str or VectorQueryKind) -- The kind of vector query being performed. Required. Known values are: "vector" and "text".

  • k_nearest_neighbors (int) -- Number of nearest neighbors to return as top hits.

  • fields (str) -- Vector Fields of type Collection(Edm.Single) to be included in the vector searched.

  • exhaustive (bool) -- When true, triggers an exhaustive k-nearest neighbor search across all vectors within the vector index. Useful for scenarios where exact matches are critical, such as determining ground truth values.

  • oversampling (float) -- Oversampling factor. Minimum value is 1. It overrides the 'defaultOversampling' parameter configured in the index definition. It can be set only when 'rerankWithOriginalVectors' is true. This parameter is only permitted when a compression method is used on the underlying vector field.

  • weight (float) -- Relative weight of the vector query when compared to other vector query and/or the text query within the same search request. This value is used when combining the results of multiple ranking lists produced by the different vector queries and/or the results retrieved through the text query. The higher the weight, the higher the documents that matched that query will be in the final ranking. Default is 1.0 and the value needs to be a positive number larger than zero.

  • text (str) -- The text to be vectorized to perform a vector search query. Required.