英伟达驱动，cuda,cudnn,onnxruntime各个版本下载地址

英伟达驱动下载地址：https://www.nvidia.cn/drivers/lookup/

cuda下载地址:

最新cuda下载地址https://developer.nvidia.com/cuda-downloads/

cuda其他版本下载列表：https://developer.nvidia.com/cuda-toolkit-archive

cudnn下载地址：

最新cudnn下载地址：https://developer.nvidia.com/cudnn-downloads

cudnn其他下载地址：https://developer.nvidia.com/rdp/cudnn-archive

**GPU显卡驱动/cuda/cudnn对应关系**
Component Name		Version Information	Supported Architectures	Supported Platforms
CUDA C++ Core Compute Libraries	Thrust	3.0.1	x86_64, arm64-sbsa	Linux, Windows
	CUB	3.0.1
	libcu++	3.0.1
	Cooperative Groups	13.0.85
CUDA Application Compiler (crt)		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA Compilation Optimizer (ctadvisor)		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA Runtime (cudart)		13.0.96	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA culibos		13.0.85	x86_64, arm64-sbsa	Linux
CUDA cuobjdump		13.0.85	x86_64, arm64-sbsa	Linux, Windows
CUPTI		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuxxfilt (demangler)		13.0.85	x86_64, arm64-sbsa	Linux, Windows
CUDA Documentation		13.0.85	x86_64	Linux, Windows
CUDA GDB		13.0.85	x86_64, arm64-sbsa	Linux, WSL
CUDA Nsight Eclipse Plugin		13.0.85	x86_64	Linux
CUDA NVCC		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvdisasm		13.0.85	x86_64, arm64-sbsa	Linux, Windows
CUDA NVML Headers		13.0.87	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvprune		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA NVRTC		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA NVTX		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA OpenCL		13.0.85	x86_64	Linux, Windows
CUDA Profiler API		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA Sandbox dev		13.0.85	x86_64, arm64-sbsa	Linux, WSL
CUDA Compute Sanitizer API		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuBLAS		13.1.0.3	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuFFT		12.0.0.61	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuFile		1.15.1.6	x86_64, arm64-sbsa	Linux
CUDA cuRAND		10.4.0.35	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuSOLVER		12.0.4.66	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA cuSPARSE		12.6.3.3	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA NPP		13.0.1.2	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvFatbin		13.0.85	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvJitLink		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvJPEG		13.0.1.86	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvptxcompiler		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
CUDA nvsdm		580.95.05	x86_64	Linux, Windows, WSL
CUDA nvvm		13.0.88	x86_64, arm64-sbsa	Linux, Windows, WSL
Nsight Compute		2025.3.1.4	x86_64, arm64-sbsa	Windows, WSL (Windows 11)
Nsight Systems		2025.3.2.474	x86_64, arm64-sbsa	Linux, Windows, WSL
Nsight Visual Studio Edition (VSE)		2025.3.1.25227	x86_64 (Windows)	Windows
nvidia_fs1		2.26.6	x86_64, arm64-sbsa	Linux
Visual Studio Integration		13.0.85	x86_64 (Windows)	Windows
NVIDIA Linux Driver		580.95.05	x86_64, arm64-sbsa	Linux

2.2. CUDA Driver

Running a CUDA application requires the system with at least one CUDA capable GPU and a driver that is compatible with the CUDA Toolkit. See Table 3. For more information various GPU products that are CUDA capable, visit https://developer.nvidia.com/cuda-gpus.
Each release of the CUDA Toolkit requires a minimum version of the CUDA driver. The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases.
More information on compatibility can be found at https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#cuda-compatibility-and-upgrades.
Note: Starting with CUDA 11.0, the toolkit components are individually versioned, and the toolkit itself is versioned as shown in the table below.
The minimum required driver version for CUDA minor version compatibility is shown below. CUDA minor version compatibility is described in detail in https://docs.nvidia.com/deploy/cuda-compatibility/index.html

Table 2 CUDA Toolkit and Minimum Required Driver Version for CUDA Minor Version Compatibility
CTK Version	Driver Range for Minor Version Compatibility
	Min	Max
13.x	>= 580	N/A
12.x	>= 525	< 580
11.x	>= 450	< 525

* Using a Minimum Required Version that is different from Toolkit Driver Version could be allowed in compatibility mode – please read the CUDA Compatibility Guide for details.
** Starting with CUDA 13.0, the Windows display driver is no longer bundled with the CUDA Toolkit package. Users must download and install the appropriate NVIDIA driver separately from the official driver download page.
For more information on supported driver versions, see the CUDA Compatibility Guide for drivers.
*** CUDA 11.0 was released with an earlier driver version, but by upgrading to Tesla Recommended Drivers 450.80.02 (Linux) / 452.39 (Windows), minor version compatibility is possible across the CUDA 11.x family of toolkits.
The version of the development NVIDIA GPU Driver packaged in each CUDA Toolkit release is shown below.

1: Only available on select Linux distros

Table 3 CUDA Toolkit and Corresponding Driver Versions
CUDA Toolkit	Toolkit Driver Version
	Linux x86_64 Driver Version	Windows x86_64 Driver Version
CUDA 13.0 Update 2	>=580.95.05	N/A
CUDA 13.0 Update 1	>=580.82.07	N/A
CUDA 13.0 GA	>=580.65.06	N/A
CUDA 12.9 Update 1	>=575.57.08	>=576.57
CUDA 12.9 GA	>=575.51.03	>=576.02
CUDA 12.8 Update 1	>=570.124.06	>=572.61
CUDA 12.8 GA	>=570.26	>=570.65
CUDA 12.6 Update 3	>=560.35.05	>=561.17
CUDA 12.6 Update 2	>=560.35.03	>=560.94
CUDA 12.6 Update 1	>=560.35.03	>=560.94
CUDA 12.6 GA	>=560.28.03	>=560.76
CUDA 12.5 Update 1	>=555.42.06	>=555.85
CUDA 12.5 GA	>=555.42.02	>=555.85
CUDA 12.4 Update 1	>=550.54.15	>=551.78
CUDA 12.4 GA	>=550.54.14	>=551.61
CUDA 12.3 Update 1	>=545.23.08	>=546.12
CUDA 12.3 GA	>=545.23.06	>=545.84
CUDA 12.2 Update 2	>=535.104.05	>=537.13
CUDA 12.2 Update 1	>=535.86.09	>=536.67
CUDA 12.2 GA	>=535.54.03	>=536.25
CUDA 12.1 Update 1	>=530.30.02	>=531.14
CUDA 12.1 GA	>=530.30.02	>=531.14
CUDA 12.0 Update 1	>=525.85.12	>=528.33
CUDA 12.0 GA	>=525.60.13	>=527.41
CUDA 11.8 GA	>=520.61.05	>=520.06
CUDA 11.7 Update 1	>=515.48.07	>=516.31
CUDA 11.7 GA	>=515.43.04	>=516.01
CUDA 11.6 Update 2	>=510.47.03	>=511.65
CUDA 11.6 Update 1	>=510.47.03	>=511.65
CUDA 11.6 GA	>=510.39.01	>=511.23
CUDA 11.5 Update 2	>=495.29.05	>=496.13
CUDA 11.5 Update 1	>=495.29.05	>=496.13
CUDA 11.5 GA	>=495.29.05	>=496.04
CUDA 11.4 Update 4	>=470.82.01	>=472.50
CUDA 11.4 Update 3	>=470.82.01	>=472.50
CUDA 11.4 Update 2	>=470.57.02	>=471.41
CUDA 11.4 Update 1	>=470.57.02	>=471.41
CUDA 11.4.0 GA	>=470.42.01	>=471.11
CUDA 11.3.1 Update 1	>=465.19.01	>=465.89
CUDA 11.3.0 GA	>=465.19.01	>=465.89
CUDA 11.2.2 Update 2	>=460.32.03	>=461.33
CUDA 11.2.1 Update 1	>=460.32.03	>=461.09
CUDA 11.2.0 GA	>=460.27.03	>=460.82
CUDA 11.1.1 Update 1	>=455.32	>=456.81
CUDA 11.1 GA	>=455.23	>=456.38
CUDA 11.0.3 Update 1	>= 450.51.06	>= 451.82
CUDA 11.0.2 GA	>= 450.51.05	>= 451.48
CUDA 11.0.1 RC	>= 450.36.06	>= 451.22
CUDA 10.2.89	>= 440.33	>= 441.22
CUDA 10.1 (10.1.105 general release, and updates)	>= 418.39	>= 418.96
CUDA 10.0.130	>= 410.48	>= 411.31
CUDA 9.2 (9.2.148 Update 1)	>= 396.37	>= 398.26
CUDA 9.2 (9.2.88)	>= 396.26	>= 397.44
CUDA 9.1 (9.1.85)	>= 390.46	>= 391.29
CUDA 9.0 (9.0.76)	>= 384.81	>= 385.54
CUDA 8.0 (8.0.61 GA2)	>= 375.26	>= 376.51
CUDA 8.0 (8.0.44)	>= 367.48	>= 369.30
CUDA 7.5 (7.5.16)	>= 352.31	>= 353.66
CUDA 7.0 (7.0.28)	>= 346.46	>= 347.62

onnxruntime对应列表：
https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements
请参阅下表，了解 ONNX 运行时推理包的官方 GPU 包依赖项。请注意，ONNX 运行时训练与 PyTorch CUDA 版本保持一致;有关支持的版本，请参阅 onnxruntime.ai 上的“优化训练”选项卡。
由于 Nvidia CUDA 次要版本兼容性，使用 CUDA 11.8 构建的 ONNX Runtime 与任何 CUDA 11.x 版本兼容;使用 CUDA 12.x 构建的 ONNX Runtime 与任何 CUDA 12.x 版本兼容。
使用 cuDNN 8.x 构建的 ONNX Runtime 与 cuDNN 9.x 不兼容，反之亦然。您可以根据与运行时环境相匹配的 CUDA 和 cuDNN 主要版本选择包（例如，PyTorch 2.3 使用 cuDNN 8.x，而 PyTorch 2.4 或更高版本使用 cuDNN 9.x）。
注意：从 1.19 版本开始，在 PyPI 中分发 ONNX Runtime GPU 包时，CUDA 12.x 成为默认版本。
为了减少手动安装 CUDA 和 cuDNN 的需要，并确保 ONNX Runtime 和 PyTorch 之间的无缝集成，Python 包提供了 API 来适当地加载 CUDA 和 cuDNN 动态链接库（DLL）。有关更多详细信息，请参阅与 PyTorch 的兼容性和预加载 DLL 部分。onnxruntime-gpu

CUDA 12.x 版

ONNX 运行时	库达	cuDNN	笔记
1.20.x 版本	12.x 版本	9.x	在 PyPI 中可用。与 CUDA 12.x 的 PyTorch >= 2.4.0 兼容。
1.19.x 版本	12.x 版本	9.x	在 PyPI 中可用。与 CUDA 12.x 的 PyTorch >= 2.4.0 兼容。
1.18.1	12.x 版本	9.x	cuDNN 9 是必需的。没有 Java 包。
1.18.0	12.x 版本	8.x	添加了 Java 包。
1.17.x 版本	12.x 版本	8.x	仅发布 C++/C# Nuget 和 Python 包。没有 Java 包。

CUDA 11.x

ONNX 运行时	库达	cuDNN	笔记
1.20.x 版本	11.8	8.x	在 PyPI 中不可用。有关详细信息，请参阅安装 ORT。与 CUDA 11.8 的 PyTorch <= 2.3.1 兼容。
1.19.x 版本	11.8	8.x	在 PyPI 中不可用。有关详细信息，请参阅安装 ORT。与 CUDA 11.8 的 PyTorch <= 2.3.1 兼容。
1.18.x 版本	11.8	8.x	在 PyPI 中可用。
1.17 1.16 1.15	11.8	8.2.4 （Linux） 8.5.0.96 （Windows）	使用 11.6 到 11.8 的 CUDA 版本和 8.2 到 8.9 的 cuDNN 进行测试
1.14 1.13	11.6	8.2.4 （Linux） 8.5.0.96 （Windows）	libcudart 11.4.43 libcufft 10.5.2.100 libcurand 10.2.5.120 libcublasLt 11.6.5.2 libcublas 11.6.5.2 libcudnn 8.2.4
1.12 1.11	11.4	8.2.4 （Linux） 8.2.2.26 （Windows）	libcudart 11.4.43 libcufft 10.5.2.100 libcurand 10.2.5.120 libcublasLt 11.6.5.2 libcublas 11.6.5.2 libcudnn 8.2.4
1.10	11.4	8.2.4 （Linux） 8.2.2.26 （Windows）	libcudart 11.4.43 libcufft 10.5.2.100 libcurand 10.2.5.120 libcublasLt 11.6.1.51 libcublas 11.6.1.51 libcudnn 8.2.4
1.9	11.4	8.2.4 （Linux） 8.2.2.26 （Windows）	libcudart 11.4.43 libcufft 10.5.2.100 libcurand 10.2.5.120 libcublasLt 11.6.1.51 libcublas 11.6.1.51 libcudnn 8.2.4
1.8	11.0.3	8.0.4 （Linux） 8.0.2.39 （Windows）	libcudart 11.0.221 libcufft 10.2.1.245 libcurand 10.2.1.245 libcublasLt 11.2.0.252 libcublas 11.2.0.252 libcudnn 8.0.4
1.7	11.0.3	8.0.4 （Linux） 8.0.2.39 （Windows）	libcudart 11.0.221 libcufft 10.2.1.245 libcurand 10.2.1.245 libcublasLt 11.2.0.252 libcublas 11.2.0.252 libcudnn 8.0.4

CUDA 10.x

ONNX 运行时	库达	cuDNN	笔记
1.5-1.6	10.2	8.0.3	CUDA 11 可以从源代码构建
1.2-1.4	10.1	7.6.5	需要 cublas10-10.2.1.243;cublas 10.1.x 将无法工作
1.0-1.1	10.0	7.6.4	从 9.1 到 10.1 的 CUDA 版本和从 7.1 到 7.4 的 cuDNN 版本也应适用于 Visual Studio 2017

对于旧版本，请参考发布分支上的自述文件和构建页面。

建

有关构建说明，请参阅构建页面。

与 PyTorch 的兼容性

该软件包旨在与 PyTorch 无缝协作，前提是两者都是针对相同的 CUDA 和 cuDNN 主要版本构建的。安装支持 CUDA 的 PyTorch 时，包括必要的 CUDA 和 cuDNN DLL，无需单独安装 CUDA 工具包或 cuDNN。onnxruntime-gpu
为确保 ONNX 运行时使用 PyTorch 安装的 DLL，可以在创建推理会话之前预加载这些库。这可以通过导入 PyTorch 或使用函数来实现。onnxruntime.preload_dlls()
示例 1：导入 PyTorch

# Import torch will preload necessary DLLs. It need to be done before creating session.
import torch
import onnxruntime

# Create an inference session with CUDA execution provider
session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])

示例 2：使用preload_dlls函数

import onnxruntime

# Preload necessary DLLs
onnxruntime.preload_dlls()

# Create an inference session with CUDA execution provider
session = onnxruntime.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])

预加载 DLL

从版本 1.21.0 开始，该包提供了预加载 CUDA、cuDNN 和 Microsoft Visual C++ （MSVC）运行时 DLL 的功能。此功能可以灵活地指定要加载的库以及从哪些目录加载。onnxruntime-gpupreload_dlls
功能签名：

onnxruntime.preload_dlls(cuda=True, cudnn=True, msvc=True, directory=None)

参数：

cuda（bool）：如果设置为，则预加载 CUDA DLL。True
cudnn（bool）：如果设置为，则预加载 cuDNN DLL。True
msvc（bool）：如果设置为，则预加载 MSVC 运行时 DLL。True
directory（str 或 None）：要从中加载 DLL 的目录。
- None：在默认目录中搜索。
- ""（空字符串）：在 NVIDIA 站点包中搜索。
- 特定路径：从指定目录加载 DLL。

默认搜索顺序：
当时，该函数按以下顺序搜索 CUDA 和 cuDNN DLL：directory=None

在 Windows 上，PyTorch 安装下的目录。lib
NVIDIA CUDA 或 cuDNN 库的 Python 站点包目录（例如，）。nvidia_cuda_runtime_cu12nvidia_cudnn_cu12
回退到默认的 DLL 加载行为。

通过使用默认搜索顺序预加载必要的 DLL，可以确保 ONNX 运行时与 PyTorch 无缝运行。
通过 onnxruntime-gpu 安装 CUDA 和 cuDNN：
您可以使用 pip 将必要的 CUDA 和 cuDNN 运行时 DLL 与包一起安装：onnxruntime-gpu

pip install onnxruntime-gpu[cuda,cudnn]

从 NVIDIA 站点包预加载 DLL：
要从 NVIDIA 站点包预加载 CUDA 和 cuDNN DLL 并显示调试信息：

import onnxruntime

# Preload DLLs from NVIDIA site packages
onnxruntime.preload_dlls(directory="")

# Print debug information
onnxruntime.print_debug_info()

从特定目录加载 DLL：
若要从指定位置加载 DLL，请将参数设置为绝对路径或相对于 ONNX 运行时包根目录的路径。directory
示例：从系统安装加载 CUDA 并从 NVIDIA 站点包加载 cuDNN

import os
import onnxruntime

# Load CUDA DLLs from system installation
cuda_path = os.path.join(os.environ["CUDA_PATH"], "bin")
onnxruntime.preload_dlls(cuda=True, cudnn=False, directory=cuda_path)

# Load cuDNN DLLs from NVIDIA site package
onnxruntime.preload_dlls(cuda=False, cudnn=True, directory="..\\nvidia\\cudnn\\bin")

# Print debug information
onnxruntime.print_debug_info()

配置选项

CUDA 执行提供程序支持以下配置选项。

device_id

设备 ID。
默认值：0

user_compute_stream

定义要运行推理的计算流。它隐式设置了选项。它不能通过设置，而是。这不能与外部分配器结合使用。has_user_compute_streamUpdateCUDAProviderOptionsUpdateCUDAProviderOptionsWithValue
python 用法示例：

providers = [("CUDAExecutionProvider", {"device_id": torch.cuda.current_device(),
                                        "user_compute_stream": str(torch.cuda.current_stream().cuda_stream)})]
sess_options = ort.SessionOptions()
sess = ort.InferenceSession("my_model.onnx", sess_options=sess_options, providers=providers)

要利用用户计算流，建议使用 I/O 绑定将输入和输出绑定到设备中的张量。

do_copy_in_default_stream

是在默认流中进行复制还是使用单独的流。建议的设置为 true。如果为 false，则存在竞争条件，并且可能具有更好的性能。
默认值：true

use_ep_level_unified_stream

对 CUDA EP 的所有线程使用相同的 CUDA 流。这是由或在使用外部分配器时隐式启用的。has_user_compute_streamenable_cuda_graph
默认值：false

gpu_mem_limit

设备内存竞技场的大小限制（以字节为单位）。此大小限制仅适用于执行提供程序的 arena。设备总内存使用量可能更高。s：C++size_t类型的最大值（有效无限制）
注意：将被（如果指定）的内容覆盖default_memory_arena_cfg

arena_extend_strategy

扩展设备内存领域的策略。

价值	描述
kNext二的幂（0）	后续扩展扩展幅度更大（乘以 2 的幂）
kSameAs请求（1）	按请求金额延长

默认值：kNextPowerOfTwo
注意：将被（如果指定）的内容覆盖default_memory_arena_cfg

cudnn_conv_algo_search

为 cuDNN 卷积算法执行的搜索类型。

价值	描述
详尽无遗（0）	使用 cudnnFindConvolutionForwardAlgorithmEx 进行昂贵的详尽基准测试
启发式（1）	使用 cudnnGetConvolutionForwardAlgorithm_v7 进行基于启发式的轻量级搜索
默认（2）	使用 CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM 的默认算法

默认值：EXHAUSTIVE

cudnn_conv_use_max_workspace

检查卷积重模型的调优性能，了解有关此标志功能的详细信息。仅当使用 C API 时，提供程序选项结构的 V2 版本才支持此标志。（下面的示例）
默认值：1，对于版本 1.14 及更高版本 0，对于以前的版本

cudnn_conv1d_pad_to_nc1d

检查 CUDA EP 中的卷积输入填充，了解有关此标志功能的详细信息。仅当使用 C API 时，提供程序选项结构的 V2 版本才支持此标志。（下面的示例）
默认值：0

enable_cuda_graph

检查在CUDA EP中使用CUDA Graphs，了解有关此标志功能的详细信息。仅当使用 C API 时，提供程序选项结构的 V2 版本才支持此标志。（下面的示例）
默认值：0

enable_skip_layer_norm_strict_mode

是否在 SkipLayerNormalization cuda 实现中使用严格模式。默认和建议的设置为 false。如果启用，则可以预期精度提高和性能下降。仅当使用 C API 时，提供程序选项结构的 V2 版本才支持此标志。（下面的示例）
默认值：0

use_tf32

TF32 是自 Ampere 以来 NVIDIA GPU 上可用的数学模式。它允许某些 float32 矩阵乘法和卷积在张量核心上运行得更快，TensorFloat-32 精度降低：float32 输入用 10 位尾数四舍五入，结果以 float32 精度累加。
默认值：1
TensorFloat-32 默认启用。从 ONNX 运行时 1.18 开始，可以使用此标志在推理会话中禁用它。
python 用法示例：

providers = [("CUDAExecutionProvider", {"use_tf32": 0})]
sess_options = ort.SessionOptions()
sess = ort.InferenceSession("my_model.onnx", sess_options=sess_options, providers=providers)

仅当使用 C API 时，提供程序选项结构的 V2 版本才支持此标志。（下面的示例）

gpu_external_[分配|免费|empty_cache]

gpu_external_* 用于传递外部分配器。python 用法示例：

from onnxruntime.training.ortmodule.torch_cpp_extensions import torch_gpu_allocator

provider_option_map["gpu_external_alloc"] = str(torch_gpu_allocator.gpu_caching_allocator_raw_alloc_address())
provider_option_map["gpu_external_free"] = str(torch_gpu_allocator.gpu_caching_allocator_raw_delete_address())
provider_option_map["gpu_external_empty_cache"] = str(torch_gpu_allocator.gpu_caching_allocator_empty_cache_address())

默认值：0

prefer_nhwc

此选项从 ONNX 运行时 1.20 开始可用，其中生成默认具有。onnxruntime_USE_CUDA_NHWC_OPS=ON
如果启用此选项，则执行提供程序首选 NHWC 运算符而不是 NCHW。必要的布局转换将自动应用于模型。由于 NVIDIA 张量核心在 NHWC 布局下运行效率更高，因此当模型由许多受支持的运算符组成并且不需要过多的额外转置作时，启用此选项可以提高性能。计划在未来的版本中对 NHWC 提供更广泛的运营商支持。
使用 C API 时，仅在 V2 版本的提供程序选项结构中支持此标志。可以使用 CreateCUDAProviderOptions 创建 V2 提供程序选项结构，并使用 UpdateCUDAProviderOptions 进行更新。
默认值：0

文章版权归作者所有，未经允许请勿转载。

THE END