我尽量加快翻译的速度,如果有问题或意见,请到http://blog.csdn.net/cloudqiu 留言,或者向knowthyself.cn 在 gmail 发邮件。我会尽快回复。
如果对翻译工作感兴趣,请和我联系,以便做好任务分配。git repo 在 https://github.com/cloudqiu1110/Vulkan-Docs-CN。
贡献者 负责章节

knowthyself

1,2,3,4,5,6,7,8,9,10,11,12,13

如何翻译:
1, git clone  https://github.com/cloudqiu1110/Vulkan-Docs-CN,  git checkout 1.0-CN
2,   花二十分钟了解一下https://github.com/stanzgy/wiki/blob/master/markup/asciidoc-guide.asciidoc[asciidoc文档]
3, 在Vulkan-Docs-CN/doc/specs/vulkan/chapters 选择对应的章节的txt文件开始翻译
4, push 并提交 pull request
编译环境搭建:
windows上我简单试了试,感觉很有难度。还是选用简单的Linux环境吧。 VirtualBox 安装 Ubuntu 16。
   sudo apt install cmake make gcc asciidoctor ruby ruby-dev git
   sudo gem install coderay
如果在CentOS系列上,还需要运行:
gem install asciidoctor
sudo yum install python34
   cd Vulkan-Docs-CN/doc/specs/vulkan
   git checkout 1.0-CN
   make html

输出的html文件在Vulkan-Docs-CN/out/1.0/html/,单个html有 5M左右

1. 简介

本章,除了“术语”、“规范化参考”小节,都是介绍性的信息。

这份文档,被称为“Vulkan 规范”或者此后仅称为“规范”,描述了Vulkan图形系统: 它是什么,它怎么行动,和实现它需要做什么。我们假设读者对计算机图形学有基础的 了解。这表示要对计算机图形学算法、术语,和现代化的GPU(Graphic Processing Units)熟悉。

官方版本的规范在Vulkan注册机构可获取,在下面的链接中

1.1. Vulkan图形系统是什么?

Vulkan是一个图形和计算硬件的API(Application Programming Interface)。 这个API由很多命令组成,它们允许程序员指定着色器程序,计算内核,对象和操作 产生高质量的图像,特别是三维对象的彩色图像。

1.1.1. 程序员的视角看Vulkan

对程序员来讲,Vulkan是一些命令的集合,允许内核或者着色器,和shader执行的外部 Vulkan aspect控制,使用着色器程序或者着色器和数据。 通常,数据是指二维或者三维几何物体和纹理图像,着色器和内核控制数据的处理, 几何物体的光栅化,光栅化产生的光照和阴影的片段,及最终把几何对象渲染到帧缓冲区。

一个典型的Vulkan程序以调用系统函数打开窗口或者准备程序绘制显示的设备为开始。 然后,调用命令打开队列,并向其提交命令缓冲区。 命令缓冲区包含一系列命令,它们将会被底层的硬件执行。 应用程序能够做到:分配设备内存,关联内存和资源,从命令缓冲区中引用这些资源。 绘制命令可调用应用程序的着色器程序,然而使用资源里的数据并产生图像。 为了展示结果图像,一些平台相关的命令需要把结果图像转移到显示设备或者窗口。

1.1.2. 实现者的角度看Vulkan

对于实现者,Vulkan是一系列命令的集合,它允许构造命令缓冲区并提交到设备。 现代化的设备几乎对所有Vulkan操作都加速了,把数据和帧缓冲区存储在高速内存, 在专用的GPU上执行着色器。 实现者的任务是在主机端提供软件lib,它实现了Vulkan API,把每一个Vulkan命令适当的 映射到图形硬件以利用物理设备的能力。

1.1.3. 我们的视角看Vulkan

我们把Vulkan看作拥有绘制操作可调用的可编程阶段和状态驱动固定阶段的管线。 我们期待这个模型可产生一个规范,满足程序员和Vulkan实现者的需要。 然而,它并不提供如何实现的模型。Vulkan实现必须产生符合这些特定方法产生的结果,但是, 可以执行比指定方法更加高效率的计算方法。

1.2. Bug报告

Issues with and bug reports on the Vulkan Specification and the API Registry can be filed in the Khronos Vulkan GitHub repository, located at URL

Please tag issues with appropriate labels, such as “Specification”, “Ref Pages” or “Registry”, to help us triage and assign them appropriately. Unfortunately, GitHub does not currently let users who do not have write access to the repository set GitHub labels on issues. In the meantime, they can be added to the title line of the issue set in brackets, e.g. ''[Specification]''.

1.3. 术语

文档中的关键字 must, required, shall should, recommend, may, and optional 的释义,请参考RFC 2119。

must

在此文档中当单独使用*must* ,或者*required*关键字,表示是绝对必须的含义。

should

When used alone, this word, or the adjective recommended, means that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. When followed by not (“should not”), the phrase means that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.

may

This word, or the adjective optional, means that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option must be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option must be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides).

The additional terms can and cannot are to be interpreted as follows:

can

This word means that the particular behavior described is a valid choice for an application, and is never used to refer to implementation behavior.

cannot

This word means that the particular behavior described is not achievable by an application. For example, an entry point does not exist, or shader code is not capable of expressing an operation.

注意

There is an important distinction between cannot and must not, as used in this Specification. Cannot means something the application literally is unable to express or accomplish through the API, while must not means something that the application is capable of expressing through the API, but that the consequences of doing so are undefined and potentially unrecoverable for the implementation.

editing-note

TODO (Jon) - We might need to augment the RFC 2119 definition of must not to include some of the previous note, since at present it is defined solely in terms of implementation behavior. See Gitlab issue #9.

1.4. Normative References

Normative references are references to external documents or resources to which implementers of Vulkan must comply.

IEEE Standard for Floating-Point Arithmetic, IEEE Std 754-2008, http://dx.doi.org/10.1109/IEEESTD.2008.4610935, August, 2008.

A. Garrard, Khronos Data Format Specification, version 1.1, https://www.khronos.org/registry/dataformat/specs/1.1/dataformat.1.1.html, June, 2016.

J. Kessenich, SPIR-V Extended Instructions for GLSL, Version 1.00, https://www.khronos.org/registry/spir-v/, February 10, 2016.

J. Kessenich and B. Ouriel, The Khronos SPIR-V Specification, Version 1.00, https://www.khronos.org/registry/spir-v/, February 10, 2016.

J. Leech and T. Hector, Vulkan Documentation and Extensions: Procedures and Conventions, https://www.khronos.org/registry/vulkan/, July 11, 2016

2. 基础

本章介绍一些基础的概念,包括Vulkan架构和执行模型、API语法、队列、管线、配置、 数值表示、状态和状态查询,还有不同类型的对象和着色器。 在本规范文档剩余部分中它提供了一个对更加精细描述命令和行为做解释的框架。

2.1. 架构模型

Vulkan和其API为符合以下特征的CPU、GPU和其他硬件加速架构所设计和实现:

  • 运行时库支持8位、16位、32位和64位有符号和无符号整形,都可以通过该类型 数据的粒度的大小来寻址到。

  • 运行时库支持满足 Floating Point Computation 节 的范围和精度的32位和64位浮点类型。

  • 这些类型的表示和大小端必须满足主机端和设备端一致。

注意

因为Vulkan中很多数据类型和结构可能在主机端和设备端内存来回的映射,主机端和设备端 架构必须能够高效的访问到数据,以便很方便的写高性能、可移植的应用程序。

在支持Vulkan的特定平台上此规范对影响ABI()的选项开放,这些选项通常是平台提供商用来 向前兼容的。 一些选项,比如函数调用惯例,可能在 vk_platform.h 头文件中的不同部分。

注意

例如,Android ABI由Google定义,Linux ABI是通过一系列的GCC默认项、发行版提供商和诸如Linux标准库 的外部标准一同定义的。

2.2. 执行模型

本节描绘了Vulkan执行系统执行模型的主框架。

Vulkan对外暴露一个或多个 设备,每一个对外暴露一个或多个 队列,队列之间异步的处理工作。 一个设备支持的一个集合的队列被分到 族_里。每一个族都支持一个或多个类型的功能,并可能包含多个拥有相近特性的队列。 在同一个族中的队列被认为是互相 _兼容的,同一个族中的队列需要完成的任务可以被任何一个队列执行。 这份规范定义了队列可能支持的四种功能:图形,计算,转移,稀疏内存管理。

注意

单个设备可能报告有多个相近的队列族,而非报告含有这些队列中一个或者多个成员。 这表明多个族的有相近功能的成员之间可能并不兼容。

设备内存是由应用程序显式的管理的。每一个设备可能宣称有一个或多个堆,表示内存的不同区域。 内存堆可能是在主机端或者是设备端,但是,都能被设备所见。 关于内存堆的更多细节是通过该堆上获取的内存类型对外暴露的。 在一个Vulkan实现上可用的内存区域的例子包含如下:

  • device local 设物理连接到设备的内存。

  • device local, host visible 是设备端内存,对主机端可见。 the host.

  • host local, host visible 是主机端内存,对设备和主机都可见。

在其他的一些架构上,也许只有个堆,可以用作任何用途。

一个Vulkan应用程序通过提交记录了Vulkan库调用激发的设备命令的命令缓冲区来控制多个物理设备。 命令缓冲区的内容是通过硬件指定的,对应用程序不可见。 一个命令缓冲区一旦被构造出来,就可以马上一次或多次提交到一个设备上以备执行。 多个命令缓冲区可能在应用程序里多线程中并行的被构建。

提交到不同的队列的命令缓冲区可能并行的执行或者乱序执行。 提交到同一个队列的命令缓冲区遵循submission order, 这在synchronization chapter中深入描述。 命令缓冲区在设备上的执行和主机端的执行也是异步的。 一旦命令缓冲区被提交到一个队列,控制权马上返回到应用程序。 在主机端和设备之间的同步,和不同队列之间的同步是应用程序的职责。

2.2.1. 队列操作

Vulkan队列提供了设备执行引擎的接口。执行引擎的命令需要在执行之前被记录到命令缓冲区。 这些命令缓冲区然后被一个 _queue submission_提交到队列以供一个或多个批次执行。 一旦提交到队列,这些命令将开始并完成执行,不受应用程序的干扰,虽然执行的顺序会受到 implicit and explicit ordering constraints的一些限制。

任务通常被一些队列提交命令提交到队列,这些命令一般形如vkQueue* (比如 vkQueueSubmit, vkQueueBindSparse),且可能接受一些等待任务开始的信号量和一些任务完成才激发的信号量。 任务本身,以及激发和等待信号量都是 队列操作

在不同队列上的队列操作并没有隐式的顺序限制,可能以任何顺序执行。 不同队列间显式的顺序限制可以通过semaphoresfences表述。

提交到单个队列的命令缓冲区遵循submission order 和其他 implicit ordering guarantees,否则可能重叠或者乱序执行。 对于单一队列上的批次和队列提交的其他类型,和其他队列或批次提交之间并没有隐式的顺序限制。 在不同队列和各自的批次之间的附加显式的顺序限制可以通过semaphoresfences表述。

在栅栏或信号量被激发之前,可以确定的是之前被提交的队列操作已经完成了,且这些队列操作的内存写入对未来的队列操作 可见 。 等待一个被激发的信号量或者栅栏保证之前的可用的写入对后续的命令是可见的

在相同或不同的批次或者提交,还有主和次命令缓冲区之间的命令缓冲区边界,不会有任何附加的顺序限制。 也就是,在任何信号量或栅栏操作之间提交多个命令缓冲区(包含执行次级命令缓冲区)执行被记录的命令,就如同他们被记录进入 单个主命令缓冲区一样,除了每一个边界当前的状态都被重置 。显式顺序限制可以通过 explicit synchronization primitives表示。

在一个命令缓冲区内多个命令之间有一些隐式顺序保证,但是只包含 一部分执行子集。附加的显式顺序限制可以通过多种显式同步原语来表示。

注意

Vulkan实现对提交到一个队列的任务之间的重叠执行有极大的自由度,这是由 Vulkan设备里深度的管线和并行机制导致的。

被记录在命令缓冲区的命令,要么执行操作(绘制、分发、清除、复制、查询/时间戳操作、开始/结束subpass操作), 设置状态(绑定管线、描述符集、缓冲区、设置动态状态、推送常量、设置render pass/subpass状态), 要们执行同步(设置/等待时间、管线屏障、renderpass/subpass依赖)。一些命令执行不止一个上述任务。状态设置命令 更新命令缓冲区的 当前状态。一些命令执行操作(如绘制/分发)基于从命令缓冲区开始累积到当前状态集。 执行操作的命令内的任务是可以重叠或者重新记录的,但是必须禁止改动每一个操作命令使用的状态。 通常,操作命令是那些更改帧缓冲区附件、读写缓冲区或者图像内存、想查询池写入的命令,

同步命令在两个操作命令集合之间引入显式的execution and memory dependencies ,这里 第二个命令集合依赖于第一个命令集合。 这些依赖强制保证在后面的集合中的某些管线阶段的执行发生在源集合中某些阶段的执行之后, 且某些管线阶段执行的内存访问的影响结果顺序发生并对彼此可见。 当没有显式的依赖或隐式的顺序保证,操作命令也许重叠执行或者乱序执行,而且看不到 彼此的内存访问的影响结果。

设备执行队列操作和主机端是异步的。当命令缓冲区被提交到队列后控制流马上就退回到应用程序了。 应用程序必须按需求在主机端和设备端同步任务。

2.3. 对象模型

设备、队列和Vulkan中其他的的实体都是通过Vulkan对象表示的。 在API层,所有的对象都通过handle来引用。有两种类型的handle:可分发的与不可分发的。 可分发的 handle是不可见类型数据的指针。这个指针可被layers使用,被当作拦截API命令的一部分, 每一个API命令头接受一个可分发类型的handle作为第一个参数。 每一个不可分发类型的对象必须在其生命周期内有唯一一个handle值。

_不可分发的_handle类型是64位整型类型,其含义是Vulkan实现决定的,能把对象信息直接包含到handle里,而非通过指向一个数据结构。 不可分发类型的对象,不一定只有一个唯一的handle值。 如果其他类型的handle值变得无效了,那么销毁这样的一个handle必须不能导致此对象其他类型的handle失效,如果一个handle值被创建的次数多与被销毁的次数 ,则必须不能导致同种类型的等价的handle变得无效。

所有通过VkDevice (比如 with a VkDevice 作为第一个参数)的命令创建的对象都是该设备私有的,必须不能被其他设备使用。

2.3.1. 对象的生存周期

对象都是通过形如vkCreate* and vkAllocate* 这样的命令创建或者分配的。 一旦一个对象被创建或者分配,它的结构就被认为是不变的,即使某个对象类型的内容仍然是可以被自由的改动。 对象都是通过形如vkDestroy* and vkFree* 的命令来销毁或者释放的。

被分配(而不是创建)的对象从一个已存在的池子对象或者内存堆中获取资源,当被释放时把资源归还给该池子或者堆。但 对象的创建和销毁在运行时通常是低频操作, 分配或者释放对象可能是高频的。对象池帮助调节分配和释放的性能提升。

应用程序有责任跟踪Vulkan对象的生命周期,且在对象正在被使用时不能销毁它们。

应用程序所拥有的内存被内存传递所到的命令迅速使用。在使用这些内存的命令返回后,应用程序可以立刻更改或者释放这些内存。

以下对象类型被传入Vulkan命令是被使用,此后并不被用它们来创建的对象所访问。它们在传入所到的API命令执行期间被能被销毁:

  • VkShaderModule

  • VkPipelineCache

一个 VkPipelineLayout 对象在被任何使用它的命令缓冲区记录状态时被销毁。

VkDescriptorSetLayout 对象可以被操作使用其布局的描述符集合的命令访问,在描述符集合布局被销毁后这些描述符集合必须不能被vkUpdateDescriptorSets 更新。否则的话,描述符集合布局可在它们不被Vulkan API命令使用的任何时刻被销毁。

在设备(如从过命令缓冲区执行)已经完成使用Vulkan对象之前,应用程序必须不能销毁这些任何类型的Vulkan对象。

如下类型的Vulkan对象在被命令缓冲区使用或者暂停执行时不能被销毁:

  • VkEvent

  • VkQueryPool

  • VkBuffer

  • VkBufferView

  • VkImage

  • VkImageView

  • VkPipeline

  • VkSampler

  • VkDescriptorPool

  • VkFramebuffer

  • VkRenderPass

  • VkCommandPool

  • VkDeviceMemory

  • VkDescriptorSet

一下Vulkan对象在队列执行使用到这些对象的命令时不能被销毁:

  • VkFence

  • VkSemaphore

  • VkCommandBuffer

  • VkCommandPool

通常,对象可以按照任意顺序销毁或者释放 ,即使被释放的对象可能使用到另外一个对象(如在视图中使用一个资源, 在描述符集合中使用视图,在命令缓冲区中使用对象,绑定分配的内存到资源),只要使用被释放了的对象的对象不被 再次使用,除了被销毁或者重置这样的导致对象不再使用另外一个对象的情况(比如重置了命令缓冲区)。 如果对象被重置,那么它可以被使用,如同从来没有用过被释放的对象一样。 一个例外是对象之间存在父子关系时。在这种情况下,在子对象被销毁前应用程序必须不能销毁父对象,除非父对象被释放 时被定义显式的释放它的子对象(比如下面定义的对象池)。

VkCommandPool 对象是 VkCommandBuffer 的父对象。 VkDescriptorPool 对象是 VkDescriptorSet 的父对象。 VkDevice 对象是很多对象类型(所有接受VkDevice作为参数来创建)的父对象。

下面的Vulkan对象在被销毁时有特定的限制:

  • VkQueue 不能被显式的销毁。当它们所在的VkDevice对象被销毁时才被隐式的销毁。

  • 销毁一个池对象隐式的释放从它分配出来的所有对象。特别是,销毁VkCommandPool就会释放所有从之分配出来 的VkCommandBuffer对象,销毁VkDescriptorPool会释放从之分配而来的VkDescriptorSet对象。

  • 当所有从VkDevice获取到的VkQueue对象处于空闲状态时,VkDevice 对象可以被销毁, 所有依VkQueue而创建的对象也被销毁了。这包括如下对象:

    • VkFence

    • VkSemaphore

    • VkEvent

    • VkQueryPool

    • VkBuffer

    • VkBufferView

    • VkImage

    • VkImageView

    • VkShaderModule

    • VkPipelineCache

    • VkPipeline

    • VkPipelineLayout

    • VkSampler

    • VkDescriptorSetLayout

    • VkDescriptorPool

    • VkFramebuffer

    • VkRenderPass

    • VkCommandPool

    • VkCommandBuffer

    • VkDeviceMemory

  • VkPhysicalDevice 不能被显式的销毁。相反,在所有从值获取的VkInstance对象被销毁后被隐式的销毁。

  • 当所有从VkPhysicalDevice中创建的 VkDevice被销毁后, VkInstance 对象才被销毁。

2.4. 命令的语法和duration

这份规范描述Vulkan命令为C99语法的函数或者过程。其他语言,如C++和JavaScript的绑定允许更严格的参数传递,或者面向对象接口。 Vulkan使用标准的C类型作为标量参数的基础类型(比如stdint.h中的类型),例外的情况有如下或者本文档中任何合适的地方:

VkBool32表示boolean类型的`True` and `False`值,因为C并没有可移植的内置boolean类型:

typedef uint32_t VkBool32;

VK_TRUE 表示 boolean True (整型 1) 值, VK_FALSE 表示一个boolean False (整型 0) 值。

从Vulkan实现返回的VkBool32类型的值要么是VK_TRUE 要么是 VK_FALSE

当Vulkan实现期望VkBool32类型参数时,应用程序不能传入 VK_TRUEVK_FALSE 以外的值。

VkDeviceSize 表示设备内存大小和偏移量:

typedef uint64_t VkDeviceSize;

创建Vulkan对象的命令都形如vkCreate*,接受形如Vk*CreateInfo的参数。这些Vulkan对象通过形如vkDestroy* 的命令来销毁。

每一个用来创建或者销毁Vulkan对象的命令的最后一个传入参数是pAllocatorpAllocator参数可以被置为非NULL值,此时, 对给定对象的分配任务就代理给应用程序提供的回调函数了;请参考Memory Allocation章节以得到更多细节。

从池对象中分配Vulkan对象的命令都形如vkAllocate*,接受形如 Vk*AllocateInfo 的结构为参数。 这些Vulkan对象都通过形如vkFree*的命令来释放。这些对象并不接受内存分配器;如果需要主机端内存,它们将使用缓存池被创建时 指定分配器。

通过调用形如vkCmd*的API命令把命令记录到命令缓冲区。每一个命令可能有不同的限制条件:在和/或此命令缓冲区,在一个 renderpass内部和/或外部,在一个或者多个受支持队列类型中。这些限制条件都在每一个命令的定义处给出。

Vulkan命令的 _duration_是指调用命令到它返回的时间段。

2.4.1. 获取的值的生命周期

通过形如vkGet* and vkEnumerate*的命令从Vulkan实现中获取这些信息。

除非针对一个特定的命令,结果都是 不变的 。亦即,只要参数保持有效,用相同的参数调用这些命令的结果将会保持不变。

2.5. 线程的行为

在多线程CPU主机上,Vulkan提供线性拓展能力。所有的命令都支持多线程,但是一些参数或者参数成员需要在外部保持同步。这意味着 调用者必须保证同一时刻只能有一个线程在使用这个参数。

更准确的来说,Vulkan命令使用简单的存储来更新表示Vulkan对象的数据结构。在主机端执行命令时,被声明为外部同步的参数的内容可能 就被更新了。如果两个命令操作同一个对象,至少有一个命令生命这个对象需要被外部同步保护,然而调用者保证命令不会同时执行, 但是如果有必要,两个命令可以通过内存屏障被分离。

注意

内存屏障在ARM CPU架构上非常重要,因为很多开发者所熟悉的X86/x64平台编程比相对无序一些。幸运的是,大多数高层同步原语 (像pthread库)是内存屏障表现为互斥的一种,所以通过这些原语实现Vulkan对象互斥会有预期的效果。

很多对象类型是不可改变内容的,意味着对象一旦被创建就不能被改变。这些类型不需要外部同步,除了在销毁的时候不能在另外 一个线程中使用。在一些特定的场合下,可改变内容的对象参数在其内部同步,所以无需在外部保持同步。这样的一个例子是在 vkCreateGraphicsPipelinesvkCreateComputePipelinesVkPipelineCach的使用,这里给这样一个重型的 命令做外部同步不切实际。在这个例子中,Vulkan实现必须在cache内部保持同步。还有,某些和命令参数相关的对象(比如命令池和描述符池)可能 被命令影响,它们也必须在外部保持同步。这些隐式的参数在下面文档给出。

需要在外部保持同步的命令的参数列举如下。

Externally Synchronized Parameters

也有一些接受内容是在外部保持同步的参数的用户分配列表的命令的例子。这些情况下,调用者必须保证某时刻最多只有一个线程正在使用 列表中指定的元素。这些参数列举如下。

Externally Synchronized Parameter Lists
  • Each element of the pWaitSemaphores member of each element of the pSubmits parameter in vkQueueSubmit

  • Each element of the pSignalSemaphores member of each element of the pSubmits parameter in vkQueueSubmit

  • Each element of the pWaitSemaphores member of each element of the pBindInfo parameter in vkQueueBindSparse

  • Each element of the pSignalSemaphores member of each element of the pBindInfo parameter in vkQueueBindSparse

  • The buffer member of each element of the pBufferBinds member of each element of the pBindInfo parameter in vkQueueBindSparse

  • The image member of each element of the pImageOpaqueBinds member of each element of the pBindInfo parameter in vkQueueBindSparse

  • The image member of each element of the pImageBinds member of each element of the pBindInfo parameter in vkQueueBindSparse

  • Each element of the pFences parameter in vkResetFences

  • Each element of the pDescriptorSets parameter in vkFreeDescriptorSets

  • The dstSet member of each element of the pDescriptorWrites parameter in vkUpdateDescriptorSets

  • The dstSet member of each element of the pDescriptorCopies parameter in vkUpdateDescriptorSets

  • Each element of the pCommandBuffers parameter in vkFreeCommandBuffers

另外,有一些隐式参数需要在外部保持同步。比如所有需要在外部保持同步的的commandBuffer参数表明当创建 该命令缓冲区时被传入的commandPool也需要在外部保持同步。这个隐式的参数和它关联的对象列举如下。

Implicit Externally Synchronized Parameters

2.6. 错误

Vulkan是一个分层的API。最底层是Vulkan核心层,就是本规范所定义的。应用程序可以在其上使用附加层来调试、验证或者达到其他的目的。

Vulkan一个核心的原则就是构建并提交命令缓冲区应该是非常高效的。所以,错误检查和状态验证在核心层应该尽量小, 尽管可以使用这些层来开启更严格的验证。

核心层假设应用程序正确的使用API。除了按照本规范所写的内容,应用程序中核心层错误的使用API将导致未知的结果,可能包括程序终止。 然而,Vulkan实现必须保证应用程序错误的使用并不影响操作系统的完整性、Vulkan实现或者其他Vulkan客户端应用程序,也不允许一个应用程序去 访问另外一个程序的数据。应用程序可以通过启用特征,限制和格式中描述的robustBufferAccess特征来要求更健壮的保证。

Validation of correct API usage is left to validation layers. Applications should be developed with validation layers enabled, to help catch and eliminate errors. Once validated, released applications should not enable validation layers by default.

2.6.1. 正确使用(Valid Usage)

正确使用定义了一系列的条件,它们必须被遵守,来让应用程序达到预想的运行时行为。 这些条件只依赖于Vulkan状态,这些参数或者对象的使用被条件所限制。

一些正确使用条件依赖于运行时限制条件或者可用的特性。 可能通过对这些限制和特征的Vulkan的最小支持值或者其他已知的值来验证这些条件 。

正确使用条件并不覆盖正确行为的条件(包含返回错误码)。

正确使用条件应该应用到命令或者数据结构上,关于条件完整的信息在程序运行的使用是已知的。 This is such that a validation layer or linter can be written directly against these statements at the point they are specified.

注意

This does lead to some non-obvious places for valid usage statements. For instance, the valid values for a structure might depend on a separate value in the calling command. In this case, the structure itself will not reference this valid usage as it is impossible to determine validity from the structure that it is invalid - instead this valid usage would be attached to the calling command.

Another example is draw state - the state setters are independent, and can cause a legitimately invalid state configuration between draw calls; so the valid usage statements are attached to the place where all state needs to be valid - at the draw command.

Valid usage conditions are described in a block labelled “Valid Usage” following each command or structure they apply to.

2.6.2. 隐式的正确使用(Implicit Valid Usage)

一些正确使用条件应用到API的所有的命令和数据结构上,除非显式的指明一个特定命令或者结构。 这些条件被认为是 隐式的 ,在每个命令或者数据结构可能带有标签为 "`Valid Usage (Implicit)`"的块中进行描述。 隐式正确使用条件在下面详细的讲解。

对象handle的正确使用(Valid Usage for Object Handles)

一个命令的输入参数必须是一个有效的对象handle,除非有明确指明。 有效的handle符合以下条件:

  • 它是被之前的一次成功的API调用产生的。这个函数在spec里有列举。

  • 它没有被之前的API所删除或者释放。这些函数在spec里有列举。

  • 被该对象使用的任何对象,是创建或者执行的一部分,必须是有效的。

在本规范中称谓时 ,被保留的值VK_NULL_HANDLENULL 可以用来分别替代有效的 不可分派handle和 可分派handle, 任何成功创建对象的命令不能返回这些值。 把这些值传入 vkDestroy* or vkFree* 命令是合法的,命令将静默的忽略这些值。

指针的正确使用(Valid Usage for Pointers)

任何是指针的参数必须是有效的指针。一个有效的指针包含的值必须指向命令期待的类型的数据, 所有通过指针访问的基础类型(例如数组的所有元素或者数据结构的成员)满足CPU要求的对齐大小。

枚举类型的正确使用(Valid Usage for Enumerated Types)

任何枚举类型的参数必须是该枚举类型有效的枚举值。一个有效的枚举值必须满足:

  • 该枚举值在枚举类型中有定义。

  • 枚举值不是枚举类型中特许的值,比如带有诸如 _BEGIN_RANGE, _END_RANGE, _RANGE_SIZE_MAX_ENUM 等前缀的值。

标志位的正确使用(Valid Usage for Flags)

VkFlags 可以用来表示 标志掩码 的枚举:

typedef uint32_t VkFlags;

这些标志位被传入许多命令,表示各种选项,但是 VkFlags 在API中并没有被使用。 相反, Vk*FlagsVkFlags的别名,和Vk*FlagBits 中的各个枚举值一一对应。 这些别名在规范的 附录 Flag Types 部分中描述。

Any Vk*Flags member or parameter used in the API must be a valid combination of bit flags. A valid combination is either zero or the bitwise OR of valid bit flags. A bit flag is valid if:

  • The bit flag is defined as part of the Vk*FlagBits type, where the bits type is obtained by taking the flag type and replacing the trailing Flags with FlagBits. For example, a flag value of type VkColorComponentFlags must contain only bit flags defined by VkColorComponentFlagBits.

  • The flag is allowed in the context in which it is being used. For example, in some cases, certain bit flags or combinations of bit flags are mutually exclusive.

数据结构的正确使用

任何包含 sType 作为参数的数据结构必须包含一个sType 值,对应着 VkStructureType 数据结构的值。 作为常见规则,这个名字就是把数据结构的前缀Vk 去掉的部分,把首字母大写前添加 _,把整个生成的字符串转化为大写,并在最前添加 VK_STRUCTURE_TYPE_ 前缀。 例如, VkImageCreateInfo 类型的数据结构必须有一个sType 类型的值 VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO

VK_STRUCTURE_TYPE_LOADER_INSTANCE_CREATE_INFOVK_STRUCTURE_TYPE_LOADER_DEVICE_CREATE_INFO 被保留给内部的loader 使用,在规范中并没有对应的Vulkan数据结构。

受支持的 structure types 列表在附录中定义。

数据结构指针链的正确使用(Valid Usage for Structure Pointer Chains)

包含 void* pNext成员 的任何参数必须是`NULL`或者指向一个有效的由拓展定义的数据结构的指针,包含sTypepNext 类型的成员,如Vulkan Documentation and Extensions 文档中"`Extension Interactions`" 一节所描述。 如果拓展被Vulkan实现所支持,它必须被开启。 Vulkan实现的任何组成部分(loader,任何启用的layer和 驱动)必须跳过,无需处理任何带有未被受支持的拓展定义的sType类型 链式数据结构。

拓展的数据结构并没有在基础的Vulkan规范文档中描述,但是在 和这些拓展协作的layered规范中,或者单独的vendor提供的文档中描述。

嵌套数据结构的正确使用(Valid Usage for Nested Structures)

上述条件可以递归的应用到作为命令参数的数据结构上,要么是命令的直接参数,要么它们自己是别的数据结构的成员。

Specifics on valid usage of each command are covered in their individual sections.

2.6.3. 返回值Return Codes

因为Vulkan核心API并未被设计来捕捉错误的使用,一些场景下依然需要返回错误码。 Vulkan的命令通过状态码返回他们的状态,这些状态码分为两类:

  • Successful completion codes are returned when a command needs to communicate success or status information. All successful completion codes are non-negative values.

  • Run time error codes are returned when a command needs to communicate a failure that could only be detected at run time. All run time error codes are negative values.

Vulkan中所有的返回码都在VkResult 中定义。可选的值有:

typedef enum VkResult {
    VK_SUCCESS = 0,
    VK_NOT_READY = 1,
    VK_TIMEOUT = 2,
    VK_EVENT_SET = 3,
    VK_EVENT_RESET = 4,
    VK_INCOMPLETE = 5,
    VK_ERROR_OUT_OF_HOST_MEMORY = -1,
    VK_ERROR_OUT_OF_DEVICE_MEMORY = -2,
    VK_ERROR_INITIALIZATION_FAILED = -3,
    VK_ERROR_DEVICE_LOST = -4,
    VK_ERROR_MEMORY_MAP_FAILED = -5,
    VK_ERROR_LAYER_NOT_PRESENT = -6,
    VK_ERROR_EXTENSION_NOT_PRESENT = -7,
    VK_ERROR_FEATURE_NOT_PRESENT = -8,
    VK_ERROR_INCOMPATIBLE_DRIVER = -9,
    VK_ERROR_TOO_MANY_OBJECTS = -10,
    VK_ERROR_FORMAT_NOT_SUPPORTED = -11,
    VK_ERROR_FRAGMENTED_POOL = -12,
} VkResult;
Success Codes
  • VK_SUCCESS 命令正确执行完成

  • VK_NOT_READY fence 或者 query 还没有完成

  • VK_TIMEOUT A wait operation has not completed in the specified time

  • VK_EVENT_SET 一个event被激发了

  • VK_EVENT_RESET An event is unsignaled

  • VK_INCOMPLETE A return array was too small for the result

Error codes
  • VK_ERROR_OUT_OF_HOST_MEMORY A host memory allocation has failed.

  • VK_ERROR_OUT_OF_DEVICE_MEMORY A device memory allocation has failed.

  • VK_ERROR_INITIALIZATION_FAILED Initialization of an object could not be completed for implementation-specific reasons.

  • VK_ERROR_DEVICE_LOST The logical or physical device has been lost. See Lost Device

  • VK_ERROR_MEMORY_MAP_FAILED Mapping of a memory object has failed.

  • VK_ERROR_LAYER_NOT_PRESENT A requested layer is not present or could not be loaded.

  • VK_ERROR_EXTENSION_NOT_PRESENT A requested extension is not supported.

  • VK_ERROR_FEATURE_NOT_PRESENT A requested feature is not supported.

  • VK_ERROR_INCOMPATIBLE_DRIVER The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons.

  • VK_ERROR_TOO_MANY_OBJECTS Too many objects of the type have already been created.

  • VK_ERROR_FORMAT_NOT_SUPPORTED A requested format is not supported on this device.

  • VK_ERROR_FRAGMENTED_POOL A requested pool allocation has failed due to fragmentation of the pool’s memory.

If a command returns a run time error, it will leave any result pointers unmodified, unless other behavior is explicitly defined in the specification.

Out of memory errors do not damage any currently existing Vulkan objects. Objects that have already been successfully created can still be used by the application.

性能严苛的命令一般没有返回码。若在这些命令中出现错误,Vulkan实现会把错误报告推迟到一个特定的点。 对于那些录入到命令缓冲区(vkCmd*)的命令,错误是通过 vkEndCommandBuffer 报告出来的。

2.7. 数值表示与计算(Numeric Representation and Computation)

Vulkan实现一般以浮点数进行计算,必须满足下面定义的"`Floating-Point Computation`" 的浮点数范围和精度。 Implementations normally perform computations in floating-point, and must meet the range and precision requirements defined under “Floating-Point Computation” below.

These requirements only apply to computations performed in Vulkan operations outside of shader execution, such as texture image specification and sampling, and per-fragment operations. Range and precision requirements during shader execution differ and are specified by the Precision and Operation of SPIR-V Instructions section.

In some cases, the representation and/or precision of operations is implicitly limited by the specified format of vertex or texel data consumed by Vulkan. Specific floating-point formats are described later in this section.

2.7.1. 浮点计算

大多数浮点运算是通过SPIR-V着色器模块执行的。在着色器内存计算的属性受到限制,在 Precision and Operation of SPIR-V Instructions一节中有描述。

一些浮点计算是在着色器外部进行的,比如视口和深度范围计算。 对于这些计算,我们不指定浮点数是如何表示的,或者对它们的操作是如何完成的,但在本节剩余部分给出如何表示及精度的最小要求。

editing-note

(Jon, Bug 14966) This is a rat’s nest of complexity, both in terms of describing/enumerating places such computation may take place (other than “not shader code”) and in how implementations may do it. We have consciously deferred the resolution of this issue to post-1.0, and in the meantime, the following language inherited from the OpenGL Specification is inserted as a placeholder. Hopefully it can be tightened up considerably.

We require simply that numbers' floating-point parts contain enough bits and that their exponent fields are large enough so that individual results of floating-point operations are accurate to about 1 part in 105. The maximum representable magnitude for all floating-point values must be at least 232.

x × 0 = 0 × x = 0 for any non-infinite and non-NaN x.

1 × x = x × 1 = x.

x + 0 = 0 + x = x.

00 = 1.

Occasionally, further requirements will be specified. Most single-precision floating-point formats meet these requirements.

The special values Inf and -Inf encode values with magnitudes too large to be represented; the special value NaN encodes “Not A Number” values resulting from undefined arithmetic operations such as 0 / 0. Implementations may support Inf and NaN in their floating-point computations.

Any representable floating-point value is legal as input to a Vulkan command that requires floating-point data. The result of providing a value that is not a floating-point number to such a command is unspecified, but must not lead to Vulkan interruption or termination. In IEEE 754 arithmetic, for example, providing a negative zero or a denormalized number to an Vulkan command must yield deterministic results, while providing a NaN or Inf yields unspecified results.

2.7.2. 16-Bit Floating-Point Numbers

16-bit floating point numbers are defined in the “16-bit floating point numbers” section of the Khronos Data Format Specification.

Any representable 16-bit floating-point value is legal as input to a Vulkan command that accepts 16-bit floating-point data. The result of providing a value that is not a floating-point number (such as Inf or NaN) to such a command is unspecified, but must not lead to Vulkan interruption or termination. Providing a denormalized number or negative zero to Vulkan must yield deterministic results.

2.7.3. Unsigned 11-Bit Floating-Point Numbers

Unsigned 11-bit floating point numbers are defined in the “Unsigned 11-bit floating point numbers” section of the Khronos Data Format Specification.

When a floating-point value is converted to an unsigned 11-bit floating-point representation, finite values are rounded to the closest representable finite value.

While less accurate, implementations are allowed to always round in the direction of zero. This means negative values are converted to zero. Likewise, finite positive values greater than 65024 (the maximum finite representable unsigned 11-bit floating-point value) are converted to 65024. Additionally: negative infinity is converted to zero; positive infinity is converted to positive infinity; and both positive and negative NaN are converted to positive NaN.

Any representable unsigned 11-bit floating-point value is legal as input to a Vulkan command that accepts 11-bit floating-point data. The result of providing a value that is not a floating-point number (such as Inf or NaN) to such a command is unspecified, but must not lead to Vulkan interruption or termination. Providing a denormalized number to Vulkan must yield deterministic results.

2.7.4. Unsigned 10-Bit Floating-Point Numbers

Unsigned 10-bit floating point numbers are defined in the “Unsigned 10-bit floating point numbers” section of the Khronos Data Format Specification.

When a floating-point value is converted to an unsigned 10-bit floating-point representation, finite values are rounded to the closest representable finite value.

While less accurate, implementations are allowed to always round in the direction of zero. This means negative values are converted to zero. Likewise, finite positive values greater than 64512 (the maximum finite representable unsigned 10-bit floating-point value) are converted to 64512. Additionally: negative infinity is converted to zero; positive infinity is converted to positive infinity; and both positive and negative NaN are converted to positive NaN.

Any representable unsigned 10-bit floating-point value is legal as input to a Vulkan command that accepts 10-bit floating-point data. The result of providing a value that is not a floating-point number (such as Inf or NaN) to such a command is unspecified, but must not lead to Vulkan interruption or termination. Providing a denormalized number to Vulkan must yield deterministic results.

2.7.5. General Requirements

Some calculations require division. In such cases (including implied divisions performed by vector normalization), division by zero produces an unspecified result but must not lead to Vulkan interruption or termination.

2.8. 浮点数据转换

When generic vertex attributes and pixel color or depth components are represented as integers, they are often (but not always) considered to be normalized. Normalized integer values are treated specially when being converted to and from floating-point values, and are usually referred to as normalized fixed-point.

In the remainder of this section, b denotes the bit width of the fixed-point integer representation. When the integer is one of the types defined by the API, b is the bit width of that type. When the integer comes from an image containing color or depth component texels, b is the number of bits allocated to that component in its specified image format.

The signed and unsigned fixed-point representations are assumed to be b-bit binary two’s-complement integers and binary unsigned integers, respectively.

2.8.1. Conversion from Normalized Fixed-Point to Floating-Point

Unsigned normalized fixed-point integers represent numbers in the range [0,1]. The conversion from an unsigned normalized fixed-point value c to the corresponding floating-point value f is defined as

\[f = { c \over { 2^b - 1 } }\]

Signed normalized fixed-point integers represent numbers in the range [-1,1]. The conversion from a signed normalized fixed-point value c to the corresponding floating-point value f is performed using

\[f = \max\left( {c \over {2^{b-1} - 1}}, -1.0 \right)\]

Only the range [-2b-1 + 1, 2b-1 - 1] is used to represent signed fixed-point values in the range [-1,1]. For example, if b = 8, then the integer value -127 corresponds to -1.0 and the value 127 corresponds to 1.0. Note that while zero is exactly expressible in this representation, one value (-128 in the example) is outside the representable range, and must be clamped before use. This equation is used everywhere that signed normalized fixed-point values are converted to floating-point.

2.8.2. Conversion from Floating-Point to Normalized Fixed-Point

The conversion from a floating-point value f to the corresponding unsigned normalized fixed-point value c is defined by first clamping f to the range [0,1], then computing

c = convertFloatToUint(f × (2b - 1), b)

where convertFloatToUint}(r,b) returns one of the two unsigned binary integer values with exactly b bits which are closest to the floating-point value r. Implementations should round to nearest. If r is equal to an integer, then that integer value must be returned. In particular, if f is equal to 0.0 or 1.0, then c must be assigned 0 or 2b - 1, respectively.

The conversion from a floating-point value f to the corresponding signed normalized fixed-point value c is performed by clamping f to the range [-1,1], then computing

c = convertFloatToInt(f × (2b-1 - 1), b)

where convertFloatToInt(r,b) returns one of the two signed two’s-complement binary integer values with exactly b bits which are closest to the floating-point value r. Implementations should round to nearest. If r is equal to an integer, then that integer value must be returned. In particular, if f is equal to -1.0, 0.0, or 1.0, then c must be assigned -(2b-1 - 1), 0, or 2b-1 - 1, respectively.

This equation is used everywhere that floating-point values are converted to signed normalized fixed-point.

2.9. API版本数字和语义

Vulkan版本号在API中数个地方被用到。在每一处使用点, API 主版本号次版本号补丁版本号 都被塞入到如下的一个32位整型数字内:

  • 主版本号是10-bit 整型,被塞入到31-22 bit的位置。

  • 次版本号是10-bit 整型,被塞入到21-12 bit的位置。

  • 补丁版本号是12-bit 整型,被塞入到11-0 bit的位置。

Vulkan的任何版本号区别意味着API在一定程度上已经改变了,版本号的每一部分意味着更改的范围不同。

补丁版本号的区别意味着此规范文档或者头文件很小一部分被修改了,通常为了修复bug,或者现有功能有性能性能问题。 这个号码变更不应该影响 兼容性 或两个版本之间的 向后兼容 ,亦或者是为了向API添加接口。

次版本号的变更意味着新增了一系列的新功能。这通常宝货向头文件中增加新街口,或者包括行为变更、bug修复。 在次版本中可能淘汰一些功能,但并不会移除。 当引入了一个新的此版本后,补丁版本号被重置为0,每一个次版本都维护自己的补丁版本。 次版本的变更不应该影响后向兼容,但是会影响全兼容性。

主版本的变更意味着API的重大变化,可能包括新功能和头文件接口,行为变更,已淘汰特性的移除, 修改或者公开替换任何特性,极有可能破坏兼容性。 主版本变更通常导致已有的应用程序需进行明显的修改才能正常工作。

操纵版本号数字的C语言宏在附录 Version Number Macros 中定义。

2.10. 常见对象类型

这里列举了在很多不同的数据结构和命令参数中使用的Vulkan对象。这些类型包括 offsets, extents, 和 rectangles

2.10.1. Offsets

Offset 用来描述图像或者帧缓冲区的某个像素的位置,在二维图像中是(x,y),在三维图像中是(x,y,z)。

下列数据结构定义了二维Offset:

typedef struct VkOffset2D {
    int32_t    x;
    int32_t    y;
} VkOffset2D;

下列数据结构定义了三维Offset:

typedef struct VkOffset3D {
    int32_t    x;
    int32_t    y;
    int32_t    z;
} VkOffset3D;

2.10.2. Extents

Extends被用来描述图像或者帧缓冲区内矩形区域的大小,对于二维图像是(width,height),对于三维图像是(width,height,depth)。

下列数据结构定义了二维extent:

typedef struct VkExtent2D {
    uint32_t    width;
    uint32_t    height;
} VkExtent2D;

下列数据结构定义了三维extent:

typedef struct VkExtent3D {
    uint32_t    width;
    uint32_t    height;
    uint32_t    depth;
} VkExtent3D;

2.10.3. Rectangles

Rectangles被用来描述图像或者帧缓冲区内某个矩形区域内的像素。 Rectangles 包含一个offset和一个 extend,二者是同一个维度。

下列数据结构定义了二维rectangles:

typedef struct VkRect2D {
    VkOffset2D    offset;
    VkExtent2D    extent;
} VkRect2D;

3. 初始化

在使用Vulkan之前,应用程序必须通过载入Vulkan命令和创建VkInstance对象来初始化它。

3.1. 命令函数的指针

Vulkan命令在各平台上并不是静态的暴露出来的。可以通过以下命令来获取Vulkan命令的函数指针:

PFN_vkVoidFunction vkGetInstanceProcAddr(
    VkInstance                                  instance,
    const char*                                 pName);
  • instance 是函数指针兼容的实例的指针,或者对不依赖于任何实例的命令来说是`NULL`

  • pName 是需要获取的命令的名字。

vkGetInstanceProcAddr 自己是通过平台和loader各异的方式获取的。通常,loader库将以函数符号的方式导出这个命令, 所以,应用程序可以链接到loader库,或者动态的载入并使用平台自己的API来寻找符号。 loader应导出其他所有的核心Vulkan命令;如果完成了,应用程序只使用核心的Vulkan命令,就没有必要使用vkGetInstanceProcAddr了。

下面的表格定义了vkGetInstanceProcAddr 各种使用场景和期待的返回值(fp 是函数指针):

返回的函数指针是 PFN_vkVoidFunction 类型的,必须强制转换为查询所用的类型:

Table 1. vkGetInstanceProcAddr behavior
instance pName return value

*

NULL

undefined

invalid instance

*

undefined

NULL

vkEnumerateInstanceExtensionProperties

fp

NULL

vkEnumerateInstanceLayerProperties

fp

NULL

vkCreateInstance

fp

NULL

* (any pName not covered above)

NULL

instance

core Vulkan command

fp1

instance

enabled instance extension commands for instance

fp1

instance

available device extension2 commands for instance

fp1

instance

* (any pName not covered above)

NULL

1

The returned function pointer must only be called with a dispatchable object (the first parameter) that is instance or a child of instance. e.g. VkInstance, VkPhysicalDevice, VkDevice, VkQueue, or VkCommandBuffer.

2

An “available extension” is an extension function supported by any of the loader, driver or layer.

Valid Usage (Implicit)
  • If instance is not NULL, instance must be a valid VkInstance handle

  • pName must be a null-terminated string

为了支持有多个Vulkan实现的异构系统,vkGetInstanceProcAddr 返回的函数指针可能指向 不可可分发的代码,亦即 对不同的VkDevice对象调用不同的真实实现。 这个内部分发的开销可以通过获取设备各异的函数指针而避免,这些命令使用设备或者设备子对象作为不可分发对象。 这些函数指针可通过以下命令获取:

PFN_vkVoidFunction vkGetDeviceProcAddr(
    VkDevice                                    device,
    const char*                                 pName);

下面的表格定义了使用vkGetDeviceProcAddr的各种场景和各自期待的返回值。

返回的函数指针是 PFN_vkVoidFunction 类型的,必须强制转换为查询所用的类型:

Table 2. vkGetDeviceProcAddr behavior
device pName return value

NULL

*

undefined

invalid device

*

undefined

device

NULL

undefined

device

core Vulkan command

fp1

device

enabled extension commands

fp1

device

* (any pName not covered above)

NULL

1

The returned function pointer must only be called with a dispatchable object (the first parameter) that is device or a child of device. e.g. VkDevice, VkQueue, or VkCommandBuffer.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pName must be a null-terminated string

PFN_vkVoidFunction 的定义是:

typedef void (VKAPI_PTR *PFN_vkVoidFunction)(void);

3.2. 实例

在Vulkan中没有全局的状态,所有应用程序自己的状态都存储在一个VkInstance对象中。 创建一个VkInstance 对象会初始化Vulkan库并允许应用程序传递信息给Vulkan实现。

实例由VkInstance 类型的handle表示:

VK_DEFINE_HANDLE(VkInstance)

可调用如下命令来创建一个实例:

VkResult vkCreateInstance(
    const VkInstanceCreateInfo*                 pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkInstance*                                 pInstance);
  • pCreateInfo 指向一个VkInstanceCreateInfo 类型数据结构,控制了实例的创建 。

  • pAllocator 控制了CPU端内存分配,如 Memory Allocation 一章所描述。

  • pInstance 指向了一个 VkInstance 类型handle ,产生的实例由此返回 。

vkCreateInstance创建一个实例,然后启用并初始化应用程序需要的全局层和拓展。 如果一个拓展通过一个层提供,那么层和拓展都必须在vkCreateInstance指定。 如果指定的层没有被找到,那么将不会创建VkInstance对象,函数将返回VK_ERROR_LAYER_NOT_PRESENT。 同样,如果一个指定的拓展没有被找到,函数调用将返回VK_ERROR_EXTENSION_NOT_PRESENT

Valid Usage (Implicit)
  • pCreateInfo must be a pointer to a valid VkInstanceCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pInstance must be a pointer to a VkInstance handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INITIALIZATION_FAILED

  • VK_ERROR_LAYER_NOT_PRESENT

  • VK_ERROR_EXTENSION_NOT_PRESENT

  • VK_ERROR_INCOMPATIBLE_DRIVER

VkInstanceCreateInfo 类型数据结构定义如下:

typedef struct VkInstanceCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkInstanceCreateFlags       flags;
    const VkApplicationInfo*    pApplicationInfo;
    uint32_t                    enabledLayerCount;
    const char* const*          ppEnabledLayerNames;
    uint32_t                    enabledExtensionCount;
    const char* const*          ppEnabledExtensionNames;
} VkInstanceCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留。

  • pApplicationInfoNULL 或者是一个指向 VkApplicationInfo 类型实例的指针。 如果不是`NULL`,这个信息帮助Vulkan实现识别多个类别的程序固有的行为。 VkApplicationInfo 在下面有详细的定义。

  • enabledLayerCount 是将要启用的全局层的个数。

  • ppEnabledLayerNames 是一个指针,指向了大小为 enabledLayerCount ,以null结尾UTF-8 字符串(包含了创建的实例将要启用的层的名字)的数组。 参考 Layers 一节以获取更多细节。

  • enabledExtensionCount 是将要启用的全局拓展的个数。

  • ppEnabledExtensionNames 一个指针,指向了大小为 enabledExtensionCount ,以null结尾UTF-8 字符串(包含了将要启用的拓展的名字)的数组。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_INSTANCE_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • If pApplicationInfo is not NULL, pApplicationInfo must be a pointer to a valid VkApplicationInfo structure

  • If enabledLayerCount is not 0, ppEnabledLayerNames must be a pointer to an array of enabledLayerCount null-terminated strings

  • If enabledExtensionCount is not 0, ppEnabledExtensionNames must be a pointer to an array of enabledExtensionCount null-terminated strings

VkApplicationInfo 数据结构定义如下:

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion;
} VkApplicationInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者指向一个指向拓展特定的数据结构的指针。

  • pApplicationName 是一个以 null 结束的UTF-8字符串,包含了应用程序的名字。

  • applicationVersion is an unsigned integer variable containing the developer-supplied version number of the application.

  • pEngineName is a pointer to a null-terminated UTF-8 string containing the name of the engine (if any) used to create the application.

  • engineVersion is an unsigned integer variable containing the developer-supplied version number of the engine used to create the application.

  • apiVersion is the version of the Vulkan API against which the application expects to run, encoded as described in the API Version Numbers and Semantics section. If apiVersion is 0 the implementation must ignore it, otherwise if the implementation does not support the requested apiVersion it must return VK_ERROR_INCOMPATIBLE_DRIVER. The patch version number specified in apiVersion is ignored when creating an instance object. Only the major and minor versions of the instance must match those requested in apiVersion.

正确使用
  • apiVersion must be zero, or otherwise it must be a version that the implementation supports, or supports an effective substitute for

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_APPLICATION_INFO

  • pNext must be NULL

  • If pApplicationName is not NULL, pApplicationName must be a null-terminated string

  • If pEngineName is not NULL, pEngineName must be a null-terminated string

可调用如下命令来销毁一个实例:

void vkDestroyInstance(
    VkInstance                                  instance,
    const VkAllocationCallbacks*                pAllocator);
  • instance 是需要被销毁的实例的handle。

  • pAllocator 控制了CPU端内存分配,如 Memory Allocation 一章所述。

正确使用
  • All child objects created using instance must have been destroyed prior to destroying instance

  • If VkAllocationCallbacks were provided when instance was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when instance was created, pAllocator must be NULL

Valid Usage (Implicit)
  • If instance is not NULL, instance must be a valid VkInstance handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

Host Synchronization
  • Host access to instance must be externally synchronized

4. 设备和队列

一旦Vulkan完成初始化,设备和队列是用来和Vulkan实现交互的主要对象。

Vulkan把 physicallogical 设备的概念分开了。一个物理设备通常表示单独的一个系统(也许由几个单独的硬件组成一起工作), 个数是有限的。逻辑设备表示从应用程序的角度看设备。

物理设备通过 VkPhysicalDevice handles表示:

VK_DEFINE_HANDLE(VkPhysicalDevice)

4.1. 物理设备

要从系统中获取表示已安装的物理设备的对象列表,可调用:

VkResult vkEnumeratePhysicalDevices(
    VkInstance                                  instance,
    uint32_t*                                   pPhysicalDeviceCount,
    VkPhysicalDevice*                           pPhysicalDevices);
  • instance 是一个handle,指向了之前用 vkCreateInstance 创建的Vulkan实例。

  • pPhysicalDeviceCount 是一个指针,指向了可用的或者已查询到的物理设备数量的整数,如下面所描述。

  • pPhysicalDevices 要么是 NULL ,要么是一个指向 VkPhysicalDevice 数组的指针。

如果pPhysicalDevicesNULL,那么可用的物理设备的个数通过pPhysicalDeviceCount返回。 否则,pPhysicalDeviceCount必须指向一个用户端设置的、值为pPhysicalDevices数组大小的变量, 且返回时,变量被覆写为pPhysicalDevices数组的大小。 如果pPhysicalDeviceCount 比可用的物理设备个数小,最多pPhysicalDeviceCount被覆盖。 如果pPhysicalDeviceCount 比可用的物理设备个数小,VK_INCOMPLETE 将被返回,表示不是所有可用设备被返回。

Valid Usage (Implicit)
  • instance must be a valid VkInstance handle

  • pPhysicalDeviceCount must be a pointer to a uint32_t value

  • If the value referenced by pPhysicalDeviceCount is not 0, and pPhysicalDevices is not NULL, pPhysicalDevices must be a pointer to an array of pPhysicalDeviceCount VkPhysicalDevice handles

Return Codes
Success
  • VK_SUCCESS

  • VK_INCOMPLETE

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INITIALIZATION_FAILED

要查询一个获取的物理设备的通用属性,调用:

void vkGetPhysicalDeviceProperties(
    VkPhysicalDevice                            physicalDevice,
    VkPhysicalDeviceProperties*                 pProperties);
  • physicalDevice 是将被查询各种属性的物理设备的handle。

  • pProperties 指向一个 VkPhysicalDeviceProperties 类型数据结构的实例,将被返回的信息所填充。

Valid Usage (Implicit)
  • physicalDevice must be a valid VkPhysicalDevice handle

  • pProperties must be a pointer to a VkPhysicalDeviceProperties structure

VkPhysicalDeviceProperties 数据结构定义如下:

typedef struct VkPhysicalDeviceProperties {
    uint32_t                            apiVersion;
    uint32_t                            driverVersion;
    uint32_t                            vendorID;
    uint32_t                            deviceID;
    VkPhysicalDeviceType                deviceType;
    char                                deviceName[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
    uint8_t                             pipelineCacheUUID[VK_UUID_SIZE];
    VkPhysicalDeviceLimits              limits;
    VkPhysicalDeviceSparseProperties    sparseProperties;
} VkPhysicalDeviceProperties;
  • apiVersion 是设备所支持的Vulkan版本,如API Version Numbers and Semantics 一节所描述的那样被编码.

  • driverVersion 是显卡生产商所提供的驱动版本号。

  • vendorID 是物理设备的 vendor (see below) 对应的唯一标识。

  • deviceID is a 供应商所有设备中此设备的唯一标识。

  • deviceType 一个 VkPhysicalDeviceType ,指定了设备的类型。

  • deviceName 是一个以 null 结束的UTF-8字符串,包含了设备的名字。

  • pipelineCacheUUID 是一个大小为VK_UUID_SIZE 的数组,每个元素包含8-bit,表示该设备的唯一编号。

  • limits 是一个 VkPhysicalDeviceLimits 数据结构,给出了物理设备特定的物理限制。细节部分请参考Limits

  • sparseProperties 是一个 VkPhysicalDeviceSparseProperties 数据结构,给出了和物理设备的各种稀疏相关的属性。 细节部分请参考Sparse Properties

vendorIDdeviceID 域可以让应用程序适配硬件的没有通过其他Vulkan查询暴露出来的特性, 这些也可能包括性能分析,硬件勘误,或者其他的特性。在基于PCI的Vulkan实现中,vendorIDdeviceID最低6位必须包含 PCI供应商和硬件设备关联的设备ID,剩下的位必须设置为0。 在非PCI实现中,渲染返回什么值可以由操作系统或者平台策略来决定。 其他方面,则由Vulkan实现者根据如下限制条件或者指导意见来自由决定:

  • For purposes of physical device identification, the vendor of a physical device is the entity responsible for the most salient characteristics of the hardware represented by the physical device handle. In the case of a discrete GPU, this should be the GPU chipset vendor. In the case of a GPU or other accelerator integrated into a system-on-chip (SoC), this should be the supplier of the silicon IP used to create the GPU or other accelerator.

  • If the vendor of the physical device has a valid PCI vendor ID issued by PCI-SIG, that ID should be used to construct vendorID as described above for PCI-based implementations. Implementations that do not return a PCI vendor ID in vendorID must return a valid Khronos vendor ID, obtained as described in the Vulkan Documentation and Extensions document in the section “Registering a Vendor ID with Khronos”. Khronos vendor IDs are allocated starting at 0x10000, to distinguish them from the PCI vendor ID namespace.

  • The vendor of the physical device is responsible for selecting deviceID. The value selected should uniquely identify both the device version and any major configuration options (for example, core count in the case of multicore devices). The same device ID should be used for all physical implementations of that device version and configuration. For example, all uses of a specific silicon IP GPU version and configuration should use the same device ID, even if those uses occur in different SoCs.

物理设备的类型为:

typedef enum VkPhysicalDeviceType {
    VK_PHYSICAL_DEVICE_TYPE_OTHER = 0,
    VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU = 1,
    VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU = 2,
    VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU = 3,
    VK_PHYSICAL_DEVICE_TYPE_CPU = 4,
} VkPhysicalDeviceType;
  • VK_PHYSICAL_DEVICE_TYPE_OTHER 此设备不与其他任何可用的类型匹配。

  • VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU 此设备通常是嵌入式的,或者是集成到CPU内部的。

  • VK_PHYSICAL_DEVICE_TYPE_DISCRETE_GPU 此设备通常是一个与CPU通过内部总线直接相连的独立设备。

  • VK_PHYSICAL_DEVICE_TYPE_VIRTUAL_GPU 此设备通常是虚拟环境中的一个虚拟节点。

  • VK_PHYSICAL_DEVICE_TYPE_CPU 此设备通常和CPU一样运行在同一个处理器上。

物理设备类型只是提供建议性的信息,并不直接影响系统的操作。 然而,设备类型可能与其他宣称的属性或者系统兼容性 有关,比如有多少个内存堆。

可调用下列命令来查询在物理设备上可用的队列的属性:

void vkGetPhysicalDeviceQueueFamilyProperties(
    VkPhysicalDevice                            physicalDevice,
    uint32_t*                                   pQueueFamilyPropertyCount,
    VkQueueFamilyProperties*                    pQueueFamilyProperties);
  • physicalDevice 是将被查询各种属性的物理设备的handle。

  • pQueueFamilyPropertyCount 是一个指针,指向一个和可用的或者查询到的队列族数量相关的整数,如下所描述。

  • pQueueFamilyProperties 是`NULL`,或者是一个指向 VkQueueFamilyProperties 类型数组的指针。

如果pQueueFamilyProperties 为`NULL`,那么可用的队列族的数量通过 pQueueFamilyPropertyCount 返回。 否则,pQueueFamilyPropertyCount 必须: 指向一个变量,由用户设置的pQueueFamilyProperties 数组的个数, 返回的时候,这个变量值被 写入到pQueueFamilyProperties的个数所覆盖。 如果pQueueFamilyPropertyCount 比可用的队列族少,最多有 pQueueFamilyPropertyCount 个数据被写入。

Valid Usage (Implicit)
  • physicalDevice must be a valid VkPhysicalDevice handle

  • pQueueFamilyPropertyCount must be a pointer to a uint32_t value

  • If the value referenced by pQueueFamilyPropertyCount is not 0, and pQueueFamilyProperties is not NULL, pQueueFamilyProperties must be a pointer to an array of pQueueFamilyPropertyCount VkQueueFamilyProperties structures

VkQueueFamilyProperties 数据结构定义如下:

typedef struct VkQueueFamilyProperties {
    VkQueueFlags    queueFlags;
    uint32_t        queueCount;
    uint32_t        timestampValidBits;
    VkExtent3D      minImageTransferGranularity;
} VkQueueFamilyProperties;
  • queueFlags 包含了此队列在队列族中兼容性的标志位。

  • queueCount 是队列族中队列的个数。

  • timestampValidBits 是通过 vkCmdWriteTimestamp写入的时间戳的有效位个数。有效位的范围是 36-64,或者值为0,表示不支持时间戳。 在有效bit位之外的位置被保证都为0。

  • minImageTransferGranularity 是队列族中的队列支持的转移图像的最小粒度。

queueFlags 的标志位如下:

typedef enum VkQueueFlagBits {
    VK_QUEUE_GRAPHICS_BIT = 0x00000001,
    VK_QUEUE_COMPUTE_BIT = 0x00000002,
    VK_QUEUE_TRANSFER_BIT = 0x00000004,
    VK_QUEUE_SPARSE_BINDING_BIT = 0x00000008,
} VkQueueFlagBits;
  • if VK_QUEUE_GRAPHICS_BIT is set, then the queues in this queue family support graphics operations.

  • if VK_QUEUE_COMPUTE_BIT is set, then the queues in this queue family support compute operations.

  • if VK_QUEUE_TRANSFER_BIT is set, then the queues in this queue family support transfer operations.

  • if VK_QUEUE_SPARSE_BINDING_BIT is set, then the queues in this queue family support sparse memory management operations (see Sparse Resources). If any of the sparse resource features are enabled, then at least one queue family must support this bit.

如果Vulkan实现保留了任何支持图形操作的队列族,那么至少一个物理设备中至少有一个队列族必须都支持图形和计算操作。

注意

All commands that are allowed on a queue that supports transfer operations are also allowed on a queue that supports either graphics or compute operations thus if the capabilities of a queue family include VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT then reporting the VK_QUEUE_TRANSFER_BIT capability separately for that queue family is optional.

更多细节请参考Queues

minImageTransferGranularity中返回的值有一个单位的压缩纹理块,因为图像拥有块压缩格式,其他的则是有一个单位的纹素。

Possible values of minImageTransferGranularity are:

  • (0,0,0) which indicates that only whole mip levels must be transferred using the image transfer operations on the corresponding queues. In this case, the following restrictions apply to all offset and extent parameters of image transfer operations:

    • The x, y, and z members of a VkOffset3D parameter must always be zero.

    • The width, height, and depth members of a VkExtent3D parameter must always match the width, height, and depth of the image subresource corresponding to the parameter, respectively.

  • (Ax, Ay, Az) where Ax, Ay, and Az are all integer powers of two. In this case the following restrictions apply to all image transfer operations:

    • x, y, and z of a VkOffset3D parameter must be integer multiples of Ax, Ay, and Az, respectively.

    • width of a VkExtent3D parameter must be an integer multiple of Ax, or else x + width must equal the width of the image subresource corresponding to the parameter.

    • height of a VkExtent3D parameter must be an integer multiple of Ay, or else y + height must equal the height of the image subresource corresponding to the parameter.

    • depth of a VkExtent3D parameter must be an integer multiple of Az, or else z + depth must equal the depth of the image subresource corresponding to the parameter.

    • If the format of the image corresponding to the parameters is one of the block-compressed formats then for the purposes of the above calculations the granularity must be scaled up by the compressed texel block dimensions.

Queues supporting graphics and/or compute operations must report (1,1,1) in minImageTransferGranularity, meaning that there are no additional restrictions on the granularity of image transfer operations for these queues. Other queues supporting image transfer operations are only required to support whole mip level transfers, thus minImageTransferGranularity for queues belonging to such queue families may be (0,0,0).

Device Memory一节描述了从物理设备中查询出来的内存属性。

对于物理设备特征查询,请参考 Features 一章。

4.2. 设备

设备对象表示和物理设备之间的一个连接。每一个设备对外暴露一些 队列族,每一个都有一个或多个_队列_。 在一个队列族中的所有队列都支持相同的操作。

如在Physical Devices中所描述的,一个Vulkan应用程序将首先查询 一个系统中所有的物理设备。 每一个物理设备可以被查询它的能力,包含队列和队列族的属性。一旦一个可接受的物理设备被确认了,应用程序将创建对应的 逻辑设备。应用程序必须对每一个使用的物理设备创建单独的逻辑设备。被创建的逻辑设备然后就是和物理设备之间的接口了。

如何遍历一个系统中的物理设备并查询这些物理设备的队列族属性在之前的 Physical Device Enumeration小节讲解过。

4.2.1. 设备创建

逻辑设备通过VkDevice handles表示:

VK_DEFINE_HANDLE(VkDevice)

一个逻辑设备被当作和物理设备的连接被创建。调用下面的命令来创建逻辑设备:

VkResult vkCreateDevice(
    VkPhysicalDevice                            physicalDevice,
    const VkDeviceCreateInfo*                   pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkDevice*                                   pDevice);
  • physicalDevice 必须: 是vkEnumeratePhysicalDevices调用返回的多个队列中的一个的handle(参考 Physical Device Enumeration)。

  • pCreateInfo 是一个指针,指向一个类型为 VkDeviceCreateInfo 的数据结构,包含如何创建设备的信息。

  • pAllocator 控制了CPU端内存分配,如 Memory Allocation 一章所描述。

  • pDevice 指向了创建得到的 VkDevice 的handle。

可以在同一个物理设备上创建多个逻辑设备。因为物理资源的缺乏(和其他错误),逻辑设备的创建可能失败。 如果发生了失败,vkCreateDevice 将返回VK_ERROR_TOO_MANY_OBJECTS

Valid Usage (Implicit)
  • physicalDevice must be a valid VkPhysicalDevice handle

  • pCreateInfo must be a pointer to a valid VkDeviceCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pDevice must be a pointer to a VkDevice handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_INITIALIZATION_FAILED

  • VK_ERROR_EXTENSION_NOT_PRESENT

  • VK_ERROR_FEATURE_NOT_PRESENT

  • VK_ERROR_TOO_MANY_OBJECTS

  • VK_ERROR_DEVICE_LOST

VkDeviceCreateInfo 数据结构定义如下:

typedef struct VkDeviceCreateInfo {
    VkStructureType                    sType;
    const void*                        pNext;
    VkDeviceCreateFlags                flags;
    uint32_t                           queueCreateInfoCount;
    const VkDeviceQueueCreateInfo*     pQueueCreateInfos;
    uint32_t                           enabledLayerCount;
    const char* const*                 ppEnabledLayerNames;
    uint32_t                           enabledExtensionCount;
    const char* const*                 ppEnabledExtensionNames;
    const VkPhysicalDeviceFeatures*    pEnabledFeatures;
} VkDeviceCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构。

  • flags 被保留使用。

  • queueCreateInfoCountpQueueCreateInfos 数组的大小。参考 下面的Queue Creation 一节来获取更多信息。

  • pQueueCreateInfos 是一个指针,指向一个元素类型为 VkDeviceQueueCreateInfo 的数组,描述了 describing the queues that are requested to be created along with the logical device. Refer to the Queue Creation section below for further details.

  • enabledLayerCount 已弃用,被忽略。

  • ppEnabledLayerNames 已弃用,被忽略。参考 Device Layer Deprecation.

  • enabledExtensionCount 是启用的设备拓展的个数。

  • ppEnabledExtensionNames 是一个指针,指向了一个长度为 enabledExtensionCount ,null-terminated UTF-8 字符串, 包含了将对创建的设备所启用的拓展名字。 参考Extensions 一节来获取更多细节。

  • pEnabledFeaturesNULL,或者是一个指针,指向一个 VkPhysicalDeviceFeatures 数据结构,其包含了所有将被启用的特征的 boolean标志位。 参考Features 一节获取更多细节。

正确使用
  • The queueFamilyIndex member of any given element of pQueueCreateInfos must be unique within pQueueCreateInfos

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • pQueueCreateInfos must be a pointer to an array of queueCreateInfoCount valid VkDeviceQueueCreateInfo structures

  • If enabledLayerCount is not 0, ppEnabledLayerNames must be a pointer to an array of enabledLayerCount null-terminated strings

  • If enabledExtensionCount is not 0, ppEnabledExtensionNames must be a pointer to an array of enabledExtensionCount null-terminated strings

  • If pEnabledFeatures is not NULL, pEnabledFeatures must be a pointer to a valid VkPhysicalDeviceFeatures structure

  • queueCreateInfoCount must be greater than 0

4.2.2. 使用设备

下面是如何使用 VkDevice 的上层列表,还有详细信息的参考点:

4.2.3. 设备丢失

逻辑设备可能因为硬件错误、执行超时、电源管理事件和平台某些事件而丢失。这会导致待执行的命令执行失败和硬件资源损坏。 当发生这种情况时,某些命令会返回name:VK_ERROR_DEVICE_LOST(参考Error Codes)。 发生了这种情况后,逻辑设备被认为已经丢失了。 不可能重置逻辑设备到非丢失状态,然而,这个丢失状态只是针对逻辑设备而言的,对应的物理设备(VkPhysicalDevice)可能不受影响。 一些情况下,物理设备也可能丢失,尝试创建逻辑设备会失败,返回VK_ERROR_DEVICE_LOST。这通常意味着潜在的硬件问题,或者和 CPU端的连接问题。 如果物理设备没有被丢失,在它上面一个新的逻辑设备成功的被创建了,物理设备一定是非丢失状态。

注意

同时,逻辑设备丢失是可恢复的,物理设备丢失的情况下,除非系统中有另外未受影响的物理设备,否则应用程序不可能会恢复。 错误大体上是信息性的,试图通知用户他们的硬件可能产生了错误或连接不良,应该自己调查一下。一些情况下,物理设备丢失可能导致 其他严重的问题,比如操作系统崩溃,这样的话Vulkan API就不会给出原因了。

注意

应用程序导致的未知行为可能导致设备丢失。然而,这些未定义行为也会导致进程内存损坏,此时就不能保证API对象,包括 VkPhysicalDevice 或者VkInstance仍然是有效的,或者是可恢复的。

当设备丢失了,它的子对象并没有隐式的被销毁,它们的handle仍然有效。这些对象必须要在他们父对象 或者设备被销毁之前被销毁(参考 Object Lifetime 小节 )。 使用vkMapMemory把设备内存映射到CPU端寻址空间的内存依然有效,对于被映射区域的CPU端内存访问仍然是有效的,但是内存内容是未定义的。 仍然可以对设备和它的子对象调用任何API命令。

一旦设备丢失,命令执行可能失败:,返回VkResult 的命令有可能: 返回VK_ERROR_DEVICE_LOST。 不允许运行时错误的命令有可能仍然正常使用中,有可能的话,仍返回有效的数据。

Commands that wait indefinitely for device execution (namely vkDeviceWaitIdle, vkQueueWaitIdle, vkWaitForFences with a maximum timeout, and vkGetQueryPoolResults with the VK_QUERY_RESULT_WAIT_BIT bit set in flags) must return in finite time even in the case of a lost device, and return either VK_SUCCESS or VK_ERROR_DEVICE_LOST. For any command that may return VK_ERROR_DEVICE_LOST, for the purpose of determining whether a command buffer is pending execution, or whether resources are considered in-use by the device, a return value of VK_ERROR_DEVICE_LOST is equivalent to VK_SUCCESS.

editing-note

TODO (piman) - I do not think we are very clear about what “in-use by the device” means.

4.2.4. 设备销毁

可调用如下命令来销毁设备:

void vkDestroyDevice(
    VkDevice                                    device,
    const VkAllocationCallbacks*                pAllocator);
  • device 是需要被销毁的逻辑设备。

  • pAllocator 控制CPU内存分配,如 Memory Allocation 一章讲解。

为了保证在设备上没有正在进行的工作,vkDeviceWaitIdle 可以: 用来守护设备的销毁。 在销毁设备之前,应用程序需要负责销毁/释放从该设备上创建出来的Vulkan对象(使用vkCreate* 、vkAllocate* 等命令并以该device作为第一个参数)。

注意

The lifetime of each of these objects is bound by the lifetime of the VkDevice object. Therefore, to avoid resource leaks, it is critical that an application explicitly free all of these resources prior to calling vkDestroyDevice.

正确使用
  • All child objects created on device must have been destroyed prior to destroying device

  • If VkAllocationCallbacks were provided when device was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when device was created, pAllocator must be NULL

Valid Usage (Implicit)
  • If device is not NULL, device must be a valid VkDevice handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

Host Synchronization
  • Host access to device must be externally synchronized

4.3. 队列

4.3.1. 队列族的属性

如之前的Physical Device Enumeration一节讲过,vkGetPhysicalDeviceQueueFamilyProperties 命令是用来获取设备所支持的队列族的属性的。

vkGetPhysicalDeviceQueueFamilyProperties返回的pQueueFamilyProperties数组的每一个索引描述了 物理设备上唯一的队列族。 这些索引被用来创建队列,它们通过下节Queue Creation 讲解的VkDeviceQueueCreateInfo类型数据 直接对应着传递给 vkCreateDevice命令的 queueFamilyIndex

同一个物理设备上的队列族的分组取决于Vulkan实现。

注意

可以假定一个物理设备会把能力匹配的所有队列分组到一个族。 然而,这是对Vulkan实现的一个推荐,很可能,物理设备可能会返回两个拥有相同能力的族。

一旦应用程序以它想用的族来确定了物理设备,它将结合逻辑设备来创建这些队列。 这在下节讲述。

4.3.2. 创建队列

创建一个逻辑设备也会创建该设备相关的队列。需要创建的队列通过传递给vkCreateDevice 的参数pQueueCreateInfos的一系列VkDeviceQueueCreateInfo类型数据描述。

队列通过VkQueue handles表示:

VK_DEFINE_HANDLE(VkQueue)

VkDeviceQueueCreateInfo 数据结构定义如下:

typedef struct VkDeviceQueueCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkDeviceQueueCreateFlags    flags;
    uint32_t                    queueFamilyIndex;
    uint32_t                    queueCount;
    const float*                pQueuePriorities;
} VkDeviceQueueCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留使用。

  • queueFamilyIndex 是一个无符号整形数字,表示在此设备上创建的队列族的索引。 这个索引对应着 vkGetPhysicalDeviceQueueFamilyProperties返回的pQueueFamilyProperties 数组中的元素的索引。

  • queueCount 是一个无符号整形数字,指定了 queueFamilyIndex表示的队列族中的队列个数。

  • pQueuePriorities 是一个元素为归一化浮点数的数组,大小为queueCount,指定了提交到已创建队列的任务的优先级。 参考Queue Priority 以获取更多信息。

正确使用
  • queueFamilyIndex 必须比 vkGetPhysicalDeviceQueueFamilyProperties返回的pQueueFamilyPropertyCount 值要小

  • queueCount 必须要小于等于 vkGetPhysicalDeviceQueueFamilyProperties 调用返回的pQueueFamilyProperties[queueFamilyIndex]数组的一个 VkQueueFamilyProperties类型元素 的queueCount

  • pQueuePriorities 的每一个元素都必须在 0.01.0(开区间)之内。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • pQueuePriorities must be a pointer to an array of queueCount float values

  • queueCount must be greater than 0

可调用如下命令来获取一个队列的handle:

void vkGetDeviceQueue(
    VkDevice                                    device,
    uint32_t                                    queueFamilyIndex,
    uint32_t                                    queueIndex,
    VkQueue*                                    pQueue);
  • device 是拥有该队列的逻辑设备。

  • queueFamilyIndex 是队列所属的队列族的索引。

  • queueIndex 是需要获取的队列在队列族中的索引。

  • pQueue 是一个指针,指向了VkQueue 对象,它将被获取到的队列的handle所覆盖。

正确使用
  • queueFamilyIndex must be one of the queue family indices specified when device was created, via the VkDeviceQueueCreateInfo structure

  • queueIndex must be less than the number of queues created for the specified queue family index when device was created, via the queueCount member of the VkDeviceQueueCreateInfo structure

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pQueue must be a pointer to a VkQueue handle

4.3.3. 队列族索引

队列族索引在Vulkan中多处被使用,它可以把操作和特定的队列族绑定起来。

当通过vkGetDeviceQueue获取到一个队列的handle时,队列族的索引被用来选择从哪个队列族中获取队列handle(上上节所述)。

当创建VkCommandPool 对象时(参考 Command Pools),一个队列族索引通过 VkCommandPoolCreateInfo 数据结构指定。 从缓存池中获取的命令缓冲区 只能提交到对应着此队列族的队列上。

当创建 VkImage (参考 Images) 和 VkBuffer (参考 Buffers) 资源时,一系列的队列族被包含到VkImageCreateInfoVkBufferCreateInfo 数据结构中,来指定可以访问到这些资源的队列族。

当插入一个VkBufferMemoryBarrierVkImageMemoryBarrier (see 事件) 时,一个源和目标队列族索引被指定来允许把缓冲区或图像转移到另外一个队列族。 参考 Resource Sharing 一节以获取细节。

4.3.4. 队列优先级

每一个队列都被分配了一个优先级,在创建设备过程中VkDeviceQueueCreateInfo的数据结构中设置的。 每一个队列的优先级是一个归一化的浮点值,在0.0和1.0之间,然后有Vulkan实现转换到离散的的优先级级别。 高的值表示更高的优先级,0.0表示最低的优先级,1.0表示最高。一个设备内,拥有高优先级的队列将会比 优先级低的队列获得更多的处理时间。Vulkan实现不保证相同优先级的队列的排序或者调度, 不管explicit synchronization primitives中已定义的限制如何。 Vulkan实现不保证不同设备之间的队列要如何安排优先级。

一个Vulkan实现也许允许更高优先级的队列会让同一个设备上的低优先级的队列挨饿,直到自己完成所有命令的执行。 队列优先级的关系必须不能导致另外一个设备上的队列暂停工作。

没有任何明确的保证高优先级的队列比低优先级的队列接受更多的处理时间和服务质量。

4.3.5. 队列提交

通过_队列提交_命令,如vkQueueSubmit,工作就被提交到队列了。队列提交命令定义了一系列的需要 物理设备执行的_队列操作_,包括使用信号量和栅栏来同步。

提交命令接受目标设备为参数,零个或者多个_batches_的任务,和一个可选的栅栏来激发任务完成的信号。 每一个批次由三个部分组成:

  1. Zero or more semaphores to wait on before execution of the rest of the batch.

  2. Zero or more work items to execute.

    • If present, these describe a queue operation matching the work described.

  3. Zero or more semaphores to signal upon completion of the work items.

如果在队列提交之间出现了fence,那么它描述了fence signal operation

通过一个队列提交命令描述的所有任务必须在命令返回前提交到队列。

稀疏内存绑定

在Vulkan里,可以稀疏的绑定内存到缓冲区或者图像,这点在前面的Sparse Resource一章讲过。 稀疏内存绑定是一个队列操作。包含VK_QUEUE_SPARSE_BINDING_BIT标志的队列必须能支持在设备上 映射虚拟地址到物理地址。这将导致设备上映射表的更新。这个更新必须让队列保持同步,以避免在图形命令执行期间 损坏page table映射。通过在队列上绑定稀疏内存,所有依赖于被更新的绑定的命令在绑定更新之后同步的执行。 查看 Synchronization and Cache Control 一章可知如何实现同步。

4.3.6. 队列销毁

队列和vkCreateDevice创建的逻辑设备一同被创建。当调用vkDestroyDevice后,和逻辑设备关联的所有队列都被销毁了。

5. 命令缓冲区

命令缓冲区是用来记录命令的对象,可以顺序的提交到队列以供执行。有两个级别的命令缓冲区: 主命令缓冲区(可以执行次命令缓冲区,被提交到队列),次命令缓冲区(可以被主命令缓冲区执行,不直接被提交到队列)。

命令缓冲区通过VkCommandBuffer handles表示:

VK_DEFINE_HANDLE(VkCommandBuffer)

记录的命令缓冲区包括绑定管线和描述符到命令缓冲区的命令、修改动态状态的命令、绘制命令,分发命令,执行次命令缓冲区的命令、 复制缓冲区和图像等命令。

每一个命令缓冲区都独立的管理状态。主、次命令缓冲区之间或者两个次级命令缓冲区之间并不继承状态。 当一个命令缓冲区开始记录,该命令缓冲区所有的状态是未定义的。 当次级命令缓冲区被记录以备在主命令缓冲区上执行时,次级命令缓冲区并不从主命令缓冲区继承状态, 在执行次级命令缓冲区后被记录后主命令缓冲区的所有状态是未定义的。 对于这条规则有一个例外—​如果主命令缓冲区在一个渲染pass实例中,那么这个renderpass和subpass状态 并不会被次级命令缓冲区的执行所干扰。 当命令缓冲区的状态是未定义时,应用程序必须在依赖诸如绘制、分发的命令被记录之前 设置命令缓冲区上的相关状态,否则执行命令缓冲区的导致的行为是未知的。

除非特别指定了,或者显式地进行同步,通过命令缓冲区把提交到队列的各种命令才能以任意的顺序, 或者同时执行。还有,若没有显式的内存依赖,这些命令带来的内存副作用可能并不会直接被其他命令看到。 在同一个命令缓冲区,提交到一个指定队列的不同命令缓冲区之间都是有效的。 查看the synchronization chapter 来获取命令之间隐式的 和显式同步的信息。

Each command buffer is always in one of three states:

重置 一个命令缓冲区是一个把之前记录的命令抛弃并把命令缓冲区置为初始状态的操作。 重置是 vkResetCommandBuffervkResetCommandPool,或者 vkBeginCommandBuffer(当把一个命令缓冲区放到记录状态)调用的结果。

5.1. 命令池

命令缓存池是一个不透明对象,可从之分配出命令缓冲区内存,它可允许Vulkan实现均摊多个命令缓冲区 创建资源的开销。命令缓存池需要在外部保持同步,意味着一个命令缓存池不能同时被多个线程使用。 这包括通过记录命令到任何从缓存池中获取的命令缓冲区,和分配、释放、重置命令缓冲区或命令缓存池本身 等操作。

命令缓冲池是通过VkCommandPool 类型handle来表示的:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkCommandPool)

可调用如下命令来创建目录缓存池:

VkResult vkCreateCommandPool(
    VkDevice                                    device,
    const VkCommandPoolCreateInfo*              pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkCommandPool*                              pCommandPool);
  • device 是创建命令缓冲池的逻辑设备。

  • pCreateInfo 包含用来创建命令缓冲池的信息。

  • pAllocator 控制CPU端内存分配,如Memory Allocation 一章所讲。

  • pCommandPool 指向一个VkCommandPool handle ,用它接收被创建缓存池。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkCommandPoolCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pCommandPool must be a pointer to a VkCommandPool handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkCommandPoolCreateInfo 类型数据结构定义如下:

typedef struct VkCommandPoolCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkCommandPoolCreateFlags    flags;
    uint32_t                    queueFamilyIndex;
} VkCommandPoolCreateInfo;
  • sType 是这个数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags is a bit标志位,表示缓存池和从它分配出来的命令缓冲区的用途。 bit位可选如下:

    typedef enum VkCommandPoolCreateFlagBits {
        VK_COMMAND_POOL_CREATE_TRANSIENT_BIT = 0x00000001,
        VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT = 0x00000002,
    } VkCommandPoolCreateFlagBits;
    • VK_COMMAND_POOL_CREATE_TRANSIENT_BIT indicates that command buffers allocated from the pool will be short-lived, meaning that they will be reset or freed in a relatively short timeframe. This flag may be used by the implementation to control memory allocation behavior within the pool.

    • VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT controls whether command buffers allocated from the pool can be individually reset. If this flag is set, individual command buffers allocated from the pool can be reset either explicitly, by calling vkResetCommandBuffer, or implicitly, by calling vkBeginCommandBuffer on an executable command buffer. If this flag is not set, then vkResetCommandBuffer and vkBeginCommandBuffer (on an executable command buffer) must not be called on the command buffers allocated from the pool, and they can only be reset in bulk by calling vkResetCommandPool.

      • queueFamilyIndex designates a queue family as described in section Queue Family Properties. All command buffers allocated from this command pool must be submitted on queues from the same queue family.

正确使用
  • queueFamilyIndex must be the index of a queue family available in the calling command’s device parameter

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO

  • pNext must be NULL

  • flags must be a valid combination of VkCommandPoolCreateFlagBits values

可调用如下命令来重置命令缓存池:

VkResult vkResetCommandPool(
    VkDevice                                    device,
    VkCommandPool                               commandPool,
    VkCommandPoolResetFlags                     flags);
  • device 是拥有命令缓存池的逻辑设备。

  • commandPool 是需要被重置的命令缓存池。

  • flags 包那好附加的标志位,可控制重置行为。 Bits which can be set include:

    typedef enum VkCommandPoolResetFlagBits {
        VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT = 0x00000001,
    } VkCommandPoolResetFlagBits;

    If flags includes VK_COMMAND_POOL_RESET_RELEASE_RESOURCES_BIT, resetting a command pool recycles all of the resources from the command pool back to the system.

Resetting a command pool recycles all of the resources from all of the command buffers allocated from the command pool back to the command pool. All command buffers that have been allocated from the command pool are put in the initial state.

正确使用
  • All VkCommandBuffer objects allocated from commandPool must not currently be pending execution

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • commandPool must be a valid VkCommandPool handle

  • flags must be a valid combination of VkCommandPoolResetFlagBits values

  • commandPool must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to commandPool must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

可调用如下命令来销毁命令缓存池:

void vkDestroyCommandPool(
    VkDevice                                    device,
    VkCommandPool                               commandPool,
    const VkAllocationCallbacks*                pAllocator);
  • device 是需要销毁命令缓存池的逻辑设备。

  • commandPool 是需被销毁的命令缓存池的handle。

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

当一个缓存池被销毁了,所有从之分配命令缓冲区都被释放了,变得无效了。 从一个给定的缓存池分配而来的命令缓冲区并不需要在销毁命令缓存池之前被释放。

正确使用
  • All VkCommandBuffer objects allocated from commandPool must not be pending execution

  • If VkAllocationCallbacks were provided when commandPool was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when commandPool was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If commandPool is not VK_NULL_HANDLE, commandPool must be a valid VkCommandPool handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If commandPool is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to commandPool must be externally synchronized

5.2. 命令缓冲区的分配和管理

可调用如下命令来分配命令缓冲区:

VkResult vkAllocateCommandBuffers(
    VkDevice                                    device,
    const VkCommandBufferAllocateInfo*          pAllocateInfo,
    VkCommandBuffer*                            pCommandBuffers);
  • device 是拥有命令缓存池的逻辑设备。

  • pAllocateInfo 是一个指向 VkCommandBufferAllocateInfo 类型数据结构实例的指针,描述了分配行为的参数。

  • pCommandBuffers 是一个指针,执行一个元素类型为VkCommandBuffer handle的数组,以接收被返回的命令缓冲区对象。 数组的长度至少为pAllocateInfo 的成员 commandBufferCount 指定的大小。每一个分配的命令缓冲区都处于初始状态。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pAllocateInfo must be a pointer to a valid VkCommandBufferAllocateInfo structure

  • pCommandBuffers must be a pointer to an array of pAllocateInfo::commandBufferCount VkCommandBuffer handles

Host Synchronization
  • Host access to pAllocateInfo::commandPool must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkCommandBufferAllocateInfo 类型数据结构定义如下:

typedef struct VkCommandBufferAllocateInfo {
    VkStructureType         sType;
    const void*             pNext;
    VkCommandPool           commandPool;
    VkCommandBufferLevel    level;
    uint32_t                commandBufferCount;
} VkCommandBufferAllocateInfo;
  • sType 是这个数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • commandPool 是分配出命令缓冲区的命令缓存池。

  • level 决定命令缓冲区是主缓冲区还是次缓冲区。 可选值包括:

    typedef enum VkCommandBufferLevel {
        VK_COMMAND_BUFFER_LEVEL_PRIMARY = 0,
        VK_COMMAND_BUFFER_LEVEL_SECONDARY = 1,
    } VkCommandBufferLevel;
  • commandBufferCount is the number of command buffers to allocate from the pool.

正确使用
  • commandBufferCount 必须: 大于 0

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO

  • pNext must be NULL

  • commandPool must be a valid VkCommandPool handle

  • level must be a valid VkCommandBufferLevel value

可调用下列命令啦重置命令缓冲区:

VkResult vkResetCommandBuffer(
    VkCommandBuffer                             commandBuffer,
    VkCommandBufferResetFlags                   flags);
  • commandBuffer 是需要被重置的命令缓冲区。命令缓冲区可以处于任何状态,并把它设置到初始状态。

  • flags 是一个bit标志位,控制重置操作。 可选的bit位包括:

    typedef enum VkCommandBufferResetFlagBits {
        VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT = 0x00000001,
    } VkCommandBufferResetFlagBits;

    flags 包含 VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT, 那么当前占有大多数或者所有的内存资源的命令缓冲区应该被父命令池返回。 若没有设置此标志位,那么命令缓冲区可能持有内存资源并在记录命令时重用它们。

正确使用
  • commandBuffer must not currently be pending execution

  • commandBuffer must have been allocated from a pool that was created with the VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT

Valid Usage (Implicit)
Host Synchronization
  • Host access to commandBuffer must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

可调用下列命令来释放命令缓冲区

void vkFreeCommandBuffers(
    VkDevice                                    device,
    VkCommandPool                               commandPool,
    uint32_t                                    commandBufferCount,
    const VkCommandBuffer*                      pCommandBuffers);
  • device 是拥有该命令缓存池的逻辑设备。

  • commandPool 是分配出命令缓冲区的命令缓存池。

  • commandBufferCountpCommandBuffers 数组的长度。

  • pCommandBuffers 是需要被释放的命令缓冲区handle的数组。

正确使用
  • pCommandBuffers 中所有的元素不能是被暂停执行状态。

  • pCommandBuffers 必须是一个指针,指向 commandBufferCount 个元素的VkCommandBuffer handle数组,每一个元素必须是一个有效的handle或者 NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • commandPool must be a valid VkCommandPool handle

  • commandBufferCount must be greater than 0

  • commandPool must have been created, allocated, or retrieved from device

  • Each element of pCommandBuffers that is a valid handle must have been created, allocated, or retrieved from commandPool

Host Synchronization
  • Host access to commandPool must be externally synchronized

  • Host access to each member of pCommandBuffers must be externally synchronized

5.3. 命令缓冲区的记录

可调用下列命令来开始记录命令缓冲区:

VkResult vkBeginCommandBuffer(
    VkCommandBuffer                             commandBuffer,
    const VkCommandBufferBeginInfo*             pBeginInfo);
  • commandBuffer 是需要被置为记录状态的命令缓冲区的handle。

  • pBeginInfo is an instance of the VkCommandBufferBeginInfo structure, which defines additional information about how the command buffer begins recording.

正确使用
  • commandBuffer 不能处于记录状态。

  • commandBuffer 当前不能是暂停执行状态。

  • commandBuffer 是从 一个 没有VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT标志位的 VkCommandPool 中分配而来的,commandBuffer必须处于出事状态。

  • commandBuffer 是一个次级命令缓冲区, pBeginInfopInheritanceInfo 成员必须是一个有效的 VkCommandBufferInheritanceInfo 数据结构。

  • commandBuffer 是一个次级命令缓冲区,或者pBeginInfo的成员pInheritanceInfo的成员occlusionQueryEnableVK_FALSE,精确occlusion查询特征没有被启用, 那么 pBeginInfo的成员 pInheritanceInfo 的成员 queryFlags,不能包含 VK_QUERY_CONTROL_PRECISE_BIT

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pBeginInfo must be a pointer to a valid VkCommandBufferBeginInfo structure

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkCommandBufferBeginInfo 数据类型定义如下:

typedef struct VkCommandBufferBeginInfo {
    VkStructureType                          sType;
    const void*                              pNext;
    VkCommandBufferUsageFlags                flags;
    const VkCommandBufferInheritanceInfo*    pInheritanceInfo;
} VkCommandBufferBeginInfo;
  • sType 是这个数据结构的类型。

  • pNextNULL 或者指向拓展特定结构的指针

  • flags 是一个标志位,用来表示命令缓冲区被使用时的行为。 Bits which can be set include:

    typedef enum VkCommandBufferUsageFlagBits {
        VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT = 0x00000001,
        VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT = 0x00000002,
        VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT = 0x00000004,
    } VkCommandBufferUsageFlagBits;
    • VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT indicates that each recording of the command buffer will only be submitted once, and the command buffer will be reset and recorded again between each submission.

    • VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT indicates that a secondary command buffer is considered to be entirely inside a render pass. If this is a primary command buffer, then this bit is ignored.

    • Setting VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT allows the command buffer to be resubmitted to a queue or recorded into a primary command buffer while it is pending execution.

      • pInheritanceInfo is a pointer to a VkCommandBufferInheritanceInfo structure, which is used if commandBuffer is a secondary command buffer. If this is a primary command buffer, then this value is ignored.

正确使用
  • If flags contains VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, the renderPass member of pInheritanceInfo must be a valid VkRenderPass

  • If flags contains VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, the subpass member of pInheritanceInfo must be a valid subpass index within the renderPass member of pInheritanceInfo

  • If flags contains VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT, the framebuffer member of pInheritanceInfo must be either VK_NULL_HANDLE, or a valid VkFramebuffer that is compatible with the renderPass member of pInheritanceInfo

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO

  • pNext must be NULL

  • flags must be a valid combination of VkCommandBufferUsageFlagBits values

若命令缓冲区是次级命令缓冲区,那么VkCommandBufferInheritanceInfo 数据结构定义了从主命令缓冲区继承而来的任何状态:

typedef struct VkCommandBufferInheritanceInfo {
    VkStructureType                  sType;
    const void*                      pNext;
    VkRenderPass                     renderPass;
    uint32_t                         subpass;
    VkFramebuffer                    framebuffer;
    VkBool32                         occlusionQueryEnable;
    VkQueryControlFlags              queryFlags;
    VkQueryPipelineStatisticFlags    pipelineStatistics;
} VkCommandBufferInheritanceInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • renderPass is a VkRenderPass object defining which render passes the VkCommandBuffer will be compatible with and can be executed within. If the VkCommandBuffer will not be executed within a render pass instance, renderPass is ignored.

  • subpass is the index of the subpass within the render pass instance that the VkCommandBuffer will be executed within. If the VkCommandBuffer will not be executed within a render pass instance, subpass is ignored.

  • framebuffer optionally refers to the VkFramebuffer object that the VkCommandBuffer will be rendering to if it is executed within a render pass instance. It can be VK_NULL_HANDLE if the framebuffer is not known, or if the VkCommandBuffer will not be executed within a render pass instance.

    注意

    指定次级命令缓冲区将被执行的命令缓冲区将导致该缓冲区执行时有更好的性能。

  • occlusionQueryEnable indicates whether the command buffer can be executed while an occlusion query is active in the primary command buffer. If this is VK_TRUE, then this command buffer can be executed whether the primary command buffer has an occlusion query active or not. If this is VK_FALSE, then the primary command buffer must not have an occlusion query active.

  • queryFlags indicates the query flags that can be used by an active occlusion query in the primary command buffer when this secondary command buffer is executed. If this value includes the VK_QUERY_CONTROL_PRECISE_BIT bit, then the active query can return boolean results or actual sample counts. If this bit is not set, then the active query must not use the VK_QUERY_CONTROL_PRECISE_BIT bit.

  • pipelineStatistics indicates the set of pipeline statistics that can be counted by an active query in the primary command buffer when this secondary command buffer is executed. If this value includes a given bit, then this command buffer can be executed whether the primary command buffer has a pipeline statistics query active that includes this bit or not. If this value excludes a given bit, then the active pipeline statistics query must not be from a query pool that counts that statistic.

正确使用
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_INFO

  • pNext must be NULL

  • Both of framebuffer, and renderPass that are valid handles must have been created, allocated, or retrieved from the same VkDevice

主命令缓冲区被认为是从使用vkQueueSubmit提交命令开始、直到提交操作完成之前,都处于暂停执行的状态。

次命令缓冲区被认为从它的执行被记录到主缓冲区()到主缓冲区提交到队列完成的最终时刻,是出于暂停执行状态的。 如果,在主缓冲区完成了,次缓冲区被记录到另外一个主缓冲区上执行,第一个主缓冲区不能被再次提交,直到通过 vkResetCommandBuffer 被重置,除非次命令缓冲区以VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT 标志被记录。

如果没有在次命令缓冲区上设置VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT,那么那个命令缓冲区就不能在 指定主命令缓冲区上使用多次。 还有,如果不带有VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT的次命令缓冲区被记录到一个带有 VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT标志位的主命令缓冲区上执行,那么这个主缓冲区不能被多次暂停执行。

注意

在一些Vulkan实现上,不使用VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT 标志位将导致命令缓冲区在有需求的情况下被部分的替代, 而不是创建创建一份新的命令缓冲区复制。

如果一个命令缓冲区处于可执行状态,且命令缓冲区带有VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT标志位 从缓存池中分配而来,那么vkBeginCommandBuffer将隐式的重置命令缓冲区,就如同不带有 VK_COMMAND_BUFFER_RESET_RELEASE_RESOURCES_BIT参数调用了vkResetCommandBuffer。 然而它把命令缓冲区置为记录中的状态。

Once recording starts, an application records a sequence of commands (vkCmd*) to set state in the command buffer, draw, dispatch, and other commands.

可调用下来命令来结束记录命令缓冲区:

VkResult vkEndCommandBuffer(
    VkCommandBuffer                             commandBuffer);
  • commandBuffer is the command buffer to complete recording. The command buffer must have been in the recording state, and is moved to the executable state.

If there was an error during recording, the application will be notified by an unsuccessful return code returned by vkEndCommandBuffer. If the application wishes to further use the command buffer, the command buffer must be reset.

正确使用
  • commandBuffer 必须处于记录中状态。

  • commandBuffer 是一个主命令缓冲区, 那么它不能有一个活跃的render pass实例。

  • All queries made active during the recording of commandBuffer must have been made inactive

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

当命令缓冲区处于可执行状态时,它可以被提交到队列等待执行。

5.4. 命令缓冲区的提交

可调用下列的命令把命令缓冲区提交到队列:

VkResult vkQueueSubmit(
    VkQueue                                     queue,
    uint32_t                                    submitCount,
    const VkSubmitInfo*                         pSubmits,
    VkFence                                     fence);
  • queue 是命令缓冲区被提交到的队列。

  • submitCountpSubmits 数组元素的大小。

  • pSubmits 是一个指向 元素类型为VkSubmitInfo 的数组的指针,每一个元素都指定了一个命令缓冲区提交batch。

  • fence 是一个可选的handle,指向一个将被激发的fence。如果fence 不是 VK_NULL_HANDLE,它定义了一个 fence signal operation

注意

提交可能是一个代价很高的操作,应用程序应该尝试批量的工作,尽量少的调用vkQueueSubmit

vkQueueSubmit是一个队列提交命令,每一批任务通过pSubmits中由VkSubmitInfo表示的一批任务定义。 pSubmits中各批次的任务依出现的顺序执行,但是,完成的顺序可能是乱序的。

通过vkQueueSubmit提交的栅栏和信号量操作和其他的命令提交有另外的顺序限制,依赖于队列中前后操作。 关于这些量外的限制的信息可以在“同步”一章信号量 and 栅栏 小节中看到。

关于pWaitDstStageMask和同步之间的细节在“同步”一章中的 信号量等待操作 小节中描述。

pSubmits中批次出现的顺序通常由提交顺序决定,故所有的隐式排序保证遵守这点。 除了这些隐式排序保证和任何显式的同步原语,这些工作批次可能重叠或者乱序执行。

正确使用
  • fence 不是 VK_NULL_HANDLE,那么 fence 必须未被发送过信号。

  • fence 不是 VK_NULL_HANDLE,那么fence 不能和任何其他的在队列中未完成执行的队列命令关联上。

  • Any calls to vkCmdSetEvent, vkCmdResetEvent or vkCmdWaitEvents that have been recorded into any of the command buffer elements of the pCommandBuffers member of any element of pSubmits, must not reference any VkEvent that is referenced by any of those commands that is pending execution on another queue.

  • Any stage flag included in any element of the pWaitDstStageMask member of any element of pSubmits must be a pipeline stage supported by one of the capabilities of queue, as specified in the table of supported pipeline stages.

  • Any given element of the pSignalSemaphores member of any element of pSubmits must be unsignaled when the semaphore signal operation it defines is executed on the device

  • When a semaphore unsignal operation defined by any element of the pWaitSemaphores member of any element of pSubmits executes on queue, no other queue must be waiting on the same semaphore.

  • All elements of the pWaitSemaphores member of all elements of pSubmits must be semaphores that are signaled, or have semaphore signal operations previously submitted for execution.

Valid Usage (Implicit)
  • queue must be a valid VkQueue handle

  • If submitCount is not 0, pSubmits must be a pointer to an array of submitCount valid VkSubmitInfo structures

  • If fence is not VK_NULL_HANDLE, fence must be a valid VkFence handle

  • Both of fence, and queue that are valid handles must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to queue must be externally synchronized

  • Host access to pSubmits[].pWaitSemaphores[] must be externally synchronized

  • Host access to pSubmits[].pSignalSemaphores[] must be externally synchronized

  • Host access to fence must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

VkSubmitInfo 类型数据结构定义如下:

typedef struct VkSubmitInfo {
    VkStructureType                sType;
    const void*                    pNext;
    uint32_t                       waitSemaphoreCount;
    const VkSemaphore*             pWaitSemaphores;
    const VkPipelineStageFlags*    pWaitDstStageMask;
    uint32_t                       commandBufferCount;
    const VkCommandBuffer*         pCommandBuffers;
    uint32_t                       signalSemaphoreCount;
    const VkSemaphore*             pSignalSemaphores;
} VkSubmitInfo;
  • sType 是本数据结构的类型。

  • pNextNULL 或者指向拓展特定结构的指针。

  • waitSemaphoreCount is the number of semaphores upon which to wait before executing the command buffers for the batch.

  • pWaitSemaphores is a pointer to an array of semaphores upon which to wait before the command buffers for this batch begin execution. If semaphores to wait on are provided, they define a semaphore wait operation.

  • pWaitDstStageMask is a pointer to an array of pipeline stages at which each corresponding semaphore wait will occur.

  • commandBufferCount is the number of command buffers to execute in the batch.

  • pCommandBuffers is a pointer to an array of command buffers to execute in the batch.

  • signalSemaphoreCount is the number of semaphores to be signaled once the commands specified in pCommandBuffers have completed execution.

  • pSignalSemaphores is a pointer to an array of semaphores which will be signaled when the command buffers for this batch have completed execution. If semaphores to be signaled are provided, they define a semaphore signal operation.

命令缓冲区在pCommandBuffers中出现的顺序决定了submission orderimplicit ordering guarantees 也取决于这个顺序。 除了这些隐式的顺序保证 和其他的explicit synchronization primitives,这些命令缓冲区可能重叠或者乱序执行。

正确使用
  • Any given element of pCommandBuffers must either have been recorded with the VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT, or not currently be executing on the device

  • Any given element of pCommandBuffers must be in the executable state

  • If any given element of pCommandBuffers contains commands that execute secondary command buffers, those secondary command buffers must have been recorded with the VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT, or not currently be executing on the device

  • If any given element of pCommandBuffers was recorded with VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, it must not have been previously submitted without re-recording that command buffer

  • If any given element of pCommandBuffers contains commands that execute secondary command buffers recorded with VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, each such secondary command buffer must not have been previously submitted without re-recording that command buffer

  • Any given element of pCommandBuffers must not contain commands that execute a secondary command buffer, if that secondary command buffer has been recorded in another primary command buffer after it was recorded into this VkCommandBuffer

  • Any given element of pCommandBuffers must have been allocated from a VkCommandPool that was created for the same queue family that the calling command’s queue belongs to

  • Any given element of pCommandBuffers must not have been allocated with VK_COMMAND_BUFFER_LEVEL_SECONDARY

  • If the geometry shaders feature is not enabled, any given element of pWaitDstStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, any given element of pWaitDstStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • Any given element of pWaitDstStageMask must not include VK_PIPELINE_STAGE_HOST_BIT.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SUBMIT_INFO

  • pNext must be NULL

  • If waitSemaphoreCount is not 0, pWaitSemaphores must be a pointer to an array of waitSemaphoreCount valid VkSemaphore handles

  • If waitSemaphoreCount is not 0, pWaitDstStageMask must be a pointer to an array of waitSemaphoreCount valid combinations of VkPipelineStageFlagBits values

  • Each element of pWaitDstStageMask must not be 0

  • If commandBufferCount is not 0, pCommandBuffers must be a pointer to an array of commandBufferCount valid VkCommandBuffer handles

  • If signalSemaphoreCount is not 0, pSignalSemaphores must be a pointer to an array of signalSemaphoreCount valid VkSemaphore handles

  • Each of the elements of pCommandBuffers, the elements of pSignalSemaphores, and the elements of pWaitSemaphores that are valid handles must have been created, allocated, or retrieved from the same VkDevice

5.5. 队列发送进度(Queue Forward Progress)

应用程序必须保证在任何队列上没有剩下的操作时命令缓冲区提交将能够完成。 在vkQueueSubmit调用之后,对等待一个信号量的每个排队等待者必须是一个比信号量更早的信号, 该信号量不会被一个不同的等待者消耗。

提交的命令缓冲区可以包含等待不会被队列中更早的命令所激发的事件的vkCmdWaitEvents命令, 这些事件必须通过应用程序使用vkSetEvent来激发,且等待这些事件的vkCmdWaitEvents命令不能在一个render pass内。 Vulkan实现可能对命令缓冲区等待的时长有限制,以避免和设备的其他clients的工作进度有干扰。 如果没有这些限制条件下事件被激发了,结果是未定义的,可能包括设备丢失。

5.6. 次级命令缓冲区的执行

次命令缓冲区必须不能直接被提交到队列。相反,需要被记录到主命令缓冲区的一部分来被执行:

void vkCmdExecuteCommands(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    commandBufferCount,
    const VkCommandBuffer*                      pCommandBuffers);
  • commandBuffer 是主命令缓冲区,次命令缓冲区在它里面执行。

  • commandBufferCountpCommandBuffers 数组的大小。

  • pCommandBuffers 是次命令缓冲区handle的数组,按照在数组中的位置被提交到主命令缓冲区中被记录、执行。

一旦vkCmdExecuteCommands被调用,在任何主命令缓冲区中的由pCommandBuffers指定的次命令缓冲区之前的任何执行工作都被无效化了, 除非这些次命令缓冲区被记录时带有VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT

正确使用
  • commandBuffer 必须在分配的时候带有的参数 level 值为 VK_COMMAND_BUFFER_LEVEL_PRIMARY

  • pCommandBuffers中任何一个元素必须在分配的时候带有的参数level 值为 VK_COMMAND_BUFFER_LEVEL_SECONDARY

  • pCommandBuffers中任何一个元素不能在commandBuffer之中处于pending状态,也不能出现两次,除非带有 VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT 标志位

  • pCommandBuffers的任何元素都不能在任何其他的 VkCommandBuffer中处于pending 状态,除非被记录时带有 VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT 标志位

  • pCommandBuffers 中任何一个元素都不能处于可执行状态

  • pCommandBuffers 中任何一个元素都必须从队列族索引与commandBuffer所在的VkCommandPool 相同的VkCommandPool 分配而来。

  • 如果vkCmdExecuteCommands在一个render pass 实例中被调用,此render pass实例必须在开始的时候保证 vkCmdBeginRenderPass参数的成员 contents 被设置为 VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS

  • 如果在一个render pass实例内调用 vkCmdExecuteCommands 命令,pCommandBuffers 中的任何元素必须以 VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT的方式被记录

  • 如果在一个render pass实例内调用vkCmdExecuteCommands 命令,pCommandBuffers中的任何元素被记录时, 必须让VkCommandBufferInheritanceInfo::subpass 被设置为指定的将所在执行的命令缓冲区的subpass的索引

  • 如果在一个render pass 实例内调用vkCmdExecuteCommands, 在用来开始记录pCommandBuffers 中每一个元素的 vkBeginCommandBuffer 的成员pname::pBeginInfo::pInheritanceInfo::renderPass 指定 render pass实例必须和 当前使用的render pass 兼容

  • 如果在一个render pass实例中调用 vkCmdExecuteCommandspCommandBuffers 中任一个以VkCommandBufferInheritanceInfo::framebuffer 被记录的元素必须匹配在当前的render pass实例中使用的VkFramebuffer

  • 如果在一个render pass实例中没有调用 vkCmdExecuteCommandspCommandBuffers 中任何一个元素在被记录时不能带有 VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT

  • 如果 inherited queries 特征没有被启用,commandBuffer 的queries都不能为 active

  • commandBuffer 有一个VK_QUERY_TYPE_OCCLUSION query active, 那么 pCommandBuffers 的每一个元素被记录时 VkCommandBufferInheritanceInfo::occlusionQueryEnable 必须设置为 VK_TRUE

  • commandBuffer 有一个VK_QUERY_TYPE_OCCLUSION query active, 那么 pCommandBuffers 的每一个元素被记录时 VkCommandBufferInheritanceInfo::queryFlags 所有的bit必须为该query设置好

  • commandBuffer 有一个 VK_QUERY_TYPE_PIPELINE_STATISTICS query active,那么 pCommandBuffers 的每一个元素在记录时 VkCommandBufferInheritanceInfo::pipelineStatistics 所有的bit都必须和这个query使用的VkQueryPool 的bit位相同

  • pCommandBuffers 任何一个元素不能开启 在commandBuffer active的 query类型

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pCommandBuffers must be a pointer to an array of commandBufferCount valid VkCommandBuffer handles

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support transfer, graphics, or compute operations

  • commandBuffer must be a primary VkCommandBuffer

  • commandBufferCount must be greater than 0

  • Both of commandBuffer, and the elements of pCommandBuffers must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary

Both

Transfer
graphics
compute

6. 同步与缓存控制

对资源的访问同步是Vulkan应用程序的主要职责。主机端的命令和设备端的其他命令的执行顺序只有少数几个隐式的保证,需要显式的指定。 内存缓存和其他优化也需要显式的管理,要求数据流在应用程序的控制之下。

尽管在不同的命令之间有一些隐式的保证,在Vulkan中提供了四种显示的同步原语:

栅栏

栅栏可以用来和主机端交流设备端的任务已经执行完成。

信号量

信号量可以用来控制多个队列对资源的访问。

事件

事件提供了精细粒度的同步原语,可以在命令缓冲区之内或者被主机端激发,可以在命令缓冲区内等待或者主机端被查询。

管线屏障

管线屏障也在命令缓冲区内提供了同步控制,但是只是单点的,而非可分离激发和等待操作。

除了这里的基本的原语, 在本章的概念上,Render Passes 为大多数渲染任务提供了同步框架。 本章中很多需要应用程序来同步原语的情况,可以被render pass更加高效的表达出来。

6.1. 执行与内存依赖

一个 操作 是主机端、设备端或者外部入口如展示引擎等之上任意数量的工作。 同步命令引入了显式的 执行依赖内存依赖,有命令的两个_synchronization scopes_ 定义的两个操作集合之间的 memory dependencies

同步范围定义了哪些其他的同步命令可以创建执行依赖。 不再同步命令的同步范围内的任何类型的操作将不被包含进入生成的依赖之中。 例如,对于很多同步命令来说,同步范围可以被限制到只执行特定管线阶段的操作上,这允许其他的管线阶段被排除在依赖之外。 其他的范围选项也是可行的,依赖于特定的命令。

执行依赖 保证了两个集合的操作其中的一个一定在另外一个之前发生。 如果一个操作在另外一个操作之前发生了,那么第一个操作一定会在第二个操作初始化之前完成。 更准确的表述是:

  • 假定 A and B 是相互独立的操作集合

  • 假定 S 是同步命令

  • 假定 ASBSS 的同步范围.

  • 假定 A' 是集合 AAS 的交集

  • 假定 B' 是集合 B 与 *BS*的交集

  • 提交 A, S and B 来按照顺序执行,将导致在 A'B' 产程一个执行依赖

  • 执行依赖 E 保证 A' 在*B'* 之前发生

一个 执行依赖链 是一系列的执行依赖,形成了第一个依赖的 A' 与最后一个依赖的 B' 之间的 “先行发生”的关系。 对于每一对连续的执行依赖,如果 第一个依赖中*BS* 与第二个依赖中*AS*的交集不是空集,那么就存在一个依赖链。 一个执行依赖链之中单个执行依赖的形成可以通过代入如下执行依赖的描述所描述:

  • 假设 S 是一个同步命令集合,由一个执行依赖链所生成。

  • 假设 AS 是 *S*之中第一个命令的第一个同步作用域。

  • 假设 BS 是 *S*中最后一个命令的第二个同步作用域。

注意

一个执行依赖is inherently also 多重执行依赖- 一个 依赖在 A'*的每个子集与 *B' 的每个子集之间 存在,且对执行依赖链同样成立。 例如,一个 带有多重管线阶段阶段掩码 的 同步命令 高效的在每个源阶段与每个目标阶段 之间生成一个依赖。 这对于 理解执行链在其没有涉及到同步命令的依赖时如何构建的 很有用。 同样,在一个执行依赖链中相邻的依赖都可以被认为是一个执行依赖链 。

单单就执行顺序无法保证其他操作集写入的结果可以被另一个操作集读取。

附加操作的两个类型是用来控制内存访问的。 可见性操作 导致 特定的内存写入对于未来的访问变得 可用 。 Any available value remains available until a subsequent write to the same memory location occurs (whether it is made available or not) or the memory is freed. 可见性操作 导致任何可用的值对于特定的内存访问变得 可见

一个_内存依赖_ 是一个执行依赖,它包含可用性、可见性操作:
  • 第一个操作集合在可用性操作之前发生。

  • 可用性操作在可见性操作之前发生。

  • 可见性操作在第二个操作集合之前发生。

一旦被写入的值对于一个特定类型内存访问可见,那么它们可以被该类型内存访问读取或者写入。 Vulkan中多数同步命令定义了一个内存依赖。

内存依赖的 访问作用域 定义了 特定的内存访问的可用性与可见性。 在一个内存依赖的第一个 访问作用域 与出现在 *A'*的 任何类型的访问是可用的。 在一个内存依赖第二个访问作用域 与出现在 *B'*的任何类型的访问让可用的写入对其变得可见。 不在同步命令的访问作用域的任何类型操作 将不被产生的依赖所包含。

一个内存依赖强制保证了 两个操作集合之间 内存访问与 执行顺序的可用性、可见性。 参考 执行依赖链一节的描述:

  • 假定 a 是由 *A'*执行的 内存访问的集合。

  • 假定 b 是由 *B*执行的 内存访问的集合。

  • 假定 aS 是 在 *S*中第一个命令的第一个访问作用域。

  • 假定 bS 是 在 *S*中最后一个命令的第二个访问作用域。

  • 假定 a' 是*a* and *aS*的子集。

  • 假定 b' 是*b* and *bS*的子集。

  • 提交 A, S and B 来执行,依此序,将会在*A'* and B'*之间 产生一个内存依赖*m

  • 内存依赖*m* guarantees that:

    • 在*a'*中的内存写入 变得可用。

    • 可用内存写入,包括那些在 a'*中的,变得对*b'*可见。

注意

执行与内存依赖是用来解决数据危害的,亦即,保证读写操作依良好定义的顺序执行。 “读后写”的数据错误可以被执行依赖解决,但是“写后读”与“写后写”数据错误需要内存操作之间有合适的内存依赖。 若应用程序并不包含依赖以解决这些数据危害问题,那么内存访问的执行顺序与结果是未定义的。

6.1.1. 图像布局转换

图形子资源可以从一个 布局 到另外一个,作为 内存依赖的一部分。 (e.g. by using an 图像内存屏障). 当内存依赖指定了一个布局转换,在内存依赖中它发生在availability操作之后,发生在visibility操作之前。 图像布局转换操作可以对绑定到图像子资源范围的所有内存执行读、写,故应用程序必须保证所有的内存写入结果需要在布局转换被执行之前 变为可用。 可用的内存对布局转换自动可见,且布局转换写入的内存也是自动变为可用。

布局转换总是应用到图像子资源的特定范围,且指定老的布局和新的布局。 若老的布局与新布局并不匹配,转换可以发生。老布局必须与图像子资源范围的当前布局匹配,除了一个例外。 老布局可以被指定为 VK_IMAGE_LAYOUT_UNDEFINED,即便做如此操作会导致图像子资源范围的内容变得无效。

注意

设定老的布局为 VK_IMAGE_LAYOUT_UNDEFINED 暗示着图像子资源的内容不需要被保留。 Vulkan驱动实现可能使用这信息以避免昂贵的数据转换操作。

注意

应用程序必须保证 布局转换 在 所有的访问老布局的图像的操作 之后 发生, 且在所有访问新布局的图像的操作之前 发生。 布局转换是潜在的读写操作,所以不定义一个合适的 内存依赖来保证顺序将会导致数据竞争。

在一次图像布局转换之后,共享绑定到转换之后的图像子资源范围的内存的 其他的资源,它们的任何部分内存内容都都可能是未定义的。

6.1.2. 管线阶段

action command 执行的计算由多个操作组成。这些操作由一些被称作“管线阶段”的逻辑独立执行单元执行。 被执行的管线阶段依赖于被使用的 action command,和action command被记录所在的当前的命令缓冲区阶段。 Drawing commands, dispatching commands, copy commands, 和 clear commands 都在不同的 管线阶段集合中被执行。

跨管线阶段的操作的执行必须参照于implicit ordering guarantees,特别是包含 管线阶段排序。 否则,跨管线阶段执行可能会互相重叠,或者乱序,除非通过执行依赖来强制保证。

一些同步命令包含管线阶段参数,对命令的那些阶段限制了 同步作用域 。 这允许了 在准确执行依赖与 action命令执行的访问之间 更加精确的控制。 Vulkan实现应该使用这些 管线阶段来避免不必要的暂停与cache刷新。

可以使用如下位掩码来指定管线阶段:

typedef enum VkPipelineStageFlagBits {
    VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT = 0x00000001,
    VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT = 0x00000002,
    VK_PIPELINE_STAGE_VERTEX_INPUT_BIT = 0x00000004,
    VK_PIPELINE_STAGE_VERTEX_SHADER_BIT = 0x00000008,
    VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT = 0x00000010,
    VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT = 0x00000020,
    VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT = 0x00000040,
    VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT = 0x00000080,
    VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT = 0x00000100,
    VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT = 0x00000200,
    VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT = 0x00000400,
    VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT = 0x00000800,
    VK_PIPELINE_STAGE_TRANSFER_BIT = 0x00001000,
    VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT = 0x00002000,
    VK_PIPELINE_STAGE_HOST_BIT = 0x00004000,
    VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT = 0x00008000,
    VK_PIPELINE_STAGE_ALL_COMMANDS_BIT = 0x00010000,
} VkPipelineStageFlagBits;

每一个bit的含义如下:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT: 任何命令被队列初次接受的管线阶段。

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT: Stage of the 管线where Draw/DispatchIndirect data structures are consumed.

  • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT: Stage of the 管线where vertex and 索引 buffers are consumed.

  • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT: Vertex shader stage.

  • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT: Tessellation control shader stage.

  • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT: Tessellation evaluation shader stage.

  • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT: Geometry shader stage.

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT: Fragment shader stage.

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT: Stage of the pipeline where early fragment tests (depth and stencil tests before fragment shading) are performed. This stage also includes subpass load operations for framebuffer attachments with a depth/stencil format.

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT: Stage of the pipeline where late fragment tests (depth and stencil tests after fragment shading) are performed. This stage also includes subpass store operations for framebuffer attachments with a depth/stencil format.

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT: Stage of the 管线after blending where the final color values are output from the pipeline. This stage also includes subpass load and store operations and multisample resolve operations for framebuffer attachments with a color format.

  • VK_PIPELINE_STAGE_TRANSFER_BIT: Execution of copy commands. This includes the operations resulting from all copy commands, clear commands (with the exception of vkCmdClearAttachments), and vkCmdCopyQueryPoolResults.

  • VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT: Execution of a compute shader.

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT: Final stage in the pipeline where operations generated by all commands complete execution.

  • VK_PIPELINE_STAGE_HOST_BIT: A pseudo-stage indicating 执行on the host of reads/writes of 设备memory. This stage is not invoked by any commands recorded in a 命令buffer.

  • VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT: Execution of all graphics 管线阶段. Equivalent to the logical or of:

    • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

    • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

    • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

    • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

    • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

    • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

    • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

    • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

    • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

    • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

    • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

    • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_ALL_COMMANDS_BIT: Equivalent to the logical or of every other 管线stage flag that is supported on the 队列it is used with.

注意

在源阶段掩码中只有 VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT的一个执行依赖,将只阻止该阶段在被提交的命令中顺序执行。 如同这个阶段并不执行任何真实的操作,外界无法观测到。实际上,它并不延迟处理下一个命令。 同样,在目标阶段掩码中只有VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT的一个执行依赖 ,不会有效等待之前的任何命令去完成。

当定义了一个内存依赖, 只使用VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT 或者 VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT时, 将不会让任何访问变得可用、可见,因为这些阶段不会访问内存。

当所需的执行依赖被其他方式所满足时 - 例如, 在不同队列之间的信号量操作,完成布局转换和 队列所有权转移操作中,VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BITVK_PIPELINE_STAGE_TOP_OF_PIPE_BIT 将很有用。

若一个同步命令包含一个目标阶段掩码,它的第一个同步作用域 将只包含 由该掩码指定的管线阶段的执行, 同样包含逻辑上更早的 阶段。 若一个同步命令包含一个源阶段掩码,它的第二个同步作用域 将只包含 由该掩码指定的管线阶段的执行,同样包含逻辑上稍后 阶段。

访问作用域 受同样的方式所影响。 若一个同步命令包含一个目标阶段掩码,她的第一个访问作用域 只包含 由该掩码指定的管线阶段 所执行的 内存访问。 若一个同步命令包含一个源阶段掩码,它的第二个访问作用域 只包含 由该掩码指定的管线阶段 所执行的内存访问。

注意

Vulkan实现可能并不支持为了每个同步操作在每个管线阶段进行同步。 若Vulkan实现并不支持同步的一个管线阶段出现在 目标阶段掩码,那么它可以为了逻辑上的后续阶段替换该阶段。 若Vulkan实现并不支持同步的一个管线阶段出现在源阶段掩码,那么它可以为了逻辑上的之前的阶段替换该阶段。

例如,若一个Vulkan实现无法在顶点着色器执行完成后激发事件,它可以在颜色附件输出完成之后再激发事件。 若Vulkan实现做出了此种替换,它必须不能影响执行或者内存依赖、图像或缓冲区内存屏障的语义。

某些管线阶段只在支持特定操作集合的队列中可以使用。 下表陈列了,每个管线阶段标志位,以及队列必须支持的队列兼容性标志。 当表格的第二列出现了多个标志位,这表示管线阶段被带有其中任意一个标志位的队列所支持, 向获知有关队列兼容性的更多详细信息,请参考 物理设备枚举 队列.

Table 3. Supported 管线stage flags
Pipeline 阶段 flag Required队列capability flag

VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

None required

VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_QUEUE_COMPUTE_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_QUEUE_GRAPHICS_BIT, VK_QUEUE_COMPUTE_BIT or VK_QUEUE_TRANSFER_BIT

VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

None required

VK_PIPELINE_STAGE_HOST_BIT

None required

VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT

VK_QUEUE_GRAPHICS_BIT

VK_PIPELINE_STAGE_ALL_COMMANDS_BIT

None required

管线阶段 that execute as a result of a 命令logically complete 执行in a specific order, such that completion of a 逻辑上稍后 管线阶段 must not happen-before completion of a logically earlier 阶段. This means that including any given 阶段 in the 目标阶段掩码 for a particular synchronization 命令also implies that any logically earlier 阶段s are included in AS for that command.

Similarly, initiation of a logically earlier 管线阶段 must not happen-after initiation of a 逻辑上稍后 管线阶段. Including any given 阶段 in the 源阶段掩码 for a particular synchronization 命令also implies that any 逻辑上稍后 阶段s are included in BS for that command.

注意

当定义 一个内存屏障访问作用域 , 逻辑上更早/更晚的 阶段 并没有被包含。

管线阶段的顺序依赖于 特定的管线;图像、计算、转移或者 host管线。

对于图形管线,如下阶段按顺序发生:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

  • VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

  • VK_PIPELINE_STAGE_VERTEX_SHADER_BIT

  • VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT

  • VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

对于计算管线,如下阶段按顺序发生:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

  • VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

对于转换管线,如下阶段按顺序发生:

  • VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT

  • VK_PIPELINE_STAGE_TRANSFER_BIT

  • VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT

对于CPU端操作,只会有一个阶段会发生,所以不需要保证顺序:

  • VK_PIPELINE_STAGE_HOST_BIT

6.1.3. 访问类型

Memory in Vulkan can be accessed from within shader invocations and via some fixed-function 阶段s of the pipeline. The access type is a function of the descriptor type used, or how a fixed-function 阶段 访问memory. Each access type corresponds to a bit flag in VkAccessFlagBits.

Some synchronization commands take sets of 访问类型 as parameters to define the 访问作用域 of a内存依赖. 若a synchronization 命令includes a 源访问掩码, its first 访问作用域 only includes 访问via the 访问类型 specified in that mask. Similarly, if a synchronization 命令includes a 目标访问掩码, its second 访问作用域 only includes 访问via the 访问类型 specified in that mask.

可以被设置的访问类型包含如下:

typedef enum VkAccessFlagBits {
    VK_ACCESS_INDIRECT_COMMAND_READ_BIT = 0x00000001,
    VK_ACCESS_INDEX_READ_BIT = 0x00000002,
    VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT = 0x00000004,
    VK_ACCESS_UNIFORM_READ_BIT = 0x00000008,
    VK_ACCESS_INPUT_ATTACHMENT_READ_BIT = 0x00000010,
    VK_ACCESS_SHADER_READ_BIT = 0x00000020,
    VK_ACCESS_SHADER_WRITE_BIT = 0x00000040,
    VK_ACCESS_COLOR_ATTACHMENT_READ_BIT = 0x00000080,
    VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT = 0x00000100,
    VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT = 0x00000200,
    VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT = 0x00000400,
    VK_ACCESS_TRANSFER_READ_BIT = 0x00000800,
    VK_ACCESS_TRANSFER_WRITE_BIT = 0x00001000,
    VK_ACCESS_HOST_READ_BIT = 0x00002000,
    VK_ACCESS_HOST_WRITE_BIT = 0x00004000,
    VK_ACCESS_MEMORY_READ_BIT = 0x00008000,
    VK_ACCESS_MEMORY_WRITE_BIT = 0x00010000,
} VkAccessFlagBits;
  • VK_ACCESS_INDIRECT_COMMAND_READ_BIT: Read access to一个indirect 命令structure read as part of一个indirect drawing or dispatch command.

  • VK_ACCESS_INDEX_READ_BIT: Read access to一个索引 缓冲区 as part of一个indexed drawing command, bound by vkCmdBindIndexBuffer.

  • VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT: Read access to a vertex 缓冲区 as part of a drawing command, bound by vkCmdBindVertexBuffers.

  • VK_ACCESS_UNIFORM_READ_BIT: Read access to a uniform 缓冲区.

  • VK_ACCESS_INPUT_ATTACHMENT_READ_BIT: Read access to an input attachment within a renderpass during fragment shading.

  • VK_ACCESS_SHADER_READ_BIT: Read access to a storage buffer, uniform texel buffer, storage texel buffer, sampled image, or storage image.

  • VK_ACCESS_SHADER_WRITE_BIT: Write access to a storage buffer, storage texel buffer, or storage image.

  • VK_ACCESS_COLOR_ATTACHMENT_READ_BIT: Read access to a color attachment, such as via blending, logic operations, or via certain subpass load operations.

  • VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT: Write access to a color or resolve attachment during a render pass or via certain subpass load and store operations.

  • VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT: Read access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load operations.

  • VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT: Write access to a depth/stencil attachment, via depth or stencil operations or via certain subpass load and store operations.

  • VK_ACCESS_TRANSFER_READ_BIT: Read access to一个图像or缓冲区in a copy operation.

  • VK_ACCESS_TRANSFER_WRITE_BIT: Write access to一个图像or buffer in a clear or copy operation.

  • VK_ACCESS_HOST_READ_BIT: Read access by a host operation. Accesses of this type are not performed through a resource, but directly on memory.

  • VK_ACCESS_HOST_WRITE_BIT: Write access by a host operation. Accesses of this type are not performed through a resource, but directly on memory.

  • VK_ACCESS_MEMORY_READ_BIT: Read access via non-specific entities. These entities include the Vulkan 设备and host, but may also include entities external to the Vulkan 设备or otherwise not part of the core Vulkan pipeline. When included in a 目标访问掩码, makes all available writes visible to all future read 访问on entities known to the Vulkan device.

  • VK_ACCESS_MEMORY_WRITE_BIT: Write access via non-specific entities. These entities include the Vulkan 设备and host, but may also include entities external to the Vulkan 设备or otherwise not part of the core Vulkan pipeline. When included in a 源访问掩码, all writes that are performed by entities known to the Vulkan 设备are made available. When included in a 目标访问掩码, makes all available writes visible to all future write 访问on entities known to the Vulkan device.

Certain 访问类型 are only performed by a subset of 管线阶段. Any synchronization 命令that takes both 阶段掩码s and 访问掩码 uses both to define the access scopes - only the specified 访问类型 performed by the specified 阶段s are included in the 访问作用域. An application must not specify一个access flag in a synchronization command if it does not include a 管线阶段 in the corresponding 阶段掩码 that is able to perform 访问of that type. The following table lists, for each access flag, which 管线阶段 can perform that type of access.

Table 4. Supported access types
Access flag Supported 管线阶段

VK_ACCESS_INDIRECT_COMMAND_READ_BIT

VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT

VK_ACCESS_INDEX_READ_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT

VK_PIPELINE_STAGE_VERTEX_INPUT_BIT

VK_ACCESS_UNIFORM_READ_BIT

VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_INPUT_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

VK_ACCESS_SHADER_READ_BIT

VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_SHADER_WRITE_BIT

VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, or VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT

VK_ACCESS_COLOR_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT

VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT

VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, or VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

VK_ACCESS_TRANSFER_READ_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_ACCESS_TRANSFER_WRITE_BIT

VK_PIPELINE_STAGE_TRANSFER_BIT

VK_ACCESS_HOST_READ_BIT

VK_PIPELINE_STAGE_HOST_BIT

VK_ACCESS_HOST_WRITE_BIT

VK_PIPELINE_STAGE_HOST_BIT

VK_ACCESS_MEMORY_READ_BIT

N/A

VK_ACCESS_MEMORY_WRITE_BIT

N/A

6.1.4. Framebuffer Region Dependencies

管线阶段 所操作的缓冲区而言,都是 _framebuffer-space_pipeline 阶段。 这些阶段如下:

  • VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT

  • VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT

  • VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT

对于这些管现阶段而言,从操作的第一个set到第二个set可以是单 framebuffer-global 依赖,也可以被拆分成多个 framebuffer-local 依赖。

带有non-framebuffer-space 管现阶段的依赖,既不 framebuffer-global ,也不 framebuffer-local。

一个 framebuffer region 是一个 sample (x, y, layer, sample) 坐标的集合,是整个缓冲区的一个子集。

一个单 framebuffer-local 依赖,保证对于单个缓冲区区域,操作的第一个set与 可用性操作再 可见性操作、操作的第二个set之前发生。 对于一个framebuffer-local 依赖,不保证 缓冲区区域之间的顺序。

一个framebuffer-global 依赖,保证对所有缓冲区区域的操作的第一个set 在对任何缓冲区区域的第二个set 之前发生。

注意

因为 fragment调用并没有在任何分组中被指定运行,帧缓冲区区域的大小取决于Vulkan实现,应用程序并不知道,而且,必须认为比单个sample要大。

若a synchronization 命令includes a dependencyFlags parameter, and specifies the VK_DEPENDENCY_BY_REGION_BIT flag, then it defines framebuffer-local dependencies for the framebuffer-space 管线阶段 in that synchronization command, for all framebuffer regions. 若no dependencyFlags parameter is included, or the VK_DEPENDENCY_BY_REGION_BIT flag is not specified, then a framebuffer-global 依赖is specified for those 阶段s. The VK_DEPENDENCY_BY_REGION_BIT flag does not affect the dependencies between non-framebuffer-space 管线阶段, nor does it affect the dependencies between framebuffer-space and non-framebuffer-space pipeline 阶段s.

注意

Framebuffer-local dependencies are more optimal for most architectures; particularly tile-based architectures - which can keep framebuffer-regions entirely in on-chip registers and thus avoid external bandwidth across such a 依赖. Including a framebuffer-global 依赖in your rendering will usually force all implementations to flush data to memory, or to a higher level cache, breaking any potential locality optimizations.

6.2. 隐式同步保障

Vulkan支持少量的隐式顺序保证,确保命令提交的顺序是有意义的,且在常规的操作中避免不必要的复杂性。

Submission order 是Vulkan中基础的排序,对于记录并提交到单个队列的 action and synchronization commands 的顺序是有意义的。显式的或者隐式的排序保证了,在Vulakn中以此为前提的不同命令之间所有工作,这种排序都是有意义的。

Submission order for any given set of commands is based on the order in which they were recorded to 命令缓冲区and then submitted. This order is determined as follows:

  1. The initial order is determined by the order in which vkQueueSubmit commands are executed on the host, for a single queue, from first to last.

  2. The order in which VkSubmitInfo structures are specified in the pSubmits parameter of vkQueueSubmit, from lowest索引to highest.

  3. The order in which 命令缓冲区are specified in the pCommandBuffers member of VkSubmitInfo, from lowest索引to highest.

  4. The order in which commands were recorded to a 命令缓冲区on the host, from first to last:

    • For commands recorded outside a render pass, this includes all other commands recorded outside a renderpass, including vkCmdBeginRenderPass and vkCmdEndRenderPass commands; it does not directly include commands inside a render pass.

    • For commands recorded inside a render pass, this includes all other commands recorded inside the same subpass, including the vkCmdBeginRenderPass and vkCmdEndRenderPass commands that delimit the same renderpass instance; it does not include commands recorded to other subpasses.

记录到一个命令缓冲区的原子和同步命令执行 提交顺序VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT 管线阶段 — 在此阶段内每个命令之间形成了一个隐式的执行依赖。

状态命令 并不在设备上执行任何操作,反而,当它们在CPU端执行时设置命令缓冲区的状态, 依他们被记录的顺序。 Action commands consume the current state of the 命令缓冲区when they are recorded, and will execute state changes on the 设备as required to match the recorded state.

管线阶段的执行, within a given 命令also has a loose ordering, dependent only on a single command.

6.3. 栅栏

栅栏是同步原语,可以用来在CPU端队列上插入一个依赖。栅栏由两个状态:激发的、未激发的。一个栅栏可以被当作队列提交 命令 执行的一部分从而被激发。 栅栏可以在CPU端被 vkResetFences 设置为 未激发状态。 栅栏可以在CPU端让 vkWaitForFences 命令等待,且当前的状态可以通过 vkGetFenceStatus 查询。

栅栏 可以使用如下 VkFence handle表示:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFence)

可调用如下命令掉创建栅栏:

VkResult vkCreateFence(
    VkDevice                                    device,
    const VkFenceCreateInfo*                    pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkFence*                                    pFence);
  • device 是创建栅栏的逻辑设备。

  • pCreateInfo 是一个指针,指向一个 VkFenceCreateInfo 数据结构的实例,它包含如何创建栅栏的信息。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

  • pFence 指向一个handle,它包含被创建并返回的栅栏对象。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkFenceCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pFence must be a pointer to a VkFence handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkFenceCreateInfo 类型数据结构定义如下:

typedef struct VkFenceCreateInfo {
    VkStructureType       sType;
    const void*           pNext;
    VkFenceCreateFlags    flags;
} VkFenceCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 定义了初始状态和栅栏的行为。 Bits which can be set include:

    typedef enum VkFenceCreateFlagBits {
        VK_FENCE_CREATE_SIGNALED_BIT = 0x00000001,
    } VkFenceCreateFlagBits;

    flags 包含 VK_FENCE_CREATE_SIGNALED_BIT,那么栅栏对象被创建时处于已经激发的状态;否则就处于未激发状态。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_FENCE_CREATE_INFO

  • pNext must be NULL

  • flags must be a valid combination of VkFenceCreateFlagBits values

可调用如下命令来销毁fence:

void vkDestroyFence(
    VkDevice                                    device,
    VkFence                                     fence,
    const VkAllocationCallbacks*                pAllocator);
  • device 是销毁栅栏对象的逻辑设备。

  • fence 是需要被销毁的栅栏的handle。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

正确使用
  • 所有指向 fence队列提交 命令 必须已经完成执行力。

  • VkAllocationCallbacks 在创建栅栏是被提供,那么在此时需要提供一套兼容的回调函数。

  • 若在创建栅栏时没有提供 VkAllocationCallbacks,那么这个 pAllocator 必须: 是 NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If fence is not VK_NULL_HANDLE, fence must be a valid VkFence handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If fence is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to fence must be externally synchronized

可调用如下命令来查询CPU端栅栏的状态:

VkResult vkGetFenceStatus(
    VkDevice                                    device,
    VkFence                                     fence);
  • device 是拥有该栅栏的逻辑设备。

  • fence 是需要被查询的栅栏handle。

若调用成功, vkGetFenceStatus 返回栅栏对象的状态,用如下代码表示:

Table 5. Fence Object Status Codes
Status Meaning

VK_SUCCESS

fence 指定的栅栏对象已经被激发。

VK_NOT_READY

fence 指定的栅栏对象未被激发。

若一个 队列提交 命令被暂停执行,那么这个命令的返回值可能会立即过期。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • fence must be a valid VkFence handle

  • fence must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_NOT_READY

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

可调用如下命令,在CPU端设置栅栏的状态为非激发:

VkResult vkResetFences(
    VkDevice                                    device,
    uint32_t                                    fenceCount,
    const VkFence*                              pFences);
  • device 是拥有栅栏的逻辑设备。

  • fenceCount 是需要被重置的栅栏数量。

  • pFences 是一个指针,指向一个需要被重置的栅栏的数组。

vkResetFences 在CPU端被执行,它为每一个栅栏定义了一个 fence unsignal operation,它把栅栏设置为非激发状态。

若在vkResetFences执行时,pFences数组中任何一个已经处于非激发状态,那么vkResetFences 将对于该栅栏没有效果。

正确使用
  • pFences数组中任何元素当前都不能和任何在队列上未执行完成的队列命令关联。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pFences must be a pointer to an array of fenceCount valid VkFence handles

  • fenceCount must be greater than 0

  • Each element of pFences must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to each member of pFences must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

当一个栅栏被当作 队列提交 命令的一部分,提交到队列时,它在被当作该命令的一部分的批量任务上定义了一个内存依赖, 且定义了一个 fence signal operation,该operation设置栅栏为激发状态。

第一个同步范围包含同一个 queue submission 内提交的批量任务。 通过 vkQueueSubmit 定义的栅栏激发操作此外还包含在第一个同步范围内 通过vkQueueSubmit提交到同一个队列的之前的队列提交。

第二个同步范围只包含栅栏激发操作。

第一个访问范围包含此设备操纵的所有内存访问。

第二个 访问范围 是空的。

在CPU端,可调用如下命令来等待一个或者多个栅栏来进入激发状态:

VkResult vkWaitForFences(
    VkDevice                                    device,
    uint32_t                                    fenceCount,
    const VkFence*                              pFences,
    VkBool32                                    waitAll,
    uint64_t                                    timeout);
  • device 是拥有栅栏的逻辑设备。

  • fenceCount 是需要等待的栅栏的数量。

  • pFences 是指针,指向 fenceCount 个栅栏的handle的数组。

  • waitAll 是接触等待必须要满足的条件。 若waitAllVK_TRUE,那么条件就是pFences中所有的栅栏都是激发状态的。 否则,条件就是pFences中至少一个栅栏是激发状态的。

  • timeout 是过期时间,单位为纳秒。timeout 被调整为Vulkan实现所允许的精确度下最接近的值,故可能比一纳秒也大很多, 所以,真实过期时间比给定的时间要长。

若调用 vkWaitForFences时,条件都满足了,那么vkWaitForFences 直接立刻返回。 若调用 vkWaitForFences时,条件没有被满足,那么vkWaitForFences 将阻塞,并且等待timeout 纳秒,以检测条件是否被满足。

timeout 为0,那么 vkWaitForFences 不等待,但是将返回栅栏当前的状态。 若条件不满足,将返回VK_TIMEOUT,即使实际上并没有执行等待。

若条件被满足之前指定的timeout 时间过期,vkWaitForFences 将返回 VK_TIMEOUT。 若在 timeout纳秒过期之前条件已经满足,vkWaitForFences 将返回 VK_SUCCESS

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pFences must be a pointer to an array of fenceCount valid VkFence handles

  • fenceCount must be greater than 0

  • Each element of pFences must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_SUCCESS

  • VK_TIMEOUT

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

等待一个栅栏变为激发状态就 定义了一个行依赖 ,或者通过 vkWaitForFences、或轮询vkGetFenceStatus

第一个同步作用域只包含栅栏的激发操作。

第二个同步作用域包含主机端 vkWaitForFences操作或者vkGetFenceStatus 操作, 指示栅栏已经被激发。

注意

激发栅栏与 在host端等待,并不保证 内存访问的结果对于host端可见。 为了提供该保证,应用程序必须在设备写入与 将激发栅栏的提交结束之间 插入一个内存屏障, 且dstAccessMask 设置为 VK_ACCESS_HOST_READ_BITdstStageMask 设置为 VK_PIPELINE_STAGE_HOST_BITsrcStageMasksrcAccessMask 成员被设置为合适的值来保证 写入完成。 若 内存被分配时带有VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,那么vkInvalidateMappedMemoryRanges 必须在栅栏 被激发之后被调用,以此保证写入对于host端可见,细节如Host Access to Device Memory Objects小节所述。

6.4. 信号量

信号量是同步原语,可用来在提交到队列的批量任务之间插入依赖。 信号量有两种状态:激发的、未激发的。 当批量命令执行完成后,信号量可以被激发。 批量任务在真正执行前,可以等待信号量被激发,且信号量在批量任务开始执行时是未激发的。

信号量可以使用 VkSemaphore handle表示:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkSemaphore)

可调用如下命令创建信号量:

VkResult vkCreateSemaphore(
    VkDevice                                    device,
    const VkSemaphoreCreateInfo*                pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkSemaphore*                                pSemaphore);
  • device 是创建信号量的逻辑设备。

  • pCreateInfo 是一个指针,指向一个 VkSemaphoreCreateInfo 数据结构实例,它包含如何创建信号量的信息。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

  • pSemaphore 指向被返回的信号量对象的handle。

当创建好后,信号量处于未激发状态。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkSemaphoreCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pSemaphore must be a pointer to a VkSemaphore handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkSemaphoreCreateInfo 类型数据结构定义如下:

typedef struct VkSemaphoreCreateInfo {
    VkStructureType           sType;
    const void*               pNext;
    VkSemaphoreCreateFlags    flags;
} VkSemaphoreCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SEMAPHORE_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

可调用如下命令来销毁信号量:

void vkDestroySemaphore(
    VkDevice                                    device,
    VkSemaphore                                 semaphore,
    const VkAllocationCallbacks*                pAllocator);
  • device 是销毁信号量的逻辑设备。

  • semaphore 是需要被销毁的信号量的handle。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

正确使用
  • All submitted batches that refer to semaphore must have completed execution

  • VkAllocationCallbackssemaphore 被创建时提供,那么这里就应该提供兼容的一系列callback。

  • VkAllocationCallbacks 在创建 semaphore 时没有被提供,那么pAllocator 必须为 NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If semaphore is not VK_NULL_HANDLE, semaphore must be a valid VkSemaphore handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If semaphore is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to semaphore must be externally synchronized

6.4.1. 激发信号量

当一个批量任务 通过一个队列提交提交到队列,它包含了需要被激发的信号量, 它定义了一个批量任务上内存依赖,定义了 semaphore signal operations ,该操作设置信号量为激发状态。

第一同步作用域 包含被提交到同批次的每个命令。 通过vkQueueSubmit定义的 信号量激发操作,包含之前通过vkQueueSubmit提交到同一队列的所有批次, 包含同一队列提交 命令所提交的批量,只是在批量数组中占据较小的索引号。

第二个 同步作用域只包含信号量激发操作。

第一个访问作用域包含所有的device执行的内存访问。

第二个 访问作用域 是空的。

6.4.2. 等待信号量 & 取消激发

当一个批次通过 一个队列提交提交到队列,它包含一个在等待的信号量, 其定义了一个在之前的信号量激发的操作与批次之间的内存依赖,且定义了 semaphore unsignal operations ,该操作把信号量设置为未激发状态。

第一访问作用域包含所有的信号量激发的操作,该操作作用于在同一批次上等待的信号量,并且在等待完成之前发生。

第二同步作用域包含提交到同一批次的每一个命令。

vkQueueSubmit为例, 第二个同步作用域 被限制 limited to operations on the 管线阶段 determined by the 源阶段掩码 specified by the corresponding element of pWaitDstStageMask. Also, in the case of vkQueueSubmit, the second 同步作用域 additionally includes all batches subsequently submitted to the same queue via vkQueueSubmit, including batches that are submitted in the same 队列submission command, but at a higher index within the array of batches.

第一访问作用域 是空的。

第二访问作用域包含设备访问的所有内存。

信号量反激发操作在 执行依赖的操作的第一个集合之后发生,在执行依赖的操作的第二个集合之前发生。

注意

不像栅栏或者事件,等待信号量的行为也会反激发该信号量。 若两个操作被指定等待同一个信号量,且没有其他的执行依赖处于这些操作之间,那么此行为就是未定义的。 一个执行依赖 必须出现以保证对那些等待的第一个的信号量反激发操作,在信号量被再次激发之前发生,且在第二次取消之间发生。 信号量等待与激发应该成对出现。

6.5. 事件

事件是它同步原语, 可用于在提交到同一个队列的不同命令之间,或者在CPU端与队列之间,插入高度优化的依赖。 事件有两种状态:激发的、未激发的。 应用程序可以在CPU端或者设备端激发一个事件,或者取消激发状态。 设备可在进一步执行其他操作之前等待事件直至其变为激发被激发。 在CPU端没有命令等待事件变被激发,事件的当前状态也是可以被查询。

使用VkEvent handle表示事件:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkEvent)

可调用如下命令来创建事件:

VkResult vkCreateEvent(
    VkDevice                                    device,
    const VkEventCreateInfo*                    pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkEvent*                                    pEvent);
  • device 是创建事件的逻辑设备。

  • pCreateInfo 是一个指针,指向一个 VkEventCreateInfo 数据结构实例,它包含了创建事件所需的信息。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

  • pEvent 指向被创建的事件对象的handle。

创建完成后,事件对象处于未激发状态。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkEventCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pEvent must be a pointer to a VkEvent handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkEventCreateInfo 类型数据结构定义如下:

typedef struct VkEventCreateInfo {
    VkStructureType       sType;
    const void*           pNext;
    VkEventCreateFlags    flags;
} VkEventCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_EVENT_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

调用如下命令来销毁事件:

void vkDestroyEvent(
    VkDevice                                    device,
    VkEvent                                     event,
    const VkAllocationCallbacks*                pAllocator);
  • device 是销毁事件的逻辑设备。

  • event 是需要被销毁的事件。

  • pAllocator 控制CPU端内存分配,如 内存分配 一章详述。

正确使用
  • All submitted commands that refer to event must have completed execution

  • VkAllocationCallbacks were provided when event was created, a compatible set of callbacks must be provided here

  • 若no VkAllocationCallbacks were provided when event was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If event is not VK_NULL_HANDLE, event must be a valid VkEvent handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If event is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

可在CU端使用如下命令来查询事件的状态:

VkResult vkGetEventStatus(
    VkDevice                                    device,
    VkEvent                                     event);
  • device 是拥有该事件的逻辑设备。

  • event 是需要被查询的事件handle。

Upon success, vkGetEventStatus returns the state of the event object with the following return codes:

Table 6. Event Object Status Codes
Status Meaning

VK_EVENT_SET

The event specified by event is signaled.

VK_EVENT_RESET

The event specified by event is unsignaled.

若a vkCmdSetEvent or vkCmdResetEvent 命令is pending execution, then the value returned by this 命令may immediately be out of date.

The state of一个event can be updated by the host. The state of the event is immediately changed, and subsequent calls to vkGetEventStatus will return the new state. 若an event is already in the requested state, then updating it to the same state has no effect.

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Return Codes
Success
  • VK_EVENT_SET

  • VK_EVENT_RESET

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

可通过如下命令,在CPU端来设置事件的状态为 激发状态:

VkResult vkSetEvent(
    VkDevice                                    device,
    VkEvent                                     event);
  • device 是拥有该事件的逻辑设备。

  • event 是需要被设置的事件。

vkSetEvent 在CPU端执行是,它定义了一个 event signal operation ,此操作设置事件为已激发状态。

在 flink:vkSetEvent 被执行时,若 pname:event 已经处于激发状态,那么 flink:vkSetEvent 没有任何效果,不会导致事件激发操作。
Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

可通过如下命令,在CPU端来设置事件的状态为 未激发状态:

VkResult vkResetEvent(
    VkDevice                                    device,
    VkEvent                                     event);
  • device 是拥有该事件的逻辑设备。

  • event 是需要被重置的事件。

当在CPU端执行 vkResetEvent 时,它定义了一个 event unsignal operation ,它会把事件重置为非激发状态。 若在执行 vkResetEvent时,event 已经处于非激发状态,那么 vkResetEvent没有任何效果,不会导致事件取消激发操作。

正确使用
  • event must not be waited on by a vkCmdWaitEvents command that is currently executing

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • event must be a valid VkEvent handle

  • event must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to event must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

在GPU端,也可通过如下的被插入到命令缓冲区的命令,来设置事件的状态:

可使用如下命令在GPU端 来激发事件:

void vkCmdSetEvent(
    VkCommandBuffer                             commandBuffer,
    VkEvent                                     event,
    VkPipelineStageFlags                        stageMask);
  • commandBuffer 是命令被记录所在的命令缓冲区。

  • event 是将被激发的事件。

  • stageMask 指定了 用来决定何时激发 event源阶段掩码

vkCmdSetEvent 被提交到队列,它在被提交到队列的命令上 定义了一个执行依赖, 且定义了一个时间激发操作,将设置时间为已激发状态。

The first 同步作用域 includes every 命令previously submitted to the same queue, including those in the same 命令缓冲区and batch. The 同步作用域 is limited to operations on the 管线阶段 determined by the 源阶段掩码 specified by stageMask.

The second 同步作用域 includes only the event signal operation.

若在GPU执行 vkCmdSetEvent 时, event 已经处于激发状态,那么vkCmdSetEvent 没有任何作用,将不会导致任何事件激发操作, 不会生成执行依赖。

正确使用
  • stageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • 若the 几何着色器 feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 若the 细分着色器 feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • event must be a valid VkEvent handle

  • stageMask must be a valid combination of VkPipelineStageFlagBits values

  • stageMask must not be 0

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Both of commandBuffer, and event must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
compute

可使用如下命令在GPU端 设置事件为未激发状态:

void vkCmdResetEvent(
    VkCommandBuffer                             commandBuffer,
    VkEvent                                     event,
    VkPipelineStageFlags                        stageMask);
  • commandBuffer 是命令被记录所在的命令缓冲区。

  • event 是需被被设置为为激发状态的事件。

  • stageMask 指定了 用来决定何时取消激发 event源阶段掩码

vkCmdResetEvent 被提交到队列,它在提交到队列的命令之间定义了一个执行依赖,且定义了一个时间反取消操作,该操作重置时间为未激发状态状态。

The first 同步作用域 includes every 命令previously submitted to the same queue, including those in the same 命令缓冲区and batch. The同步作用域 is limited to operations on the 管线阶段 determined by the source stage mask specified by stageMask.

The second 同步作用域 includes only the event unsignal operation.

event is already in the unsignaled state when vkCmdResetEvent is executed on the device, then vkCmdResetEvent has no effect, no event unsignal操作occurs, and no 执行依赖is generated.

正确使用
  • stageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • 若the 几何着色器 feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 若the 细分着色器 feature is not enabled, stageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • When this 命令executes, event must not be waited on by a vkCmdWaitEvents 命令that is currently executing

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • event must be a valid VkEvent handle

  • stageMask must be a valid combination of VkPipelineStageFlagBits values

  • stageMask must not be 0

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

  • Both of commandBuffer, and event must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
compute

可调用如下命令来在一个设备上等待一个或者多个事件进入激发状态:

void vkCmdWaitEvents(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    eventCount,
    const VkEvent*                              pEvents,
    VkPipelineStageFlags                        srcStageMask,
    VkPipelineStageFlags                        dstStageMask,
    uint32_t                                    memoryBarrierCount,
    const VkMemoryBarrier*                      pMemoryBarriers,
    uint32_t                                    bufferMemoryBarrierCount,
    const VkBufferMemoryBarrier*                pBufferMemoryBarriers,
    uint32_t                                    imageMemoryBarrierCount,
    const VkImageMemoryBarrier*                 pImageMemoryBarriers);
  • commandBuffer是命令被记录所在的命令缓冲区。

  • eventCount 是数组pEvents array的长度。

  • pEvents 是需要被等待的事件对象handle的数组。

  • srcStageMask is the 源阶段掩码

  • dstStageMask is the 目标阶段掩码.

  • memoryBarrierCount is the length of the pMemoryBarriers array.

  • pMemoryBarriers is a pointer to一个array of VkMemoryBarrier structures.

  • bufferMemoryBarrierCount is the length of the pBufferMemoryBarriers array.

  • pBufferMemoryBarriers is a pointer to一个array of VkBufferMemoryBarrier structures.

  • imageMemoryBarrierCount is the length of the pImageMemoryBarriers array.

  • pImageMemoryBarriers is a pointer to一个array of VkImageMemoryBarrier structures.

vkCmdWaitEvents 被提交到队列,它在之前的事件激发操作与稍后的命令之间 定义了一个内存依赖。

The first 同步作用域 only includes event signal operations that operate on members of pEvents, and the operations that happened-before the event signal operations. Event signal operations performed by vkCmdSetEvent that were previously submitted to the same队列are included in the first 同步作用域, if the logically latest 管线阶段 in their stageMask parameter is logically earlier than or equal to the logically latest pipeline stage in srcStageMask. Event signal operations performed by vkSetEvent are only included in the first 同步作用域 if VK_PIPELINE_STAGE_HOST_BIT is included in srcStageMask.

The second 同步作用域 includes commands subsequently submitted to the same queue, including those in the same 命令缓冲区and batch. The second 同步作用域 is limited to operations on the 管线 阶段s determined by the 源阶段掩码 specified by dstStageMask.

The first 访问作用域 is limited to access in the 管线阶段 determined by the 目标阶段掩码 specified by srcStageMask. Within that, the first 访问作用域 only includes the first 访问作用域 defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of 内存屏障. 若no 内存屏障 are specified, then the first 访问作用域 includes no accesses.

The second 访问作用域 is limited to access in the 管线阶段 determined by the 源阶段掩码 specified by dstStageMask. Within that, the second 访问作用域 only includes the second 访问作用域 defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of 内存屏障. 若no 内存屏障 are specified, then the second 访问作用域 includes no accesses.

注意

vkCmdWaitEvents is used with vkCmdSetEvent to define a memory 依赖between two sets of action commands, roughly in the same way as 管线barriers, but split into two commands such that work between the two may execute unhindered.

注意

Applications should be careful to avoid race conditions when using events. There is no direct ordering guarantee between a vkCmdResetEvent 命令and a vkCmdWaitEvents 命令submitted after it, so some other 执行依赖must be included between these commands (e.g. a semaphore).

正确使用
  • srcStageMask must be the bitwise OR of the stageMask parameter used in previous calls to vkCmdSetEvent with any of the members of pEvents and VK_PIPELINE_STAGE_HOST_BIT if any of the members of pEvents was set using vkSetEvent

  • 若the 几何着色器 feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 若the 几何着色器 feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 若the 细分着色器 feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • 若the 细分着色器 feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • pEvents includes one or more events that will be signaled by vkSetEvent after commandBuffer has been submitted to a queue, then vkCmdWaitEvents must not be called inside a render pass instance

  • Any 管线阶段 included in srcStageMask or dstStageMask must be supported by the capabilities of the队列族specified by the queueFamilyIndex member of the VkCommandPoolCreateInfo structure that was used to create the VkCommandPool that commandBuffer was allocated from, as specified in the table of supported pipeline stages.

  • Any given element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its srcAccessMask member if that bit is not supported by any of the 管线阶段 in srcStageMask, as specified in the table of supported access types.

  • Any given element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its dstAccessMask member if that bit is not supported by any of the 管线阶段 in dstStageMask, as specified in the table of supported access types.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pEvents must be a pointer to an array of eventCount valid VkEvent handles

  • srcStageMask must be a valid combination of VkPipelineStageFlagBits values

  • srcStageMask must not be 0

  • dstStageMask must be a valid combination of VkPipelineStageFlagBits values

  • dstStageMask must not be 0

  • If memoryBarrierCount is not 0, pMemoryBarriers must be a pointer to an array of memoryBarrierCount valid VkMemoryBarrier structures

  • If bufferMemoryBarrierCount is not 0, pBufferMemoryBarriers must be a pointer to an array of bufferMemoryBarrierCount valid VkBufferMemoryBarrier structures

  • If imageMemoryBarrierCount is not 0, pImageMemoryBarriers must be a pointer to an array of imageMemoryBarrierCount valid VkImageMemoryBarrier structures

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • eventCount must be greater than 0

  • Both of commandBuffer, and the elements of pEvents must have been created, allocated, or retrieved from the same VkDevice

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Graphics
compute

6.6. 管线屏障

vkCmdPipelineBarrier 是一个同步命令,它在提交到同一个队列的不同命令之间插入一个依赖,或者在同一个subpass内的不同命令之间插入一个依赖。

可调用如下命令来记录一个管线屏障:

void vkCmdPipelineBarrier(
    VkCommandBuffer                             commandBuffer,
    VkPipelineStageFlags                        srcStageMask,
    VkPipelineStageFlags                        dstStageMask,
    VkDependencyFlags                           dependencyFlags,
    uint32_t                                    memoryBarrierCount,
    const VkMemoryBarrier*                      pMemoryBarriers,
    uint32_t                                    bufferMemoryBarrierCount,
    const VkBufferMemoryBarrier*                pBufferMemoryBarriers,
    uint32_t                                    imageMemoryBarrierCount,
    const VkImageMemoryBarrier*                 pImageMemoryBarriers);
  • commandBuffer 是命令被记录所在的命令缓冲区。

  • srcStageMask 定义了一个 源阶段掩码

  • dstStageMask 定义了一个 目标阶段掩码

  • dependencyFlags is a bitmask of VkDependencyFlagBits. The bits that can be included in dependencyFlags are:

    typedef enum VkDependencyFlagBits {
        VK_DEPENDENCY_BY_REGION_BIT = 0x00000001,
    } VkDependencyFlagBits;
  • memoryBarrierCount 是数组 pMemoryBarriers的长度。

  • pMemoryBarriers 是一个指针,指向一个VkMemoryBarrier 数据类型的数组。

  • bufferMemoryBarrierCount 是数组 pBufferMemoryBarriers 的长度。

  • pBufferMemoryBarriers 是一个指针,指向一个 VkBufferMemoryBarrier 数据类型的数组。

  • imageMemoryBarrierCount 是数组 pImageMemoryBarriers 的长度。

  • pImageMemoryBarriers 是一个指针,指向一个VkImageMemoryBarrier 数据类型的数组。

vkCmdPipelineBarrier 被提交到队列,它在命令被提交到队列之前与之后的命令之间 定义了一个内存依赖。

vkCmdPipelineBarrier was recorded outside a render pass instance, the first 同步作用域 includes every 命令submitted to the same队列before it, including those in the same 命令缓冲区and batch. 若vkCmdPipelineBarrier was recorded inside a render pass instance, the first 同步作用域 includes only commands submitted before it within the same subpass. In either case, the first 同步作用域 is limited to operations on the 管线阶段 determined by the 目标阶段掩码 specified by srcStageMask.

vkCmdPipelineBarrier was recorded outside a render pass instance, the second 同步作用域 includes every 命令submitted to the same队列after it, including those in the same 命令缓冲区and batch. 若vkCmdPipelineBarrier was recorded inside a render pass instance, the second 同步作用域 includes only commands submitted after it within the same subpass. In either case, the second 同步作用域 is limited to operations on the 管线阶段 determined by the 源阶段掩码 specified by dstStageMask.

The first 访问作用域 is limited to access in the 管线阶段 determined by the 目标阶段掩码 specified by srcStageMask. Within that, the first 访问作用域 only includes the first 访问作用域 defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of 内存屏障. 若no 内存屏障 are specified, then the first 访问作用域 includes no accesses.

The second 访问作用域 is limited to access in the 管线阶段 determined by the 源阶段掩码 specified by dstStageMask. Within that, the second 访问作用域 only includes the second 访问作用域s defined by elements of the pMemoryBarriers, pBufferMemoryBarriers and pImageMemoryBarriers arrays, which each define a set of 内存屏障. 若no 内存屏障 are specified, then the second 访问作用域 includes no accesses.

dependencyFlags includes VK_DEPENDENCY_BY_REGION_BIT, then any 依赖between framebuffer-space 管线阶段 is framebuffer-local - otherwise it is framebuffer-global.

正确使用
  • 几何着色器 特性没有被启用,srcStageMask 必须: 不能包含 VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 几何着色器 特性没有被启用, dstStageMask 必须: 不能包含 VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • 细分着色器 特性没有被启用,srcStageMask必须: 不能包含 VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT或者 VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • 细分着色器特性没有被启用,dstStageMask 必须: 不能包含 VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT 或者 VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • vkCmdPipelineBarrier 在一个render pass实例内被调用, 此 render pass 在创建时必须: 与当前subpass自身存在一个 VkSubpassDependency 另外:

    • srcStageMask must contain a subset of the bit values in the srcStageMask member of that instance of VkSubpassDependency

    • dstStageMask must contain a subset of the bit values in the dstStageMask member of that instance of VkSubpassDependency

    • The srcAccessMask of any element of pMemoryBarriers or pImageMemoryBarriers must contain a subset of the bit values the srcAccessMask member of that instance of VkSubpassDependency

    • The dstAccessMask of any element of pMemoryBarriers or pImageMemoryBarriers must contain a subset of the bit values the dstAccessMask member of that instance of VkSubpassDependency

    • dependencyFlags must be equal to the dependencyFlags member of that instance of VkSubpassDependency

  • vkCmdPipelineBarrier is called within a render pass instance, bufferMemoryBarrierCount must be 0

  • vkCmdPipelineBarrier is called within a render pass instance, the image member of any element of pImageMemoryBarriers must be equal to one of the elements of pAttachments that the current framebuffer was created with, that is also referred to by one of the elements of the pColorAttachments, pResolveAttachments or pDepthStencilAttachment members of the VkSubpassDescription instance that the current subpass was created with

  • vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of any element of pImageMemoryBarriers must be equal to the layout member of 一个element of the pColorAttachments, pResolveAttachments or pDepthStencilAttachment members of the VkSubpassDescription instance that the current subpass was created with, that refers to the same image

  • vkCmdPipelineBarrier is called within a render pass instance, the oldLayout and newLayout members of一个element of pImageMemoryBarriers must be equal

  • vkCmdPipelineBarrier is called within a render pass instance, the srcQueueFamilyIndex and dstQueueFamilyIndex members of any element of pImageMemoryBarriers must be VK_QUEUE_FAMILY_IGNORED

  • Any 管线 stage included in srcStageMask or dstStageMask must be supported by the capabilities of the队列族specified by the queueFamilyIndex member of the VkCommandPoolCreateInfo structure that was used to create the VkCommandPool that commandBuffer was allocated from, as specified in the table of supported pipeline stages.

  • Any given element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its srcAccessMask member if that bit is not supported by any of the 管线阶段 in srcStageMask, as specified in the table of supported access types.

  • Any given element of pMemoryBarriers, pBufferMemoryBarriers or pImageMemoryBarriers must not have any access flag included in its dstAccessMask member if that bit is not supported by any of the 管线阶段 in dstStageMask, as specified in the table of supported access types.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • srcStageMask must be a valid combination of VkPipelineStageFlagBits values

  • srcStageMask must not be 0

  • dstStageMask must be a valid combination of VkPipelineStageFlagBits values

  • dstStageMask must not be 0

  • dependencyFlags must be a valid combination of VkDependencyFlagBits values

  • If memoryBarrierCount is not 0, pMemoryBarriers must be a pointer to an array of memoryBarrierCount valid VkMemoryBarrier structures

  • If bufferMemoryBarrierCount is not 0, pBufferMemoryBarriers must be a pointer to an array of bufferMemoryBarrierCount valid VkBufferMemoryBarrier structures

  • If imageMemoryBarrierCount is not 0, pImageMemoryBarriers must be a pointer to an array of imageMemoryBarrierCount valid VkImageMemoryBarrier structures

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support transfer, graphics, or compute operations

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Both

Transfer
graphics
compute

6.6.1. Subpass Self-dependency

vkCmdPipelineBarrier is called inside a render pass instance, the following restrictions apply. For a given subpass to allow a 管线 barrier, the render pass must declare a self-dependency from that subpass to itself. That is, there must exist a VkSubpassDependency in the subpass 依赖list for the render pass with srcSubpass and dstSubpass equal to that subpass index. More than one self-依赖can be declared for each subpass. Self-dependencies must only include 管线 stage bits that are graphics stages. Self-dependencies must not have any earlier 管线阶段 depend on any later 管线阶段. 更准确来说,这意味着srcStageMask 的最后一个管线阶段,不能在dstStageMask 的第一个管线阶段之后()。 this means that whatever is the last 管线stage in srcStageMask must be no later than whatever is the first pipeline stage in dstStageMask (the latest source stage can be equal to the earliest 目标 stage). 若the source and 源阶段掩码s both include framebuffer-space stages, then dependencyFlags must include VK_DEPENDENCY_BY_REGION_BIT.

A vkCmdPipelineBarrier 命令inside a render pass instance must be a subset of one of the self-dependencies of the subpass it is used in, meaning that the阶段掩码s and 访问掩码 must each include only a subset of the bits of the corresponding mask in that self-dependency. 若the self-依赖has VK_DEPENDENCY_BY_REGION_BIT set, then so must the 管线barrier. 管线barriers within a render pass instance can only be types VkMemoryBarrier or VkImageMemoryBarrier. 若a VkImageMemoryBarrier is used, the图像and图像subresource range specified in the barrier must be a subset of one of the图像views used by the framebuffer in the current subpass. Additionally, oldLayout must be equal to newLayout, and both the srcQueueFamilyIndex and dstQueueFamilyIndex must be VK_QUEUE_FAMILY_IGNORED.

6.7. 内存屏障

Memory barriers are used to explicitly control access to缓冲区and image 子资源ranges. 内存屏障用来 在队列族之间转移所有权, 改变 图像布局, 和 定义可用性与可见性操作。 它们显式的定义了访问类型和 包含在 包含它们的同步命令所创建的 内存依赖的 访问作用域的缓冲区、图像子资源区间。

6.7.1. 全局内存屏障

全局内存屏障可应用于涉及到执行时所有内存对象的内存访问。

VkMemoryBarrier 类型数据结构定义如下:

typedef struct VkMemoryBarrier {
    VkStructureType    sType;
    const void*        pNext;
    VkAccessFlags      srcAccessMask;
    VkAccessFlags      dstAccessMask;
} VkMemoryBarrier;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • srcAccessMask 定义了一个 源访问掩码.

  • dstAccessMask 定义了一个目标访问掩码.

第一 访问作用域srcAccessMask指定的源访问掩码 中的访问类型 限制。

第二访问作用域dstAccessMask指定的目标访问掩码 中的访问类型限制。

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_MEMORY_BARRIER

  • pNext must be NULL

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

6.7.2. 缓冲区内存屏障

缓冲区内存屏障 只应用于涉及到特定缓冲区区间内存访问。亦即,一个由缓冲区内存屏障 形成的内存依赖, 被限定(scoped) 访问特定的缓冲区区间。 缓冲区内存屏障 也可以: 用于为特定的缓冲区区间 定义一个队列族所有权转移

VkBufferMemoryBarrier 类型数据结构定义如下:

typedef struct VkBufferMemoryBarrier {
    VkStructureType    sType;
    const void*        pNext;
    VkAccessFlags      srcAccessMask;
    VkAccessFlags      dstAccessMask;
    uint32_t           srcQueueFamilyIndex;
    uint32_t           dstQueueFamilyIndex;
    VkBuffer           buffer;
    VkDeviceSize       offset;
    VkDeviceSize       size;
} VkBufferMemoryBarrier;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • srcAccessMask 定义了一个 源访问掩码.

  • dstAccessMask 定义了一个目标访问掩码.

  • srcQueueFamilyIndex队列族所有权转移的源队列族。

  • dstQueueFamilyIndex队列族所有权转移的目标队列族。

  • buffer 是 所占内存被屏障所影响的 缓冲区的handle。

  • offsetbuffer所占内存的offset字节数;这是与绑定到缓冲区基础offset的相对值。 (参考 vkBindBufferMemory).

  • sizebuffer所占内存 受影响区域的字节数,或者值为VK_WHOLE_SIZE,表示使用offset 到缓冲区结束的所有区间。

第一访问作用域srcAccessMask指定的源访问掩码中的 访问类型 限制只能通过指定的缓冲区区间访问内存。 若srcAccessMask 包含 VK_ACCESS_HOST_WRITE_BIT,由该访问类型操作的内存写入结果也是可见的,如同该访问类型没有通过资源进行操作。

The second 访问作用域 is limited to access to内存through the specified缓冲区range, via access types in the 目标访问掩码 specified by dstAccessMask. 若dstAccessMask includes VK_ACCESS_HOST_WRITE_BIT or VK_ACCESS_HOST_READ_BIT, available内存writes are also made visible to 访问of those types, as those 访问类型 are not performed through a resource.

srcQueueFamilyIndex is not equal to dstQueueFamilyIndex, and srcQueueFamilyIndex is equal to the current 队列族, then the 内存屏障 defines a 队列组释放操作 for the specified缓冲区range, and the second 访问作用域 includes no access, as if dstAccessMask was 0.

dstQueueFamilyIndex is not equal to srcQueueFamilyIndex, and dstQueueFamilyIndex is equal to the current 队列族, then the 内存屏障 defines a queue family acquire operation for the specified缓冲区range, and the first 访问作用域 includes no access, as if srcAccessMask was 0.

正确使用
  • offset must be less than the size of buffer

  • size is not equal to VK_WHOLE_SIZE, size must be greater than 0

  • size is not equal to VK_WHOLE_SIZE, size must be less than or equal to than the size of buffer minus offset

  • buffer was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, srcQueueFamilyIndex and dstQueueFamilyIndex must both be VK_QUEUE_FAMILY_IGNORED

  • buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, srcQueueFamilyIndex and dstQueueFamilyIndex must either both be VK_QUEUE_FAMILY_IGNORED, or both be a valid队列族(see 队列族的属性)

  • buffer was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, and srcQueueFamilyIndex and dstQueueFamilyIndex are valid队列families, at least one of them must be the same as the family of the队列that will execute this barrier

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER

  • pNext must be NULL

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

  • buffer must be a valid VkBuffer handle

6.7.3. 图像内存屏障

图像内存屏障 only apply to内存访问involving a specific image 子资源range. That is, a内存依赖formed from an图像内存屏障 is scoped to access via the specified图像子资源range. 图像内存屏障 can also be used to define 图像布局转换s or a 队列族所有权转移 for the specified图像子资源range.

VkImageMemoryBarrier 类型数据结构定义如下:

typedef struct VkImageMemoryBarrier {
    VkStructureType            sType;
    const void*                pNext;
    VkAccessFlags              srcAccessMask;
    VkAccessFlags              dstAccessMask;
    VkImageLayout              oldLayout;
    VkImageLayout              newLayout;
    uint32_t                   srcQueueFamilyIndex;
    uint32_t                   dstQueueFamilyIndex;
    VkImage                    image;
    VkImageSubresourceRange    subresourceRange;
} VkImageMemoryBarrier;

The first 访问作用域 is limited to access to内存through the specified图像子资源range, via 访问类型 in the 源访问掩码 specified by srcAccessMask. 若srcAccessMask includes VK_ACCESS_HOST_WRITE_BIT, memory writes performed by that access type are also made visible, as that access type is not performed through a resource.

第二访问作用域 被限制只能通过特定的图像子资源范围来 访问内存, 由通过dstAccessMask指定的目标访问掩码中的 访问类型。 若dstAccessMask includes VK_ACCESS_HOST_WRITE_BIT or VK_ACCESS_HOST_READ_BIT, available内存writes are also made visible to accesses of those types, as those 访问类型 are not performed through a resource.

srcQueueFamilyIndex is not equal to dstQueueFamilyIndex, and srcQueueFamilyIndex is equal to the current 队列族, then the 内存屏障 defines a 队列族 release operation for the specified图像子资源range, and the second 访问作用域 includes no access, as if dstAccessMask was 0.

dstQueueFamilyIndex is not equal to srcQueueFamilyIndex, and dstQueueFamilyIndex is equal to the current 队列族, then the 内存屏障 defines a queue family acquire operation for the specified图像子资源range, and the first 访问作用域 includes no access, as if srcAccessMask was 0.

oldLayout is not equal to newLayout, then the 内存屏障 defines一个图像布局转换 for the specified图像子资源range.

Layout transitions that are performed via图像内存屏障 execute in their entirety in 提交顺序, relative to other 图像布局转换s submitted to the same queue, including those performed by render passes. In effect there is一个implicit 执行依赖from each such layout transition to all 布局transitions previously submitted to the same queue.

正确使用
  • oldLayout must be VK_IMAGE_LAYOUT_UNDEFINED or the current 布局of the图像subresources affected by the barrier

  • newLayout must not be VK_IMAGE_LAYOUT_UNDEFINED or VK_IMAGE_LAYOUT_PREINITIALIZED

  • image was created with a sharing mode of VK_SHARING_MODE_CONCURRENT, srcQueueFamilyIndex and dstQueueFamilyIndex must both be VK_QUEUE_FAMILY_IGNORED

  • image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, srcQueueFamilyIndex and dstQueueFamilyIndex must either both be VK_QUEUE_FAMILY_IGNORED, or both be a valid队列族(see 队列族的属性)

  • image was created with a sharing mode of VK_SHARING_MODE_EXCLUSIVE, and srcQueueFamilyIndex and dstQueueFamilyIndex are valid队列families, at least one of them must be the same as the family of the队列that will execute this barrier

  • subresourceRange must be a valid图像子资源range for the 图像(see 图像视图)

  • image has a depth/stencil format with both depth and stencil components, then aspectMask member of subresourceRange must include both VK_IMAGE_ASPECT_DEPTH_BIT and VK_IMAGE_ASPECT_STENCIL_BIT

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL then image must have been created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT set

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL then image must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL then image must have been created with VK_IMAGE_USAGE_SAMPLED_BIT or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT set

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL then image must have been created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT set

  • 若either oldLayout or newLayout is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL then image must have been created with VK_IMAGE_USAGE_TRANSFER_DST_BIT set

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER

  • pNext must be NULL

  • srcAccessMask must be a valid combination of VkAccessFlagBits values

  • dstAccessMask must be a valid combination of VkAccessFlagBits values

  • oldLayout must be a valid VkImageLayout value

  • newLayout must be a valid VkImageLayout value

  • image must be a valid VkImage handle

  • subresourceRange must be a valid VkImageSubresourceRange structure

6.7.4. 队列族主权转移

Resources created with a VkSharingMode of VK_SHARING_MODE_EXCLUSIVE must have their ownership explicitly transferred from one队列族to another in order to access their content in a well-defined manner on a队列in a different 队列族. If内存dependencies are correctly expressed between uses of such a resource between two queues in different families, but no ownership transfer is defined, the contents of that resource are undefined for any read accesses performed by the second 队列族.

注意

若an application does not need the contents of a resource to remain valid when transferring from one队列族to another, then the ownership transfer should be skipped.

A 队列族所有权转移 consists of two distinct parts:

  1. Release exclusive ownership from the source 队列族

  2. Acquire exclusive ownership for the 目标 队列族

An application must ensure that these operations occur in the correct order by defining一个执行依赖between them, e.g. using a semaphore.

A release operation is used to release exclusive ownership of a range of a缓冲区or图像subresource range. 一个释放操作is defined by executing a 缓冲区内存屏障 (for a buffer range) or一个图像memory barrier (for an图像子资源range), on a队列from the source queue family. The srcQueueFamilyIndex parameter of the barrier must be set to the source队列族index, and the dstQueueFamilyIndex parameter to the 目标队列族index. dstStageMask is ignored for such a barrier, such that no visibility operation is executed - the value of this mask does not affect the validity of the barrier. 释放操作在可用性操作之后发生。

一个_acquire operation_ is used to acquire exclusive ownership of a range of a缓冲区or图像subresource range. 一个获取操作通过 在目标队列族的一个队列上执行一个 缓冲区内存屏障 (for a buffer range) 或者一个图像内存屏障 (for an图像子资源range)来定义。 The srcQueueFamilyIndex parameter of the barrier must be set to the source队列族index, and the dstQueueFamilyIndex parameter to the 目标队列族index. srcStageMask is ignored for such a barrier, such that no availability operation is executed - the value of this mask does not affect the validity of the barrier. The acquire操作happens-before the visibility operation.

注意

Whilst it is not invalid to provide 目标 or 源访问掩码 for 内存屏障 used for release or acquire operations, respectively, they have no practical effect. Access after a release操作has undefined results, and so visibility for those accesses has no practical effect. Similarly, write access before一个acquire操作will produce undefined results for future access, so availability of those writes has no practical use. In一个earlier version of the specification, these were required to match on both sides - but this was subsequently relaxed. It is now recommended that these masks are simply set to 0.

若the transfer is via an图像内存屏障, and an 图像布局转换 is desired, then the values of oldLayout and newLayout in the release 内存屏障 must be equal to values of oldLayout and newLayout in the acquire 内存屏障. Although the 图像布局转换 is submitted twice, it will only be executed once. A 布局transition specified in this way happens-after the release operation and happens-before the acquire operation.

若the values of srcQueueFamilyIndex and dstQueueFamilyIndex are equal, no ownership transfer is performed, and the barrier operates as if they were both set to VK_QUEUE_FAMILY_IGNORED.

队列族所有权转移s may perform read and write accesses on all memory bound to the图像子资源or缓冲区range, so applications must ensure that all内存writes have been made available before a 队列族所有权转移 is executed. Available内存is automatically made visible to队列族release and acquire operations, and writes performed by those operations are automatically made available.

一旦一个队列族已经获取到一个缓冲区区间或者图像VK_SHARING_MODE_EXCLUSIVE 子资源区间的所有权,那么它的内容对于其他队列族就是未定义的,除非所有权发生转移。

The contents of any portion of another resource which aliases内存that is bound to the transferred缓冲区or图像子资源range are undefined after a release or acquire operation.

6.8. 等待空闲操作

在Host端等待 一个指定队列 未完成的队列操作的执行完成,需要调用:

VkResult vkQueueWaitIdle(
    VkQueue                                     queue);
  • queue 是等待行为所在的队列。

vkQueueWaitIdle is equivalent to submitting a fence to a队列and waiting with一个infinite timeout for that fence to signal.

Valid Usage (Implicit)
  • queue must be a valid VkQueue handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

在host端等待 for the 未执行的队列操作的执行 一个给定的 逻辑设备上所有的队列, call:

VkResult vkDeviceWaitIdle(
    VkDevice                                    device);
  • device 是需要闲置的逻辑设备。

对于device所拥有的所有队列, vkDeviceWaitIdle 与调用calling vkQueueWaitIdle等同。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

Host Synchronization
  • Host access to all VkQueue objects created from device must be externally synchronized

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

  • VK_ERROR_DEVICE_LOST

6.9. CPU端写入顺序保证

命令缓冲区的批量任务通过vkQueueSubmit被提交到 队列,它定义了一个与之前的host操作关联的内存依赖,且 被提交到队列的命令缓冲区的执行。

第一个同步作用域 是 host端 执行模型定义的,但包含 在host端的vkQueueSubmit 的执行、在此之前发生的任何执行。

第二个同步作用域包含每一个 被提交到同一个队列提交命令 ,和未来提交到同一个队列上的命令。

第一个访问作用域包含所有的host端写入到可映射设备内存,该内存要么是一致的, 或者被 vkFlushMappedMemoryRanges所刷新。

第二个 访问作用域 包含所有的由设备执行访问的内存。

7. Render Pass

一个render pass表示一系列附件、subpass、subpass之间的依赖关系的集合,描述了附件在subpass的过程中是 如何被使用的。 在命令缓冲区中render pass的使用 是一个 render pass实例

render pass由VkRenderPass类型的handle表示:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkRenderPass)

一个 附件描述 描述了一个附件的属性,包含它的格式、采样数、在每一个render pass实例的开始和结束时它的内容是被看作了什么。

一个 subpass 代表渲染中的一个读写render pass 的附件子集的阶段 。 渲染命令被记录在render pass实例的某一个subpass中。

一个 subpass 描述了参与到subpass执行的附件的子集。 每一个subpass都可以被当作_输入附件_ 的附件中读取数据,写入数据到被当作颜色、深度、模板附件的附件中,并对解析附件执行多采样解析操作。 一个subpass描述也可以包含多个 保留附件,这些附件不被subpass读取或者写入,但是它们的内容在subpass之间被一直保留。

如果附件是该subpass的颜色、深度、模板、解析或者输入附件(如VkSubpassDescription的成员 pColorAttachments, pDepthStencilAttachment, pResolveAttachments, and pInputAttachments所描述),一个subpass就可以使用它。 如果一个附件在subpass之间被保留,subpass就不会使用它。 序号最小的subpass第一次使用附件。序号最大的subpass最后一次使用附件。

一个renderpass的所有subpass都渲染到同一维,每一个像素对应的单subpass的多个片元可读取前一个subpass在同一个 (x,y,layer)位置写入的内容。

注意

通过提前描述一个完整的subpass集合,render pass提供了在不同subpass之间优化存储和转移附件数据的机会。

实际上,这意味着带有一个简单帧缓冲区空间依赖的subpass也许会被合并到一个tiled rendering pass,一个render pass实例内附件的数据保持在芯片上。 然而,一个render pass仅仅包含一个subpass也很常见。

subpass依赖 描述了subpass之间的执行和内存依赖

一个 subpass依赖链 是render pass内部多个subpass之间的一系列依赖关系,每一个subpass依赖的源subpass就是前一个依赖关系的目标subpass。

多个subpass也许会重叠或者乱序的执行,除非强制指定执行的依赖顺序。每一个subpass只参考记录到同一个subpass的命令submission order , 和划分renderpass界限的vkCmdBeginRenderPassvkCmdEndRenderPass命令(在另一个subpass的命令并没有被包含)。 这影响力大多数其他的implicit ordering guarantees

一个render pass描述了subpass和独立于任何特定图像视图的附件的结构。 这个特定的图像视图(将被用于附件)和附件的维度,通过VkFramebuffer 对象指定。 缓冲区是参考特定的render pass(和缓冲区兼容的,参考Render Pass 兼容性)来创建的。 总的来说,一个render pass和一个帧缓存器给一个或多个subpass定义了完整的渲染状态,也包括多个subpass之间的依赖性。

一个给定的subpass绘制命令的各种管线阶段,在多个绘制命令之内或者跨多个命令,也许会顺序/乱序的并行执行,然而仍遵循 管线顺序。 然而,对于给定的(x,y,layer,sample) 采样位置,某个逐采样操作依栅格化顺序执行。

7.1. 创建Render Pass

可调用如下函数来创建render pass:

VkResult vkCreateRenderPass(
    VkDevice                                    device,
    const VkRenderPassCreateInfo*               pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkRenderPass*                               pRenderPass);
  • device 是创建render pass的逻辑设备。

  • pCreateInfo 是一个指向 VkRenderPassCreateInfo实例的指针,描述了render pass的参数。

  • pAllocator 控制了主机端内存如何分配,如Memory Allocation一章所述。

  • pRenderPass 指向了VkRenderPass handle,是被返回的生成的render pass。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkRenderPassCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pRenderPass must be a pointer to a VkRenderPass handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkRenderPassCreateInfo 数据类型定义如下:

typedef struct VkRenderPassCreateInfo {
    VkStructureType                   sType;
    const void*                       pNext;
    VkRenderPassCreateFlags           flags;
    uint32_t                          attachmentCount;
    const VkAttachmentDescription*    pAttachments;
    uint32_t                          subpassCount;
    const VkSubpassDescription*       pSubpasses;
    uint32_t                          dependencyCount;
    const VkSubpassDependency*        pDependencies;
} VkRenderPassCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留。

  • attachmentCount 是render pass使用的附件的个数,或者为0,表示无附件。附件可以从0开始索引,如 [0,attachmentCount)。

  • pAttachments 指向了一个 大小为attachmentCount,元素类型为VkAttachmentDescription 的数组,描述了附件的属性,或当attachmentCount 为0时值为`NULL`。

  • subpassCount 是将要为render pass创建的subpass数量。subpass可以从0开始索引,如 [0,subpassCount)。一个render pass必须拥有一个subpass。

  • pSubpasses 指向了一个大小为subpassCount,元素类型为VkSubpassDescription 的数组,描述了subpass的属性。

  • dependencyCount 是不同对subpass之间依赖的个数,或者为0表示没有依赖。

  • pDependencies 指向了一个大小为dependencyCount ,元素类型为 VkSubpassDependency 的数组,描述了不同对subpass之间的依赖关系, 或者当dependencyCount值为0时为 NULL

正确使用
  • If any two subpasses operate on attachments with overlapping ranges of the same VkDeviceMemory object, and at least one subpass writes to that area of VkDeviceMemory, a subpass dependency must be included (either directly or via some intermediate subpasses) between them

  • If the attachment member of any element of pInputAttachments, pColorAttachments, pResolveAttachments or pDepthStencilAttachment, or the attachment indexed by any element of pPreserveAttachments in any given element of pSubpasses is bound to a range of a VkDeviceMemory object that overlaps with any other attachment in any subpass (including the same subpass), the VkAttachmentDescription structures describing them must include VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT in flags

  • If the attachment member of any element of pInputAttachments, pColorAttachments, pResolveAttachments or pDepthStencilAttachment, or any element of pPreserveAttachments in any given element of pSubpasses is not VK_ATTACHMENT_UNUSED, it must be less than attachmentCount

  • The value of any element of the pPreserveAttachments member in any given element of pSubpasses must not be VK_ATTACHMENT_UNUSED

  • For any member of pAttachments with a loadOp equal to VK_ATTACHMENT_LOAD_OP_CLEAR, the first use of that attachment must not specify a layout equal to VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL or VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL.

  • For any element of pDependencies, if the srcSubpass is not VK_SUBPASS_EXTERNAL, all stage flags included in the srcStageMask member of that dependency must be a pipeline stage supported by the pipeline identified by the pipelineBindPoint member of the source subpass.

  • For any element of pDependencies, if the dstSubpass is not VK_SUBPASS_EXTERNAL, all stage flags included in the dstStageMask member of that dependency must be a pipeline stage supported by the pipeline identified by the pipelineBindPoint member of the source subpass.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • If attachmentCount is not 0, pAttachments must be a pointer to an array of attachmentCount valid VkAttachmentDescription structures

  • pSubpasses must be a pointer to an array of subpassCount valid VkSubpassDescription structures

  • If dependencyCount is not 0, pDependencies must be a pointer to an array of dependencyCount valid VkSubpassDependency structures

  • subpassCount must be greater than 0

VkAttachmentDescription数据结构定义如下:

typedef struct VkAttachmentDescription {
    VkAttachmentDescriptionFlags    flags;
    VkFormat                        format;
    VkSampleCountFlagBits           samples;
    VkAttachmentLoadOp              loadOp;
    VkAttachmentStoreOp             storeOp;
    VkAttachmentLoadOp              stencilLoadOp;
    VkAttachmentStoreOp             stencilStoreOp;
    VkImageLayout                   initialLayout;
    VkImageLayout                   finalLayout;
} VkAttachmentDescription;
  • flags 是一个位掩码,描述了附件的附加属性。 位掩码包含以下可选:

    typedef enum VkAttachmentDescriptionFlagBits {
        VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT = 0x00000001,
    } VkAttachmentDescriptionFlagBits;
  • format 是一个VkFormat 值,指定了用于附件的图像的格式。

  • samplesVkSampleCountFlagBits中定义的图像的采样数。

  • loadOp 指定了附件的颜色和深度内容在subpass中第一次使用时被如何看待。

    typedef enum VkAttachmentLoadOp {
        VK_ATTACHMENT_LOAD_OP_LOAD = 0,
        VK_ATTACHMENT_LOAD_OP_CLEAR = 1,
        VK_ATTACHMENT_LOAD_OP_DONT_CARE = 2,
    } VkAttachmentLoadOp;
    • VK_ATTACHMENT_LOAD_OP_LOAD means the previous contents of the image within the render area will be preserved. For attachments with a depth/stencil format, this uses the access type VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT. For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_READ_BIT.

    • VK_ATTACHMENT_LOAD_OP_CLEAR means the contents within the render area will be cleared to a uniform value, which is specified when a render pass instance is begun. For attachments with a depth/stencil format, this uses the access type VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT. For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

    • VK_ATTACHMENT_LOAD_OP_DONT_CARE means the previous contents within the area need not be preserved; the contents of the attachment will be undefined inside the render area. For attachments with a depth/stencil format, this uses the access type VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT. For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

  • storeOp specifies how the contents of color and depth components of the attachment are treated at the end of the subpass where it is last used:

    typedef enum VkAttachmentStoreOp {
        VK_ATTACHMENT_STORE_OP_STORE = 0,
        VK_ATTACHMENT_STORE_OP_DONT_CARE = 1,
    } VkAttachmentStoreOp;
    • VK_ATTACHMENT_STORE_OP_STORE means the contents generated during the render pass and within the render area are written to memory. For attachments with a depth/stencil format, this uses the access type VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT. For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

    • VK_ATTACHMENT_STORE_OP_DONT_CARE means the contents within the render area are not needed after rendering, and may be discarded; the contents of the attachment will be undefined inside the render area. For attachments with a depth/stencil format, this uses the access type VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT. For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

  • stencilLoadOp specifies how the contents of stencil components of the attachment are treated at the beginning of the subpass where it is first used, and must be one of the same values allowed for loadOp above.

  • stencilStoreOp specifies how the contents of stencil components of the attachment are treated at the end of the last subpass where it is used, and must be one of the same values allowed for storeOp above.

  • initialLayout is the layout the attachment image subresource will be in when a render pass instance begins.

  • finalLayout is the layout the attachment image subresource will be transitioned to when a render pass instance ends. During a render pass instance, an attachment can use a different layout in each subpass, if desired.

如果附件使用了颜色格式,那么loadOpstoreOp就被使用了,stencilLoadOpstencilStoreOp被忽略。 如果该格式有深度和/或模板数据,那么loadOpstoreOp仅用于深度数据,stencilLoadOpstencilStoreOp 定义了模板数据如何被处理。 loadOpstencilLoadOp 定义了 load operations ,它作为第一个使用该附件的subpass的一部分被执行。 storeOpstencilStoreOp 定义了 store operations,它作为最后一个使用该附件的subpass的一部分被执行。

被subpass使用的附件内每一个值的加载操作,在被记录到subpass的命令读取这个值之前发生。 对于带有深度/模板格式的附件的加载操作发生在VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT pipeline阶段。 对于带有颜色格式的附件的加载操作在VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT管线阶段执行。

对于被subpass使用的附件内每一个值的存储操作发生在被记录到subpass的命令向该值写入之后。 对于带有深度/模板格式的附件的存储操作,发生在VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT管线阶段。 对于带有颜色格式的附件的存储操作在VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT 管线阶段执行。

如果一个附件没有被subpass使用,那么loadOp, storeOp, stencilStoreOp, and stencilLoadOp 都被忽略,附件的内存内容将不会被render pass的执行所修改。

在一个render pass实例期间内,带有颜色格式(每一个成员大小为8,16或32位)的输入/颜色附件,贯穿于整个实例必须以附件的格式表示。 带有其他浮点或定点颜色格式,或深度成分的附件可能通过比附件格式精度更高的格式所表示,但是必须表示相同的范围。 当这样的component通过loadOp被载入时,它将被转换为被render pass使用的依赖于Vulkan实现的格式。

这些成分必须在通过 storeOp在render pass实例的结束时进行解析和存储之前从render pass格式转换到附件的格式。 转换的发生在 Numeric Representation and ComputationFixed-Point Data Conversions中讲解。

如果flags 包含了 VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT,那么附件就被等同视作在同一个render pass中与另一个附件共享物理内存。 这个信息限制了Vulkan实现记录某些操作(如布局转换和 loadOp)的能力,以致于操作不会通过其他附件使用同一块物理内存而被不正确的排序。 a different attachment. 如下有更详细的描述:

正确使用
  • finalLayout 不能是 VK_IMAGE_LAYOUT_UNDEFINED 或者 VK_IMAGE_LAYOUT_PREINITIALIZED

Valid Usage (Implicit)
editing-note

TODO (Jon) - the following text may need to be moved back to combine with vkCreateRenderPass above for automatic ref page generation.

如果一个render pass使用多个共享一块内存的附件,这些附件必须每一个都在附件描述标志中包含VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT 标志位。 多个附件共享同一块内存在以下集中情况中可能发生:

  • Multiple attachments being assigned the same image view as part of framebuffer creation.

  • Attachments using distinct image views that correspond to the same image subresource of an image.

  • Attachments using views of distinct image subresources which are bound to overlapping memory ranges.

注意

Render passes必须包含任意操作在同一个subpass依赖链的两个subpass之间的subpass依赖(),若这些subpass中至少有一个向这些别名中的一个写入, 这个依赖关系必须包含执行和内存依赖来分离这些别名的使用。 若这些别名是不同图像子资源的不同视图并在内存中有重叠,这些依赖不能包含VK_DEPENDENCY_BY_REGION_BIT

Multiple attachments that alias the same memory must not be used in a single subpass. A given attachment index must not be used multiple times in a single subpass, with one exception: two subpass attachments can use the same attachment index if at least one use is as an input attachment and neither use is as a resolve or preserve attachment. In other words, the same view can be used simultaneously as an input and color or depth/stencil attachment, but must not be used as multiple color or depth/stencil attachments nor as resolve or preserve attachments. The precise set of valid scenarios is described in more detail below.

If a set of attachments alias each other, then all except the first to be used in the render pass must use an initialLayout of VK_IMAGE_LAYOUT_UNDEFINED, since the earlier uses of the other aliases make their contents undefined. Once an alias has been used and a different alias has been used after it, the first alias must not be used in any later subpasses. However, an application can assign the same image view to multiple aliasing attachment indices, which allows that image view to be used multiple times even if other aliases are used in between.

注意

Once an attachment needs the VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT bit, there should be no additional cost of introducing additional aliases, and using these additional aliases may allow more efficient clearing of the attachments on multiple uses via VK_ATTACHMENT_LOAD_OP_CLEAR.

VkSubpassDescription 类型数据结构定义如下:

typedef struct VkSubpassDescription {
    VkSubpassDescriptionFlags       flags;
    VkPipelineBindPoint             pipelineBindPoint;
    uint32_t                        inputAttachmentCount;
    const VkAttachmentReference*    pInputAttachments;
    uint32_t                        colorAttachmentCount;
    const VkAttachmentReference*    pColorAttachments;
    const VkAttachmentReference*    pResolveAttachments;
    const VkAttachmentReference*    pDepthStencilAttachment;
    uint32_t                        preserveAttachmentCount;
    const uint32_t*                 pPreserveAttachments;
} VkSubpassDescription;
  • flags 被保留。

  • pipelineBindPoint is a VkPipelineBindPoint value specifying whether this is a compute or graphics subpass. Currently, only graphics subpasses are supported.

  • inputAttachmentCount is the number of input attachments.

  • pInputAttachments is an array of VkAttachmentReference structures (defined below) that lists which of the render pass’s attachments can be read in the shader during the subpass, and what layout each attachment will be in during the subpass. Each element of the array corresponds to an input attachment unit number in the shader, i.e. if the shader declares an input variable layout(input_attachment_index=X, set=Y, binding=Z) then it uses the attachment provided in pInputAttachments[X]. Input attachments must also be bound to the pipeline with a descriptor set, with the input attachment descriptor written in the location (set=Y, binding=Z).

  • colorAttachmentCount is the number of color attachments.

  • pColorAttachments is an array of colorAttachmentCount VkAttachmentReference structures that lists which of the render pass’s attachments will be used as color attachments in the subpass, and what layout each attachment will be in during the subpass. Each element of the array corresponds to a fragment shader output location, i.e. if the shader declared an output variable layout(location=X) then it uses the attachment provided in pColorAttachments[X].

  • pResolveAttachments is NULL or an array of colorAttachmentCount VkAttachmentReference structures that lists which of the render pass’s attachments are resolved to at the end of the subpass, and what layout each attachment will be in during the multisample resolve operation. If pResolveAttachments is not NULL, each of its elements corresponds to a color attachment (the element in pColorAttachments at the same index), and a multisample resolve operation is defined for each attachment. At the end of each subpass, multisample resolve operations read the subpass’s color attachments, and resolve the samples for each pixel to the same pixel location in the corresponding resolve attachments, unless the resolve attachment index is VK_ATTACHMENT_UNUSED. If the first use of an attachment in a render pass is as a resolve attachment, then the loadOp is effectively ignored as the resolve is guaranteed to overwrite all pixels in the render area.

  • pDepthStencilAttachment is a pointer to a VkAttachmentReference specifying which attachment will be used for depth/stencil data and the layout it will be in during the subpass. Setting the attachment index to VK_ATTACHMENT_UNUSED or leaving this pointer as NULL indicates that no depth/stencil attachment will be used in the subpass.

  • preserveAttachmentCount is the number of preserved attachments.

  • pPreserveAttachments is an array of preserveAttachmentCount render pass attachment indices describing the attachments that are not used by a subpass, but whose contents must be preserved throughout the subpass.

The contents of an attachment within the render area become undefined at the start of a subpass S if all of the following conditions are true:

  • The attachment is used as a color, depth/stencil, or resolve attachment in any subpass in the render pass.

  • There is a subpass S1 that uses or preserves the attachment, and a subpass dependency from S1 to S.

  • The attachment is not used or preserved in subpass S.

Once the contents of an attachment become undefined in subpass S, they remain undefined for subpasses in subpass dependency chains starting with subpass S until they are written again. However, they remain valid for subpasses in other subpass dependency chains starting with subpass S1 if those subpasses use or preserve the attachment.

正确使用
  • pipelineBindPoint must be VK_PIPELINE_BIND_POINT_GRAPHICS

  • colorAttachmentCount must be less than or equal to VkPhysicalDeviceLimits::maxColorAttachments

  • If the first use of an attachment in this render pass is as an input attachment, and the attachment is not also used as a color or depth/stencil attachment in the same subpass, then loadOp must not be VK_ATTACHMENT_LOAD_OP_CLEAR

  • If pResolveAttachments is not NULL, for each resolve attachment that does not have the value VK_ATTACHMENT_UNUSED, the corresponding color attachment must not have the value VK_ATTACHMENT_UNUSED

  • If pResolveAttachments is not NULL, the sample count of each element of pColorAttachments must be anything other than VK_SAMPLE_COUNT_1_BIT

  • Any given element of pResolveAttachments must have a sample count of VK_SAMPLE_COUNT_1_BIT

  • Any given element of pResolveAttachments must have the same VkFormat as its corresponding color attachment

  • All attachments in pColorAttachments and pDepthStencilAttachment that are not VK_ATTACHMENT_UNUSED must have the same sample count

  • If any input attachments are VK_ATTACHMENT_UNUSED, then any pipelines bound during the subpass must not access those input attachments from the fragment shader

  • The attachment member of any element of pPreserveAttachments must not be VK_ATTACHMENT_UNUSED

  • Any given element of pPreserveAttachments must not also be an element of any other member of the subpass description

  • If any attachment is used as both an input attachment and a color or depth/stencil attachment, then each use must use the same layout

Valid Usage (Implicit)
  • flags must be 0

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • If inputAttachmentCount is not 0, pInputAttachments must be a pointer to an array of inputAttachmentCount valid VkAttachmentReference structures

  • If colorAttachmentCount is not 0, pColorAttachments must be a pointer to an array of colorAttachmentCount valid VkAttachmentReference structures

  • If colorAttachmentCount is not 0, and pResolveAttachments is not NULL, pResolveAttachments must be a pointer to an array of colorAttachmentCount valid VkAttachmentReference structures

  • If pDepthStencilAttachment is not NULL, pDepthStencilAttachment must be a pointer to a valid VkAttachmentReference structure

  • If preserveAttachmentCount is not 0, pPreserveAttachments must be a pointer to an array of preserveAttachmentCount uint32_t values

VkAttachmentReference 类型数据结构定义如下:

typedef struct VkAttachmentReference {
    uint32_t         attachment;
    VkImageLayout    layout;
} VkAttachmentReference;
  • attachment is the index of the attachment of the render pass, and corresponds to the index of the corresponding element in the pAttachments array of the VkRenderPassCreateInfo structure. If any color or depth/stencil attachments are VK_ATTACHMENT_UNUSED, then no writes occur for those attachments.

  • layout is a VkImageLayout value specifying the layout the attachment uses during the subpass.

正确使用
  • layout must not be VK_IMAGE_LAYOUT_UNDEFINED or VK_IMAGE_LAYOUT_PREINITIALIZED

Valid Usage (Implicit)

VkSubpassDependency 类型数据结构定义如下:

typedef struct VkSubpassDependency {
    uint32_t                srcSubpass;
    uint32_t                dstSubpass;
    VkPipelineStageFlags    srcStageMask;
    VkPipelineStageFlags    dstStageMask;
    VkAccessFlags           srcAccessMask;
    VkAccessFlags           dstAccessMask;
    VkDependencyFlags       dependencyFlags;
} VkSubpassDependency;

If srcSubpass is equal to dstSubpass then the VkSubpassDependency describes a subpass self-dependency, and only constrains the pipeline barriers allowed within a subpass instance. Otherwise, when a render pass instance which includes a subpass dependency is submitted to a queue, it defines a memory dependency between the subpasses identified by srcSubpass and dstSubpass.

If srcSubpass is equal to VK_SUBPASS_EXTERNAL, the first synchronization scope includes commands submitted to the queue before the render pass instance began. Otherwise, the first set of commands includes all commands submitted as part of the subpass instance identified by srcSubpass and any load, store or multisample resolve operations on attachments used in srcSubpass. In either case, the first synchronization scope is limited to operations on the pipeline stages determined by the source stage mask specified by srcStageMask.

If dstSubpass is equal to VK_SUBPASS_EXTERNAL, the second synchronization scope includes commands submitted after the render pass instance is ended. Otherwise, the second set of commands includes all commands submitted as part of the subpass instance identified by dstSubpass and any load, store or multisample resolve operations on attachments used in dstSubpass. In either case, the second synchronization scope is limited to operations on the pipeline stages determined by the destination stage mask specified by dstStageMask.

The first access scope is limited to access in the pipeline stages determined by the source stage mask specified by srcStageMask. It is also limited to access types in the source access mask specified by srcAccessMask.

The second access scope is limited to access in the pipeline stages determined by the destination stage mask specified by dstStageMask. It is also limited to access types in the destination access mask specified by dstAccessMask.

The availability and visibility operations defined by a subpass dependency affect the execution of image layout transitions within the render pass.

正确使用
  • If srcSubpass is not VK_SUBPASS_EXTERNAL, srcStageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • If dstSubpass is not VK_SUBPASS_EXTERNAL, dstStageMask must not include VK_PIPELINE_STAGE_HOST_BIT

  • If the geometry shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the geometry shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT

  • If the tessellation shaders feature is not enabled, srcStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • If the tessellation shaders feature is not enabled, dstStageMask must not contain VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT or VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT

  • srcSubpass must be less than or equal to dstSubpass, unless one of them is VK_SUBPASS_EXTERNAL, to avoid cyclic dependencies and ensure a valid execution order

  • srcSubpass and dstSubpass must not both be equal to VK_SUBPASS_EXTERNAL

  • If srcSubpass is equal to dstSubpass, srcStageMask and dstStageMask must only contain one of VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT, VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, VK_PIPELINE_STAGE_VERTEX_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_CONTROL_SHADER_BIT, VK_PIPELINE_STAGE_TESSELLATION_EVALUATION_SHADER_BIT, VK_PIPELINE_STAGE_GEOMETRY_SHADER_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT, VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, or VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT

  • If srcSubpass is equal to dstSubpass, the logically latest pipeline stage in srcStageMask must be logically earlier than or equal to the logically earliest pipeline stage in dstStageMask

  • Any access flag included in srcAccessMask must be supported by one of the pipeline stages in srcStageMask, as specified in the table of supported access types.

  • Any access flag included in dstAccessMask must be supported by one of the pipeline stages in dstStageMask, as specified in the table of supported access types.

Valid Usage (Implicit)
editing-note

The following two alleged implicit dependencies are practically no-ops, as the operations they describe are already guaranteed by semaphores and submission order (so they’re almost entirely no-ops on their own). The only reason they exist is because it simplifies reasoning about where automatic layout transitions happen. Further rewrites of this chapter could potentially remove the need for these.

If there is no subpass dependency from VK_SUBPASS_EXTERNAL to the first subpass that uses an attachment, then an implicit subpass dependency exists from VK_SUBPASS_EXTERNAL to the first subpass it is used in. The subpass dependency operates as if defined with the following parameters:

VkSubpassDependency implicitDependency = {
    .srcSubpass = VK_SUBPASS_EXTERNAL;
    .dstSubpass = firstSubpass; // First subpass attachment is used in
    .srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
    .dstStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
    .srcAccessMask = 0;
    .dstAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    .dependencyFlags = 0;
};

Similarly, if there is no subpass dependency from the last subpass that uses an attachment to VK_SUBPASS_EXTERNAL, then an implicit subpass dependency exists from the last subpass it is used in to VK_SUBPASS_EXTERNAL. The subpass dependency operates as if defined with the following parameters:

VkSubpassDependency implicitDependency = {
    .srcSubpass = lastSubpass; // Last subpass attachment is used in
    .dstSubpass = VK_SUBPASS_EXTERNAL;
    .srcStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
    .dstStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT;
    .srcAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    .dstAccessMask = 0;
    .dependencyFlags = 0;
};

As subpasses may overlap or execute out of order with regards to other subpasses unless a subpass dependency chain describes otherwise, the layout transitions required between subpasses cannot be known to an application. Instead, an application provides the layout that each attachment must be in at the start and end of a renderpass, and the layout it must be in during each subpass it is used in. The implementation then must execute layout transitions between subpasses in order to guarantee that the images are in the layouts required by each subpass, and in the final layout at the end of the render pass.

Automatic layout transitions away from the layout used in a subpass happen-after the availability operations for all dependencies with that subpass as the srcSubpass.

Automatic layout transitions into the layout used in a subpass happen-before the visibility operations for all dependencies with that subpass as the dstSubpass.

Automatic layout transitions away from initialLayout happens-after the availability operations for all dependencies with a srcSubpass equal to VK_SUBPASS_EXTERNAL, where dstSubpass uses the attachment that will be transitioned. For attachments created with VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT, automatic layout transitions away from initialLayout happen-after the availability operations for all dependencies with a srcSubpass equal to VK_SUBPASS_EXTERNAL, where dstSubpass uses any aliased attachment.

Automatic layout transitions into finalLayout happens-before the visibility operations for all dependencies with a dstSubpass equal to VK_SUBPASS_EXTERNAL, where srcSubpass uses the attachment that will be transitioned. For attachments created with VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT, automatic layout transitions into finalLayout happen-before the visibility operations for all dependencies with a dstSubpass equal to VK_SUBPASS_EXTERNAL, where srcSubpass uses any aliased attachment.

If two subpasses use the same attachment in different layouts, and both layouts are read-only, no subpass dependency needs to be specified between those subpasses. If an implementation treats those layouts separately, it must insert an implicit subpass dependency between those subpasses to separate the uses in each layout. The subpass dependency operates as if defined with the following parameters:

// Used for input attachments
VkPipelineStageFlags inputAttachmentStages = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;
VkAccessFlags inputAttachmentAccess = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT;

// Used for depth/stencil attachments
VkPipelineStageFlags depthStencilAttachmentStages = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT | VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT;
VkAccessFlags depthStencilAttachmentAccess = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT;

VkSubpassDependency implicitDependency = {
    .srcSubpass = firstSubpass;
    .dstSubpass = secondSubpass;
    .srcStageMask = inputAttachmentStages | depthStencilAttachmentStages;
    .dstStageMask = inputAttachmentStages | depthStencilAttachmentStages;
    .srcAccessMask = inputAttachmentAccess | depthStencilAttachmentAccess;
    .dstAccessMask = inputAttachmentAccess | depthStencilAttachmentAccess;
    .dependencyFlags = 0;
};

If a subpass uses the same attachment as both an input attachment and either a color attachment or a depth/stencil attachment, writes via the color or depth/stencil attachment are not automatically made visible to reads via the input attachment, causing a feedback loop, except in any of the following conditions:

  • If the color components or depth/stencil components read by the input attachment are mutually exclusive with the components written by the color or depth/stencil attachments, then there is no feedback loop. This requires the graphics pipelines used by the subpass to disable writes to color components that are read as inputs via the colorWriteMask, and to disable writes to depth/stencil components that are read as inputs via depthWriteEnable or stencilTestEnable.

  • If the attachment is used as an input attachment and depth/stencil attachment only, and the depth/stencil attachment is not written to.

  • If a memory dependency is inserted between when the attachment is written and when it is subsequently read by later fragments. Pipeline barriers expressing a subpass self-dependency are the only way to achieve this, and one must be inserted every time a fragment will read values at a particular sample (x, y, layer, sample) coordinate, if those values have been written since the most recent pipeline barrier; or the since start of the subpass if there have been no pipeline barriers since the start of the subpass.

An attachment used as both an input attachment and a color attachment must be in the VK_IMAGE_LAYOUT_GENERAL layout. An attachment used as an input attachment and depth/stencil attachment must be in either VK_IMAGE_LAYOUT_GENERAL or VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL. An attachment must not be used as both a depth/stencil attachment and a color attachment.

To destroy a render pass, call:

void vkDestroyRenderPass(
    VkDevice                                    device,
    VkRenderPass                                renderPass,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the render pass.

  • renderPass is the handle of the render pass to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

正确使用
  • All submitted commands that refer to renderPass must have completed execution

  • If VkAllocationCallbacks were provided when renderPass was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when renderPass was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If renderPass is not VK_NULL_HANDLE, renderPass must be a valid VkRenderPass handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If renderPass is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to renderPass must be externally synchronized

7.2. Render Pass 兼容性

帧缓冲区和图形管线是基于特定render pass对象创建的。它们必须只能被用于该render pass对象,或者 与之兼容的对象。

如果两个reference指向的附件有相同的格式和采样数量,或者两个都是VK_ATTACHMENT_UNUSED,又或都指向`NULL`, 它们就是兼容的。

如果两个附件数组中对应位置元素都是是兼容的,那么这两个数组就是兼容的。 如果数组有不同的长度,个数少的数组中没有对应元素的位置都被当作VK_ATTACHMENT_UNUSED对待。

两个render pass是兼容的,如果它们对应的颜色、输入、解析和深度/模板附件的引用是兼容的, 如果符合除了以下的条件,它们就是完全相同的: * 附件描述中初始和最终的图像布局 * 附件描述中load 和 store操作 * 附件一用的图像布局

如果帧缓冲区和render pass是通过相同的或者兼容的render pass创建的,那么它们是兼容的。

7.3. 帧缓冲区

render pass操作通过 帧缓存区 联系在一起。帧缓存区表示一个render pass实例使用的多个特定内存附件。

帧缓冲区通过VkFramebuffer handle表示:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkFramebuffer)

可调用如下命令来创建帧缓冲区:

VkResult vkCreateFramebuffer(
    VkDevice                                    device,
    const VkFramebufferCreateInfo*              pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkFramebuffer*                              pFramebuffer);
  • device 是创建帧缓冲区的逻辑设备。

  • pCreateInfo 指向了一个VkFramebufferCreateInfo 数据结构,它描述了创建帧缓冲区的附加信息。

  • pAllocator控制了CPU端内存分配,如 Memory Allocation 一章所描述。

  • pFramebuffer 指向了一个 VkFramebuffer handle,它接收生成的帧缓冲区对象。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkFramebufferCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pFramebuffer must be a pointer to a VkFramebuffer handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkFramebufferCreateInfo 类型数据结构定义如下:

typedef struct VkFramebufferCreateInfo {
    VkStructureType             sType;
    const void*                 pNext;
    VkFramebufferCreateFlags    flags;
    VkRenderPass                renderPass;
    uint32_t                    attachmentCount;
    const VkImageView*          pAttachments;
    uint32_t                    width;
    uint32_t                    height;
    uint32_t                    layers;
} VkFramebufferCreateInfo;
  • sType 是数据结构的类型。

  • pNextNULL 或者一个指向拓展特定的数据结构的指针。

  • flags 被保留。

  • renderPass 是定义了帧缓冲区兼容的render pass。参考Render Pass Compatibility获取更多细节。

  • attachmentCount是附件的数量。

  • pAttachments 是一个 VkImageView handle数组,每一个元素都将在一个render pass实例中被当作对应的附件使用。

  • width, heightlayers 定义了帧缓冲区的维度。

被当作图像子资源使用的附件不能通过非附件用途的方式在render pass实例内部被使用。

注意

这个限制意味着render pass完全知道所有附件的所有用途,所以Vulkan实现能够正确的决定 何时及如何进行布局转变,何时并行执行subpass等。

It is legal for a subpass to use no color or depth/stencil attachments, and rather use shader side effects such as image stores and atomics to produce an output. In this case, the subpass continues to use the width, height, and layers of the framebuffer to define the dimensions of the rendering area, and the rasterizationSamples from each pipeline’s VkPipelineMultisampleStateCreateInfo to define the number of samples used in rasterization; however, if VkPhysicalDeviceFeatures::variableMultisampleRate is VK_FALSE, then all pipelines to be bound with a given zero-attachment subpass must have the same value for VkPipelineMultisampleStateCreateInfo::rasterizationSamples.

正确使用
  • attachmentCount 必须等于renderPass指定的附件数量。

  • Any given element of pAttachments that is used as a color attachment or resolve attachment by renderPass must have been created with a usage value including VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT

  • Any given element of pAttachments that is used as a depth/stencil attachment by renderPass must have been created with a usage value including VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT

  • Any given element of pAttachments that is used as an input attachment by renderPass must have been created with a usage value including VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT

  • Any given element of pAttachments must have been created with an VkFormat value that matches the VkFormat specified by the corresponding VkAttachmentDescription in renderPass

  • Any given element of pAttachments must have been created with a samples value that matches the samples value specified by the corresponding VkAttachmentDescription in renderPass

  • Any given element of pAttachments must have dimensions at least as large as the corresponding framebuffer dimension

  • Any given element of pAttachments must only specify a single mip level

  • Any given element of pAttachments must have been created with the identity swizzle

  • width must be greater than 0.

  • width must be less than or equal to VkPhysicalDeviceLimits::maxFramebufferWidth

  • height must be greater than 0.

  • height must be less than or equal to VkPhysicalDeviceLimits::maxFramebufferHeight

  • layers must be greater than 0.

  • layers must be less than or equal to VkPhysicalDeviceLimits::maxFramebufferLayers

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • renderPass must be a valid VkRenderPass handle

  • If attachmentCount is not 0, pAttachments must be a pointer to an array of attachmentCount valid VkImageView handles

  • Both of renderPass, and the elements of pAttachments that are valid handles must have been created, allocated, or retrieved from the same VkDevice

可调用如下命令来销毁帧缓冲区:

void vkDestroyFramebuffer(
    VkDevice                                    device,
    VkFramebuffer                               framebuffer,
    const VkAllocationCallbacks*                pAllocator);
  • device 是销毁帧缓冲区的逻辑设备。

  • framebuffer 是需要被销毁的帧缓冲区的handle。

  • pAllocator 控制了CPU端内存分配,如 Memory Allocation 一章所描述。

正确使用
  • All submitted commands that refer to framebuffer must have completed execution

  • If VkAllocationCallbacks were provided when framebuffer was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when framebuffer was created, pAllocator must be NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If framebuffer is not VK_NULL_HANDLE, framebuffer must be a valid VkFramebuffer handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If framebuffer is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to framebuffer must be externally synchronized

7.4. Render Pass 命令

应用程序按照每个subpass记录一个命令到render pass实例,从render pass开始迭代,遍历subpass并记录命令到subpass, 直到render pass对象结束。

可以调用如下命令来开始一个render pass实例:

void vkCmdBeginRenderPass(
    VkCommandBuffer                             commandBuffer,
    const VkRenderPassBeginInfo*                pRenderPassBegin,
    VkSubpassContents                           contents);
  • commandBuffer 是记录命令的命令缓冲区。

  • pRenderPassBegin 是一个指针,指向了一个VkRenderPassBeginInfo实例(下有定义),它指明了要开启的render pass,和实例使用的帧缓冲区。

  • contents 指定了第一个subpass中的命令是如何提供的,需为以下值之一:

    typedef enum VkSubpassContents {
        VK_SUBPASS_CONTENTS_INLINE = 0,
        VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS = 1,
    } VkSubpassContents;

    如果contentsVK_SUBPASS_CONTENTS_INLINE,subpass的内容将被嵌入记录到主命令缓冲区, 次命令缓冲区不能在subpass内执行。 如果contentsVK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS,内容将被记录到被主缓冲区调用的 次命令缓冲区,vkCmdExecuteCommands是唯一的有效命令,直到vkCmdNextSubpass 或者 vkCmdEndRenderPass

一个render pass实例开始后,命令缓冲区就准备好了记录命令到该render pass的第一个subpass。

正确使用
  • If any of the initialLayout or finalLayout member of the VkAttachmentDescription structures or the layout member of the VkAttachmentReference structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL then the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin must have been created with VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT set

正确使用
  • If any of the initialLayout or finalLayout member of the VkAttachmentDescription structures or the layout member of the VkAttachmentReference structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL or VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL then the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin must have been created with VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT set

  • If any of the initialLayout or finalLayout member of the VkAttachmentDescription structures or the layout member of the VkAttachmentReference structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL then the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin must have been created with VK_IMAGE_USAGE_SAMPLED_BIT or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT set

  • If any of the initialLayout or finalLayout member of the VkAttachmentDescription structures or the layout member of the VkAttachmentReference structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL then the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin must have been created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT set

  • If any of the initialLayout or finalLayout member of the VkAttachmentDescription structures or the layout member of the VkAttachmentReference structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL then the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin must have been created with VK_IMAGE_USAGE_TRANSFER_DST_BIT set

  • If any of the initialLayout members of the VkAttachmentDescription structures specified when creating the render pass specified in the renderPass member of pRenderPassBegin is not VK_IMAGE_LAYOUT_UNDEFINED, then each such initialLayout must be equal to the current layout of the corresponding attachment image subresource of the framebuffer specified in the framebuffer member of pRenderPassBegin

  • The srcStageMask and dstStageMask members of any element of the pDependencies member of VkRenderPassCreateInfo used to create renderpass must be supported by the capabilities of the queue family identified by the queueFamilyIndex member of the VkCommandPoolCreateInfo used to create the command pool which commandBuffer was allocated from.

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pRenderPassBegin must be a pointer to a valid VkRenderPassBeginInfo structure

  • contents must be a valid VkSubpassContents value

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics operations

  • This command must only be called outside of a render pass instance

  • commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary

Outside

Graphics

Graphics

VkRenderPassBeginInfo 结构定义如下:

typedef struct VkRenderPassBeginInfo {
    VkStructureType        sType;
    const void*            pNext;
    VkRenderPass           renderPass;
    VkFramebuffer          framebuffer;
    VkRect2D               renderArea;
    uint32_t               clearValueCount;
    const VkClearValue*    pClearValues;
} VkRenderPassBeginInfo;
  • sType 是本数据结构的类型。

  • pNextNULL 或者指向拓展特定结构的指针。

  • renderPass 是开始记录的render pass对象 。

  • framebuffer 是包含被用于render pass的附件的帧缓冲区。

  • renderArea 是被render pass实例影响的渲染区域,下方有详细描述。

  • clearValueCount pClearValues 数组中元素的个数。

  • pClearValues 是一个 VkClearValue 类型的数组,包含了每一个附件需要清除的值,如果附件使用了 VK_ATTACHMENT_LOAD_OP_CLEARloadOp值,或者附件拥有深度/挡板格式并使用VK_ATTACHMENT_LOAD_OP_CLEARstencilLoadOp值。只有对应的呗清除了附件的元素才能被使用。pClearValues中其他的元素被忽略。

renderArea是被render pass实例影响的渲染区域。 附件load,store和多采样解析操作的效果被限制到所有附件的像素x和y坐标所限定的区域。 render aera拓展到 framebuffer的所有层。 应用程序必须保证所有的渲染都在这个render area,否则在此范围之外的像素就是未定义的, 或着色器副作用也许出现在此范围外的fragment(若有必要,需使用scissor)。 render area 必须包含在帧缓冲区维度的中。

注意

当render area比帧缓冲区小的时候会造成性能损失,除非它和render pass的render area粒度。

正确使用
  • clearValueCount must be greater than the largest attachment index in renderPass that specifies a loadOp (or stencilLoadOp, if the attachment has a depth/stencil format) of VK_ATTACHMENT_LOAD_OP_CLEAR

  • If clearValueCount is not 0, pClearValues must be a pointer to an array of clearValueCount valid VkClearValue unions

  • renderPass must be compatible with the renderPass member of the VkFramebufferCreateInfo structure specified when creating framebuffer.

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO

  • pNext must be NULL

  • renderPass must be a valid VkRenderPass handle

  • framebuffer must be a valid VkFramebuffer handle

  • Both of framebuffer, and renderPass must have been created, allocated, or retrieved from the same VkDevice

可调用如下命令来查询render area的粒度:

void vkGetRenderAreaGranularity(
    VkDevice                                    device,
    VkRenderPass                                renderPass,
    VkExtent2D*                                 pGranularity);
  • device 是拥有该render pass的逻辑设备。

  • renderPass 是 render pass 的handle。

  • pGranularity 指向一个VkExtent2D 数据结构,它接受返回的粒度。

The conditions leading to an optimal renderArea are:

  • the offset.x member in renderArea is a multiple of the width member of the returned VkExtent2D (the horizontal granularity).

  • the offset.y member in renderArea is a multiple of the height of the returned VkExtent2D (the vertical granularity).

  • either the offset.width member in renderArea is a multiple of the horizontal granularity or offset.x+offset.width is equal to the width of the framebuffer in the VkRenderPassBeginInfo.

  • either the offset.height member in renderArea is a multiple of the vertical granularity or offset.y+offset.height is equal to the height of the framebuffer in the VkRenderPassBeginInfo.

Subpass dependencies不受render area的影响,并且应用到附属到帧缓冲区的整个图像子资源。 同样的,即使它们的效果拓展到了render area外,管线屏障依然有效。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • renderPass must be a valid VkRenderPass handle

  • pGranularity must be a pointer to a VkExtent2D structure

  • renderPass must have been created, allocated, or retrieved from device

To transition to the next subpass in the render pass instance after recording the commands for a subpass, call:

void vkCmdNextSubpass(
    VkCommandBuffer                             commandBuffer,
    VkSubpassContents                           contents);
  • commandBuffer is the command buffer in which to record the command.

  • contents specifies how the commands in the next subpass will be provided, in the same fashion as the corresponding parameter of vkCmdBeginRenderPass.

The subpass index for a render pass begins at zero when vkCmdBeginRenderPass is recorded, and increments each time vkCmdNextSubpass is recorded.

Moving to the next subpass automatically performs any multisample resolve operations in the subpass being ended. End-of-subpass multisample resolves are treated as color attachment writes for the purposes of synchronization. That is, they are considered to execute in the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage and their writes are synchronized with VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT. Synchronization between rendering within a subpass and any resolve operations at the end of the subpass occurs automatically, without need for explicit dependencies or pipeline barriers. However, if the resolve attachment is also used in a different subpass, an explicit dependency is needed.

After transitioning to the next subpass, the application can record the commands for that subpass.

正确使用
  • 当前subpass索引必须要比render pass个数减一的结果要小。

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • contents must be a valid VkSubpassContents value

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics operations

  • This command must only be called inside of a render pass instance

  • commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary

Inside

Graphics

Graphics

可调用如下命令来在为最后一个subpass记录命令时记录一个命令来完成render pass实例:

void vkCmdEndRenderPass(
    VkCommandBuffer                             commandBuffer);
  • commandBuffer 是将结束当前render pass实例的命令缓冲区。

Ending a render pass instance performs any multisample resolve operations on the final subpass.

正确使用
  • The current subpass index must be equal to the number of subpasses in the render pass minus one

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics operations

  • This command must only be called inside of a render pass instance

  • commandBuffer must be a primary VkCommandBuffer

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary

Inside

Graphics

Graphics

8. 着色器

一个着色器指定了在图形和计算管线对应各阶段的每个顶点、控制点、细分顶点、图元、片元或者工作组 上执行的可编程的操作。

图形管线包括作为primitive assembly结果的顶点着色器执行, 紧接着是在patches之上的 细分控制、求值着色器(如果开启了)操作,作用在图元之上的几何着色器(如果开启了), 操作于 Rasterization产生的片元之上的片元着色器。 在本规范中,顶细分控制、细分求值、几何着色器都是指顶点处理阶段,发生在逻辑管线中栅格化之前。 片元着色器程序在栅格化之后运行。

只有计算着色器阶段被包含在计算管线中。计算着色器操作一个工作组中的一些调用。

着色器可以从输入变量中读取,从输出变量中读取或者写入。 输入和输出变量可以被用来在不同着色器阶段之间转移数据,或者允许着色器和执行环境中变量值直接交互。 同样的,执行环境也提供了描述性能的常量。

着色器变量和执行环境提供的着色器内部_built-in_修饰的输入和输出变量相关。 对于每个阶段可用的修饰符在下面小节中列出。

8.1. 着色器模块

Shader modules 包含 shader code 和一个或多个入口点。 可通过指定一个入口点作为创建pipeline的一部分来从着色器模块中选择着色器。 管线的阶段可以使用来自不同模块的着色器。定义了一个着色器模块的着色器代码必须是SPIR-V格式, 在附录 Vulkan Environment for SPIR-V 有描述。

着色器模块通过VkShaderModule handles表示:

VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkShaderModule)

可调用如下命令来创建着色器模块:

VkResult vkCreateShaderModule(
    VkDevice                                    device,
    const VkShaderModuleCreateInfo*             pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkShaderModule*                             pShaderModule);
  • device 是创建着色器模块的逻辑设备。

  • pCreateInfo 参数是一个指针,指向了一个 VkShaderModuleCreateInfo 类型的数据结构实例。

  • pAllocator控制了主机端内存如何分配,如Memory Allocation一章所述。

  • pShaderModule 指向了一个VkShaderModule handle ,它接收返回的着色器模块。

一旦着色器模块被创建完成,它所包含的入口点在Compute PipelinesGraphics Pipelines中描述的管线阶段。

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • pCreateInfo must be a pointer to a valid VkShaderModuleCreateInfo structure

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • pShaderModule must be a pointer to a VkShaderModule handle

Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

VkShaderModuleCreateInfo 数据结构定义如下:

typedef struct VkShaderModuleCreateInfo {
    VkStructureType              sType;
    const void*                  pNext;
    VkShaderModuleCreateFlags    flags;
    size_t                       codeSize;
    const uint32_t*              pCode;
} VkShaderModuleCreateInfo;
  • sType 是数据结构的类型

  • pNextNULL 或者指向一个指向拓展特定的数据结构的指针

  • flags 被保留

  • codeSizepCode指明的代码的大小,以字节为单位

  • pCode 指向了将用来创建着色器模块的代码。代码的类型和格式由 pCode所指向的内存内容所决定

正确使用
  • codeSize 必须要大于0

  • codeSize 必须是4的倍数 If the VK_NV_glsl_shader extension is enabled and pCode references GLSL code codeSize can be a multiple of 1

  • pCode must point to valid SPIR-V code, formatted and packed as described by the Khronos SPIR-V Specification. If the VK_NV_glsl_shader extension is enabled pCode can instead reference valid GLSL code and must be written to the GL_KHR_vulkan_glsl extension specification

  • pCode must adhere to the validation rules described by the Validation Rules within a Module section of the SPIR-V Environment appendix. If the VK_NV_glsl_shader extension is enabled pCode can be valid GLSL code with respect to the GL_KHR_vulkan_glsl GLSL extension specification

  • pCode must declare the Shader capability for SPIR-V code

  • pCode must not declare any capability that is not supported by the API, as described by the Capabilities section of the SPIR-V Environment appendix

  • If pCode declares any of the capabilities that are listed as not required by the implementation, the relevant feature must be enabled, as listed in the SPIR-V Environment appendix

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO

  • pNext must be NULL

  • flags must be 0

  • pCode must be a pointer to an array of \(codeSize \over 4\) uint32_t values

可调用如下命令来销毁着色器模块:

void vkDestroyShaderModule(
    VkDevice                                    device,
    VkShaderModule                              shaderModule,
    const VkAllocationCallbacks*                pAllocator);
  • device 是销毁着色器模块的逻辑设备

  • shaderModule 是需要被销毁的着色器模块

  • pAllocator 控制了CPU端内存分配,如 Memory Allocation 一章所述

一个着色器模块可以在使用它的着色器的管线仍在使用中的时候被销毁。

正确使用
  • 若创建shaderModule 时提供了 VkAllocationCallbacks ,在此处也必须有匹配的回调函数 *若创建shaderModule 时没有提供 VkAllocationCallbacks ,则pAllocator 必须为 NULL

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If shaderModule is not VK_NULL_HANDLE, shaderModule must be a valid VkShaderModule handle

  • If pAllocator is not NULL, pAllocator must be a pointer to a valid VkAllocationCallbacks structure

  • If shaderModule is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to shaderModule must be externally synchronized

8.2. 着色器的执行

在管线的每一个阶段,对着色器的多次调用可能同时执行。 甚至,多个命令都调用的单个着色器也可以同时执行。 调用同一个着色器类型所产生的相对执行顺序也是未知的。 应用程序绘制命令或者分发命令产生的图元的顺序和着色器调用完成的顺序也可能不一致。 然而,片元着色器输出到附件依栅格化顺序

不同着色器类型调用的相对顺序基本上是未定义的。 然而,当调用的着色器的输入是前一个管线阶段的输出时,可以保证前一个阶段在产生所有必需的输入信息之前已经完成了。

8.3. 着色器内存访问顺序

着色器读取或者写入图像或缓冲区的顺序基本上是未定义的。 对于一些着色器类型(顶点、细分求值和某些情况下的片元着色器),甚至是可能进行加载和存储的着色器调用的次数也是未定义的。

尤其是以下规则应用时:

  • 对于每一个不同的顶点,Vertextessellation evaluation 着色器将被至少调用一次。

  • Fragment 着色器将被调用0次或者多次。

  • 同类型的着色器的相对调用顺序是未定义的。 A store issued by a shader when working on primitive B might complete prior to a store for primitive A, even if primitive A is specified prior to primitive B. This applies even to fragment shaders; while fragment shader outputs are always written to the framebuffer in rasterization order, stores executed by fragment shader invocations are not.

  • 不同类型着色器程序的相对调用顺序基本上是未定义的。

注意

上述的对于着色器调用的顺序的限制形成了在一系列的不可能实现的图元上着色器调用之间的同步。 For example, having one invocation poll memory written by another invocation assumes that the other invocation has been launched and will complete its writes in finite time.

Stores issued to different memory locations within a single shader invocation may not be visible to other invocations, or may not become visible in the order they were performed.

The OpMemoryBarrier instruction can be used to provide stronger ordering of reads and writes performed by a single invocation. OpMemoryBarrier guarantees that any memory transactions issued by the shader invocation prior to the instruction complete prior to the memory transactions issued after the instruction. Memory barriers are needed for algorithms that require multiple invocations to access the same memory and require the operations to be performed in a partially-defined relative order. For example, if one shader invocation does a series of writes, followed by an OpMemoryBarrier instruction, followed by another write, then the results of the series of writes before the barrier become visible to other shader invocations at a time earlier or equal to when the results of the final write become visible to those invocations. In practice it means that another invocation that sees the results of the final write would also see the previous writes. Without the memory barrier, the final write may be visible before the previous writes.

Writes that are the result of shader stores through a variable decorated with Coherent automatically have available writes to the same buffer, buffer view, or image view made visible to them, and are themselves automatically made available to access by the same buffer, buffer view, or image view. Reads that are the result of shader loads through a variable decorated with Coherent automatically have available writes to the same buffer, buffer view, or image view made visible to them. The order that coherent writes to different locations become available is undefined, unless enforced by a memory barrier instruction or other memory dependency.

Example 1. 注意

显式的内存依赖仍然需要使用,来保证对其他缓冲区、缓冲区视图、图像视图的访问的可用性和可见性。

内置的内存事务指令可以用来对给定内存地址读取或者写作的原子性。 当多个着色器调用内置的原子函数互相之间按照未定义的顺序执行时,这些函数都对一个内存地址进行内存读写操作, 且保证没有其他的内存事务将在读写期间对潜在的内存写入。 原子操作像对Coherent变量一样,保证了读写操作的可用性和可见性。

Example 2. 注意

在同一份内存上不同资源描述符进行内存访问,甚至有Coherent修饰符或通过原子操作,也可能是没有良好定义的, 因为诸如图像布局或者资源归属(Synchronization and Cache Control一章中讲解)等原因。

注意

原子操作允许着色器使用共享全局地址作为互斥性或者作为计数器,或其他用途。

8.4. 着色器的输入和输出

数据通过 input 和 output修饰的变量传入和传出着色器。 在不同阶段之间用户自定义的输入和输出是通过匹配Location修饰符来联系起来的。 另外,可以使用 BuiltIn 修饰符来对执行环境中特殊函数提供数据或进行数据交换。

在很多场合下,同一个BuiltIn可以在多个着色器阶段使用,含义相近。 BuiltIn修饰的变量的行为在下面小节中有记录。

8.5. 顶点着色器

对每一个顶点和它相关的vertex attribute数据,都调用一次顶点着色器, 输出一个顶点和相关的数据。 图形管线必须包含一个顶点着色器,且顶点着色器阶段始终都是图形管线的第一个阶段。

8.5.1. 顶点着色器的执行

在一个绘制命令中,对于一个顶点至少执行一次。 在执行期间,对着色器提供了顶点的索引和顶点本身数据。 在着色器内声明的输入变量通过Vulkan实现填充与调用关联的顶点属性值。

如果在一个绘制命令中同一个顶点出现多次(),且如果顶点着色总是产生相同的结果, Vulkan实现也许会重用该结果。

注意

顶点着色的结果什么时候、是否被重用以及顶点着色器被执行多少次,都是依赖于Vulkan实现的。 当顶点着色器包含存储或者原子操作时依然如此(参看 vertexPipelineStoresAndAtomics)。

8.6. 细分控制着色器

细分控制着色器是用来读取应用程序提供的输入图元,并产生一个输出图元。 对于一个图元及关联的数据,调用一次细分控制着色器(在顶点着色器处理完一个图元所有的顶点之后), 并每一个输出图元输出一个控制点及关联数据,也可以输出附加的图元数据。 按照VkPipelineTessellationStateCreateInfopatchControlPoints成员来调整输入图元的顶点个数, 这是输入组装的一部分。 输出图元的大小由细分控制或细分求值着色器指定的OpExecutionMode OutputVertices 控制, 至少在其中一个着色器中指定。 输入、输出图元的大小必须大于0,不大于VkPhysicalDeviceLimits::maxTessellationPatchSize

8.6.1. 细分控制着色器的执行

对于一个图元内每一个 output 顶点都至少调用一次细分控制着色器。

细分控制着色器的输入是由顶点着色器生成的。 每一次细分控制着色器调用可以读取任何一个输入的顶点的属性和它关联的数据。 对于一个给定图元的多次着色器调用逻辑上并行执行,相对顺序是未定义的。 然而,通过在一个图元内同步着色器调用,OpControlBarrier 指令可以用来提供对执行顺序的有限度的控制, 有效地把细分控制着色器执行划分为多个周期。 如果一次调用在同一个周期内读取被其他调用写入的逐顶点或者逐图元属性, 或者两个调用尝试在一个周期内向一个图元输出写入,细分控制着色器将读取到未定义的值。

8.7. 细分求值着色器(Tessellation Evaluation Shaders)

细分求值着色器在控制点组成的输入图元和它关联数据上进行操作,单个输入重心坐标表示了在子图元中调用的对象的相对位置, 输入单个顶点和关联数据。

8.7.1. 细分求值着色器的执行

对细分器生成的每一个顶点至少调用一次细分求值着色器。

8.8. 几何着色器

几何着色器在一个输入图元上一系列的顶点和它们关联的数据上进行操作,输入零个或者多个输出图元,和它们的顶点及输出每个图元 需要的关联数据。

8.8.1. 几何着色器的执行

对于细分阶段产生的每一个图元,几何着色器至少被调用一次,或者没有使用细分时primitive assembly 生成的每一个图元都调用至少一次。 每一个输入图元调用几何着色器的次数是由几何着色器中OpExecutionMode Invocations指定的每图元几何着色器调用次数 决定。 如果调用次数未指定,默认只调用一次。

8.9. 片元着色器

片元着色器作为图形管线栅格化的结果被调用。 每一次片元着色器调用都作用在一个片元和它关联的数据上。 除了少数例外,片元着色器并不访问其他片元关联的数据,和其他片元关联的片元着色器的执行之间是孤立的。

8.9.1. 片元着色器的执行

对于栅格化产生的每个片元,一个片元着色器都被调用一次。 如果 Early Per-Fragment Tests导致它没有被覆盖到 ,一个片元着色器就不能被调用。 还有,如果第一个图元栅格化生成的片元的输出将会被同一个subpass中第二个图元栅格化产生的片元所覆盖, 且该片元所对应的片元着色器没有其他副作用,那么第一个图元中这个片元可能不需要执行片元着色器。

不同的片元着色器相对的执行顺序是未定义的。

每一个像素对应的片元着色器调用次数由以下规则决定:

  • 如果开启了逐采样着色,每一个被覆盖的采样点都被执行一次。

  • 否则,每一个片元都至少需要执行一次片元着色器,但是每一个覆盖到的采样点无需执行多次。

关于片元着色器调用,除了上面强调的条件外,片元着色器也可被 helper invocation 产生。 一个 helper invocation是一次片元着色器调用,只为了给非helper 片元着色器求导而产生的。 helper 调用所执行的存储和原子操作不能给内存造成任何副作用,helper调用中原子指令返回的值是为定义的。

8.9.2. 早期片元测试

Vulkan提供了显式的控制,允许片元着色器开启早期片元测试。如果片元着色器指定了EarlyFragmentTests OpExecutionMode, 在Early Fragment Test Mode中描述过的逐片元测试发生在片元着色器执行之前。 否则,它们在片元着色器执行之后才被执行。

8.10. 计算着色器

可通过 vkCmdDispatchvkCmdDispatchIndirect 命令来调用计算着色器。 总的来说,它们在各着色器阶段执行时与图形管线的一部分对资源有类似的访问权限。

Compute workloads are formed from groups of work items called workgroups and processed by the compute shader in the current compute pipeline. A workgroup is a collection of shader invocations that execute the same shader, potentially in parallel. Compute shaders execute in global workgroups which are divided into a number of local workgroups with a size that can be set by assigning a value to the LocalSize execution mode or via an object decorated by the WorkgroupSize decoration. An invocation within a local workgroup can share data with other members of the local workgroup through shared variables and issue memory and control flow barriers to synchronize with other members of the local workgroup.

8.11. 插值修饰(Interpolation Decorations)

插值修饰控制了片元着色器阶段插值属性的行为。 插值修饰可以应用到片元着色器的 Input 存储类变量,并控制这些变量的插值行为。

输入变量最多可以被下列描述符中的一个所修饰:

  • Flat: 无插值

  • NoPerspective: 线性插值 (对于 线多边形来说).

不被 FlatNoPerspective 修饰的片元输入变量,使用透视插值(对 线多边形来说)。

插值的类型和是否被应用是被上述插值修饰符所控制的,同样也被附加的 CentroidSample 所描述。

一个被 Flat 修饰的变量将不会被插值。 相反,三角形内每一个片元都保持相同的值。这个值从单个provoking vertex 中获取。 一个被 Flat 修饰的变量也可以被 Centroid 或者 Sample 修饰,这表示和只有 Flat 修饰时一样的效果。

对于不是 CentroidSample 修饰的片元着色器输入变量,被赋值的变量也许会插值到像素内任意位置,一个值也许被赋值给像素内每一个采样点。

CentroidSample 可以被用于控制被修饰着色器输入变量采样的位置和频率。 如果一个片元着色器输入变量被 Centroid 修饰,像素内所有采样点都会被赋值为同一个值,但是这个值必须必须要被插值到一个位置,该位置在像素和被渲染的图元上, 包括这个图元覆盖的像素采样点中任一个点。 因为变量被插值的位置可能在临近的像素上,导数可能以两个临近像素来做计算, centroid-sampled 输入变量的导数可能比 non-centroid 插值的变量准确度偏低。 如果一个片元着色器输入被 Sample 修饰,对于该像素覆盖的每一个采样点都需要赋值一个单独的值,这个值必须从不同的采样点采样获取到。 当 rasterizationSamplesVK_SAMPLE_COUNT_1_BIT时,像素中心必须被用作 Centroid, Sample, 和无描述采样。

有、无符号整型,整型向量,或者双精度浮点类型片元着色器输入变量必须以 Flat 描述。

8.12. 静态使用

一个SPIR-V模块使用OpVariable指令在内存中声明一个全局对象,将产生一个x指针指向该对象。 如果一个入口点的调用栈包含一个函数,该函数包含这个以id为操作对象的内存指令或者图像指令x, 一个SPIR-V模块中该特定的入口点被称为_静态使用_该对象。参看SPIR-V规范第三小节 “Binary Form`"的"`Memory Instructions” 和 “Image Instructions” 来获取完整的SPIR-V内存指令。

静态使用并不以InputOutput storage用来控制变量的行为。这些变量的效果基于它们是否在着色器入口点接口中出现来生效。

8.13. 调用和衍生组

一个计算着色器的_调用组_(参看SPIR-V规范第二节的"`Control Flow`")是在单个本地工作组内一系列调用的集合。 对于图形着色器,一个调用组是在单个绘制命令内产生的指定着色器阶段的着色器调用集合的Vulkan实现各异的子集合。 对于drawCount大于1的间接绘制命令,单个绘制内的调用在不同的调用组中。

注意

因为把调用分组为调用群依赖于Vulkan实现,且不可观测,应用程序一般需要假设在属于一个调用群的绘制操作所有调用的最糟糕情形。

A derivative group (see the subsection “Control Flow” of section 2 of the SPIR-V 1.00 Revision 4 specification) for a fragment shader is the set of invocations generated by a single primitive (point, line, or triangle), including any helper invocations generated by that primitive. Derivatives are undefined for a sampled image instruction if the instruction is in flow control that is not uniform across the derivative group.

9. 管线

接下来的figure 展示了Vulkan管线的一些图表。一些Vulkan命令指定了需要被绘制对象或者需要执行的计算任务, 其他的指定了控制对象如何被管线各个阶段控制,或者控制了数据如何在有组织的内存(如图像、缓冲区)之间转移。 命令通过一个处理管线(图形管线 或者 计算管线)高效的发送出去。

graphics pipeline(Input Assembler)的第一个阶段按照要求的图元拓扑把顶点组装成几何图元, 如圆点,线段和三角形。在下一个阶段(Vertex Shader) 顶点可以形变、计算位置和属性。 如果细分 and/or 几何着色器被支持,他们可以从单个输入图元生成多个图元,可能 改变原图元拓扑或者生成在处理过程中生成另外的属性。

最终生成的图元被剪裁到一个剪裁空间,为下一个阶段Rasterization做准备。 栅格化器一系列帧缓冲区地址和值,使用二维的方式描述点、线段或者三角形。产生的每一个_片元_都传递给 下一个阶段(片元着色器),在每一个单独的片元上进行操作,最后才进入帧缓冲区。 这些操作包含根据之前或之后存储的深度值(影响深度缓冲)来判断是否进入帧缓冲区, 新的片元颜色值是否和之前存储的颜色进行blending,还有masking, 模板和其他的 逻辑操作

帧缓冲区操作针对<renderpass,render pass instance>>读写帧缓冲区的颜色、深度/模板附件。 附件可以在同一个render pass中后续subpass中用作片元着色器的输入附件。

compute pipeline和图形管线是独立的,图形管线可操作一维、二维或者三维工作组(可读写缓冲区和图像内存)。

这个顺序是用来描述Vulkan的,并不是Vulkan如何实现的一个严格准则,我们这里提到只是为了组织管线中各种各样的操作。 管线阶段之间真正的顺序保证在稍后的 synchronization chapter详细讲解。