The Torch-TensorRT Core is the main graph analysis library, it processes a TorchScript Module, converting method graphs to engines and returning a new equivalent module which when run will run inputs through a TensorRT engine
Basic rule of thumb for organization, if the the output of the component is a modified block then it is in lowering, if the output is a TRT engine block then its in conversion
There are a set of passes over the IR that will be made to lower the graph into a block of convertible nodes.
Firstly the graph will go through the lowering passes used in LibTorch, this will lower it to a graph where all attributes accesses are replaced with explicit inputs to the graph (i.e. graph parameters vs. prim::GetAttr)
Graphs from prim::CallMethods need to be inserted into the graph or used to segment the graph into convertible subgraphs.
To simplify conversion we can use the PyTorch JIT Subgraph Rewriter to simplify the set of subgraphs that need explicit TensorRT converters. This means we could aim for closer to 1->1 op conversion vs looking for applicable subgraphs, limit the number of converters and reduce the size of each converter.
Once the graph has be simplified to a form thats easy to convert, we then set up a conversion context to manage the construction of a TensorRT INetworkDefinition from the blocks nodes. The conversion context records the set of converted nodes, block inputs and outputs and other information about the conversion of the graph. This data is then used to help converters link together layers and also hold build time information like weights required to construct the engine. After the context is created, the block converter starts iterating through the list of nodes, for each node, the converter will look at its inputs and assemble a dictionary of resources to pass to the converter. Inputs can be in a couple of states:
There are some nodes that contain static data and are resources for operations. These can be evaluated at conversion time so that you can use those values when doing node conversion. In theory any node kind can have a conversion time evaluator as long as it produces a static IValue, This IValue will be stored in the conversion context so it can be consumed by any node that takes the evaluated node as an input.
See the README in //core/conversion/converters for more information
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。