|
Halide 20.0.0
Halide compiler and libraries
|
| doc | |
| ► src | |
| ► autoschedulers | |
| ► adams2019 | |
| AutoSchedule.h | |
| Cache.h | |
| cost_model_schedule.h | |
| CostModel.h | |
| DefaultCostModel.h | |
| Featurization.h | |
| FunctionDAG.h | |
| LoopNest.h | |
| NetworkSize.h | |
| State.h | |
| Timer.h | |
| Weights.h | |
| ► anderson2021 | |
| ► test | |
| test.h | |
| AutoSchedule.h | |
| cost_model_schedule.h | |
| CostModel.h | |
| DefaultCostModel.h | |
| Featurization.h | |
| FunctionDAG.h | |
| GPULoopInfo.h | Data structure containing information about the current GPU loop nest hierarchy of blocks, threads, etc |
| GPUMemInfo.h | Data structures that help track memory access information |
| LoopNest.h | |
| LoopNestParser.h | |
| NetworkSize.h | |
| SearchSpace.h | |
| SearchSpaceOptions.h | |
| State.h | |
| Statistics.h | |
| ThreadInfo.h | Data structure containing information about GPU threads for a particular location in the loop nest and its surrounding block |
| Tiling.h | |
| Weights.h | |
| ► common | |
| ASLog.h | |
| cmdline.h | |
| Errors.h | |
| HalidePlugin.h | |
| ParamParser.h | |
| PerfectHashMap.h | |
| ► runtime | |
| ► hexagon_remote | |
| ► bin | |
| ► src | |
| halide_hexagon_remote.h | |
| ► qurt | |
| known_symbols.h | |
| log.h | |
| sim_protocol.h | |
| ► internal | |
| block_allocator.h | |
| block_storage.h | |
| linked_list.h | |
| memory_arena.h | |
| memory_resources.h | |
| pointer_table.h | |
| region_allocator.h | |
| string_storage.h | |
| string_table.h | |
| android_ioctl.h | |
| cl_functions.h | |
| constants.h | This file contains private constants shared between the Halide library and the Halide runtime |
| cpu_features.h | |
| cuda_functions.h | |
| device_buffer_utils.h | |
| device_interface.h | |
| gpu_context_common.h | |
| HalideBuffer.h | Defines a Buffer type that wraps from halide_buffer_t and adds functionality, and methods for more conveniently iterating over the samples in a halide_buffer_t outside of Halide code |
| HalidePyTorchCudaHelpers.h | Override Halide's CUDA hooks so that the Halide code called from PyTorch uses the correct GPU device and stream |
| HalidePyTorchHelpers.h | Set of utility functions to wrap PyTorch tensors into Halide buffers, making sure the data in on the correct device (CPU/GPU) |
| HalideRuntime.h | This file declares the routines used by Halide internally in its runtime |
| HalideRuntimeCuda.h | Routines specific to the Halide Cuda runtime |
| HalideRuntimeD3D12Compute.h | Routines specific to the Halide Direct3D 12 Compute runtime |
| HalideRuntimeHexagonDma.h | Routines specific to the Halide Hexagon DMA host-side runtime |
| HalideRuntimeHexagonHost.h | Routines specific to the Halide Hexagon host-side runtime |
| HalideRuntimeMetal.h | Routines specific to the Halide Metal runtime |
| HalideRuntimeOpenCL.h | Routines specific to the Halide OpenCL runtime |
| HalideRuntimeQurt.h | Routines specific to the Halide QuRT runtime |
| HalideRuntimeVulkan.h | Routines specific to the Halide Vulkan runtime |
| HalideRuntimeWebGPU.h | Routines specific to the Halide WebGPU runtime |
| hashmap.h | |
| hexagon_dma_pool.h | |
| metal_objc_platform_dependent.h | |
| mini_cl.h | |
| mini_cuda.h | |
| mini_d3d12.h | |
| mini_hexagon_dma.h | |
| mini_qurt.h | |
| mini_qurt_vtcm.h | |
| mini_webgpu.h | |
| objc_support.h | |
| posix_timeval.h | |
| printer.h | |
| runtime_atomics.h | |
| runtime_internal.h | |
| scoped_mutex_lock.h | |
| scoped_spin_lock.h | |
| synchronization_common.h | |
| thread_pool_common.h | |
| vulkan_context.h | |
| vulkan_extensions.h | |
| vulkan_functions.h | |
| vulkan_interface.h | |
| vulkan_internal.h | |
| vulkan_memory.h | |
| vulkan_resources.h | |
| AbstractGenerator.h | |
| AddAtomicMutex.h | Defines the lowering pass that insert mutex allocation code & locks for the atomic nodes that require mutex locks |
| AddImageChecks.h | Defines the lowering pass that adds the assertions that validate input and output buffers |
| AddParameterChecks.h | Defines the lowering pass that adds the assertions that validate scalar parameters |
| AddSplitFactorChecks.h | Defines the lowering pass that adds the assertions that all split factors are strictly positive |
| AlignLoads.h | Defines a lowering pass that rewrites unaligned loads into sequences of aligned loads |
| AllocationBoundsInference.h | Defines the lowering pass that determines how large internal allocations should be |
| ApplySplit.h | Defines method that returns a list of let stmts, substitutions, and predicates to be added given a split schedule |
| Argument.h | Defines a type used for expressing the type signature of a generated halide pipeline |
| AssociativeOpsTable.h | Tables listing associative operators and their identities |
| Associativity.h | Methods for extracting an associative operator from a Func's update definition if there is any and computing the identity of the associative operator |
| AsyncProducers.h | Defines the lowering pass that injects task parallelism for producers that are scheduled as async |
| AutoScheduleUtils.h | Defines util functions that used by auto scheduler |
| BoundaryConditions.h | Support for imposing boundary conditions on Halide::Funcs |
| BoundConstantExtentLoops.h | Defines the lowering pass that enforces a constant extent on all vectorized or unrolled loops |
| Bounds.h | Methods for computing the upper and lower bounds of an expression, and the regions of a function read or written by a statement |
| BoundsInference.h | Defines the bounds_inference lowering pass |
| BoundSmallAllocations.h | Defines the lowering pass that attempts to rewrite small allocations to have constant size |
| Buffer.h | |
| Callable.h | Defines the front-end class representing a jitted, callable Halide pipeline |
| CanonicalizeGPUVars.h | Defines the lowering pass that canonicalize the GPU var names over |
| ClampUnsafeAccesses.h | Defines the clamp_unsafe_accesses lowering pass |
| Closure.h | Provides Closure class |
| CodeGen_C.h | Defines an IRPrinter that emits C++ code equivalent to a halide stmt |
| CodeGen_D3D12Compute_Dev.h | Defines the code-generator for producing D3D12-compatible HLSL kernel code |
| CodeGen_GPU_Dev.h | Defines the code-generator interface for producing GPU device code |
| CodeGen_Internal.h | Defines functionality that's useful to multiple target-specific CodeGen paths, but shouldn't live in CodeGen_LLVM.h (because that's the front-end-facing interface to CodeGen) |
| CodeGen_LLVM.h | Defines the base-class for all architecture-specific code generators that use llvm |
| CodeGen_Metal_Dev.h | Defines the code-generator for producing Apple Metal shading language kernel code |
| CodeGen_OpenCL_Dev.h | Defines the code-generator for producing OpenCL C kernel code |
| CodeGen_Posix.h | Defines a base-class for code-generators on posixy cpu platforms |
| CodeGen_PTX_Dev.h | Defines the code-generator for producing CUDA host code |
| CodeGen_PyTorch.h | Defines an IRPrinter that emits C++ code that: |
| CodeGen_Targets.h | Provides constructors for code generators for various targets |
| CodeGen_Vulkan_Dev.h | Defines the code-generator for producing SPIR-V binary modules for use with the Vulkan runtime |
| CodeGen_WebGPU_Dev.h | Defines the code-generator for producing WebGPU shader code (WGSL) |
| CompilerLogger.h | Defines an interface used to gather and log compile-time information, stats, etc for use in evaluating internal Halide compilation rules and efficiency |
| ConciseCasts.h | Defines concise cast and saturating cast operators to make it easier to read cast-heavy code |
| ConstantBounds.h | Methods for computing compile-time constant int64_t upper and lower bounds of an expression |
| ConstantInterval.h | Defines the ConstantInterval class, and operators on it |
| CPlusPlusMangle.h | A simple function to get a C++ mangled function name for a function |
| CSE.h | Defines a pass for introducing let expressions to wrap common sub-expressions |
| Debug.h | Defines functions for debug logging during code generation |
| DebugArguments.h | Defines a lowering pass that injects debug statements inside a LoweredFunc |
| DebugToFile.h | Defines the lowering pass that injects code at the end of every realization to dump functions to a file for debugging |
| Definition.h | Defines the internal representation of a halide function's definition and related classes |
| Deinterleave.h | Defines methods for splitting up a vector into the even lanes and the odd lanes |
| Derivative.h | Automatic differentiation |
| DerivativeUtils.h | |
| Deserialization.h | |
| DeviceAPI.h | Defines DeviceAPI |
| DeviceArgument.h | Defines helpers for passing arguments to separate devices, such as GPUs |
| DeviceInterface.h | Methods for managing device allocations when jitting |
| Dimension.h | Defines the Dimension utility class for Halide pipelines |
| DistributeShifts.h | A tool to distribute shifts as multiplies, useful for some backends |
| EarlyFree.h | Defines the lowering pass that injects markers just after the last use of each buffer so that they can potentially be freed earlier |
| Elf.h | |
| EliminateBoolVectors.h | Method to eliminate vectors of booleans from IR |
| EmulateFloat16Math.h | Methods for dealing with float16 arithmetic using float32 math, by casting back and forth with bit tricks |
| Error.h | |
| Expr.h | Base classes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt) |
| ExprUsesVar.h | Defines a method to determine if an expression depends on some variables |
| Extern.h | Convenience macros that lift functions that take C types into functions that take and return exprs, and call the original function at runtime under the hood |
| ExternFuncArgument.h | Defines the internal representation of a halide ExternFuncArgument |
| ExtractTileOperations.h | Defines the lowering pass that injects calls to tile intrinsics that support AMX instructions |
| FastIntegerDivide.h | |
| FindCalls.h | Defines analyses to extract the functions called a function |
| FindIntrinsics.h | Tools to replace common patterns with more readily recognizable intrinsics |
| FlattenNestedRamps.h | Defines the lowering pass that flattens nested ramps and broadcasts |
| Float16.h | |
| Func.h | Defines Func - the front-end handle on a halide function, and related classes |
| Function.h | Defines the internal representation of a halide function and related classes |
| FunctionPtr.h | |
| FuseGPUThreadLoops.h | Defines the lowering pass that fuses and normalizes loops over gpu threads to target CUDA, OpenCL, and Metal |
| FuzzFloatStores.h | Defines a lowering pass that messes with floating point stores |
| Generator.h | Generator is a class used to encapsulate the building of Funcs in user pipelines |
| HexagonAlignment.h | Class for analyzing Alignment of loads and stores for Hexagon |
| HexagonOffload.h | Defines a lowering pass to pull loops marked with the Hexagon device API to a separate module, and call them through the Hexagon host runtime module |
| HexagonOptimize.h | Tools for optimizing IR for Hexagon |
| ImageParam.h | Classes for declaring image parameters to halide pipelines |
| InferArguments.h | Interface for a visitor to infer arguments used in a body Stmt |
| InjectHostDevBufferCopies.h | Defines the lowering passes that deal with host and device buffer flow |
| Inline.h | Methods for replacing calls to functions with their definitions |
| InlineReductions.h | Defines some inline reductions: sum, product, minimum, maximum |
| IntegerDivisionTable.h | Tables telling us how to do integer division via fixed-point multiplication for various small constants |
| Interval.h | Defines the Interval class |
| IntrusivePtr.h | Support classes for reference-counting via intrusive shared pointers |
| IR.h | Subtypes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt) |
| IREquality.h | Methods to test Exprs and Stmts for equality of value |
| IRMatch.h | Defines a method to match a fragment of IR against a pattern containing wildcards |
| IRMutator.h | Defines a base class for passes over the IR that modify it |
| IROperator.h | Defines various operator overloads and utility functions that make it more pleasant to work with Halide expressions |
| IRPrinter.h | This header file defines operators that let you dump a Halide expression, statement, or type directly into an output stream in a human readable form |
| IRVisitor.h | Defines the base class for things that recursively walk over the IR |
| JITModule.h | Defines the struct representing lifetime and dependencies of a JIT compiled halide pipeline |
| Lambda.h | Convenience functions for creating small anonymous Halide functions |
| Lerp.h | Defines methods for converting a lerp intrinsic into Halide IR |
| LICM.h | Methods for lifting loop invariants out of inner loops |
| LLVM_Headers.h | |
| LLVM_Output.h | |
| LLVM_Runtime_Linker.h | Support for linking LLVM modules that comprise the runtime |
| LoopCarry.h | |
| LoopPartitioningDirective.h | Defines the Partition enum |
| Lower.h | Defines the function that generates a statement that computes a Halide function using its schedule |
| LowerParallelTasks.h | Support for platform independent lowering of Halide parallel and async mechanisms |
| LowerWarpShuffles.h | Defines the lowering pass that injects CUDA warp shuffle instructions to access storage outside of a GPULane loop |
| MainPage.h | This file only exists to contain the front-page of the documentation |
| Memoization.h | Defines the interface to the pass that injects support for compute_cached roots |
| Module.h | Defines Module, an IR container that fully describes a Halide program |
| ModulusRemainder.h | Routines for statically determining what expressions are divisible by |
| Monotonic.h | Methods for computing whether expressions are monotonic |
| ObjectInstanceRegistry.h | Provides a single global registry of Generators, GeneratorParams, and Params indexed by this pointer |
| OffloadGPULoops.h | Defines a lowering pass to pull loops marked with GPU device APIs to a separate module, and call them through the appropriate host runtime module |
| OptimizeShuffles.h | Defines a lowering pass that replace indirect loads with dynamic_shuffle intrinsics where possible |
| OutputImageParam.h | Classes for declaring output image parameters to halide pipelines |
| ParallelRVar.h | Method for checking if it's safe to parallelize an update definition across a reduction variable |
| Param.h | Classes for declaring scalar parameters to halide pipelines |
| Parameter.h | Defines the internal representation of parameters to halide piplines |
| PartitionLoops.h | Defines a lowering pass that partitions loop bodies into three to handle boundary conditions: A prologue, a simplified steady-stage, and an epilogue |
| Pipeline.h | Defines the front-end class representing an entire Halide imaging pipeline |
| Prefetch.h | Defines the lowering pass that injects prefetch calls when prefetching appears in the schedule |
| PrefetchDirective.h | Defines the PrefetchDirective struct |
| PrintLoopNest.h | Defines methods to print out the loop nest corresponding to a schedule |
| Profiling.h | Defines the lowering pass that injects print statements when profiling is turned on |
| PurifyIndexMath.h | Removes side-effects in integer math |
| PythonExtensionGen.h | |
| Qualify.h | Defines methods for prefixing names in an expression with a prefix string |
| Random.h | Defines deterministic random functions, and methods to redirect front-end calls to random_float and random_int to use them |
| RDom.h | Defines the front-end syntax for reduction domains and reduction variables |
| Realization.h | Defines Realization - a vector of Buffer for use in pipelines with multiple outputs |
| RealizationOrder.h | Defines the lowering pass that determines the order in which realizations are injected and groups functions with fused computation loops |
| RebaseLoopsToZero.h | Defines the lowering pass that rewrites loop mins to be 0 |
| Reduction.h | Defines internal classes related to Reduction Domains |
| RegionCosts.h | Defines RegionCosts - used by the auto scheduler to query the cost of computing some function regions |
| RemoveDeadAllocations.h | Defines the lowering pass that removes allocate and free nodes that are not used |
| RemoveExternLoops.h | Defines a lowering pass that removes placeholder loops for extern stages |
| RemoveUndef.h | Defines a lowering pass that elides stores that depend on unitialized values |
| Schedule.h | Defines the internal representation of the schedule for a function |
| ScheduleFunctions.h | Defines the function that does initial lowering of Halide Functions into a loop nest using its schedule |
| Scope.h | Defines the Scope class, which is used for keeping track of names in a scope while traversing IR |
| SelectGPUAPI.h | Defines a lowering pass that selects which GPU api to use for each gpu for loop |
| Serialization.h | |
| Simplify.h | Methods for simplifying halide statements and expressions |
| Simplify_Internal.h | The simplifier is separated into multiple compilation units with this single shared header to speed up the build |
| SimplifyCorrelatedDifferences.h | Defines a simplification pass for handling differences of correlated expressions |
| SimplifySpecializations.h | Defines pass that try to simplify the RHS/LHS of a function's definition based on its specializations |
| SkipStages.h | Defines a pass that dynamically avoids realizing unnecessary stages |
| SlidingWindow.h | Defines the sliding_window lowering optimization pass, which avoids computing provably-already-computed values |
| Solve.h | |
| SpirvIR.h | Defines methods for constructing and encoding instructions into the Khronos format specification known as the Standard Portable Intermediate Representation for Vulkan (SPIR-V) |
| SplitTuples.h | Defines the lowering pass that breaks up Tuple-valued realization and productions into several scalar-valued ones |
| StageStridedLoads.h | Defines the compiler pass that converts strided loads into dense loads followed by shuffles |
| StmtToHTML.h | Defines a function to dump an HTML-formatted visualization to a file |
| StorageFlattening.h | Defines the lowering pass that flattens multi-dimensional storage into single-dimensional array access |
| StorageFolding.h | Defines the lowering optimization pass that reduces large buffers down to smaller circular buffers when possible |
| StrictifyFloat.h | Defines a lowering pass to make all floating-point strict for all top-level Exprs |
| StripAsserts.h | Defines the lowering pass that strips asserts when NoAsserts is set |
| Substitute.h | Defines methods for substituting out variables in expressions and statements |
| Target.h | Defines the structure that describes a Halide target |
| TargetQueryOps.h | Defines a lowering pass to lower all target_is() and target_has() helpers |
| Tracing.h | Defines the lowering pass that injects print statements when tracing is turned on |
| TrimNoOps.h | Defines a lowering pass that truncates loops to the region over which they actually do something |
| Tuple.h | Defines Tuple - the front-end handle on small arrays of expressions |
| Type.h | Defines halide types |
| UnifyDuplicateLets.h | Defines the lowering pass that coalesces redundant let statements |
| UniquifyVariableNames.h | Defines the lowering pass that renames all variables to have unique names |
| UnpackBuffers.h | Defines the lowering pass that unpacks buffer arguments onto the symbol table |
| UnrollLoops.h | Defines the lowering pass that unrolls loops marked as such |
| UnsafePromises.h | Defines the lowering pass that removes unsafe promises |
| Util.h | Various utility functions used internally Halide |
| Var.h | Defines the Var - the front-end variable |
| VectorizeLoops.h | Defines the lowering pass that vectorizes loops marked as such |
| WasmExecutor.h | Support for running Halide-compiled Wasm code in-process |
| WrapCalls.h | Defines pass to replace calls to wrapped Functions with their wrappers |
| ► test | |
| ► autoschedulers | |
| ► adams2019 | |
| included_schedule_file.schedule.h | |
| ► anderson2021 | |
| included_schedule_file.schedule.h | |
| ► common | |
| check_call_graphs.h | |
| gpu_context.h | |
| gpu_object_lifetime_tracker.h | |
| halide_test_dirs.h | |
| test_sharding.h | |
| ► correctness | |
| simd_op_check.h | |
| ► fuzz | |
| fuzz_helpers.h | |
| ► runtime | |
| common.h |