Fixes a bunch of freezes with SPIR-V on AMD hardware, and validation errors. Note: Obviously assumes input offsets are constant, which they currently are.
Rather than allocating a large array of all registers for each block in the shader, allocate one array of all registers and clear it between blocks. Reduces allocations in the shader translator.
- Pools for Instructions and LiteralIntegers. Can be passed in when creating the generator module.
- NewInstruction is called instead of new Instruction()
- Ryujinx SpirvGenerator passes in some pools that are static. The idea is for these to be shared between threads eventually.
- Estimate code size when creating the output MemoryStream
- LiteralInteger pools using ThreadStatic pools that are initialized before and after creation... not sure of a better way since the way these are created is via implicit cast.
Also, cache delegates for Spv.Generator for functions that are passed around to GenerateBinary etc, since passing the function raw creates a delegate on each call.
TODO: update python spv cs generator to make the coregrammar with NewInstruction and the `params` overloads.
- Dictionary for lookups of type declarations, constants, extinst
- LiteralInteger internal data format -> ushort
- Deterministic HashCode implementation to avoid spirv result not being the same between runs
- Inline operand list instead of List<T>, falls back to array if many operands. (large performance boost)
TODO: improve instruction allocation, structured program creator, ssa?
Passes BinaryWriter instead of the stream to Write and WriteOperand
- Removes creation of BinaryWriter for each instruction
- Removes allocations for literal string
May avoid issues with drivers with NVIDIA on linux/older gpus on windows when using large buffers (?)
Also some performance things and fixes issues with opengl games loading textures weird.
Write->Read barrier for src image (we want to wait for a write to read it)
Write->Read barrier for dst image (we want to wait for the copy to complete before use)