* Remove usage of Mono.Posix.NETStandard in Ryujinx project
* Remove usage of Mono.Posix.NETStandard in ARMeilleure project
* Remove usage of Mono.Posix.NETStandard in Ryujinx.Memory project
* Address gdkchan's comments
* infra: Migrate to .NET 6
* Rollback version naming change
* Workaround .NET 6 ZipArchive API issues
* ci: Switch to VS 2022 for AppVeyor
CI is now ready for .NET 6
* Suppress WebClient warning in DoUpdateWithMultipleThreads
* Attempt to workaround System.Drawing.Common changes on 6.0.0
* Change keyboard rendering from System.Drawing to ImageSharp
* Make the software keyboard renderer multithreaded
* Bump ImageSharp version to 1.0.4 to fix a bug in Image.Load
* Add fallback fonts to the keyboard renderer
* Fix warnings
* Address caian's comment
* Clean up linux workaround as it's uneeded now
* Update readme
Co-authored-by: Caian Benedicto <caianbene@gmail.com>
* Add an early `TailMerge` pass
Some translations can have a lot of guest calls and since for each guest
call there is a call guard which may return. This can produce a lot of
epilogue code for returns. This pass merges the epilogue into a single
block.
```
Using filter 'hcq'.
Using metric 'code size'.
Total diff: -1648111 (-7.19 %) (bytes):
Base: 22913847
Diff: 21265736
Improved: 4567, regressed: 14, unchanged: 144
```
* Set PTC version
* Address feedback
* Handle `void` returning functions
* Actually handle `void` returning functions
* Fix `RegisterToLocal` logging
* Optimize `TryAllocateRegWithtoutSpill` a bit
* Add a fast path for when all registers are live.
* Do not query `GetOverlapPosition` if the register is already in use
(i.e: free position is 0).
* Do not allocate child split list if not parent
* Turn `LiveRange` into a reference struct
`LiveRange` is now a reference wrapping struct like `Operand` and
`Operation`.
It has also been changed into a singly linked-list. In micro-benchmarks
traversing the linked-list was faster than binary search on `List<T>`.
Even for quite large input sizes (e.g: 1,000,000), surprisingly.
Could be because the code gen for traversing the linked-list is much
much cleaner and there is no virtual dispatch happening when checking if
intervals overlaps.
* Turn `LiveInterval` into an iterator
The LSRA allocates in forward order and never inspect previous
`LiveInterval` once they are expired. Something similar can be done for
the `LiveRange`s within the `LiveInterval`s themselves.
The `LiveInterval` is turned into a iterator which expires `LiveRange`
within it. The iterator is moved forward along with interval walking
code, i.e: AllocateInterval(context, interval, cIndex).
* Remove `LinearScanAllocator.Sources`
Local methods are less susceptible to do allocations than lambdas.
* Optimize `GetOverlapPosition(interval)` a bit
Time complexity should be in O(n+m) instead of O(nm) now.
* Optimize `NumberLocals` a bit
Use the same idea as in `HybridAllocator` to store the visited state
in the MSB of the Operand's value instead of using a `HashSet<T>`.
* Optimize `InsertSplitCopies` a bit
Avoid allocating a redundant `CopyResolver`.
* Optimize `InsertSplitCopiesAtEdges` a bit
Avoid redundant allocations of `CopyResolver`.
* Use stack allocation for `freePositions`
Avoid redundant computations.
* Add `UseList`
Replace `SortedIntegerList` with an even more specialized data
structure. It allocates memory on the arena allocators and does not
require copying use positions when splitting it.
* Turn `LiveInterval` into a reference struct
`LiveInterval` is now a reference wrapping struct like `Operand` and
`Operation`.
The rationale behind turning this in a reference wrapping struct is
because a `LiveInterval` is associated with each local variable, and
these intervals may themselves be split further. I've seen translations
having up to 8000 local variables.
To make the `LiveInterval` unmanaged, a new data structure called
`LiveIntervalList` was added to store child splits. This differs from
`SortedList<,>` because it can contain intervals with the same start
position.
Really wished we got some more of C++ template in C#. :^(
* Optimize `GetChildSplit` a bit
No need to inspect the remaining ranges if we've reached a range which
starts after position, since the split list is ordered.
* Optimize `CopyResolver` a bit
Lazily allocate the fill, spill and parallel copy structures since most
of the time only one of them is needed.
* Optimize `BitMap.Enumerator` a bit
Marking `MoveNext` as `AggressiveInlining` allows RyuJIT to promote the
`Enumerator` struct into registers completely, reducing load/store code
a lot since it does not have to store the struct on the stack for ABI
purposes.
* Use stack allocation for `use/blockedPositions`
* Optimize `AllocateWithSpill` a bit
* Address feedback
* Make `LiveInterval.AddRange(,)` more conservative
Produces no diff against master, but just for good measure.
* Add `Operand.Label` support to `Assembler`
This adds label support to `Assembler` and enables branch tightening
when compiling with relocatables. Jump management and patching has been
moved to the `Assembler`.
* Move instruction table to `Assembler.Table`
* Set PTC internal version
* Rename `Assembler.Table` to `AssemblerTable`