Commit graph

254 commits

Author SHA1 Message Date
riperiperi 12a7a2ead8
Inherit buffer tracking handles rather than recreating on resize (#2330)
This greatly speeds up games that constantly resize buffers, and removes stuttering on games that resize large buffers occasionally:

- Large improvement on Super Mario 3D All-Stars (#1663 needed for best performance)
- Improvement to Hyrule Warriors: AoC, and UE4 games. These games can still stutter due to texture creation/loading.
- Small improvement to other games, potential 1-frame stutters avoided.

`ForceSynchronizeMemory`, which was added with POWER, is no longer needed. Some tests have been added for the MultiRegionHandle.
2021-06-24 01:31:26 +02:00
gdkchan c71ae9c85c
Fix shader texture LOD query (#2397) 2021-06-23 23:31:14 +02:00
gdkchan 49edf14a3e
Pass all inputs when geometry shader passthrough is enabled (#2362)
* Pass all inputs when geometry shader passthrough is enabled

* Shader cache version bump
2021-06-23 23:04:59 +02:00
gdkchan 65fee49e8a
Fix separate bindless sampler at offset 0 (#2360) 2021-06-20 20:48:12 +02:00
riperiperi 7ff1f9aa12
End shader decoding when reaching a block that starts with an infinite loop (after BRX) (#2367)
* End shader decoding when reaching an infinite loop

The NV shader compiler puts these at the end of shaders.

* Update shader cache version
2021-06-15 02:09:59 +02:00
Mary afd68d4c6c
GAL: Fix sampler leaks on exit (#2353)
Before this, all samplers instance were leaking on exit because the
dispose method was never getting called.

This fix this issue by making TextureBindingsManager disposable and
calling the dispose method in the TextureManager.
2021-06-09 01:00:28 +02:00
Mary 60cf3dfebc
Do not clear gpu subchannel state on BindChannel (#2348)
This fixes a regression caused by #980, that was causing a crash on New
Super Lucky's Tale.

As always, this need feedback on possible regression on any games.

Fix #2343.
2021-06-09 00:50:18 +02:00
gdkchan 02e2e561ac
Support bindless textures with separate constant buffers for texture and sampler (#2339) 2021-06-09 00:42:25 +02:00
gdkchan 3b90adcd1d
Fix shaders with mixed PBK and SSY addresses on the stack (#2329)
* Fix shaders with mixed PBK and SSY addresses on the stack

* Address PR feedback and nits
2021-06-03 01:41:53 +02:00
gdkchan b84ba43406
Fix texture blit off-by-one errors (#2335) 2021-06-03 01:30:48 +02:00
gdkchan 79b3243f54
Do not attempt to normalize SNORM image buffers on shaders (#2317)
* Do not attempt to normalize SNORM image buffers on shaders

* Shader cache version bump
2021-05-31 21:59:23 +02:00
riperiperi 54ea2285f0
POWER - Performance Optimizations With Extensive Ramifications (#2286)
* Refactoring of KMemoryManager class

* Replace some trivial uses of DRAM address with VA

* Get rid of GetDramAddressFromVa

* Abstracting more operations on derived page table class

* Run auto-format on KPageTableBase

* Managed to make TryConvertVaToPa private, few uses remains now

* Implement guest physical pages ref counting, remove manual freeing

* Make DoMmuOperation private and call new abstract methods only from the base class

* Pass pages count rather than size on Map/UnmapMemory

* Change memory managers to take host pointers

* Fix a guest memory leak and simplify KPageTable

* Expose new methods for host range query and mapping

* Some refactoring of MapPagesFromClientProcess to allow proper page ref counting and mapping without KPageLists

* Remove more uses of AddVaRangeToPageList, now only one remains (shared memory page checking)

* Add a SharedMemoryStorage class, will be useful for host mapping

* Sayonara AddVaRangeToPageList, you served us well

* Start to implement host memory mapping (WIP)

* Support memory tracking through host exception handling

* Fix some access violations from HLE service guest memory access and CPU

* Fix memory tracking

* Fix mapping list bugs, including a race and a error adding mapping ranges

* Simple page table for memory tracking

* Simple "volatile" region handle mode

* Update UBOs directly (experimental, rough)

* Fix the overlap check

* Only set non-modified buffers as volatile

* Fix some memory tracking issues

* Fix possible race in MapBufferFromClientProcess (block list updates were not locked)

* Write uniform update to memory immediately, only defer the buffer set.

* Fix some memory tracking issues

* Pass correct pages count on shared memory unmap

* Armeilleure Signal Handler v1 + Unix changes

Unix currently behaves like windows, rather than remapping physical

* Actually check if the host platform is unix

* Fix decommit on linux.

* Implement windows 10 placeholder shared memory, fix a buffer issue.

* Make PTC version something that will never match with master

* Remove testing variable for block count

* Add reference count for memory manager, fix dispose

Can still deadlock with OpenAL

* Add address validation, use page table for mapped check, add docs

Might clean up the page table traversing routines.

* Implement batched mapping/tracking.

* Move documentation, fix tests.

* Cleanup uniform buffer update stuff.

* Remove unnecessary assignment.

* Add unsafe host mapped memory switch

On by default. Would be good to turn this off for untrusted code (homebrew, exefs mods) and give the user the option to turn it on manually, though that requires some UI work.

* Remove C# exception handlers

They have issues due to current .NET limitations, so the meilleure one fully replaces them for now.

* Fix MapPhysicalMemory on the software MemoryManager.

* Null check for GetHostAddress, docs

* Add configuration for setting memory manager mode (not in UI yet)

* Add config to UI

* Fix type mismatch on Unix signal handler code emit

* Fix 6GB DRAM mode.

The size can be greater than `uint.MaxValue` when the DRAM is >4GB.

* Address some feedback.

* More detailed error if backing memory cannot be mapped.

* SetLastError on all OS functions for consistency

* Force pages dirty with UBO update instead of setting them directly.

Seems to be much faster across a few games. Need retesting.

* Rebase, configuration rework, fix mem tracking regression

* Fix race in FreePages

* Set memory managers null after decrementing ref count

* Remove readonly keyword, as this is now modified.

* Use a local variable for the signal handler rather than a register.

* Fix bug with buffer resize, and index/uniform buffer binding.

Should fix flickering in games.

* Add InvalidAccessHandler to MemoryTracking

Doesn't do anything yet

* Call invalid access handler on unmapped read/write.

Same rules as the regular memory manager.

* Make unsafe mapped memory its own MemoryManagerType

* Move FlushUboDirty into UpdateState.

* Buffer dirty cache, rather than ubo cache

Much cleaner, may be reusable for Inline2Memory updates.

* This doesn't return anything anymore.

* Add sigaction remove methods, correct a few function signatures.

* Return empty list of physical regions for size 0.

* Also on AddressSpaceManager

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-05-24 22:52:44 +02:00
riperiperi 79092310fa
Compare aligned size for largest mip level when considering sampler resize (#2306)
* Compare aligned size for largest mip level when considering sampler resize

When selecting a texture that's a view for a sampler resize, we should take care that resizing it doesn't change the aligned size of any larger mip levels.

This PR covers two cases:
- When creating a view of the texture, we check that the aligned size of the view shifted up to level 0 still matches the aligned size of the container. If it does not, a copy dependency is created rather than resizing.
- When searching for a texture for sampler, textures that do _not_ match our aligned size when both are shifted up by its base level are not considered an exact match, as resizing the found texture will cause the mip 0 aligned size to change. It will create a copy dependency view instead.

Fixes graphical errors and crashes (on flush) in various Unity games that use render-to-texture.

* Move shared code to its own method.
2021-05-24 17:35:26 +10:00
gdkchan e9c15d32cb
Use a different method for out of bounds blit (#2302)
* Use a different method for out of bounds blit

* This is not needed
2021-05-22 01:26:49 +02:00
gdkchan a03bbef4d6
Add another Depth32F texture format (#2304) 2021-05-22 01:15:08 +02:00
gdkchan f0add129e2
Fix non-independent blend state not being updated (#2303)
* Fix non-independent blend state not being updated

* Actually, this is not needed
2021-05-22 01:08:00 +02:00
riperiperi 5271cfe70b
Fix dimensions check for scale eligibility (#2301) 2021-05-21 01:09:18 +02:00
gdkchan 12533e5c9d
Fix buffer and texture uses not being propagated for vertex A/B shaders (#2300)
* Fix buffer and texture uses not being propagated for vertex A/B shaders

* Shader cache version bump
2021-05-20 21:43:23 +02:00
gdkchan b34c0a47b4
Fix constant buffer array size when indexing is used and other buffer descriptor and resolution scale regressions (#2298)
* Fix constant buffer array size when indexing is used

* Change default QueryConstantBufferUse value

* Fix more regressions

* Ensure proper order
2021-05-20 15:12:15 -03:00
gdkchan 49745cfa37
Move shader resource descriptor creation out of the backend (#2290)
* Move shader resource descriptor creation out of the backend

* Remove now unused code, and other nits

* Shader cache version bump

* Nits

* Set format for bindless image load/store

* Fix buffer write flag
2021-05-19 23:15:26 +02:00
EmulationFanatic b5c72b44de
Merge pull request #2177 from riperiperi/feature/parallel-shader-cache
Allow parallel shader compilation when loading a shader cache
2021-05-19 11:39:19 -07:00
riperiperi 0129250c2e
Pass CbufSlot when getting info from the texture descriptor (#2291)
* Pass CbufSlot when getting info from the texture descriptor

Fixes some issues with bindless textures, when CbufSlot is not equal to the current TextureBufferIndex.

Specifically fixes a random chance of full screen colour flickering in Super Mario Party.

* Apply suggestions from code review

Oops

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2021-05-19 20:05:43 +02:00
riperiperi 212e472c9f
Use copy dependencies for the Intel/AMD view format workaround (#2144)
* This might help AMD a bit

* Removal of old workaround.
2021-05-16 20:43:27 +02:00
Mary 7aed7808be
misc: Fix default value for GraphicsConfig.MaxAnisotropy (#2274)
As title say.
Doesn't change anything as the Ryujinx project set it.
2021-05-07 13:18:23 -03:00
gdkchan 0e9823d7e6
Fix shader buffer write flag on atomic instructions (#2261)
* Fix shader buffer write flag on atomic instructions

* Shader cache version bump
2021-05-01 20:46:21 +02:00
gdkchan 4770cfa920
Only enable clip distance if written to on shader (#2217)
* Only enable clip distance if written to on shader

* Signal InstanceId use through FeatureFlags

* Shader cache version bump
2021-04-20 12:33:54 +02:00
riperiperi 9e68f5026e Fix skipping missing shaders 2021-04-18 17:34:01 +01:00
riperiperi b1c3e01691 Nit 2021-04-18 17:34:00 +01:00
riperiperi 35eac315ab The task isn't required for loading compute binary. 2021-04-18 17:33:59 +01:00
riperiperi a0aa09912c Use event to wake the main thread on task completion 2021-04-18 17:33:59 +01:00
riperiperi 20d560e3f9 The new host program needs to be saved even if it isn't valid. 2021-04-18 17:33:59 +01:00
riperiperi ddf4b92a9c Implement parallel host shader cache compilation. 2021-04-18 17:33:58 +01:00
gdkchan 40e276c9b5
Improve shader global memory to storage pass (#2200)
* Improve shader global memory to storage pass

* Formatting and more comments

* Shader cache version bump
2021-04-18 12:31:39 +02:00
gdkchan f665e1b409
Hold reference for render targets in use (#2156) 2021-04-02 16:33:39 +02:00
gdkchan 524fe3bea4
Implement shader HelperThreadNV (#2163)
* Implement shader HelperThreadNV

* Bump shader cache version

* Use gl_HelperInvocation since its supported across all vendors

* Nit
2021-04-02 21:50:35 +11:00
gdkchan a0b4799f19
Fix ZN flags set for shader instructions using RZ.CC dest (#2147)
* Fix ZN flags set for shader instructions using RZ.CC dest

* Shader cache version bump and nits
2021-03-27 22:59:05 +01:00
mageven a5d5ca0635
Shader Cache: Move bindless checking from translation to decode (#2145) 2021-03-27 00:50:26 +01:00
mageven 69f8722e79
Fix inconsistencies in progress reporting (#2129)
Signal and setup events correctly
Eliminate possible races
Use a single event
Mark volatiles and reduce scope of waithandles
Common handler
100ms -> 50ms
2021-03-22 19:40:07 +01:00
Mary aef25980a7
Salieri: Detect and avoid caching shaders using bindless textures (#2097)
* Salieri: Add blacklist system and blacklist shaders using bindless

Currently the shader cache doesn't have the right format to support
bindless textures correctly and may cache shaders that it cannot rebuild
after host invalidation.

This PR address the issue by blacklisting shaders using bindless
textures.

THis also support detection of already cached broken shader and handle removal
of those.

* Move to a feature flags design to avoid intrusive changes in the translator

This remove the auto correct behaviour

* Reduce diff on TranslationFlags

* Reduce comma on last entry of TranslationFlags

* Fix inverted logic and remove leftovers

* remove debug edits oops
2021-03-19 20:07:37 +01:00
riperiperi 9b7335a63b
Improve linear texture compatibility rules (#2099)
* Improve linear texture compatibility rules

Fixes an issue where small or width-aligned (rather than byte aligned) textures would fail to create a view of existing data. Creates a copy dependency as size change may be risky.

* Minor cleanup

* Remove Size Change for Copy Depenedencies

The copy to the target (potentially different sized) texture can properly deal with cropping by itself.

* Move StrideAlignment and GobAlignment into Constants
2021-03-19 02:17:38 +01:00
riperiperi 1623ab524f
Improve Buffer Textures and flush Image Stores (#2088)
* Improve Buffer Textures and flush Image Stores

Fixes a number of issues with buffer textures:

- Reworked Buffer Textures to create their buffers in the TextureManager, then bind them with the BufferManager later.
  - Fixes an issue where a buffer texture's buffer could be invalidated after it is bound, but before use.
- Fixed width unpacking for large buffer textures. The width is now 32-bit rather than 16.
- Force buffer textures to be rebound whenever any buffer is created, as using the handle id wasn't reliable, and the cost of binding isn't too high.

Fixes vertex explosions and flickering animations in UE4 games.

* Set ImageStore flag... for ImageStore.

* Check the offset and size.
2021-03-08 18:43:39 -03:00
riperiperi 8d36681bf1
Improve handling for unmapped GPU resources (#2083)
* Improve handling for unmapped GPU resources

- Fixed a memory tracking bug that would set protection on empty PTEs
- When a texture's memory is (partially) unmapped, all pool references are forcibly removed and the texture must be rediscovered to draw with it. This will also force the texture discovery to always compare the texture's range for a match.
- RegionHandles now know if they are unmapped, and automatically unset their dirty flag when unmapped.
- Partial texture sync now loads only the region of texture that has been modified. Unmapped memory tracking handles cause dirty flags for a texture group handle to be ignored.

This greatly improves the emulator's stability for newer UE4 games.

* Address feedback, fix MultiRange slice

Fixed an issue where the size of the multi-range slice would be miscalculated.

* Update Ryujinx.Memory/Range/MultiRange.cs (feedback)

Co-authored-by: Mary <thog@protonmail.com>

Co-authored-by: Mary <thog@protonmail.com>
2021-03-06 11:43:55 -03:00
mageven ca5d8e58dd
Add progress reporting to PTC and Shader Cache (#2057)
* UI changes

* Add progress reporting to PTC & ShaderCache

* Account for null events and expand docs

Co-authored-by: Joshi234 <46032261+Joshi234@users.noreply.github.com>
2021-03-03 01:39:36 +01:00
riperiperi b530f0e110
Texture Cache: "Texture Groups" and "Texture Dependencies" (#2001)
* Initial implementation (3d tex mips broken)

This works rather well for most games, just need to fix 3d texture mips.

* Cleanup

* Address feedback

* Copy Dependencies and various other fixes

* Fix layer/level offset for copy from view<->view.

* Remove dirty flag from dependency

The dirty flag behaviour is not needed - DeferredCopy is all we need.

* Fix tracking mip slices.

* Propagate granularity (fix astral chain)

* Address Feedback pt 1

* Save slice sizes as part of SizeInfo

* Fix nits

* Fix disposing multiple dependencies causing a crash

This list is obviously modified when removing dependencies, so create a copy of it.
2021-03-02 19:30:54 -03:00
gdkchan 4047477866
Simplify handling of shader vertex A (#1999)
* Simplify handling of shader vertex A

* Theres no transformation feedback, its transform

* Merge TextureHandlesForCache
2021-02-08 10:42:17 +11:00
gdkchan 7016d95eb1
Implement ETC2 (RGB) texture format (#2000)
* Implement ETC2 format

* Fix component counts for compressed formats
2021-02-08 10:23:56 +11:00
gdkchan 67033ed8e0
Do not flush multisample textures (#1973) 2021-02-01 08:30:16 +01:00
gdkchan f93089a64f
Implement geometry shader passthrough (#1961)
* Implement geometry shader passthrough

* Cache version change
2021-01-29 14:38:51 +11:00
riperiperi c30504e3b3
Use a descriptor cache for faster pool invalidation. (#1977)
* Use a descriptor cache for faster pool invalidation.

* Speed up comparison by casting to Vector256

Now we never need to worry about this ever again
2021-01-29 14:19:06 +11:00
gdkchan 4b7c7dab9e
Support multiple destination operands on shader IR and shuffle predicates (#1964)
* Support multiple destination operands on shader IR and shuffle predicates

* Cache version change
2021-01-28 10:59:47 +11:00