* Improve Buffer Textures and flush Image Stores
Fixes a number of issues with buffer textures:
- Reworked Buffer Textures to create their buffers in the TextureManager, then bind them with the BufferManager later.
- Fixes an issue where a buffer texture's buffer could be invalidated after it is bound, but before use.
- Fixed width unpacking for large buffer textures. The width is now 32-bit rather than 16.
- Force buffer textures to be rebound whenever any buffer is created, as using the handle id wasn't reliable, and the cost of binding isn't too high.
Fixes vertex explosions and flickering animations in UE4 games.
* Set ImageStore flag... for ImageStore.
* Check the offset and size.
* Improve handling for unmapped GPU resources
- Fixed a memory tracking bug that would set protection on empty PTEs
- When a texture's memory is (partially) unmapped, all pool references are forcibly removed and the texture must be rediscovered to draw with it. This will also force the texture discovery to always compare the texture's range for a match.
- RegionHandles now know if they are unmapped, and automatically unset their dirty flag when unmapped.
- Partial texture sync now loads only the region of texture that has been modified. Unmapped memory tracking handles cause dirty flags for a texture group handle to be ignored.
This greatly improves the emulator's stability for newer UE4 games.
* Address feedback, fix MultiRange slice
Fixed an issue where the size of the multi-range slice would be miscalculated.
* Update Ryujinx.Memory/Range/MultiRange.cs (feedback)
Co-authored-by: Mary <thog@protonmail.com>
Co-authored-by: Mary <thog@protonmail.com>
* Initial implementation (3d tex mips broken)
This works rather well for most games, just need to fix 3d texture mips.
* Cleanup
* Address feedback
* Copy Dependencies and various other fixes
* Fix layer/level offset for copy from view<->view.
* Remove dirty flag from dependency
The dirty flag behaviour is not needed - DeferredCopy is all we need.
* Fix tracking mip slices.
* Propagate granularity (fix astral chain)
* Address Feedback pt 1
* Save slice sizes as part of SizeInfo
* Fix nits
* Fix disposing multiple dependencies causing a crash
This list is obviously modified when removing dependencies, so create a copy of it.
* Initial implementation of buffer flush (VERY WIP)
* Host shaders need to be rebuilt for the SSBO write flag.
* New approach with reserved regions and gl sync
* Fix a ton of buffer issues.
* Remove unused buffer unmapped behaviour
* Revert "Remove unused buffer unmapped behaviour"
This reverts commit f1700e52fb8760180ac5e0987a07d409d1e70ece.
* Delete modified ranges on unmap
Fixes potential crashes in Super Smash Bros, where a previously modified range could lie on either side of an unmap.
* Cache some more delegates.
* Dispose Sync on Close
* Also create host sync for GPFifo syncpoint increment.
* Copy buffer optimization, add docs
* Fix race condition with OpenGL Sync
* Enable read tracking on CommandBuffer, insert syncpoint on WaitForIdle
* Performance: Only flush individual pages of SSBO at a time
This avoids flushing large amounts of data when only a small amount is actually used.
* Signal Modified rather than flushing after clear
* Fix some docs and code style.
* Introduce a new test for tracking memory protection.
Sucessfully demonstrates that the bug causing write protection to be cleared by a read action has been fixed. (these tests fail on master)
* Address Comments
* Add host sync for SetReference
This ensures that any indirect draws will correctly flush any related buffer data written before them. Fixes some flashing and misplaced world geometry in MH rise.
* Make PageAlign static
* Re-enable read tracking, for reads.
* Support for resources on non-contiguous GPU memory regions
* Implement MultiRange physical addresses, only used with a single range for now
* Actually use non-contiguous ranges
* GetPhysicalRegions fixes
* Documentation and remove Address property from TextureInfo
* Finish implementing GetWritableRegion
* Fix typo
* Implement shader CC mode for ISCADD, X mode for ISETP and fix STS/STG with RZ
* Fix STG too and bump shader cache version
* Fix wrong name
* Fix Carry being inverted on comparison
* Interrupt GPU command processing when a frame's fence is reached.
* Accumulate times rather than %s
* Accurate timer for vsync
Spin wait for the last .667ms of a frame. Avoids issues caused by signalling 16ms vsync. (periodic stutters in smo)
* Use event wait for better timing.
* Fix lazy wait
Windows doesn't seem to want to do 1ms consistently, so force a spin if we're less than 2ms.
* A bit more efficiency on frame waits.
Should now wait the remainder 0.6667 instead of 1.6667 sometimes (odd waits above 1ms are reliable, unlike 1ms waits)
* Better swap interval 0 solution
737 fps without breaking a sweat. Downside: Vsync can no longer be disabled on games that use the event heavily (link's awakening - which is ok since it breaks anyways)
* Fix comment.
* Address Comments.
* Implement TreeMap from scratch.
Begin implementation of MemoryBlockManager
* Implement GetFreePosition using MemoryBlocks
* Implementation of Memory Management using a Tree.
Still some issues to work around, but promising thus far.
* Resolved invalid mapping issue.
Performance appears promising.
* Add tick metrics
* Use the logger instead
* Use debug loggin instead of info.
* Remove unnecessary code. Add descriptions of added functions.
* Improve memory allocation even further. As well as improve speed of position fetching.
* Add TreeDictionary to Ryujinx Commons
Removed Unnecessary Usigns
* Add a Performance Profiler + Improve ReserveFixed
* Begin transition to allocation in nvdrv
* Create singleton nvmemallocator
* Moved Allocation into Nv Related Files
As requested by gdkchan, any allocation of memory has been moved into the driver files.
Mapping remains in the GPU MemoryManager.
* Remove unnecessary usings
* Add missing descriptions
* Correct descriptions
* Fix formatting.
* Remove unnecessary whitespace
* Formatting / Convention Updates
* Changes / Fixes
Made syntax and convention changes as requested by gdkchan.
Fixed an issue where IsRegionUsed would return the wrong boolean.
Fixed an issue where GetFreePosition was asked for an address instead of a size.
* Undo commenting of Assert in shader cache
* Update Ryujinx.Common/Collections/TreeDictionary.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Resolved many suggestions
* Implement Improved TreeDictionary
Based off of Pseudo code and custom implementations.
* Rename _set to _dictionary
* Remove unused code
* Remove unused code.
* Remove unnecessary MapLow function.
* Resolve data-structure based issues
* Make adjustments to memory management.
Deactive de-allocation for now, it causes more harm than good.
* Minor refactorings + Re-implement deallocation
Also cleaned up unnecessary code.
* Add Tests for TreeDictionary
* Update data structure to properly balance the tree
* Experimental Implementation:
1. Reduce Time to Next Node to O(1) Runtime
2. Reduce While Loop Ct To 2 (In Most Cases)
* Address issues w/ Deallocating Memory
* Final Build
+ Fully Implement Dictionary Interface for new Data Structure
+ Cover All Memory Allocation Edge Cases, particularly w/ Games that De-Allocate a lot.
* Minor Corrections
Give TreeDictionary its own count (do not depend on inner dictionary)
Properly remove adjacent allocations
* Add AsList
* Fix bug where internal dictionary wasn't being updated w/ new node for overwritten key.
* Address comments in review.
* Fix issue where block wouldn't break out (Fixes UE4 issues)
* Update descriptions
* Update descriptions
* Reduce Node visibility to protect TreeDictionary Integrity + Remove usage of struct.
* Update tests to use new TreeDictionary implementation.
* Remove usage of dictionary in TreeDictionary
* Refactoring / Renaming
* Remove unneeded memoryblock class.
* Add space for while
* Add space for if
* Formatting / descriptions
* Clarified some descriptions
* Reduce visibility of memory allocator
* Edit method names to make more sense as memory blocks are no longer in use.
* Make names consistent.
* Protect against npe when sucessorof is called against keys that don't exist. (Not in use by memory manager, this is for other prs that might use this data structure)
* Possible edge-case resolve
* Update Ryujinx.Common/Collections/TreeDictionary.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update Ryujinx.HLE/HOS/Services/Nv/NvMemoryAllocator.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Reduce # of unnecessary duplicate variables / Reduce visibility of variables only internally used.
* Rename count to _count
* Update Description of Add method.
* Fix copypasta
* Address comments
* Address comments
* Remove whitespace
* Address comments, condense variables.
* Consolidate vars
* Fix whitespace.
* Nit
* Fix exception msg
* Fix arrayIndex check
* Fix arrayIndex check + indexer
* Remove whitespace from cast
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* IPC refactor part 2: Use ReplyAndReceive on HLE services and remove special handling from kernel
* Fix for applet transfer memory + some nits
* Keep handles if possible to avoid server handle table exhaustion
* Fix IPC ZeroFill bug
* am: Correctly implement CreateManagedDisplayLayer and implement CreateManagedDisplaySeparableLayer
CreateManagedDisplaySeparableLayer is requires since 10.x+ when appletResourceUserId != 0
* Make it exit properly
* Make ServiceNotImplementedException show the full message again
* Allow yielding execution to avoid starving other threads
* Only wait if active
* Merge IVirtualMemoryManager and IAddressSpaceManager
* Fix Ro loading data from the wrong process
Co-authored-by: Thog <me@thog.eu>
This adds the guest GPU accessor to hashes computation.
As this change all the hashes from the cache, I added some migration
logic.
This is required for #1755.
* Allow copy destination to have a different scale from source
Will result in more scaled copy destinations, but allows scaling in some games that copy textures to the output framebuffer.
* Support copying multiple levels/layers
Uses glFramebufferTextureLayer to copy multiple layers, copies levels individually (and scales the regions).
Remove CopyArrayScaled, since the backend copy handles it now.
* shader cache: Fix Linux boot issues
This rollback the init logic back to previous state, and replicate the
way PTC handle initialization.
* shader cache: set default state of ready for translation event to false
* Fix cpu unit tests
* shader cache: Fix possible race causing crashes on manifest at startup
This fix a misplace function call ending up causing possibly two write
on the cache.info at the same time.
* shader cache: Make RemoveManifestEntries async too to be sure all operations are perform before starting the game
* shader cache: Fix invalid virtual address clean up
This fix an issue causing the virtual address of texture descriptors to
not be cleaned up when caching and instead cleaning texture format and swizzle.
This should fix duplicate high duplication in the cache for certain
games and possible texture corruption issues.
**THIS WILL INVALIDATE ALL SHADER CACHE LEVELS CONSIDERING THE NATURE OF THE ISSUE**
* shader cache: Address gdk's comment
* infra: Migrate to .NET 5
This migrate projects and CI to .NET 5
* Remove language version restrictions (now on 9.0 by default)
* infra: pin .NET 5 to avoid later issues
* infra: Cleanup csproj files
* infra: update dependencies
* infra: Add temporary workaround for a bug in Vector128.Create
see https://github.com/dotnet/runtime/issues/44704 for more informations
"Screen scissor" is the minimum size of all render targets, and is set when any render target is bound on NVN or OpenGL. Since it works on all active texture's real sizes, it is therefore more reliable than viewport 0's width, and is actually set before clear.
This fixes a regression with Hyrule Warriors: Age Of Calamity's cubemaps, which did not set viewport dimensions before clear. This resulted in attempting to create a cubemap with rectangular sides, which is logically and physically impossible. (also it just fails)
Here come Salieri, my implementation of a disk shader cache!
"I'm sure you know why I named it that."
"It doesn't really mean anything."
This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
* Size hints for copy regions and viewport dimensions to avoid data loss
* Reword comment.
* Use info for the rule rather than calculating aligned size.
* Reorder min/max, remove spaces