First special instruction
Start Load/Store implementation
Start TextureSample
Sample progress
I/O Load/Store Progress
Rest of load/store
TODO: Currently, the generator still assumes the GLSL style of I/O attributres. On MSL, the vertex function should output a struct which contains a float4 with the required position attribute.
TextureSize and VectorExtract
Fix UserDefined IO Vars
Fix stage input struct names
Mostly copied from GLSL since in terms of syntax within blocks they’re pretty similar. Likely the result will need tweaking…
Isn’t that conveniant?
“Do the simd_shuffle”
atomics
Remaining instructions
Remove removed special instructions
Getting somewhere…
Ryujinx was not hinting application name, so on some platforms (e.g.
Linux) volume control shows Ryujinx as 'SDL Application'. This can cause
confusion.
This commit fixes name in volume control applets on some platforms.
see: https://wiki.libsdl.org/SDL2/SDL_HINT_APP_NAME
* Add Texture Size Capacity and 8GB Dram Build
* Update AutoDeleteCache.cs
* Dynamic Texture Cache (WIP)
* Change to float Multiplier, in-case it needs fine-tuning.
* Delete src/src.sln
* Update AutoDeleteCache.cs
* Format
* Fix Formatting
* Add DefaultTextureSizeCapacity and MemoryScaleFactor
- Also remove redundant New Lines
* Fix 4GB dram crashing
* Format newline
* Refractor
- Added Initialize() function to TextureCache and AutoDeleteCache
- Removed GetMaxTextureCapacity() function and instead added _maxCacheMemoryUsage
- Added private const MaxTextureSizeCapacity to AutoDelete Cache
- Added TextureCache.Initialize() to MemoryManager in order to fetch MaxGpuMemory at the right time.
- Moved and Changed Logger.Info for Gpu Memory to Logger.Notice and Moved it to PrintGpuInformation function.
- Opted to use a ternary operator for the Initialize function, I think it looks cleaner than bunch of if statements.
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* maxMemory to CacheMemory, use Clamp instead of Ternary. Changed MinTextureCapacity 1GiB to 512 MiB
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Format comment
* comment context
* Increase TextureSize capacity for OpenGL back to 1024
- Added a new const ulong for OpenGLTextureSizeCapacity
* Fix changes from last commit.
* Adjust last OpenGL changes.
* Remove garbage VSC file
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update src/Ryujinx.Graphics.Gpu/Image/AutoDeleteCache.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* refactor(perf): pass MemoryOwner<byte> around as itself rather than IMemoryOwner<byte>
* fix(perf): get span via MemoryOwner<byte>.Span property instead of through Memory property
* fix: incorrect comment change
* Allow creating texture aliases on texture pool
* Delete old image format override code
* New format incompatible alias
* Missing bounds check
* GetForBinding now takes FormatInfo
* Make FormatInfo struct more compact
This way exceptions thrown during the execution of CheckLaunchState()
will correctly invoke the unhandled exception handler
and cause Ryujinx to crash.
* Add area sampling scaler to allow for super-sampled anti-aliasing.
* Area scaling filter doesn't have a scaling level.
* Add further clarification to the tooltip on how to achieve supersampling.
* ShaderHelper: Merge the two CompileProgram functions.
* Convert tabs to spaces in area scaling shaders
* Fixup Vulkan and OpenGL project files.
* AreaScaling: Replace texture() by texelFetch() and use integer vectors.
No functional difference, but it cleans up the code a bit.
* AreaScaling: Delete unused sharpening level member.
Also rename _scale to _sharpeningLevel for clarity and consistency.
* AreaScaling: Delete unused scaleX/scaleY uniforms.
* AreaScaling: Force the alpha to 1 when storing the pixel.
* AreaScaling: Remove left-over sharpening buffer.
* Vulkan: Feedback loop improvements
This PR allows the Vulkan backend to detect attachment feedback loops. These are currently used in the following ways:
- Partial use of VK_EXT_attachment_feedback_loop_layout
- All renderable textures have AttachmentFeedbackLoopBitExt
- Compile pipelines with Color/DepthStencil feedback loop flags when present
- Support using FragmentBarrier for feedback loops (fixes regressions from https://github.com/Ryujinx/Ryujinx/pull/7012 )
TODO:
- AMD GPUs may need layout transitions for it to properly allow textures to be used in feedback loops.
- Use dynamic state for feedback loops. The background pipeline will always miss since feedback loop state isn't known on the GPU project.
- How is the barrier dependency flag used? (DXVK just ignores it, there's no vulkan validation...)
- Improve subpass dependencies to fix validation errors
* Mark field readonly
* Add feedback loop dynamic state
* fix: add MoltenVK resolver workaround
fix: add MoltenVK resolver workaround
* Formatting
* Fix more complaints
* RADV dcc workaround
* Use dynamic state properly, cleanup.
* Use aspects flags in more places
TryDequeue checks for _disposed before taking the lock. If another
thread calls Dispose before it takes the lock, it won't get woken up by
the PulseAll call, and will deadlock in Monitor.Wait.
Double-checking _disposed with the lock taken should avoid this.
* Fix arbitrary sorting by "Favorite" in the UI by making it the same as sorting alphabetically while giving favorites priority.
* Use a more engineered solution rather than string hacks.
* Address code style warnings. Add null checking. Make title name comparison case insensitive.
* one more style fix
---------
Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
* refactor: replace usage of ByteMemoryPool with MemoryOwner<byte>
* refactor: delete unused ByteMemoryPool and ByteMemoryPool.ByteMemoryPoolBuffer types
* refactor: change IMemoryOwner<byte> return types to MemoryOwner<byte>
* fix(perf): get span via `MemoryOwner<T>.Span` directly instead of `MemoryOwner<T>.Memory.Span`
* fix(perf): get span via MemoryOwner<T>.Span directly instead of `MemoryOwner<T>.Memory.Span`
* fix(perf): get span via MemoryOwner<T>.Span directly instead of `MemoryOwner<T>.Memory.Span`
* optimization: Load application metadata only for applications with IDs
* Load applications when necessary
This prevents loading applications when launching an application
directly from the command line (or a shortcut).
Instead, applications will be loaded after the emulation was stopped by the user.
* Show the title in the configured language when launching an application
* Rename DesiredTitleLanguage to DesiredLanguage
* chore: replace `ByteMemoryPool` usage with `MemoryOwner<byte>`
* refactor: `PixelConverter.ConvertR4G4ToR4G4B4A4()` - rename old `outputSpan` to `outputSpanUInt16`, reuse same output `Span<byte>` as newly-freed name `outputSpan`
* eliminate temporary buffer allocations
* chore, perf: use MemoryOwner<byte> instead of IMemoryOwner<byte>
* Don't load files from hidden subdirectories
* Catch FileNotFoundException in TryGetApplicationsFromFile()
* Skip non-existent files and bad symlinks when loading applications
Vulkan spec states that input topology should always be PatchList when a tessellation pipeline is present. The AMD GPU on windows crashes so hard it BSODs the machine if this isn't the case, so it's forced here just in case.
I'm not sure what providing a different topology here would even do, as you'd think it would always be a patch list input.
This barrier has always been missing, but it only became apparent when #7012 merged.
I also added some barriers in case the target buffer used here is used by other commands, though right now it isn't.
Fixes a regression where water would turn white on AMD GPUs with the proprietary driver. May fix other issues on this driver.
* Fix checking for the wrong update metadata file
* Apply the same fix for dlc.json
* Use the base application ids for updates and DLCs in the GUI too
This shouldn't actually change anything, since the program index part of the application id
should always be 0 for all applications currently seen by the GUI.
This was just done for completeness.
* More guarantees for buffer correct placement, defer guest requested buffers
* Split RP on indirect barrier rn
* Better handling for feedback loops.
* Qualcomm barriers suck too
* Fix condition
* Remove unused field
* Allow render pass barriers on turnip for now
* Add default values to ApplicationData directly
* Refactor application loading
It should now be possible to load multi game XCIs.
Included updates won't be detected for now.
Opening a game from the command line currently only opens the first one.
* Only include program NCAs where at least one tuple item is not null
* Get application data by title id and add programIndex check back
* Refactor application loading again and remove duplicate code
* Actually use patch ncas for updates
* Fix number of applications found with multi game xcis
* Don't load bundled updates from multi game xcis
* Change ApplicationData.TitleId type to ulong & Add TitleIdString property
* Use cnmt files and ContentCollection to load programs
* Ava: Add updates and DLCs from gamecarts
* Get the cnmt file from its NCA
* Ava: Identify bundled updates in updater window
* Fix the (hopefully) last few bugs
* Add idOffset parameter to GetNcaByType
* Handle missing file for dlc.json
* Ava: Shorten error message for invalid files
* Gtk: Add additional string for bundled updates in TitleUpdateWindow
* Hopefully fix DLC issues
* Apply formatting
* Finally fix DLC issues
* Adjust property names and fileSize field
* Read the correct update file
* Fix wrong casing for application id strings
* Rename TitleId to ApplicationId
* Address review comments
* Apply suggestions from code review
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Gracefully fail when loading pfs for update and dlc window
* Fix applications with multiple programs
* Fix DLCWindow crash on GTK
* Fix some GUI issues
* Remove IsXci again
* Don't add duplicates to update/dlc windows
* Avoid double lookup
* Preserve DLC enabled state for bundled DLCs
* Fix DLCWindow not opening using GTK
* Fix missing information when loading applications from file
* Address review feedback
Rename ContentCollection to ContentMetaData
Fix casing issues in log messages
Use null as the default value for updatePath
* Fix re-adding bundled DLCs every time
* Fix bundled DLCs disappearing
* Abstract common code to open application pfs
* Remove unused imports
* Fix file exists check when loading DLCs
* Load bundled DLCs only using dlc.json
* Load AoC items correctly
* Add all DLCs from a PFS
* Add argument to launch a specific application id
* Use application-id argument for shortcuts if necessary
* Return the application id from the control NCA if possible
* GetApplicationInformation: Don't overwrite application ids
Move SaveDataOwnerId check to the top, since it seems to be more reliable.
* Get application ids from CNMT again
This commit reverts some parts of 61615b8f0d6f90ae86778958ddc38eaf6dc280ab.
Since the issue wasn't actually related to the application id in CMNTs, we can remove the wrong assumptions.
* Revert erroneous axaml change from adca8900
* Rename title to application
* Wrap nsp/pfs0 case with curly braces
* Check if _applicationData.ControlHolder.ByteSpan is zeros only once
* Catch exceptions while loading applications from nsps
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Rebased
Transformation all at once
Use SkiaSharp instead of ImageSharp
* Apply suggestions from code review
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Change back unintentionally changed comment
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
Co-authored-by: Emmanuel Hansen <emmausssss@gmail.com>
* Do not use template updates for buffer textures and buffer images
* No need to do it for images
* Simply buffer texture existence check
* Pipeline is now unused on DescriptorSetUpdater
* Fix some validation errors
* Whitespace correction
* Resolve some runtime validation errors.
* Whitespace
* Properly fix usage realted validation error by setting Extended Usage image creation flag.
* Only if supported
* Remove checking extension for features that are core functionality of Vulkan 1.2
* Ensure descriptor sets are only re-used when all command buffers using it have completed
* Fix some SPIR-V capabilities
* Set update after bind flag if we exceed limits
* Simpler fix for Intel
* Format whitespace
* Make struct readonly
* Add barriers for extra set arrays too
* Dynamic state for Depth Bounds should not be passed to PipelineDynamicStateCreateInfo as the command to set them is never called.
Do not pass pointer to viewport and scissor as those dynamic states should be supported on all devices.
Same as above for DepthBias values.
* Code Review Suggestion
* Pipeline derivation is not implemented and is not suggested.
* Depth Bounds are not used.
* feat: add new types MemoryOwner and SpanOwner
* use SpanOwner instead of new array allocation
* change for loop condition to `fences.Length` instead of `count` to elide Span boundary checks on `fences`
* Key textures using set and binding (rather than just binding)
* Extend full bindless to cover cases with phi nodes
* Log error on bindless access failure
* Shader cache version bump
* Remove constant buffer match to reduce the chances of full bindless triggering
* Re-enable it for constant buffers, paper mario does actually need it
* Format whitespace
* Report base and extra sets from the backend
* Pass texture set index everywhere
* Key textures using set and binding (rather than just binding)
* Start using extra sets for array textures
* Shader cache version bump
* Separate new commands, some PR feedback
* Introduce new manual descriptor set reservation method that prevents it from being used by something else while owned by an array
* Move bind extra sets logic to new method
* Should only use separate array is MaximumExtraSets is not zero
* Format whitespace
* intel workaround
built on top of the amd workaround
* forgot to update the note
* Logic Change
Enabled workaround for all vendors that aren't nvidia
* Applied Suggestions
* Kernel: Wake cores from idle directly rather than through a host thread
Right now when a core enters an idle state, leaving that idle state requires us to first signal the core's idle thread, which then signals the correct thread that we want to run on the core. This means that in a lot of cases, we're paying double for a thread to be woken from an idle state.
This PR moves this process to happen on the thread that is waking others out of idle, instead of an idle thread that needs to be woken first.
For compatibility the process has been kept as similar as possible - the process for IdleThreadLoop has been migrated to TryLeaveIdle, and is gated by a condition variable that lets it run only once at a time for each core. A core is only considered for wake from idle if idle is both active and has been signalled - the signal is consumed and the active state is cleared when the core leaves idle.
Dummy threads (just the idle thread at the moment) have been changed to have no host thread, as the work is now done by threads entering idle and signalling out of it.
This could put a bit of extra work on threads that would have triggered `_idleInterruptEvent` before, but I'd expect less work than signalling all those reset events and the OS overhead that follows. Worst case is that other threads performing these signals at the same time will have to wait for each other, but it's still going to be a very short amount of time.
Improvements are best seen in games with heavy (or very misguided) multithreading, such as Pokemon: Legends Arceus. Improvements are expected in Scarlet/Violet and TOTK, but are harder to measure.
Testing on Linux/MacOS still to be done, definitely need to test more games as this affects all of them (obviously) and any issues might be rare to encounter.
* Remove _idleThread entirely
* Use spinwait so we don't completely blast the CPU with cmpxchg
* Didn't I already do this
* Cleanup
* Implementing new features in the latest Concentus library - span-in, span-out Opus decoding (so we don't have to make temporary buffer copies), returning a more precise error code from the decoder, and automatically linking the native opus library with P/invoke if supported on the current system
* Remove stub log messages and commit package upgrade to 2.1.0
* use more correct disposal pattern
* Bump to Concentus 2.1.1
* Bump to Concentus 2.1.2
* Don't bother pulling in native opus binaries from Concentus package (using ExcludeAssets).
* Fix opus MS channel count. Explicitly disable native lib probe in OpusCodecFactory.
* Bump to package 2.2.0 which has split out the native libs, as suggested.
---------
Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
* GPU: Migrate buffers on GPU project, pre-emptively flush device local mappings
Essentially retreading #4540, but it's on the GPU project now instead of the backend. This allows us to have a lot more control + knowledge of where the buffer backing has been changed and allows us to pre-emptively flush pages to host memory for quicker readback. It will allow us to do other stuff in the future, but we'll get there when we get there.
Performance greatly improved in Hyrule Warriors: Age of Calamity. Performance notably improved in TOTK (average). Performance for BOTW restored to how it was before #4911, perhaps a bit better.
- Rewrites a bunch of buffer migration stuff. Might want to tighten up how dispose stuff works.
- Fixed an issue where the copy for texture pre-flush would happen _after_ the syncpoint.
TODO: remove a page from pre-flush if it isn't flushed after a certain number of copies.
* Add copy deactivation
* Fix dependent virtual buffers
* Remove logging
* Fix format issues (maybe)
* Vulkan: Remove backing swap
* Add explicit memory access types for most buffers
* Fix typo
* Add device local force expiry, change buffer inheritance behaviour
* General cleanup, OGL fix
* BufferPreFlush comments
* BufferBackingState comments
* Add an extra precaution to BufferMigration
This is very unlikely, but it's important to cover loose ends like this.
* Address some feedback
* Docs
* Block input updates while swkbd is open in foreground mode
* Flush internal driver state before unblocking input updates
* Rename Flush to Clear and remove unnecessary attribute
* Clear the driver state only if the GamepadDriver isn't null
* Update audio renderer to REV12: Add support for splitter biquad filter
* Formatting
* Official names
* Update BiquadFilterState size + other fixes
* Update tests
* Update comment for version 2
* Size test for SplitterDestinationVersion2
* Should use Volume1 if no ramp