Commit graph

75 commits

Author SHA1 Message Date
merry
1a2ae16395 T16: Implement Add to SP (immediate) 2022-02-11 23:08:02 +00:00
merry
59e9c3d6b0 T16: Implement ADR 2022-02-11 23:08:02 +00:00
merry
baaf5e126e T16: Implement LDR/STR (SP) 2022-02-11 23:08:02 +00:00
merry
f3e068b94a T16: Implement {LDR,STR}{,B,H} (immediate) 2022-02-11 23:08:02 +00:00
merry
d3272c1498 T16: Implement {LDR,STR}{,H,B,SB,SH} (register) 2022-02-11 23:08:02 +00:00
merry
a9f952ad40 T16: Implement LDR (literal) 2022-02-11 23:08:02 +00:00
merry
d84c2417aa T16: Implement BLX (reg) 2022-02-11 23:08:02 +00:00
merry
2876344cca T16: Implement ADD, CMP, MOV (high reg) 2022-02-11 23:08:02 +00:00
merry
43feb68b11 T16: Implement ANDS, EORS, LSLS, LSRS, ASRS, ADCS, SBCS, RORS, TST, NEGS, CMP, CMN, ORRS, MULS, BICS, MVNS (low registers) 2022-02-11 23:08:02 +00:00
merry
284272854b T16: Implement MOVS, CMP, ADDS, SUBS (8-bit immediate) 2022-02-11 23:08:02 +00:00
merry
7a09aea0dc T16: Implement ADDS, SUBS (3-bit immediate) 2022-02-11 23:08:01 +00:00
merry
3d663a1c8c T16: Implement ADDS, SUBS (reg) 2022-02-11 23:08:01 +00:00
merry
15ccdff751 T16: Implement LSL/LSR/ASR (imm) 2022-02-11 23:08:01 +00:00
merry
cb4ccec421 T16: Implement BX 2022-02-11 23:08:01 +00:00
merry
e1bbf8d7b9 OpCodeTables: Improve thumb fast lookup 2022-02-11 23:08:01 +00:00
merry
19c6c1c11c OpCodeTable: Prepare for thumb instructions 2022-02-11 23:08:01 +00:00
merry
08e1e0c985 OpCodeTable: Remove existing thumb instruction implementations 2022-02-11 23:08:01 +00:00
merry
1379f41d5d OpCodeTable: Minor cleanup 2022-02-11 23:08:01 +00:00
merry
5c2e780d40 Decoders: Add InITBlock argument 2022-02-11 23:08:01 +00:00
merry
86b37d0ff7
ARMeilleure: A32: Implement SHSUB8 and UHSUB8 (#3089)
* ARMeilleure: A32: Implement UHSUB8

* ARMeilleure: A32: Implement SHSUB8
2022-02-08 10:46:42 +01:00
merry
88d3ffb97c
ARMeilleure: A32: Implement SHADD8 (#3086) 2022-02-06 12:25:45 -03:00
merry
222b1ad7da
ARMeilleure: OpCodeTable: Add CMN (RsReg) (#3087) 2022-02-06 02:01:05 +01:00
sharmander
60f7cba30a
Implement FCVTNS (Scalar GP) (#2953)
* Implement FCVTNS (Scalar GP)

* Update Ptc Version
2022-01-19 22:21:44 -03:00
sharmander
e5f7ff1eee
CPU - Implement FCVTMS (Vector) (#2937)
* Add FCVTMS_V Implementation to Armeilleure

* Fix opcode designation

* Add tests

* Amend Ptc version

* Fix OpCode / Tests

* Create Math.Floor helper method + Update implementation

* Address gdk comments

* Re-address gdk comments

* Update ARMeilleure/Decoders/OpCodeTable.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Tests to use 2S (4S) and 2D

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-01-04 16:45:28 -03:00
gdkchan
e24949ca2c
Implement CSDB instruction (#2927) 2021-12-19 11:19:05 -03:00
Piyachet Kanda
3e2f89b4fd
Implement UHADD8 instruction (#2908)
* Implement UHADD8 instruction along with a test unit

* Update PTC revision number
2021-12-08 17:05:59 -03:00
Mary
501c3d5cea
Implement MSR instruction for A32 (#2585)
* Implement MSR instruction

Fix #1342.

Now Pocket Rumble is playable.

* Address gdkchan's comments

* Address gdkchan's comments

* Address gdkchan's comment
2021-08-27 00:07:44 +02:00
gdkchan
ab9d4b862d
Implement VORN (register) Arm32 instruction (#2396) 2021-06-23 23:21:23 +02:00
FICTURE7
89791ba68d
Add inlined on translation call counting (#2190)
* Add EntryTable<TEntry>

* Add on translation call counting

* Add Counter

* Add PPTC support

* Make Counter a generic & use a 32-bit counter instead

* Return false on overflow

* Set PPTC version

* Print more information about the rejit queue

* Make Counter<T> disposable

* Remove Block.TailCall since it is not used anymore

* Apply suggestions from code review

Address gdkchan's feedback

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Fix more stale docs

* Remove rejit requests queue logging

* Make Counter<T> finalizable

Most certainly quite an odd use case.

* Make EntryTable<T>.TryAllocate set entry to default

* Re-trigger CI

* Dispose Counters before they hit the finalizer queue

* Re-trigger CI

Just for good measure...

* Make EntryTable<T> expandable

* EntryTable is now expandable instead of being a fixed slab.
* Remove EntryTable<T>.TryAllocate
* Remove Counter<T>.TryCreate

Address LDj3SNuD's feedback

* Apply suggestions from code review

Address LDj3SNuD's feedback

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Remove useless return

* POH approach, but the sequel

* Revert "POH approach, but the sequel"

This reverts commit 5f5abaa247.

The sequel got shelved

* Add extra documentation

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2021-04-18 23:43:53 +02:00
LDj3SNuD
4bd1ad16f9
Add Sqdmulh_Ve & Sqrdmulh_Ve Inst.s with Tests. (#2139) 2021-03-25 23:33:32 +01:00
mageven
9bda7b4699
Implement VCNT instruction (#1963)
* Implement VCNT based on AArch64 CNT

Add tests

* Update PTC version

* Address LDj's comments

* Explicit size in encoding
* Tighter tests
* Replace SoftFallback with IR helper

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Reduce one BitwiseAnd from IR fallback

Based on popcount64b from https://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation

* Rename parameter and add assert

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2021-02-22 16:26:13 +01:00
mageven
c19cfca183
Implement PRFM (register variant) as NOP (#1956)
* Implement PRFM (register variant) as NOP

Fix typo pfrm -> prfm
Add comments to distinguish variants

* Increment PTC version
2021-01-26 16:09:27 +11:00
LDj3SNuD
c3e0c41da3
CPU (A64): Add Fmaxnmp & Fminnmp Scalar Inst.s, Fast & Slow Paths; with Tests. (#1894) 2021-01-20 09:12:33 +11:00
LDj3SNuD
430ba6da65
CPU (A64): Add Pmull_V Inst. with Clmul fast path for the "1/2D -> 1Q" variant & Sse fast path and slow path for both the "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test. (#1817)
* Add Pmull_V Sse fast path only, both "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test.

* Add Clmul fast path for the 128 bits variant.

* Small optimisation (save 60 instructions) for the Sse fast path about the 128 bits variant.

* Add slow path, both variants. Fix V128 Shl/Shr when shift = 0.

* A32: Add Vmull_I P64 variant (slow path); not tested.

* A32: Add Vmull_I_P8_P64 Test and fix P64 variant.
2021-01-04 23:45:54 +01:00
LDj3SNuD
8a33e884f8
Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s). Fix Vfma_V slow path not using StandardFPSCRValue(). (#1775)
* Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s).

Add Vfma_S & Vfms_S Fma fast paths.
Add Vfnma_S inst. with Fma/Sse fast paths and slow path.
Add Vfnms_S Sse fast path.

Add Tests for affected inst.s.

Nits.

* InternalVersion = 1775

* Nits.

* Fix Vfma_V slow path not using StandardFPSCRValue().

* Nit: Fix Vfma_V order.

* Add Vfms_V Sse fast path and slow path.

* Add Vfma_V and Vfms_V Test.
2020-12-17 20:43:41 +01:00
LDj3SNuD
b5c215111d
PPTC Follow-up. (#1712)
* Added support for offline invalidation, via PPTC, of low cq translations replaced by high cq translations; both on a single run and between runs.

Added invalidation of .cache files in the event of reuse on a different user operating system.

Added .info and .cache files invalidation in case of a failed stream decompression.

Nits.

* InternalVersion = 1712;

* Nits.

* Address comment.

* Get rid of BinaryFormatter.

Nits.

* Move Ptc.LoadTranslations().

Nits.

* Nits.

* Fixed corner cases (in case backup copies have to be used). Added save logs.

* Not core fixes.

* Complement to the previous commit. Added load logs. Removed BinaryFormatter leftovers.

* Add LoadTranslations log.

* Nits.

* Removed the search and management of LowCq overlapping functions.

* Final increment of .info and .cache flags.

* Nit.

* GetIndirectFunctionAddress(): Validate that writing actually takes place in dynamic table memory range (and not elsewhere).

* Fix Ptc.UpdateInfo() due to rebase.

* Nit for retrigger Checks.

* Nit for retrigger Checks.
2020-12-17 20:32:09 +01:00
sharmander
e901b7850c
CPU: Implement VRINTX.F32 | VRINTX.F64 (#1776)
* Start implementation

* Draft

* Updated opcode.

Needs verification.

* Clean up code.

* Update implementation and tests.

* Update implemenation + tests

* Get RM from FPSCR + Do not use emit/addintrinsic

* Remove "fast" path, as recommended by gdk.

* Variable DELETED.

* Update ARMeilleure/Decoders/OpCodeTable.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Move method

* stringing things together :)

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-12-16 20:27:15 -03:00
sharmander
3332b29f01
CPU: Implement VFMA (Vector) (#1762)
* Implement VFMA.F64

* Simplify switch

* Simplify FMA Instructions into their own IntrinsicType.

* Remove whitespace

* Fix indentation

* Change tests for Vfnms -- disable inf / nan

* Move args up, not description ;)

* Implementation Complete.

All Tests Pass (Slow / Fast Path)

* Move location of function in assembler + test updates.

* Shift params upwards

* Remove unused function

* Update PTC version.

* Add comments / re-oreder opcode table.

* Remove whitespace

* Fix nit

* Fix nit.

* Fix whitespace

* Wrong opcode was used by a bad merge.

* Addressed rip's comments.
2020-12-15 00:01:52 -03:00
sharmander
36f6bbf5b9
CPU: Implement VFNMA.F32 | F.64 (#1783)
* Implement VFNMA.F<32/64>

* Update PTC Version

* Update Implementation & Renames & Correct Order

* Fix alignment

* Update implementation to not trigger assert

* Actually use the intrinsic that makes sense :)
2020-12-07 21:04:01 -03:00
sharmander
b479a43939
CPU: Implement VFNMS.F32/64 (#1758)
* Add necessary methods / op-code

* Enable Support for FMA Instruction Set

* Add Intrinsics / Assembly Opcodes for VFMSUB231XX.

* Add X86 Instructions for VFMSUB231XX

* Implement VFNMS

* Implement VFNMS Tests

* Add special cases for FMA instructions.

* Update PPTC Version

* Remove unused Op

* Move Check into Assert / Cleanup

* Rename and cleanup

* Whitespace

* Whitespace / Rename

* Re-sort

* Address final requests

* Implement VFMA.F64

* Simplify switch

* Simplify FMA Instructions into their own IntrinsicType.

* Remove whitespace

* Fix indentation

* Change tests for Vfnms -- disable inf / nan

* Move args up, not description ;)

* Undo vfma

* Completely remove vfms code.,

* Fix order of instruction in assembler
2020-12-03 20:20:02 +01:00
gdkchan
2f16491712
Get rid of Reflection.Emit dependency on CPU and Shader projects (#1626)
* Get rid of Reflection.Emit dependency on CPU and Shader projects

* Remove useless private sets

* Missed those due to the alignment
2020-10-21 09:13:44 -03:00
LDj3SNuD
04e330cc00
Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths). (#1577)
* Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths).

No test provided (i.e. draft).

* Ptc InternalVersion = 1577
2020-10-13 22:41:33 +02:00
gdkchan
6cc187da59
SIMD&FP load/store with scale > 4 should be undefined (#1522)
* SIMD&FP load/store with scale > 4 should be undefined

* Catch more invalid encodings for FP&SIMD LDR/STR (reg variant)

* Set PTC version to PR number
2020-09-01 17:02:23 -03:00
LDj3SNuD
2cb8bd7006
CPU (A64): Add Scvtf_S_Fixed & Ucvtf_S_Fixed with Tests. (#1492) 2020-08-31 20:48:21 -03:00
LDj3SNuD
6938988427
Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests. (#1471)
* Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests.

* Address PR feedback & Nit.
2020-08-13 02:34:02 -03:00
Valentin PONS
3af2ce74ec
Implements some 32-bit instructions (VBIC, VTST, VSRA) (#1192)
* Added some 32 bits instructions:

* VBIC
* VTST
* VSRA

* Incremented the PTC

* Add tests and fix implementation

* Fixed VBIC immediate opcode mapping

* Hey hey!

* Nit.

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: LDj3SNuD <dvitiello@gmail.com>
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-07-19 15:11:58 -03:00
LDj3SNuD
56a61a5758
CPU: A32: Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests. (#1394)
* Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests.

* Update Ptc.cs
2020-07-17 10:57:49 -03:00
LDj3SNuD
88619d71b8
CPU: A32: Add Vadd & Vsub Wide (S/U_8/16/32) Inst.s with Test. (#1390) 2020-07-17 14:21:40 +10:00
Ficture Seven
863b0c8dcb
Fix Decode exception condition (#1377) 2020-07-15 17:48:16 +10:00
LDj3SNuD
a804db6eed
Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d d… (#1335)
* Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d double zero sign handling. Allows better handling of NaNs.

* Optimized EmitSse2VectorIsNaNOpF() for multiple uses per opF.
2020-07-13 21:08:47 +10:00