Commit graph

55 commits

Author SHA1 Message Date
merry
43feb68b11 T16: Implement ANDS, EORS, LSLS, LSRS, ASRS, ADCS, SBCS, RORS, TST, NEGS, CMP, CMN, ORRS, MULS, BICS, MVNS (low registers) 2022-02-11 23:08:02 +00:00
merry
284272854b T16: Implement MOVS, CMP, ADDS, SUBS (8-bit immediate) 2022-02-11 23:08:02 +00:00
merry
7a09aea0dc T16: Implement ADDS, SUBS (3-bit immediate) 2022-02-11 23:08:01 +00:00
merry
3d663a1c8c T16: Implement ADDS, SUBS (reg) 2022-02-11 23:08:01 +00:00
merry
15ccdff751 T16: Implement LSL/LSR/ASR (imm) 2022-02-11 23:08:01 +00:00
merry
cb4ccec421 T16: Implement BX 2022-02-11 23:08:01 +00:00
merry
e1bbf8d7b9 OpCodeTables: Improve thumb fast lookup 2022-02-11 23:08:01 +00:00
merry
19c6c1c11c OpCodeTable: Prepare for thumb instructions 2022-02-11 23:08:01 +00:00
merry
08e1e0c985 OpCodeTable: Remove existing thumb instruction implementations 2022-02-11 23:08:01 +00:00
merry
1379f41d5d OpCodeTable: Minor cleanup 2022-02-11 23:08:01 +00:00
merry
5c2e780d40 Decoders: Add InITBlock argument 2022-02-11 23:08:01 +00:00
merry
86b37d0ff7
ARMeilleure: A32: Implement SHSUB8 and UHSUB8 (#3089)
* ARMeilleure: A32: Implement UHSUB8

* ARMeilleure: A32: Implement SHSUB8
2022-02-08 10:46:42 +01:00
merry
88d3ffb97c
ARMeilleure: A32: Implement SHADD8 (#3086) 2022-02-06 12:25:45 -03:00
merry
222b1ad7da
ARMeilleure: OpCodeTable: Add CMN (RsReg) (#3087) 2022-02-06 02:01:05 +01:00
sharmander
60f7cba30a
Implement FCVTNS (Scalar GP) (#2953)
* Implement FCVTNS (Scalar GP)

* Update Ptc Version
2022-01-19 22:21:44 -03:00
sharmander
e5f7ff1eee
CPU - Implement FCVTMS (Vector) (#2937)
* Add FCVTMS_V Implementation to Armeilleure

* Fix opcode designation

* Add tests

* Amend Ptc version

* Fix OpCode / Tests

* Create Math.Floor helper method + Update implementation

* Address gdk comments

* Re-address gdk comments

* Update ARMeilleure/Decoders/OpCodeTable.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Tests to use 2S (4S) and 2D

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-01-04 16:45:28 -03:00
gdkchan
e24949ca2c
Implement CSDB instruction (#2927) 2021-12-19 11:19:05 -03:00
Piyachet Kanda
3e2f89b4fd
Implement UHADD8 instruction (#2908)
* Implement UHADD8 instruction along with a test unit

* Update PTC revision number
2021-12-08 17:05:59 -03:00
Mary
501c3d5cea
Implement MSR instruction for A32 (#2585)
* Implement MSR instruction

Fix #1342.

Now Pocket Rumble is playable.

* Address gdkchan's comments

* Address gdkchan's comments

* Address gdkchan's comment
2021-08-27 00:07:44 +02:00
gdkchan
ab9d4b862d
Implement VORN (register) Arm32 instruction (#2396) 2021-06-23 23:21:23 +02:00
LDj3SNuD
4bd1ad16f9
Add Sqdmulh_Ve & Sqrdmulh_Ve Inst.s with Tests. (#2139) 2021-03-25 23:33:32 +01:00
mageven
9bda7b4699
Implement VCNT instruction (#1963)
* Implement VCNT based on AArch64 CNT

Add tests

* Update PTC version

* Address LDj's comments

* Explicit size in encoding
* Tighter tests
* Replace SoftFallback with IR helper

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Reduce one BitwiseAnd from IR fallback

Based on popcount64b from https://en.wikipedia.org/wiki/Hamming_weight#Efficient_implementation

* Rename parameter and add assert

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2021-02-22 16:26:13 +01:00
mageven
c19cfca183
Implement PRFM (register variant) as NOP (#1956)
* Implement PRFM (register variant) as NOP

Fix typo pfrm -> prfm
Add comments to distinguish variants

* Increment PTC version
2021-01-26 16:09:27 +11:00
LDj3SNuD
c3e0c41da3
CPU (A64): Add Fmaxnmp & Fminnmp Scalar Inst.s, Fast & Slow Paths; with Tests. (#1894) 2021-01-20 09:12:33 +11:00
LDj3SNuD
430ba6da65
CPU (A64): Add Pmull_V Inst. with Clmul fast path for the "1/2D -> 1Q" variant & Sse fast path and slow path for both the "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test. (#1817)
* Add Pmull_V Sse fast path only, both "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test.

* Add Clmul fast path for the 128 bits variant.

* Small optimisation (save 60 instructions) for the Sse fast path about the 128 bits variant.

* Add slow path, both variants. Fix V128 Shl/Shr when shift = 0.

* A32: Add Vmull_I P64 variant (slow path); not tested.

* A32: Add Vmull_I_P8_P64 Test and fix P64 variant.
2021-01-04 23:45:54 +01:00
LDj3SNuD
8a33e884f8
Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s). Fix Vfma_V slow path not using StandardFPSCRValue(). (#1775)
* Fix Vnmls_S fast path (F64: losing input d value). Fix Vnmla_S & Vnmls_S slow paths (using fused inst.s).

Add Vfma_S & Vfms_S Fma fast paths.
Add Vfnma_S inst. with Fma/Sse fast paths and slow path.
Add Vfnms_S Sse fast path.

Add Tests for affected inst.s.

Nits.

* InternalVersion = 1775

* Nits.

* Fix Vfma_V slow path not using StandardFPSCRValue().

* Nit: Fix Vfma_V order.

* Add Vfms_V Sse fast path and slow path.

* Add Vfma_V and Vfms_V Test.
2020-12-17 20:43:41 +01:00
sharmander
e901b7850c
CPU: Implement VRINTX.F32 | VRINTX.F64 (#1776)
* Start implementation

* Draft

* Updated opcode.

Needs verification.

* Clean up code.

* Update implementation and tests.

* Update implemenation + tests

* Get RM from FPSCR + Do not use emit/addintrinsic

* Remove "fast" path, as recommended by gdk.

* Variable DELETED.

* Update ARMeilleure/Decoders/OpCodeTable.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Update ARMeilleure/Instructions/InstEmitSimdCvt32.cs

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>

* Move method

* stringing things together :)

Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-12-16 20:27:15 -03:00
sharmander
3332b29f01
CPU: Implement VFMA (Vector) (#1762)
* Implement VFMA.F64

* Simplify switch

* Simplify FMA Instructions into their own IntrinsicType.

* Remove whitespace

* Fix indentation

* Change tests for Vfnms -- disable inf / nan

* Move args up, not description ;)

* Implementation Complete.

All Tests Pass (Slow / Fast Path)

* Move location of function in assembler + test updates.

* Shift params upwards

* Remove unused function

* Update PTC version.

* Add comments / re-oreder opcode table.

* Remove whitespace

* Fix nit

* Fix nit.

* Fix whitespace

* Wrong opcode was used by a bad merge.

* Addressed rip's comments.
2020-12-15 00:01:52 -03:00
sharmander
36f6bbf5b9
CPU: Implement VFNMA.F32 | F.64 (#1783)
* Implement VFNMA.F<32/64>

* Update PTC Version

* Update Implementation & Renames & Correct Order

* Fix alignment

* Update implementation to not trigger assert

* Actually use the intrinsic that makes sense :)
2020-12-07 21:04:01 -03:00
sharmander
b479a43939
CPU: Implement VFNMS.F32/64 (#1758)
* Add necessary methods / op-code

* Enable Support for FMA Instruction Set

* Add Intrinsics / Assembly Opcodes for VFMSUB231XX.

* Add X86 Instructions for VFMSUB231XX

* Implement VFNMS

* Implement VFNMS Tests

* Add special cases for FMA instructions.

* Update PPTC Version

* Remove unused Op

* Move Check into Assert / Cleanup

* Rename and cleanup

* Whitespace

* Whitespace / Rename

* Re-sort

* Address final requests

* Implement VFMA.F64

* Simplify switch

* Simplify FMA Instructions into their own IntrinsicType.

* Remove whitespace

* Fix indentation

* Change tests for Vfnms -- disable inf / nan

* Move args up, not description ;)

* Undo vfma

* Completely remove vfms code.,

* Fix order of instruction in assembler
2020-12-03 20:20:02 +01:00
gdkchan
2f16491712
Get rid of Reflection.Emit dependency on CPU and Shader projects (#1626)
* Get rid of Reflection.Emit dependency on CPU and Shader projects

* Remove useless private sets

* Missed those due to the alignment
2020-10-21 09:13:44 -03:00
LDj3SNuD
04e330cc00
Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths). (#1577)
* Add Umaal & Vabd_I, Vabdl_I, Vaddl_I, Vhadd, Vqshrn, Vshll inst.s (slow paths).

No test provided (i.e. draft).

* Ptc InternalVersion = 1577
2020-10-13 22:41:33 +02:00
gdkchan
6cc187da59
SIMD&FP load/store with scale > 4 should be undefined (#1522)
* SIMD&FP load/store with scale > 4 should be undefined

* Catch more invalid encodings for FP&SIMD LDR/STR (reg variant)

* Set PTC version to PR number
2020-09-01 17:02:23 -03:00
LDj3SNuD
2cb8bd7006
CPU (A64): Add Scvtf_S_Fixed & Ucvtf_S_Fixed with Tests. (#1492) 2020-08-31 20:48:21 -03:00
LDj3SNuD
6938988427
Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests. (#1471)
* Fix Vcvt_FI & Vcvt_RM; Add Vfma_S & Vfms_S. Add Tests.

* Address PR feedback & Nit.
2020-08-13 02:34:02 -03:00
Valentin PONS
3af2ce74ec
Implements some 32-bit instructions (VBIC, VTST, VSRA) (#1192)
* Added some 32 bits instructions:

* VBIC
* VTST
* VSRA

* Incremented the PTC

* Add tests and fix implementation

* Fixed VBIC immediate opcode mapping

* Hey hey!

* Nit.

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: LDj3SNuD <dvitiello@gmail.com>
Co-authored-by: LDj3SNuD <35856442+LDj3SNuD@users.noreply.github.com>
2020-07-19 15:11:58 -03:00
LDj3SNuD
56a61a5758
CPU: A32: Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests. (#1394)
* Fix Vabs_V & Vneg_V (S8, S16, S32 & F32); add Tests.

* Update Ptc.cs
2020-07-17 10:57:49 -03:00
LDj3SNuD
88619d71b8
CPU: A32: Add Vadd & Vsub Wide (S/U_8/16/32) Inst.s with Test. (#1390) 2020-07-17 14:21:40 +10:00
LDj3SNuD
a804db6eed
Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d d… (#1335)
* Add Fmax/minv_V & S/Ushl_S Inst.s with Tests. Fix Maxps/d & Minps/d double zero sign handling. Allows better handling of NaNs.

* Optimized EmitSse2VectorIsNaNOpF() for multiple uses per opF.
2020-07-13 21:08:47 +10:00
riperiperi
d7044b10a2
Add SSE4.2 Path for CRC32, add A32 variant, add tests for non-castagnoli variants. (#1328)
* Add CRC32 A32 instructions.

* Fix CRC32 instructions.

* Add CRC intrinsic and fast path.

Loop is currently unrolled, will look into adding temp vars after tests are added.

* Begin work on Crc tests

* Fix SSE4.2 path for CRC32C, finialize tests.

* Remove unused IR path.

* Fix spacing between prefix checks.

* This should be Src.

* PTC Version

* OpCodeTable Order

* Integer check improvement. Value and Crc can be either 32 or 64 size.

* This wasn't necessary...

* If size is 3, value type must be I64.

* Fix same src+dest handling for non crc intrinsics.

* Pre-fix (ha) issue with vex encodings
2020-07-13 20:48:14 +10:00
riperiperi
9a49f8aec9
Fix VMVN (immediate), Add VPMIN, VPMAX, VMVN (register) (#1303)
* Add Vmvn (register), tests for both Vmvn variants.

* Add Vpmin, Vpmax, improve Non-FastFp accuracy for Vpadd

* Rebase on top of PTC.

* Add Nopcode

* Increment PTC version.

* Fix nits.
2020-06-24 10:43:44 +10:00
riperiperi
fa286d3535
VABS takes one input register, not two. (#1300) 2020-06-14 10:32:21 +10:00
LDj3SNuD
83d94b21d0
Add FMaxNmV & FMinNmV Inst.s with Test. (#1279)
Successful unit testing on Windows (debug and release mode).
2020-05-27 18:51:59 +02:00
LDj3SNuD
1de16f7653
Add Fcvtas_S/V & Fcvtau_S/V. (#1018) 2020-03-24 22:53:49 +01:00
Chenj168
31b94f4641
Move the MakeOp to OpCodeTable class, for reduce the use of ConcurrentDictionary (#996) 2020-03-20 15:15:37 +11:00
riperiperi
dd433c1296
Implement AESMC, AESIMC, AESE, AESD and VEOR AArch32 instructions (#982)
* Add VEOR and AES instructions.

* Add tests for crypto instructions.

* Update ValueSource name.
2020-03-14 10:29:58 +11:00
gdkchan
c26f3774bd
Implement VMULL, VMLSL, VRSHR, VQRSHRN, VQRSHRUN AArch32 instructions + other fixes (#977)
* Implement VMULL, VMLSL, VQRSHRN, VQRSHRUN AArch32 instructions plus other fixes

* Re-align opcode table

* Re-enable undefined, use subclasses to fix checks

* Add test and fix VRSHR instruction

* PR feedback
2020-03-11 11:49:27 +11:00
gdkchan
89ccec197e
Implement VMOVL and VORR.I32 AArch32 SIMD instructions (#960)
* Implement VMOVL and VORR.I32 AArch32 SIMD instructions

* Rename <dt> to <size> on test description

* Rename Widen to Long and improve VMOVL implementation a bit
2020-03-10 16:17:30 +11:00
gdkchan
ab3b6ea6d4
A64 SIMD LDP and STP with size = 0b11 is undefined (#971) 2020-03-07 13:39:52 +11:00
gdkchan
fb0939f9b6
Add SSAT, SSAT16, USAT and USAT16 ARM32 instructions (#954)
* Implement SMULWB, SMULWT, SMLAWB, SMLAWT, and add tests for some multiply instructions

* Improve test descriptions

* Rename SMULH to SMUL__

* Add SSAT, SSAT16, USAT and USAT16 ARM32 instructions

* Fix new tests

* Replace AND 0xFFFF with 16-bits zero extension (more efficient)
2020-03-01 07:51:55 +11:00