-
Notifications
You must be signed in to change notification settings - Fork 1.6k
S390x: emit new instructions added in z17 #12319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ae1c56a to
fac9929
Compare
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:x64", "cranelift:meta", "isle"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
|
I'm going to shift review of this over to @uweigand |
|
or, well, I can't officially do that, but @uweigand I'm happy to rubber-stamp once you've approved |
uweigand
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me, but see inline comments.
In addition to what is implemented here, we now could implement vector integer division for 32-bit and 64-bit integer vectors - but there is currently no ISLE to even express this.
| (cmov_imm $I64 (intcc_as_cond (IntCC.Equal)) 0 x))) | ||
|
|
||
| ;; Implement `sdiv` for 128-bit integers on z17 (only). | ||
| ;; FIXME: integer-overflow check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to fix this before committing. (Also, there probably should be run-time test validating the correct behaviour like for other divison operations.)
| (rule 16 (lower (has_type (and (vxrs_ext3_enabled) (vr128_ty ty)) (band (band y z) (bnot x)))) | ||
| (vec_eval ty 0b00000010 x y z)) | ||
| (rule 17 (lower (has_type (and (vxrs_ext3_enabled) (vr128_ty ty)) (band (band x (bnot y)) z))) | ||
| (vec_eval ty 0b00000010 y x z)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what if any canonicalization is done at the ISLE level here, but these four don't cover all possible combinations. E.g. (band x (band (bnot y) z)) is not covered.
I guess a more fundamental question is which combinations we should be covering. For example, why cover and-not with three inputs but not or-not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stopped because I realized this would add hundreds of rules to get correct, and ran out of steam pretty quickly. I think it's an open question: what should we encode? what 3-input binary operations are actually used?
| ; block0: ; offset 0x0 | ||
| ; .byte 0xe7, 0x8a | ||
| ; .byte 0x80, 0x02 | ||
| ; .byte 0x9f, 0x88 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add z17 insns to the disassembler, but that is of course a different patch.
This emits & tests a bunch of instructions:
* from Miscellaneous-Instruction-Extensions Facility 4:
* CLZ, 64bit
* CTZ, 64bit
* from Vector-Enhancements Facility 3:
* 32x4, 64x2 & 128x1 variants of the following:
* Divide
* Remainder
* 64x2 & 128x1 multiply variants
* 128x1 vaiants of:
* Compare
* CLZ
* CTZ
* Max
* Min
* Average
* Negation
* Evaluate (3-input and, 3-input or, atm)
Co-authored-by: Jimmy Brisson <[email protected]>
Now that s390x implements blendv as well, we should refer to the instruction without the x86 prefix.
fac9929 to
73c5595
Compare
Z17 (arch15) includes some instructions that allow us to encode some more complicated operations in fewer instructions. This PR adds support to cranelift-codegen to emit these newer instructions when appropriate.
Further, Z17 includes a VBLEND instruction that mimics the same instruction on x64. Since this is no longer an x64-exclusive instruction type, I've renamed the appropriate stuff within cranelift codegen to reflect that this is not ISA-specific anymore.