RISC-V ASM unaligned read/writes: alternative assembly#10530
Conversation
|
Jenkins: retest this please |
3b8e981 to
b4611fa
Compare
|
I would like to push back on this implementation a bit. Currently, all instructions that could be unaligned get emulated. I think it makes sense to check if the pointers are actually unaligned and only then use the more costly emulation. There should only be a negligible impact on performance in case the data is aligned. This also only introduces a small size overhead, the check should be a few instructions at most. |
|
On a separate issue: In my opinion readability would be increased if you just used the EDIT: This is probably obsolete in case the suggestion above gets adopted |
bd46955 to
26c45cf
Compare
|
I've modified the macros to check the alignment and choose the sequence to use. Thanks, |
There was a problem hiding this comment.
I see the following pattern several times:
addi t, p, o
andi t, t, <alignment mask>
bnez t, <label>
----
t: scratch register
p: register holding the base address
o: offset
This checks the alignment of p+o, but since o is always a valid offset (is it not?) it is sufficient to check p only.
One instruction can then be saved by doing:
andi t, p, <alignment mask>
bnez t, <label>
There was a problem hiding this comment.
One issue I see is that there is a lot of double checking for alignments:
The bulk variants call the underlying N times, but since the check for the alignment is done in the underlying only, it also gets checked N times.
In case the data is actually aligned there should only be one check.
I'd argue that most of the time the data is (or at the very least should be) aligned, so I'd try to keep the overhead for this case as small as possible.
If the data is unaligned, performance will take a hit even on hardware that supports misaligned loads and stores, e.g. when accesses cross cache-line boundaries.
There was a problem hiding this comment.
Made it that it will do many at once after the check to reduce the overhead.
Will test for largest alignment first and use the smallest number of instructions.
Not all RISC-V chips allow unaligned reads and writes with basic assembly instructions like lw/sw. Add alternative assembly that is turned on with: WOLFSSL_RISCV_ASM_NO_UNALIGNED.
11ccbc3 to
018e937
Compare
Description
Not all RISC-V chips allow unaligned reads and writes with basic assembly instructions like lw/sw.
Add alternative assembly that is turned on with:
WOLFSSL_RISCV_ASM_NO_UNALIGNED.
Fixes #10525
Testing
./configure --disable-shared LDFLAGS=--static --host=riscv64 CC=riscv64-linux-gnu-gcc --enable-riscv-asm CFLAGS=-DWOLFSSL_RISCV_ASM_NO_UNALIGNED