Hi, just testing STM32H7 code... .
In a simple LED Blink program, I used for loops for simple delay:
for( int i= 0; i< 10000000; i++) iDelay++;
This code will generate 6 assembly lines needing 10...20 bytes of code.
If this loop runs inside a 32 Byte page of STM32H7 code, all nice, runs as expected, with 2.5nsec time per command for STM32H7 running at 400MHz.
Just If this loop code snippet extends over a 32byte flash page border, then the complete for loop will take 5...20 times longer time ... .
This is quite nerving, I would like to use such delay loops at least in initialisation code, of course not for exact timing, but at least approximately correct.
I found an ALIGN command in Keil arm assembler, but if I try to use this in a __asm{...} inline assembly, unfortunately the c compiler gives an error "#3061: unrecognized instruction opcode". I tried also "#pragma align 32" ... but his also does not work... .
Is there any way to convice the c compiler to fill up nop's automatically in an easy way for such an application?
Why not just use a subroutine? Want tightly defined assembly, put it in startup_stm32xxxx.s
...but I use C to come around assembler...
E. g. in the clock init code I have about 10 loops of the following sort:
#define IFLOOPEND_1ms 1000000 iDelay= 0; while( !(PWR->D3CR & PWR_D3CR_VOSRDY)) { if( iDelay++ > IFLOOPEND_1ms) return 0; }
... in worst case I really had to do this in assembler ... but writing an assembly function just because of this ... does this not sound a bit stupid ... at least really an idea, I will think about this ... .
But I think in C code quite often you have such small for loops ... Older processors did not use such large Flash buffers for reading ... and they were not so fast in execution timing ... in this STM32H7 this effect really gets quite nerving ... . And if the C compiler would support such a #pragma align 32, this would be super-great.
Hi, I tried it now.
I think I am somehow fine with the assembly function ... I did this as you instructed somebody else in the forum.
Just I am not really sure with these AREA commands in the startup.s file.
... so I just copied my function at the end of the startup.s, but before the END statement...
If I insert my code there:
ALIGN 32 WaitForReadyWithoutTimer PROC EXPORT WaitForReadyWithoutTimer MOV R3, #0 PUSH {r4,lr} B WFRWT_LoopEnd ...
... then it compiles, but it gives a warning: "A1479W: Requested alignment 32 is greater than area alignment 4, which has been increased"
Otherwise all seems to work nicely, I am happy. I also see in the list file and in the generated code, that the placement is done on a 32 byte limit.
Is this ok with this warning? Some way to remove this warning? (I do not like to have warnings from the compiler...). Should I perhaps define a special code segment for this function?
AREA |.text|, CODE, READONLY, ALIGN=5
Exactly, thank you VERY much :).
He gave now one further warning about 1 padding word at the end (because my code had 14 Bytes and not 16 ... . So now I appended one NOP after the return and compiler seems to be happy now too, no more warnings.
...
meanwhile I found the "mircacle command" to avoid these 32 byte flash page problems: If I use SCB_EnableICache() at start of my main, then all works with full speed, regardless wheter the for loops extend over 32 byte borders or not.
... just the Keil Disassembly then does not work as nice at before ... at least somehow it seems to have problems to show disassembly code if I click in the C code during run time ... if the program is at a breakpoint, it shows correctly around the breakpoint, but even then some problems I think if I scroll up/down in the disassembly code.