| Details | Message |
|---|
Read-Only Author Dragi Pandeliev Posted 1-Sep-2010 15:35 GMT Toolset ARM |  Optimizing FIQ Dragi Pandeliev I have SPI interrupt defined as FIQ and the execution time for it should be less than 12us, but adding a few checks and the execution time is going out of the limit. I've seen that it takes 2-3us for the ARM7TDMI to enter the FIQ. Can you give some ideas how to optimize it to enter it quickly or execute quickly? What should I use/avoid? I realized I should avoid using % because it takes about 3us to compute a%100! Any advice is welcome. |
|
Read-Only Author Per Westermark Posted 1-Sep-2010 15:41 GMT Toolset ARM |  RE: Optimizing FIQ Per Westermark ARM has documentation mentioning the clock cycles required for all assembler instructions. Either write your code in assembler, or write the code in C and take a peek at the generated assembler output to make sure that you are familiar with the cost of different C constructs. Avoid thinking of instructions as taking 3us. Think of them as taking a specific number of clock cycles. Then decide what clock frequency you will have your processor core run. Knowing the required response time and the selected clock frequency will tell you the maximum number of clock cycles your FIQ may consume. |
|
Read-Only Author Tamir Michael Posted 1-Sep-2010 15:43 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael 12us is an extremely tight time budget - about the execution duration of a single instruction at 72[MHz]! |
|
Read-Only Author Christoph Franck Posted 1-Sep-2010 15:52 GMT Toolset ARM |  RE: Optimizing FIQ Christoph Franck 12us is an extremely tight time budget - about the execution duration of a single instruction at 72[MHz]! I think you're off by a few orders of magnitude, Tamir. ;) (or did you read ns instead of us?) I have a Cortex-M3 part sitting here, happily handling quite a bit of processing (8 multiply-adds) in about 2us at 64 MHz. |
|
Read-Only Author Christoph Franck Posted 1-Sep-2010 15:50 GMT Toolset ARM |  RE: Optimizing FIQ Christoph Franck have SPI interrupt defined as FIQ and the execution time for it should be less than 12us, but adding a few checks and the execution time is going out of the limit. I've seen that it takes 2-3us for the ARM7TDMI to enter the FIQ. It would be interesting to know what frequency your uC is running at. What should I use/avoid? Avoid: * Anything that calls library functions. * Any function calls. * Multiplication, division and modulo operators (unless they are by powers of two, in which case the compiler should replace them with shifts or ANDs respectively). * Byte and halfword memory accesses. Use signed/unsigned ints (32 bit integers) instead. * Doing too much stuff inside the ISR. Consider very carefully what must be done inside the ISR, and what can be done outside the ISR. * Having the operating system interfere in any way with the ISR, e.g. by adding an OS-specific preamble. If, after all optimizations, you still cannot reach the target of 12 us, consider switching to a Cortex-M3 parts. The interrupt system and controller of the ARM7TDMI is mediocre at best (the architecture wasn't designed to be used in microcontroller applications in the first place and hence isn't designed for ultra-low latency interrupt). |
|
Read-Only Author Tamir Michael Posted 1-Sep-2010 16:06 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael Holly smoke! It has been a VERY long day indeed! Sorry OP! |
|
Read-Only Author Dragi Pandeliev Posted 2-Sep-2010 07:00 GMT Toolset ARM |  RE: Optimizing FIQ Dragi Pandeliev My uC is running at full speed - 20MHz and I must be able to work with SPI baudrates up to 1MHz at full duplex, i.e during the reception of the bytes from the Master(I'm the Slave) I must make some measurements&computations(they have to be made during SPI transfer) and respond on time. Therefore all this should be done inside SPI ISR. I avoid using function calls, anything that calls library functions and the OS doesn't affect in any way the execution of my ISR. I'll try using only 32bit integers but I cannot avoid multiplication. Unfortunately I cannot switch to another uC, because the one I use has some unique features which are needed for my application. |
|
Read-Only Author Tamir Michael Posted 2-Sep-2010 07:21 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael If you post the ISR code, maybe somebody will be able to suggest improvements. |
|
Read-Only Author Christoph Franck Posted 2-Sep-2010 08:56 GMT Toolset ARM |  RE: Optimizing FIQ Christoph Franck My uC is running at full speed - 20MHz Hm ... what's the exact type of uC you're using? 20 MHz sounds a bit lot for an ARM7TDMI. Or are you using a custom piece of hardware with the processor core on it, e.g. an FPGA, ASIC or similar? Twelve microseconds translates into 240 processor clock cycles, some of which are overhead for entering the ISR. You'll probably have to either analyze the compilers assembly output (set the compiler to generate an assembly file in addition to an object file in order to see it) and either modify the assembly file to suit your needs, or write the whole ISR in assembly yourself (might be easier, as you'll know exactly what it is doing and don't have to mess with what the compiler generated). Are you using any loops inside the ISR? It might be worth trying to manually "unroll" them, since actual loops carry a fairly hefty overhead on the ARM architecture. Example:
for(i = 8; i > 0; --i)
{ *dest++ = *coef++ * *src++; }
unrolled would be
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
*dest++ = *coef++ * *src++;
This uses more code memory, but can be signifikantly faster. |
|
Read-Only Author Steffen Rose Posted 2-Sep-2010 09:07 GMT Toolset ARM |  RE: Optimizing FIQ Steffen Rose For 20MHz irrelevant, but if you use a speed, that require Waitstates on Flash access, it can be a good idea to move this time critical interrupt in the RAM. |
|
Read-Only Author Dragi Pandeliev Posted 2-Sep-2010 11:20 GMT Toolset ARM |  RE: Optimizing FIQ Dragi Pandeliev ...it can be a good idea to move this time critical interrupt in the RAM. Can you explain me how to do it? I was searching in the help about it, but with no success. |
|
Read-Only Author Christoph Franck Posted 2-Sep-2010 11:53 GMT Toolset ARM |  RE: Optimizing FIQ Christoph Franck Can you explain me how to do it? I was searching in the help about it, but with no success. Does your flash/rom/... actually require wait states if the processor clock is 20 MHz? Because if it does not, there is no point in moving the function to RAM. For the exact procedure, you will need to refer to the linker manual. You basically need to place the function in a section that is located in RAM, and which is initialized at system startup. |
|
Read-Only Author Dragi Pandeliev Posted 7-Sep-2010 07:07 GMT Toolset ARM |  RE: Optimizing FIQ Dragi Pandeliev Thanks to all of you. I managed to lower the execution time to less than 12us. I moved the ISR in RAM and it helped a lot. I've also made some changes in the state machine I'm using(broke it into smaller states) and finally it is good. |
|
Read-Only Author Mike Hummel Posted 1-Oct-2010 15:12 GMT Toolset ARM |  RE: Optimizing FIQ Mike Hummel It seems the Realview Compiler doesn't distinguish between IRQ and FIQ when deciding what registers to push on the stack at the beginning of the interrupt service routine. You would think it would use the FIQ shadow registers instead of pushing them on the stack, but it doesn't. It's a real pain, but if you can write your FIQ service in assembler you can do this, and/or only save the registers you use. It astonishes me that, at least to my knowledge, ARM hasn't addressed this. I would think they would do everything possible to make FIQ truly a FAST interrupt. |
|
Read-Only Author Tamir Michael Posted 2-Oct-2010 08:14 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael I guess you can write some assembly in the startup code inside the FIQ vector itself if you really need a fast response. |
|
Read-Only Author Neville Alexander Posted 2-Oct-2010 10:29 GMT Toolset ARM |  RE: Optimizing FIQ Neville Alexander "I guess you can write some assembly in the startup code inside the FIQ vector itself if you really need a fast response." No need to guess. The processor supports it and the development tools support it. |
|
Read-Only Author Dragi Pandeliev Posted 4-Oct-2010 07:19 GMT Toolset ARM |  RE: Optimizing FIQ Dragi Pandeliev Having code in RAM was not the most safety solution, so my big switch was replaced by small functions for each state. Then I have a table with pointers to functions, and I have one variable pointing to the next state(as index of table). This way, the compiler isn't saving necessary registers for all the states every time, but it saves for each function if needed. This way the execution time is between 5us and 10us, which is much better. Writing the ISR in assembler is hard for me, and I suppose will be hard to maintain, so is not the preferred solution. And the compiler won't help me soon in the future, because Keil has not planned to fully support FIQ yet... |
|
Read-Only Author Andy Neil Posted 4-Oct-2010 07:29 GMT Toolset None |  RE: Writing the ISR in assembler ... will be hard to maintain Andy Neil Only if you make it so! It's entirely up to you whether you implement it in such a way, and provide adequate documentation, to make it maintainable! But, of course, getting absolute maximum speed is almost always going to require "clever tricks" in the code - so the burden is on you to document them very clearly and completely. |
|
Read-Only Author Steffen Rose Posted 8-Oct-2010 13:16 GMT Toolset ARM |  RE: Optimizing FIQ Steffen Rose "Having code in RAM was not the most safety solution" You mean, that a buggy software can overwrite this code? The same argument a can also say for your function pointers, I think. |
|
Read-Only Author Tamir Michael Posted 8-Oct-2010 13:44 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael "Having code in RAM was not the most safety solution" Sometimes there is no other way. Internal flash is full, code stored in NOR flash, but device access times are unacceptable. Scatter load to RAM, all is well. |
|
Read-Only Author Dragi Pandeliev Posted 9-Oct-2010 08:28 GMT Toolset ARM |  RE: Optimizing FIQ Dragi Pandeliev It is not safe because of EMI. RAM is more vulnerable to EMI than Flash |
|
Read-Only Author Tamir Michael Posted 9-Oct-2010 13:48 GMT Toolset ARM |  RE: Optimizing FIQ Tamir Michael If you really insist, code in RAM can be guarded by checksums just like data in RAM. Don't you think corrupt data in RAM can cause as much damage as corrupt code in RAM? |
|
Read-Only Author Per Westermark Posted 9-Oct-2010 14:45 GMT Toolset ARM |  RE: Optimizing FIQ Per Westermark Code in RAM can definitely result in larger problems than data in RAM. With data in RAM, undamaged code can keep multiple copies and perform multiple evaluations to verify that the multiple computations give similar results. With code in RAM, the program may do just about anything. Code corruption will hopefully result in a trap after the program tries to access protected memory areas or gets an invalid opcode. But the only real protection from code corruption is to have multiple processors connected to some majority-vote hardware. But an important issue here is that the stack stores not just parameter data but also return pointers which is basically function pointers. So stack corruption can result in random code execution just as if processor instructions was stored in RAM. And it's impossible to use any checksumming to protect from stack corruption. We need microcontrollers with ECC-protected RAM if we really want to reduce the dangers from RAM corruption caused by EMI, radioactivity etc. |
|