Hey everyone, just wondering if you could perhaps point me in the right direction.
I'm doing a project using an ARM7 lpc2388 using RTX and the TcpNet library for TCP ethernet communication. I found my system would seemingly randomly stop and become unresponsive. This was previously masked by the fact that I have a WDT in place and this would force the system to restart. I thought it was a timing issue but when I stop the ARM7 in the debugging environment I see that it throwing a data abort exception (stops at the Dabt_handler). When I look at the task list it shows the task that I have handling all the TcpNet call in an underflow state.
I'm not sure how I can debug this. What sort of bugs could I have that could cause this? I'm not touching any assembly stuff in my code. Could it be because I'm calling the TcpNet timer tick from an interrupt? I'v noticed that it only seems to happen when my system is being stressed quite heavily. Could this be caused by the interrupt fifo overflowing? I've never really encountered this type of random problem before so I'm kind of at a loss on how to proceed.
Any thoughts would help!
Mark
So I'm starting to feel like it's coming from an interrupt and the RTX task list says the TcpNet task is underflowing just because it's the task most often running. I need to investigate further.
I think we need to clarify some terminology. Is what you have really stack underflow, or overflow of a stack that's growing downward in address space?
Stack overflow is when the stack grows beyond its upper limit, i.e. you tried to store more into it than it can hold. This is both an easy mistake to make and a hard one to avoid with complete certainty, and therefore almost painfully common.
Stack underflow OTOH is when you go below the bottom of the stack, i.e. it becomes less than empty. As a primary fault this is pretty rare unless your code does some monkey business with the stack pointer itself. Avoiding such mistakes is something makers of embedded OSs like RTX have to deal with, because their systems have to manipulate stack pointers quite a lot. Code using an RTOS should not have to go near this danger zone, so I would discount that possibility for the time being.
But most of the time what appears to be stack underflow is actually stack pointer corruption, typically by a local buffer over-/underflow corrupting stack pointer or frame pointer contents currently saved on the stack. On return from a function, the CPU may load those into the stack pointer. More likely than not, the stack pointer will then point nowhere near the actual stack region any more, and have about equal probability of looking like overflow or underflow.
It always amuses me to see people post responses in forums when they don't know the key point of the subject.
You were right in your first wonderings about whether you are safe calling the tcpnet ticker from an isr.
The answer to that is you should not. Check the relevant documents and look at the keil examples.
You said: I put it into a timer interrupt since the function reference seems to say you can.
That is very interesting. I'm reasonably sure that the documentation has changed for that because it used to give an example of using a separate dedicated task to periodically call the timer_tick function. I know from some previous communication with keil technical support that all the function does (or did at the time) was set a flag that was picked up during the call to main_TcpNet, so it could even then have been called from an ISR. But then they said that you shouldn't because they might change the implementation details.
But you're obviously right about calling os_ functions in an ISR. Even now that is a documented no-no.