All,
I am trying to track down a problem of some code that was written by an 'overseas' 3rd party (I will be nice and not state the country of origin).
This code uses a single timer set at a 1mS interrupt rate in order to determine if the SPI is still communicating externally with it's master. If no communications are detected the timer resets the SPI port, clears the interrupt, and jumps to the reset vector. The problem is: the SPI port remains dead until a power cycle is accomplished.
The obvious fix is to use the watchdog (which is what I will eventually do), but I would like to understand the why of why this does not work (yes, bad coding practice is the real reason)...
Since the code jumps to the reset vector this is what I have been able to analyze:
(1) Since this is not a true reset (ie: via watchdog) all hardware registers are not reset - problem potential here. (2) The jump to the reset vector was accomplished while in supervisor mode, so the privleged registers (ie: SP,etc) can be written. (3) The timer interrupt was cleared prior to making the jump to the reset vector, so all interrupts are still enabled. (4) Since this is not a true reset, all resident code can still execute (ie: interrupt handlers). (5) The startup code will reset all initialized data, registers, etc prior to jumping to program main(), effectively returning data to a power up state.
One reason I can currently come up with as to why the SPI is never functional after this occurs is that maybe an interrupt occurs while in the startup code (clearing a tracking variable or resetting the processor registers). But the interrupt would also inhibit the startup code until it was serviced. This potential cause is (probably) not the only reason for this issue, and why I am asking for your input(s).
Unfortunately, this board has no JTAG to connect so stepping through the code is not an option. I could write to the serial port - if it was connected, but it isnt. Right now I am trying to analyze my way through this code before using a 'hammer' approach to solving this problem.
What else am I missing in this analysis? Thanks.
Not being an ARM guy I don't know, but I could speculate:
If the jump to reset occurs from within the timer interrupt service routine then the return-from-interrupt instruction at the end of it will not have been executed. This might leave the whole interrupt subsystem out of kilter in some way.
Hey Jack,
An assemply code snapshot shows the following on an ISR entry:
00008e b530 PUSH {r4,r5,lr}
and on exit:
0001f2 bd30 POP {r4,r5,pc}
which returns to the interrupted code.
I dont know that there is a 'RETI' type instruction that does anything more than what you see posted here.
That said, it seems to me that while these registers will remain on the stack I dont see it as a problem in that the SP is reset in the starup code anyway. And wouldnt another interrupt just push/pop its registers on 'top' on these anyway?
Thanks for the input though...
are you really jumping to the reset vector using a branch instruction? that is asking for trouble. the status of the VIC and other peripherals may not be defined. you must use a watchdog (I must say that this system sounds a awkward to me, but that is not the subject...)
do note that you are forgetting something very important: stacks cannot be configured while not in supervisor mode (at least, on a LPC24xx). thus, the MRS, MSR instructions will have no effect. jumping to the rest vector will leave you in user mode. you must reset the controller via a watchdog (or use a SWI function...)!
Can you reproduce it on a board that does have JTAG?
Tamir,
The jump to the reset vector happens within the timer ISR.
Note point:
(2) The jump to the reset vector was accomplished while in supervisor mode, so the privleged registers (ie: SP,etc) can be written.
Thanks.
Andy,
If it were only that easy... Development was done way before I got here and boards are all equivalent. Might need to get an eval board though - good thought.
Note initial post:
"The obvious fix is to use the watchdog (which is what I will eventually do), but I would like to understand the why of why this does not work (yes, bad coding practice is the real reason)..."
I am trying to understand the 'why' - I know what the solution/remedy should be.
Actually, the code that invokes reset does not use the BX instruction. This is how it is done:
LDR PC, =0x00000000
Effectively the same as a BX? or not?
Need to probably research this...
Most likely some initialization code assumes that the MCU is fresh out of a hardware reset, so some deep analysis of that code would be necessary to see what a simple 'jump to 0' would do. Some ARM-based MCU's have other means of performing software reset in addition to the watchdog timer. Perhaps, that would be the easiest workaround?
Another thought occured to me that I'd like to put up for review....
Lets say that the reset sequence is OK, by some sort of bad design luck.
If the slave's SPI was reset and then comes up while the master is actually sending data on the port, say on bit 3 of the data byte, the slave SPI would be out of sync with the data transmitted and always receive bad data.
Possible?
(1) Since this is not a true reset (ie: via watchdog) all hardware registers are not reset - problem potential here.
it is a common code monkey belief that jumping to the reset vector actually does a reset.
Try timing out the watchdog instead.
Erik
"Development was done way before I got here and boards are all equivalent"
I trust you've logged somewhere for future reference that it's a really bad idea to make any board with no access at all to the JTAG (or equivalent) - even if it's just a set of test points...
Mike,
Thanks for the response. I was writing while you were responding.
"Most likely some initialization code assumes that the MCU is fresh out of a hardware reset".
My thoughts too, as my concern is not having a true reset which would reset all chip registers. Looking at the startup.s code does not give me any insight as to why this would be an issue.
If I have anything to do with new development I insist on having a JTAG port - no issue here.