I'm using silabs parts. The 040 has a vmon protection that resets the process when the supply voltage drops to a point where flash could be corrupted. That is enabled in my code. We are seeing devices come back with apparently corrupted flash (these are locked so we actually cannot see the flash contents, but they appear to be locked in reset loops or partially run and then get crazy.
In order to ensure that rouge code doesn't execute the flash write routines, I'd like to be able to alter the flash by using code to overwrite the flash enable lines with nops so it will physically be impossible for the code to alter flags required to initiate a flash write.
Generating code that can overwrite a function is proving to be difficult. How can I force Keil 7.5 C51 compiler to do this?
To enable the flash writes on the silab part, you have to set a bit that enables flash write/erase, then you set a bit that changes the target of the MOVX instruction so that it targets flash. you need code that looks like this: CBYTE can also be used, but it is throwing some strange error messages. unsigned char code* ptr;
ptr=&lcation; // point to a byte in flash boo(); .... boo() { FLSCL=1; PCTL=1; // *ptr=0xAA; // generates a MOVX which, due to the bits above, will target flash PCTL=0; FLSCL=0; // now movx instructions access xram, and writes are disabled. }
What I want to do is to add in a line of code after the bits are set, that will reference foo + some offset. For example, foo+3 happens the be the FLSCL instruction. The CBYTE macro throws an error. When I do a manual replace like this:
*((unsigned char volatile code * &(foo))=0x00;
The compiler swallows it nicely, and completely eliminates the above line. I thought about using a label and generating a point to it, but C doesn't allow that. Then the thought occurred : "Well you can take the address of a function:..."
Currently I have code running that does something similar to this as part of a personalization/configuration procedure:
but I can't make the compiler generate a pointer to foo (see above) CBYTE[foo] throws an error. CBYTE[&foo()] throws an error.....
This is from working code:
void configure_me() { unsigned char volatile xdata * write_ptr; unsigned int t; saved_ie = IE; EA = 0; // disable interrupts (precautionary) write_ptr=&(security[0]); // it is critical that this comes before flash // enable. The damn compiler uses movx // otherwise and screws things up FLSCL=0x01; // enable flash write/erase PSCTL=0x01; // NOT 2, which would erase flash, but 1 write to flash *write_ptr=CBYTE[0xFA00]; PSCTL = 0x00; // MOVX writes target XRAM FLSCL=0; // disable flash write write_flash(&(security[1]),0xFA00,20); // copy serial number FLSCL=0x01; // enable flash write/erase PSCTL = 0x03; // MOVX writes target FLASH memory, and erase is ENabled XBYTE[0xFA00]=0; // erase the flash page PSCTL=0x00; FLSCL=0x00; // disable flash write IE=saved_ie; } idata unsigned char byt2wrt; void write_flash( unsigned char * dest, unsigned char code* srce, int len) { char EA_save; // saves the current state of the interrupts register volatile unsigned char code * source; register volatile unsigned char xdata * destination; EA_save = EA; EA = 0; // disable interrupts (precautionary) source=srce; destination=dest; FLSCL=0x01; // enable flash write/erase do { // copy until len is 0 byt2wrt=*srce; // ensure that no MOVX is used. Variable is in idata. PSCTL = 0x01; // MOVX writes target FLASH memory, and erase is disabled *destination=byt2wrt; PSCTL=0; // MOVX writes to xdata srce++; // advance pointers destination++; len--; } while (len != 0); PSCTL = 0x00; // MOVX writes target XRAM FLSCL=0x00; // disable flash write EA = EA_save; // re-enable interrupts }
on the reset pin, if so, REMOVE IT!
that, according to dozens of posts at the SILabe forum www.cygnal.org/.../Ultimate.cgi and my experience cures the problem
Erik
I've looked at the schematic, and RST just goes straight to the debug header, with no caps.
The gist of my question is "How can I make the Keil compiler create a pointer to the function that I am currently in, and then generate a MOVX to that address, even though it is in code (flash) space.
If I can do that, then I can modify the flash code on the fly after doing the flashing, so that the instructions that set the flash up for write cannot happen. The compiler is optimizing the line away. ( I built a test function:
foo() { PSCTL=00; // enable flash write FLSCL=00; // these would be the appropriate values for enabling writing, but are dummys now CBYTE[0xFA00]=0; // I use this in other places FLSCL=00; PSCTL=00; }
and the compiler completely removes the CBYTE line. Just generates the code for the other 4 lines. CBYTE is
*((unsigned char volatile code *)0)
If I manually expand CBYTE to the above line, I should be able to force C to replace the 0 with the address of the function foo...but (I got a really unusual error with one version of this, but I can't seem to re-create the line.. The error had something to do with size_of which does not make sense in the context.
I *should* be able to generate a pointer to foo and de-reference it in foo. As you can see in the prior code in the configure_me function, I am indeed writing to flash page 0xFA00.. but getting the compiler to replace 0xFA00 with a pointer to an arbitrary location is not proving any to easy...
I've looked at the schematic, and RST just goes straight to the debug header, with no caps. But you NEED a pullup rssistor, the reset is open drain
the fact that you have flash write routines makes it IMPARATIVE that you have a pullup.
*((unsigned char volatile code *)0) what happens if you do not use the macro?
If I can do that, then I can modify the flash code on the fly after doing the flashing,
That looks like you're doing the whole thing bass-ackwards. The general idea for this kind of activity is to run the relevant piece of code directly from RAM, and put it there from the outside, for the duration of the flash operation only.
As-is, you're trying to do two things at the same time, to essentially the same tiny section of memory: overwrite it, and run code from it. That basically never works. The flash page (or even the entire flash memory bank) being written to is usually completely unavailable for any other use during the write or erase cycle. Sawing off the branch you're sitting on can't be done in that situation.
Hans-Bernhard, The general idea for this kind of activity is to run the relevant piece of code directly from RAM I think you missed that this is about a '51
the entire flash memory bank) being written to is usually completely unavailable for any other use during the write or erase cycle. Not the case for the SILabs chips; HOWEVER, running any code other than waiting for complete till complete will lead to 'unexpected' results.
I think you missed that this is about a '51
I didn't. But his is clearly an atypical one already (he's planning to write to CODE space by a simple MOVX after all), so it's none too big a stretch of the imagination that it might be von-Neumann'ed enough to allow running code from RAM, too.
But his is clearly an atypical one already (he's planning to write to CODE space by a simple MOVX after all), so it's none too big a stretch of the imagination that it might be von-Neumann'ed enough to allow running code from RAM, too. no, the SILabs chips can not run from RAM. The SILabs chips have some SFRs that when properly massaged makes MOVX write to flash. This is a "regular flash write" so a previous (page) erase will be required.
If you have concluded that a self-modifying living structure is the cure to the problem of returned boards, then why don't place theese functions to absolute addresses using linker ?
How do I locate a C function at an absolute address? http://www.keil.com/support/docs/359.htm
In order to ensure that rouge code doesn't execute the flash write routines
.... do you have them in the first place?
This may seem a dumb question, but I AM puzzled
O.k. Here is more information:
This device requires personalization. That is: Once the hex file has been downloaded into a blank silabs part (the 040 in this case), the person loading the firmware needs to enter a serial number and a device type. He does that by entering the information into page 0xFA00. When the unit is power cycled, or reset, it checks to see if 0xFA00 is either a 0xFF (unprogrammed) or 0x20 (programmed). If it is anything else, it means that the production person has put in the serialization and production code. It then needs to put the serial number etc into a secure place in flash elsewhere. So it copies the serial number from 0xFA00 into an array living somewhere else in the flash image. It then erases the FA00 page, and writes the model, the serial number and the copyright into page 0xFA00, and a 0x20 into 0xFA00. This way, the power up code can compare the security array (where the serial number is stored) with the FA00 serial number to detect tampering...
Thats way to much stuff for a person to manually enter..
This is a hold over from the F310 code. In the F310 code, this was at 7A00. This was done so that you could lock all the pages but the last one, and when a unit came back in for service, you could use the debugger and open the single unprotected page, and see the unit type, serial number and copyright.
Unfortunatly, the 040 does not allow you to do that, but we still have to have the personalization step. So, the code is still there.
Upon power up, the serial number is checked against the FA00 serial number, and if they don't match it locks up. This, on the 310 was done just in case some yoyo figured out to hook a silabs debugger up to the programming port. We also, in the 310, cleared consecutive bits in flash starting at 0xFB00 to to the flash protect byte, on each power on. This way, we could at least make a sanity check if someone returned a unit claiming they had used it only a few tiems....
In the 040, this is done by writing the time, power ups and some other information to a serial eeprom, as well as the model, checksum etc.
For some reason, we rarely got any 310 based ones back that were bad, but occasionally we did. In the 310 version there was no checksum.
So the write to flash routines are needed only at production time, or service (if a unit comes back in for service).
And, no, the silabs parts will not allow you to run from xram. I wish they did, it would make life a lot easier.
I recently (today) while making some tweaks to this code, discovered that the Keil 7.5 compiler seems to have a bug in the CWORD macro... What I was doing: I checksum 0-FAFF. I write the checksum to the serial eeprom, but also want to write the checksum to flash as well, just outside of the the checksummed area.
This works: FLSCL=0x01; // enable flash write/erase PSCTL=0x01; // BUGG NOT 2, which would erase flash, but 1write to flash // *write_ptr=(unsigned char) ((chksum >> 8)&0xFF); // *write_ptr+1=(unsigned char) (chksum &0xFF); XBYTE[0xFB00]=(unsigned char) ((chksum >> 8)&0xFF); XBYTE[0xFB01]=(unsigned char) (chksum &0xFF); PSCTL = 0x00; // MOVX writes target XRAM FLSCL=0; // disable flash write
BUT trying to access this as a word for comparison purposes using CWORD[0xFB00] fails by generating a 0xF600 load into dptr rather than the correct FB00....
Once the hex file has been downloaded into a blank silabs part (the 040 in this case), the person loading the firmware needs to enter a serial number and a device type.
I don't see that as sufficient justification for deciding to put a flash write routine into the software. That could just as well be handled by modifying the hex file to be flashed, before flashing the chip. And save you a whole lot of hassle while at it.
I recently (today) while making some tweaks to this code, discovered that the Keil 7.5 compiler seems to have a bug in the CWORD macro...
As explained elsewhere, your reasoning towards this discovery was incorrect. You failed to read and/or understand the documentation about CWORD.
your problem will be gone with the puillup resistor (if not, you have a REAL problem with runaway code even without power cycling), however, if you can not modify the hardware, what about making a SEPARATE program for setting these values.
Again, I strongly recommend the hardware option, even without the flash write routines, you may get dropouts till you install the resistor (I use 2k2, anything between 1k and 22k will work)
PS the "pullup, no cap" is the concensus of the 'gurus' at the SILabs forum>
um... might be, but you have to understand that the guys programming the parts would be totally incapable of doing this.
Additionally, since this is an FDA cleared device, you cannot alter the hex file (binary) if you do you have to completely do a new Verification and Validation, and that is a good 10 days of work on this device. If you had to alter the binary for every device programmed, i.e. sold, you would hit the verification trap. So that is completely out of the question. If discovered on an audit, it would get the company completely shut down...
Oh, and before I forget, this:
We are seeing devices come back with apparently corrupted flash (these are locked so we actually cannot see the flash contents, but they appear to be locked in reset loops or partially run and then get crazy.
is way insufficient evidence to justify the kind of voodoo you're trying to implement to fix this effect. At the very least, you would have to have taken one of those parts, erased the whole chip and flashed it again with a hex file containing exactly those data it was supposed to have contained, and see if that fixes it.
And even if that did fix it, meaning that some flash contents were not what they were supposed to be: how the heck did you make the leap from those symptoms to a diagnosis like "this flash-write routine must have been called when it's not allowed to be used any longer", not to mention how did you arrive at "let's make it crash deliberately" (which is what overwriting it with zeroes will most likely do) as the supposed cure?
And you seriously didn't expect an access via a (code *) (like in CBYTE[n] = 0, or your original code * ptr) to generate the MOVX operation that will end up writing to code memory, did you?
If you had to alter the binary for every device programmed, i.e. sold, you would hit the verification trap.
But that's exactly what you are doing: you're changing the actual binary contents of every single device. You're just not doing it in one go, but doing it in two separate steps.
And even if all that writing of data wasn't seen as a modification of the binary, what you're about to try here certainly is one. You're planning to modify the verified code, post-verification. You're essentially trying to cure diarrhea by infection with the flu.