Hello everyone,
I am writing some code using an 8051 device and I'm trying to make the code MISRA C compliant. One of the MISRA C rules is that unions are not allowed (and truthfully, I don't fully understand why the rule exists. I'm sure there's some way to abuse it that they're trying to avoid). Right now, I have a fairly basic union:
union { short s16; unsigned char u8[2]; } adcValue;
I know I can create some code updates to get around this (ie assign both values separately in code, only use one of the two datatypes and rely on shifts, etc to do this) but I was wondering if there is another MISRA C acceptable structure that might do the same thing (to avoid having to make some pretty large scale updates to my code).
Alternatively, I guess I could just make an exemption to this rule and document it but I wanted to understand my options before I decide one way or another and thought you guys might be able to provide me with some insight on the subject.
Any help you guys would be willing to give me would be greatly appreciated. Thanks!
Sorry, forgot to put in the code brackets:
how about a 'pseudo union'
s16 = (charh <<8 + charl)
I think that's rather unlikely to be an appropriate goal.
MISRA compliance isn't something to be just "bolted-on" after the event - it has to be designed-in and followed-through from the outset. It's not just a matter of applying a few coding rules - see section 5.2...
Heh, truthfully, I already know we're going to have to rewrite the code, if only for the reason that our original code didn't follow a formal process and testing development cycle and I know it's going to probably create more problems trying to fix and apply rules after a bulk of the code has been written rather than writing it from scratch with the new objectives in mind (the requirements of the code has changed since it has been developed).
I'm trying to determine the best way to approach this for the future. I can MAKE it pass by creating a workaround, but as you said, it's against the spirit of the standard and kind of defeats it. I was hoping there was a universal alternative to something as simple as I have it.
Realistically, I only update the adcvalue in my ISR and then use either the signed or unsigned part in different ways throughout the code (which is what I'm using the union for). If it's going to create problems, I might just update them both. I would just prefer to have them linked to cut down on the possibility of mismatch errors (shouldn't be an issue but, in my eyes, it appears using a union would make more sense).
I guess the alternative would be to to use a single variable and use converts and shifts anytime I use the function. Unfortunately, this will increase code size and increasing the timing of the application.
Unfortunately, this will increase code size and increasing the timing of the application.
I wouldn't be quite so sure of that assumption. There's no particularly good reason why a decent compiler couldn't generate the exact code for the following lines:
unsigned char foo = union_var.u8_array_variant[0]; unsigned char bar = ((unsigned short)union_var.s16_variant) & 0xff;
After all, you're extracting the exact same bits from the same place in memory, so why would one method be slower than the other?
Hans,
I was referring to if I was receiving the ADC values in a s16 and everytime I wanted to use a u8, I had to shift the s16 (which is entirely impractical, since the adc values are only being stored in one place, rather than being updated in several). So yes, you are right, using it in the way you described shouldn't take significantly more space or time.
I don't fully understand why the rule exists. I'm sure there's some way to abuse it that they're trying to avoid)
Actually it's the other way round. Bascially all of the usual uses of unions are, actually, abuses. Hardly anyone ever uses unions within their specified limitations.
The problem with your particular plan (like the vast majority of ways people try to apply unions) is that the language explicitly forbids it, by casting its second-strongest verdict: reading from any element of a union other than the one most recently written "yields an unspecified result". I.e. for all you can actually rely on, that code could give you the time-of-day on Mondays, or 13 on the 12th of any given month in the Chinese calender, and complet and utter random garbage the rest of the week; and you would have nobody but yourself to complain to.
Hmm. Interesting. Well, at least that makes sense. Thanks for the explanation!
I was referring to if I was receiving the ADC values in a s16 and everytime I wanted to use a u8, I had to shift the s16
Same difference. I.e.: none. As long as you extract the same bits out of your s16, it makes no difference to properly optimized code whether you do it via a value-cast-shift-and-mask, a pointer cast, or a union hack. The end result is always the same thing: the high byte of that s16 object, so there's no reason why the compiled should be any different.
But if you don't believe me, you don't have to. Feel free to just try it out yourself.
"... the language explicitly forbids it, by casting its second-strongest verdict: reading from any element of a union other than the one most recently written "yields an unspecified result"."
That's certainly the case for C51 being pre-C99 (TC3).
If I'm not misinterpreting a footnote added in 2007's C99 TC3, there may be, shall I say, some "wiggle room". TC3 still has the "unspecified values" in 6.2.6.1 par.7 and "unspecified behavior" in Annex J.1, but 6.5.2.3 par.3 now has footnote 82:
"If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation."
Of course, note the last sentence is a caveat.
The footnote sends a bit of a mixed message, I think, what with "unspecified" still hanging around elsewhere.
Thanks for the help, everyone! I ended up just making it a struct, rather than a union and just set the value for both of the u8's and the s16 at the same time, allowing me to not have to change the rest of my code that uses both those data types. I think when I restructure the code, I'll probably just use one of them (I inherited some of the code, which had this structure already setup and made the mistake of moving forward with it rather than reevaluating it).