C and "casting"

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
I'm not sure how to address this. This struct type is (apparently) regarded by the compiler as 4 bytes long:

Code:
struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};
Yet I want to pass a variable of that type where a unit8_t is expected (because it is - or should be - just a single byte).

Trying a C "cast" fails:

Code:
_WriteSingleByteRegister(device_ptr, NrfRegister.RF_CH, (uint8_t)(Value), NrfStatus);
with:

"aggregate value used where an integer was expected".

intellisense reveals this:

1667317979926.png

most puzzling...
 

WBahn

Joined Mar 31, 2012
32,745
It's hard to tell what is going on because you are hiding a lot of key pieces. You don't show how NrfRegister is declared. Is there a typedef somewhere? Is it an instance of that structure, or a pointer to that structure?

The structure you define is going to take up the memory required to store an unsigned int. Since C requires an int type to be AT LEAST two bytes, your structure will occupy at least two bytes (most likely four since most compilers today use 32 bits for an int).

Assuming that 'Value' is a variable (not a pointer) of that structure, then you are trying to cast the structure variable, not one of it's members, to a uint8_t.

Even if you dereference the structure variable to access the first member (which in this case would be the unsigned int that contains both bit fields), the alignment of the bit fields within that storage unit is implementation defined, so it might be sitting at the high end or the low end.
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
It's hard to tell what is going on because you are hiding a lot of key pieces. You don't show how NrfRegister is declared. Is there a typedef somewhere? Is it an instance of that structure, or a pointer to that structure?

The structure you define is going to take up the memory required to store an unsigned int. Since C requires an int type to be AT LEAST two bytes, your structure will occupy at least two bytes (most likely four since most compilers today use 32 bits for an int).

Assuming that 'Value' is a variable (not a pointer) of that structure, then you are trying to cast the structure variable, not one of it's members, to a uint8_t.

Even if you dereference the structure variable to access the first member (which in this case would be the unsigned int that contains both bit fields), the alignment of the bit fields within that storage unit is implementation defined, so it might be sitting at the high end or the low end.
I'm calling a method that takes an arg of type uint8_t. But the caller is passing a variable declared as type NrfReg_RF_CH

Here's the full details:

Code:
typedef struct nrf_reg_RF_CH NrfReg_RF_CH, * NrfReg_RF_CH_ptr;
and

Code:
struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};
Code:
static void _WriteRFChannelRegister(NrfSpiDevice_ptr device_ptr, NrfReg_RF_CH Value, NrfReg_STATUS_ptr NrfStatus);
and that method calls into a more generic method:

Code:
void _WriteSingleByteRegister(NrfSpiDevice * SPI, uint8_t Register, uint8_t Value, NrfReg_STATUS_ptr NrfStatus);
i.e.
Code:
void _WriteRFChannelRegister(NrfSpiDevice_ptr device_ptr, NrfReg_RF_CH Value, NrfReg_STATUS_ptr NrfStatus)
{
    _WriteSingleByteRegister(device_ptr, NrfRegister.RF_CH, (uint8_t)(Value), NrfStatus);
}
I just played around with the union idea and it seems to work, but not sure if there are other ways.
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
This whole "the alignment of the bit fields within that storage unit is implementation defined" is another reason I'm critical of C. Any language that's being embraced for low level hardware manipulation should never leave such behavior to the compiler implementors.

Frankly it should be a language feature, the layout, the "endianess", padding etc. should provide reasonable defaults but also options for the developer to specify precisely.

How can one write reusable C source code if its runtime behavior might vary depending on the chosen compiler!
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
Well this union does work as desired, I can see memory change as expected as I set/unset various bits under debug:

Code:
union nrf_reg_EN_RXADDR_union
{
    uint8_t value;

    struct nrf_reg_EN_RXADDR
    {
        unsigned int ERX_P0 : 1;
        unsigned int ERX_P1 : 1;
        unsigned int ERX_P2 : 1;
        unsigned int ERX_P3 : 1;
        unsigned int ERX_P4 : 1;
        unsigned int ERX_P5 : 1;
        unsigned int RESERVED : 2;

    } fields;
    
};
i.e. setting P3 to 1 and P5 to 1 shows a raw byte value in memory of 0x28.
 

WBahn

Joined Mar 31, 2012
32,745
So this would be akin to something like the following, using a situation that is perhaps a bit easier to visualize the issue.

Code:
#include <stdio.h>

int main(void)
{
    int myArray[1];
    char myChar;

    myArray[0] = 42;
    myChar = (char)(myArray);
    printf("myChar: %c\n", myChar);

    return 0;
}
This throws the warning: "cast from pointer to integer of different size".

The variable myArray is a pointer (as it is used in the expression) and it is the pointer value that is being cast to a char, NOT the value pointed at by myArray.

Consider the following:

Code:
#include <stdio.h>

struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};

typedef struct nrf_reg_RF_CH NrfReg_RF_CH, * NrfReg_RF_CH_ptr;


int main(void)
{
    NrfReg_RF_CH value;
    unsigned char data;
 
    value.RF_CH = 0x2C;
    value.RESERVED = 1;
 
   data = *((unsigned char *) &value);

   printf("data: %c\n", data);

   unsigned char *byte_ptr = ((unsigned char *) &value);
   for (int i = 0; i < sizeof(value); i++)
   {
       printf("[%04X]: %02X\n", i, *(byte_ptr + i));
   }

   return 0;
}
This produced:

data: ¼
[0000]: AC
[0001]: 00
[0002]: 00
[0003]: 00

So we can see that MY compiler put RF_CH at the lsb of the low-addressed by of the unsigned int within the structure. But this is not required by the standard.

Here's some relevant language from the C99 (draft) standard (Section 6.7.2.1):

"An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified."
 

WBahn

Joined Mar 31, 2012
32,745
This whole "the alignment of the bit fields within that storage unit is implementation defined" is another reason I'm critical of C. Any language that's being embraced for low level hardware manipulation should never leave such behavior to the compiler implementors.
Why not? A strong argument can be made that that is precisely who it SHOULD be left to -- the people writing the compiler that is targeting a specific hardware so that they can best leverage the capabilities and limitations of that hardware.

It again comes down to what the intent of the C language was -- performance.

Frankly it should be a language feature, the layout, the "endianess", padding etc. should provide reasonable defaults but also options for the developer to specify precisely.

How can one write reusable C source code if its runtime behavior might vary depending on the chosen compiler!
Simple -- don't write code that invokes undefined or implementation-defined behavior.

The kicker is that few people are truly conversant with what is and is not language-defined behavior, in large part because most people (today) want to treat C like a general purpose language that you can pick up on the fly.
 

WBahn

Joined Mar 31, 2012
32,745
Well this union does work as desired, I can see memory change as expected as I set/unset various bits under debug:

Code:
union nrf_reg_EN_RXADDR_union
{
    uint8_t value;

    struct nrf_reg_EN_RXADDR
    {
        unsigned int ERX_P0 : 1;
        unsigned int ERX_P1 : 1;
        unsigned int ERX_P2 : 1;
        unsigned int ERX_P3 : 1;
        unsigned int ERX_P4 : 1;
        unsigned int ERX_P5 : 1;
        unsigned int RESERVED : 2;

    } fields;
   
};
i.e. setting P3 to 1 and P5 to 1 shows a raw byte value in memory of 0x28.
That's a structure, not a union. Fine point.

You are still relying on implementation-defined behavior since whether the ERX_P0 bit or one of the two RESERVED bits ends up at the lsb or the msb is up to the implementation. Beyond that, whether it ends up at the base address or not depends on whether the underlying unsigned int is stored little-endian or big-endian.

The key to undefined behavior (and to a lesser extent implementation-defined behavior) is that anything goes. It can crash the program, produce a wrong result, start a global thermonuclear war, or do exactly what you would like it to -- all of these are equally valid. The worst one, by far, is for it to do exactly what you would like it to because this leads you into a false sense of security thinking that you know something when, in fact, you don't.

If you want to avoid undefined/unspecified behaviors with bit fields, then you need to access them in accordance with the language standard, namely as members of a structure. If you want to go bit-banging, then go bit-banging. Bit fields are way to get some of the benefits of bit-banging while abstracting the implementation details away -- but, just like anytime you do that, you need to live within the bounds of the abstraction.
 

eetech00

Joined Jun 8, 2013
4,704
So this would be akin to something like the following, using a situation that is perhaps a bit easier to visualize the issue.

Code:
#include <stdio.h>

int main(void)
{
    int myArray[1];
    char myChar;

    myArray[0] = 42;
    myChar = (char)(myArray);
    printf("myChar: %c\n", myChar);

    return 0;
}
This throws the warning: "cast from pointer to integer of different size".

The variable myArray is a pointer (as it is used in the expression) and it is the pointer value that is being cast to a char, NOT the value pointed at by myArray.

Consider the following:

Code:
#include <stdio.h>

struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};

typedef struct nrf_reg_RF_CH NrfReg_RF_CH, * NrfReg_RF_CH_ptr;


int main(void)
{
    NrfReg_RF_CH value;
    unsigned char data;

    value.RF_CH = 0x2C;
    value.RESERVED = 1;

   data = *((unsigned char *) &value);

   printf("data: %c\n", data);

   unsigned char *byte_ptr = ((unsigned char *) &value);
   for (int i = 0; i < sizeof(value); i++)
   {
       printf("[%04X]: %02X\n", i, *(byte_ptr + i));
   }

   return 0;
}
This produced:

data: ¼
[0000]: AC
[0001]: 00
[0002]: 00
[0003]: 00

So we can see that MY compiler put RF_CH at the lsb of the low-addressed by of the unsigned int within the structure. But this is not required by the standard.

Here's some relevant language from the C99 (draft) standard (Section 6.7.2.1):

"An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified."

Doesn’t the array element 0 need to be specified in cast statement?
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
Why not? A strong argument can be made that that is precisely who it SHOULD be left to -- the people writing the compiler that is targeting a specific hardware so that they can best leverage the capabilities and limitations of that hardware.

It again comes down to what the intent of the C language was -- performance.
Well C was not intended for performance, the impetus originally was ease of compiler implementation, it was language simplicity that drove the designers to produce C.

Performance is often in the hands of optimizers and most of the work done by optimizers is independent of source language, to a large degree anyway, optimizers are concerned with things like common subexpression elimination, redundant register reads and writes, and so on.

Yes the precise way something is laid out in memory is also relevant to performance but not if that layout is visible outside of the code as it is here. Typically a developer does not care about how a double is laid out or how it is aligned, padded and so on, but when that becomes part of a contract as it is when mapping declared bits to external hardware then they do care else the code might not execute as required.

I'm all for the defaults to be implementation defined, when I don't care but it would be helpful to also give an ability to override defaults in a way that yields behavior that is not implementation dependent.

There are only a small finite number of considerations for these things too, alignment, endianess, padding and order - that's probably it - if the language exposed these as keywords in some way then we could write code that leads to a storage layout that is always the same irrespective of compiler implementation.

After all we don't care about efficiently laying out a structure at the expense of correctness.

Simple -- don't write code that invokes undefined or implementation-defined behavior.

The kicker is that few people are truly conversant with what is and is not language-defined behavior, in large part because most people (today) want to treat C like a general purpose language that you can pick up on the fly.
Sure, we should never rely on implementation defined behavior but that leads to A) One must be fully aware that something is implementation defined and B) Take steps to code it code in a way that avoids the implementation defined behavior manifesting itself i.e give the programmer control over layout.
 

WBahn

Joined Mar 31, 2012
32,745
Doesn’t the array element 0 need to be specified in cast statement?
Exactly! That's the point -- and putting it in the context of using an array instead of a structure containing bit fields made it apparent.

So just as

(char)(myArray)

needed to reference a specific member of the array, such as

(char)(myArray[0])

Your code

(uint8_t)(Value)

needs to reference a specific member of the structure, such as

(uint8_t)(Value.RF_CH)

The problem is that, because the members are bit fields packed into an unsigned int, there is no way to reference the unsigned int as a single member, you have to reference one of the bit fields and the code generated by the compiler will do the big-banging necessary to separate out and shift them to make them appear as an n-bit value independent and separate from any other bit fields that might be stored within that same unsigned int.
 

WBahn

Joined Mar 31, 2012
32,745
Well C was not intended for performance, the impetus originally was ease of compiler implementation, it was language simplicity that drove the designers to produce C.
I disagree. It was developed in order to do something that the rest of the world pretty firmly believed could only be done in assembly language -- namely implement an operating system that could run on the hardware of the day.
 

BobaMosfet

Joined Jul 1, 2009
2,211
I'm not sure how to address this. This struct type is (apparently) regarded by the compiler as 4 bytes long:

Code:
struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};
Yet I want to pass a variable of that type where a unit8_t is expected (because it is - or should be - just a single byte).

Trying a C "cast" fails:

Code:
_WriteSingleByteRegister(device_ptr, NrfRegister.RF_CH, (uint8_t)(Value), NrfStatus);
with:

"aggregate value used where an integer was expected".

intellisense reveals this:

View attachment 279700

most puzzling...
Why is this puzzling? The struct has 2, 2-byte ints adjacent in memory, which is eactly 4-bytes long.

How do you think memory looks?

And saying C is not intended for performance shows a huge lack of knowledge and understanding. C is the highest-performance language above assembly language, that exists because it is terse and effeicient. This is why C is considered the 'High Level Assembler'.

IF you want to use just 8-bits, then use an UNSIGNED CHAR, and masking. Using a structure to do this with bitfield specification is only telling the compiler to use n bits from each field in the structure.

I understand learning is difficult, and noobs have a habit of blaming the compiler and/or the language when in fact it is a lack of knowledge on the person's part, not a compiler or language fault that is the problem.

If you want to fiddle with bits; learn HOW to fiddle with bits. Your & , | , and ~ operators are what you want to work with and learn how to reference things in HEX. It's so much easier.


1667326250508.png
 
Last edited:

WBahn

Joined Mar 31, 2012
32,745
Code:
struct nrf_reg_RF_CH
{
    unsigned int RF_CH : 7;
    unsigned int RESERVED : 1;
};
Why is this puzzling? The struct has 2, 2-byte ints adjacent in memory, which is eactly 4-bytes long.
No, the struct has a single 32-bit unsigned int that is partitioned into a 7-bit bit field named RF_CH and a 1-bit bit field named RESERVED. Just where these 8 bits are located within those 32 bits is, with some constraints, up to the compiler implementer.

The C language standard does require that if an adjacent bit field member can be allocated within the storage unit of the prior one, that it must. So if unsigned int was just 16-bits on this compiler, the size of the struct would just be two bytes.
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
I disagree. It was developed in order to do something that the rest of the world pretty firmly believed could only be done in assembly language -- namely implement an operating system that could run on the hardware of the day.
Well historically the first OS to be written in a high level language was of course Multics, although commercially this was not a success but technically was a huge leap forward. Multics was coded 95% in PL/I subset G. Multics introduced demand paged virtual memory, multiple user support, security, device driver abstraction and many more things.

Ritchie and Thompson worked on Multics at Bell Labs, they had roles on that project and later chose to start a new project wich became Unix and also C, indeed, "Unix" is itself a play one the name "Multics".

Unfortunately the advances and capabilities achieved with the Multics project are far less known but not for technical reasons, but mainly commercial reasons. It was a big project and had a sizeable team and the attraction of a smaller system like Unix and a simpler to implement language like C are not hard to understand.

As I've said before the success achieved by Multics was due in no small part to the PL/I language, a far better system programming language than C. PL/I subset G was designed for implementing an operating system and the quality of the compiler used for the Multics project was outstanding.

This was perhaps the first time a high level language was bootstrapped too, the final compiler used by the Multics team was written in PL/I itself a huge leap.

PL/I's language support for bits, aligned and unaligned storage, strings, fixed point decimal/binary, exceptions, nested functions, and pointers made it superb for OS development. It's seemingly not common knowledge but PL/I was the first high level language to support a pointer data type, even the arrow notation " -> " began in PL/I.

If anyone is interested in the Multics PL/I compiler and how it was designed, here's another important historic record.

Finally here's an interview with the late Bob Freiburghouse the lead developer on the Multics PL/I compile project, I have not seen this interview before, looks very interesting. he was a smart guy whom I met briefly one in London.
 

WBahn

Joined Mar 31, 2012
32,745
Sure, we should never rely on implementation defined behavior but that leads to A) One must be fully aware that something is implementation defined and B) Take steps to code it code in a way that avoids the implementation defined behavior manifesting itself i.e give the programmer control over layout.
Back when I was first learning C (as an engineering grad student, not as a comp sci guy of any shape or form) and then having to turn around and teach it to engineering undergrads in an intensive one-week course segment (they literally learned C on Monday and then used it to solve three separate real-time control and data acquisition/analysis problems on the three subsequent days followed by a final exam on Friday), a lot of the common traps became really apparent. This was before I even knew that there were such things as language standards and when I assumed that all languages were set in stone (boy, did I have a lot to learn!). So I put together a short guide containing a set of style guidelines and addressing all of the pitfalls that I knew of at the time (and that list grew substantially over the next several years) and how to avoid them. Of course, I only knew about them because I had tripped on them, sometimes spectacularly. This was in the early 1990's, so there was no web to search -- this is why I quickly amassed a lot of C textbooks, because I was constantly running into something that none of my existing texts addressed and so I had to go to the bookstore and find one that dealt with that particular new (to me) issue.

I still recall, pretty much verbatim, the first sentence I wrote in that guide. It went something like: C is a language that is not for the feint of heart as it is more than willing to give you plenty of rope with which to hang yourself. It gives the programmer extreme authority over the hardware, but along with great power comes great responsibility -- authority and responsibility that many programmers are not emotionally mature enough to handle well.

I know that my own emotional maturity was challenged on more than one occasion! :D

But I can also safely say that I've learned much more about how computers and programming languages work than from all of the many other languages I have used, combined.
 

nsaspook

Joined Aug 27, 2009
16,266
This whole "the alignment of the bit fields within that storage unit is implementation defined" is another reason I'm critical of C. Any language that's being embraced for low level hardware manipulation should never leave such behavior to the compiler implementors.

Frankly it should be a language feature, the layout, the "endianess", padding etc. should provide reasonable defaults but also options for the developer to specify precisely.

How can one write reusable C source code if its runtime behavior might vary depending on the chosen compiler!
This statement show you are early in your understanding of embedded C programming. The alignment of the bit fields is a hardware feature that should be implementation defined because hardware level bit manipulations are very hardware dependant. :rolleyes:
 

Thread Starter

ApacheKid

Joined Jan 12, 2015
1,762
No, the struct has a single 32-bit unsigned int that is partitioned into a 7-bit bit field named RF_CH and a 1-bit bit field named RESERVED. Just where these 8 bits are located within those 32 bits is, with some constraints, up to the compiler implementer.

The C language standard does require that if an adjacent bit field member can be allocated within the storage unit of the prior one, that it must. So if unsigned int was just 16-bits on this compiler, the size of the struct would just be two bytes.
Well in the specific case here that struct is 32 bits long, but is actually a single byte padded by three bytes. The struct is aligned on a 4 byte boundary because I specified "int" I could specify "char" and the struct would shrink and end up aligning on a 1 byte boundary, I guess I might even do that but I'd need to consider implications.

But note the two fields are literally individual bits, setting any bit alters the overall byte value as one would expect and yes I suppose that could be or is, implementation dependent which is my gripe here - it should be unambiguous.

The alignment of the overall structure could be an attribute of the structure or declaration, not something specified at the bit field level.

Look, here is how PL/I would declare this, its a different syntax but you get the idea:

Code:
dcl 1 IoRegister aligned
      2 CH bit(7),
      2 RESERVED bit(1);
Consider also:

Code:
dcl 1 MainRegister unaligned,
      2 flag       bit(1),
      2 state      bit(2),
      2 carry      bit(1) aligned,
      2 rest       bit(7);
The language is basically totally flexible with layout, the second structure is "unaligned" meaning we don't care about how the structure itself is aligned. The flag bit is the MSB of a byte, the state bits are bits 6 and 5 of the same byte, the carry bit is aligned and so begins on a new separate byte, rest is the remaining 7 bits of that second byte.

Accessing bits by subscript is easy to and east to read in code.

C could have done something similar, that's all I'm saying here.
 
Top