Using a pointer to void

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
In another thread this subject came up and I would like get some educated opinions on the issue. It was tangential to the original topic and I will do a little cut and pasting to set up the current topic, these cuts are of course out of context, but I'm not trying to be misleading, I just want to get to the meat...

While it is not practical to post all of the code from my reference, this morning I generated some code that illustrates the issue and does, in fact, represent my reference. That code, with some specific questions is in the next post - give me a few minutes to get it written and sent.


The only reason I know this is because of a long head-banging session. I was writing a function that I wanted to send both unsigned char * and signed char *.

The compiler would not let me do this by giving me a warning or error along the lines of:
pointer targets in passing argument 1 of 'blahblah' differ in signedness [-Werror=pointer-sign]
because, of course, an unsigned character is a different type than a signed char, even though they were, in my case, always the same size.

I could have turned the error off with some compiler switch, I suppose, but that is always a bad idea for me.

I was not going to write two separate functions and I was also not going to always typecast one or the other (the less typecasting I do, the better).

So, I wrote the function prototype as blah_write(void * message); and it worked just fine. Maybe this was sloppy, but I don't see how, after reading about what the heck a pointer to void was. The function always knew what to do and it was the same thing regardless of whether it was a signed or unsigned char. I believed and still do, that it was the way to go.
@raymond: Sorry I cannot post code via my iPad but two identicle functions, one passing a void pointer, one passing a char pointer, yield different results when assigning a character.

The difference being assigning a character yeilds an error "illegal conversion between types."

C has very strong type checking, and type char* is not the same as a void* pointer.
How did you dereference the 'message' pointer within your blah_write() function? If your compiler is conforming at all, then you should have had to cast it to a pointer to a complete data type. Unless it was ONLY your prototype that had it declared as a void pointer and it was declared as a char (of one type or the other) within the function. Now THAT would have been sloppy!
 

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
The example code is below:
Code:
#include "qm_common.h"

/* void F_write(uint8_t * message); */
/* void F_write(int8_t * message);  */
void F_write(void * message);

uint8_t Bat0[8] = { 0x4, 0x1f, 0x11, 0x11, 0x11, 0x11, 0x20, 0x00 };
int8_t  Bat1[8] = {"Test1111"};

int main(void)
{
QM_PRINTF("running... ");
F_write(Bat0);
F_write(Bat1);
F_write(" 22222tseT ");
QM_PRINTF("done... ");

return 0;
}

void F_write(void * m) {

    /* do something with m here */
}
I could have sent a number of bytes from m somewhere to actually do something within F_write(), but, the example will hopefully illustrate the issue clearly. Also, uint8_t is basically an unsigned char and int8_t is the signed equivalent, width is 8 bits. This is an Intel D2000 Quark chip using a GNU compiler.

As written, this compiles and runs. I went to this (i.e., using a pointer to void) because I wanted to be able to call the function F_write() with both pointers to signed and unsigned chars. It works.

If you use either
void F_write(uint8_t * message);
void F_write(int8_t * message);
as the prototype (with the analogous call). You will get a compiler error because the pointer types do not match.

So, the main question is basically - Is this "sloppy" in the sense that it is an incorrect usage of a pointer to void? If so, why - clearly and concisely without over-reliance on some kind of projective test of what is right or wrong. That is, if it is sloppy, one should be able to logically and concisely explain why, including the liabilities and the "neat" alternatives.

This the main question that I am asking for opinions on.

A secondary question is whether or not this experience is atypical of the platform. @ErnieM if I read your original post right, the XC80 compiler would not take the code. Is that right? Can you post the example you referred to?

Finally, I did not say why I wanted to be able to use pointers to both uint8_t and int8_t, but I don't think that is the point or that I should have to do that. An easy to understand example is with one the I2c send routine calls in the system code. That call requires that it be sent a uint8_t pointer. If you were to try Sendit("test") it would fail as would int8_t test[5]={"test"}; Sendit(test). Those are going as signed (int8_t). But again, please, don't focus on why.
 

WBahn

Joined Mar 31, 2012
32,707
You've left out two pieces of information that are absolutely critical to evaluating this use of a void pointer.

(1) How is the void pointer 'm' used within the function F_write().

Anything you do with it, other than pass it as an argument to another function or set the value of the pointer itself to a specific value, should require (if the compiler is anything close to standards-compliant) that you cast it to a specific type of pointer.

(2) Is what you are doing with the void pointer truly the same independent of whether the data is signed or unsigned?

If that is the case (and I suspect it is) then you are fine. But you need to ask, for instance, if the value stored in Bat0[2] is -10, should that behave exactly the same as when the value stored in Bat1[2] is 246? If so (and also for all other values that they do not have in commmon) then life is good. I suspect (don't know for sure) the values stored in either array are supposed to within the common range of both types; if that's the case then what you are basically saying is that the behavior for values beyond that range is undefined and that you are okay with anything happening. There is nothing intrinsically wrong with this position (after all, the C language standard itself is littered with undefined behaviors).

An unrelated issue is that, in some cases, you are passing pointers to string literals. Since this is clearly allowed for your intended use of F_write(), and since F_write() has not way to know whether the pointer it receives is pointing to data or a string literal embedded in the code itself, you should restrict the use of the pointer within the function so that it can't be used to modify the data it is pointing to.

void F_write(const void * m);

That way, if you happen to try to modify the data (as the result of a typo, for instance) the compiler can smack you.
 

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
You've left out two pieces of information that are absolutely critical to evaluating this use of a void pointer.

(1) How is the void pointer 'm' used within the function F_write().

Anything you do with it, other than pass it as an argument to another function or set the value of the pointer itself to a specific value, should require (if the compiler is anything close to standards-compliant) that you cast it to a specific type of pointer.

(2) Is what you are doing with the void pointer truly the same independent of whether the data is signed or unsigned?

If that is the case (and I suspect it is) then you are fine. But you need to ask, for instance, if the value stored in Bat0[2] is -10, should that behave exactly the same as when the value stored in Bat1[2] is 246? If so (and also for all other values that they do not have in commmon) then life is good. I suspect (don't know for sure) the values stored in either array are supposed to within the common range of both types; if that's the case then what you are basically saying is that the behavior for values beyond that range is undefined and that you are okay with anything happening. There is nothing intrinsically wrong with this position (after all, the C language standard itself is littered with undefined behaviors).

An unrelated issue is that, in some cases, you are passing pointers to string literals. Since this is clearly allowed for your intended use of F_write(), and since F_write() has not way to know whether the pointer it receives is pointing to data or a string literal embedded in the code itself, you should restrict the use of the pointer within the function so that it can't be used to modify the data it is pointing to.

void F_write(const void * m);

That way, if you happen to try to modify the data (as the result of a typo, for instance) the compiler can smack you.
Thank you so much for taking the time to read and respond. I think I am understanding what you are saying. For your questions, let me skip the contrived code above and use an actual functional example. The code comes from a project article I did here (please don't pick apart everything in there - it's a fait accompli, but is a real funtioning example).

With regard to (1) How is the void pointer 'm' used within the function F_write().

In the project, there is a library for using an LCD display.

The instant function prototype is:

int LCD_write(const void * message);

The function code is:
C:
int LCD_write(const void * message) {
    uint8_t dta[1] = { lcddta };

    /* note false option in next line */
    if (qm_i2c_master_write(QM_I2C_0, LCDADDR, dta, 1, false, &I2Cstatus)) {
        return (errno);
    }
    if (qm_i2c_master_write(QM_I2C_0, LCDADDR, message, strlen(message), true,
            &I2Cstatus)) {
        return (errno);
    }
    return (0);
}
qm_i2c_master_write() is in the qm_i2c.h system files and its prototype is
C:
int qm_i2c_master_write(const qm_i2c_t i2c, const uint16_t slave_addr,
            const uint8_t *const data, uint32_t len,
            const bool stop, qm_i2c_status_t *const status);
Note that the first byte sent (dta) is a controller command that the next sequence of bytes should go to the lcd memory.

A statement in main like LCD_write("This is a test"); would not work if I defined int LCD_write(int8_t * message);. I assume because a pointer to a int8_t is not a uint8_t *const data that qm_i2c_master_write() demands. Statements like
uint8_t M[5]={0x01,0x02,0x03,0x5,0x00}; and then LCD_write(uint8_t * message) would work.

when I changed to int LCD_write(const void * message);
both would work and that was what I stayed with.

Interestingly, when I call qm_i2c_master_write() from LCD_write(), I pass *message without a typecast to uint8_t (I'm not sure why, I think I recall trying, but it may not have been working the way I wanted it to and I gave up on that part). Admittedly, I might have reaced an understanding barrier.

With regard to (2) Is what you are doing with the void pointer truly the same independent of whether the data is signed or unsigned?

Yes, absolutely as far as I can tell. I am writing to LCD memory through the i2c port. I don't see anything in qm_i2c_master_write() that does anything with *message but pass it through.
 
Last edited by a moderator:

WBahn

Joined Mar 31, 2012
32,707
A good case could be made that your LCD_write() function should not use a void pointer but, rather, a pointer to uint8_t, and that you should explicitly cast the pointer to int8_t when you use that in a call. This lets the compiler do stronger type checking, for instance when someplace else in the code you accidentally passed this function the name of an integer array. As it stands now, the code will compile and run and who knows what kind of havoc might ensue. Explicitly casting it also helps document that this was done intentionally so that someone (possibly even you) maintaining the code years from now is less likely to find themselves going down a rabbit hole.

Having said that, a counter case can probably be made that the loss in type checking is offset by the increase in (higher level) readability. I don't think it is a very persuasive case, but it is not without merit.

The reason that you are not having to cast the void pointer to a uint8_t pointer is because the language standard specifies the following:

"A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer."

Basically (in conjunction with some other fine print) you can assign the value of a pointer of type void to a pointer of any other type and vice-versa without having to use a cast and no warning (let alone error) will be thrown.

But it is generally considered very good practice to do explicit casting whenever you assign a void pointer to another type (including when passing it as an argument). The reason is simple -- you are establishing a contract with the compiler wherein you say that you intend to use this pointer as a pointer to a particular type. If you then proceed to not hold up your end of the contract, the compiler is in a position to smack you. This is a good thing because it can turn very hard to find logic errors that have to be tracked down at runtime into trivially easy to find syntax errors that the compiler can point to at compile time.

For example, it is very common to see malloc() used without a cast. For instance:

double *pete;
int *peter;

...
peter = malloc(elementsInPete * sizeof(*pete));
...

This is almost certainly unintended. But the compiler has no way to detect this because malloc() returns a void pointer and it is perfectly legal to assign a void pointer to a pointer to an int.

However, if this had instead been written:

...
peter = (double *) malloc(elementsInPete * sizeof(*pete));
...

Then the compiler sees a pointer to a double being assigned to a variable that is a pointer to an int; thus it is in a position to complain about it.

So while you don't HAVE to cast message to a uint8_t pointer, it would be good practice to do so. First, it encourages you to think a bit about whether it really is safe to cast that pointer to a signed array of bytes into the function you are calling. Second, it allows the compiler to do better error checking. For instance, if one of the neighboring arguments was supposed to be a pointer to a structure and you accidentally got them in the wrong order, then the compiler might catch it because of the cast. Depending on the specifics, it might not have been able to do so otherwise (for instance, if the other pointer in the call was hardcoded as an uncast NULL pointer).

As with all things, there are certainly exceptions to every rule of style. Most style rules have (or should have) and implied, "unless there's a damn good reason to do otherwise." The attitude of the programmer (and anyone doing code reviews) should then be, "Yes, damn good reasons can and do exist, but they are few and far between. Is this really one of them?" Sometimes the answer will be, "Yes."
 

ErnieM

Joined Apr 24, 2011
8,415
This is my void pointer test code. The compile error is in line 13:


Code:
void f1(void* thing);
void f2(char* thing);

main()
{
char buf[]="hello";
f1(&buf);
f2(&buf);
}

void f1(void* thing)
{
    thing[0]=' ';
}

void f2(char* thing)
{
    thing[0]=' ';
}
 

MrAl

Joined Jun 17, 2014
13,667
Hi,

More toward the main point of the thread which is how to pass both char and uchar with the same function possibly using a void pointer.

First, would it be possible to use a union? I cant remember the last time i used a union but it had to be many many years ago.

My main question though is what it appears to look like is you are trying to fool the function into thinking that the two types are really one type and thus it can operate on both types in the SAME way. This would depend on what the function is actually doing and maybe more important, what the range of the actual data is.
For example, if the range of uint_8 actually being passed is from 0x00 to 0x7F then they fit the range of positive values of type char so they dont need to be uchar. I mention this because your example included only values that fit this range.

The second question is what kind of function would need to handle both types in the SAME way.
This i believe becomes a "type only" question where the types were determined previously and can not be changed, such as in the string "hello\0" where that last null might be being passed as uint_8.
In this case it think everything could be cast to type char before passing, but in the function cast it to type uchar to check the range. If it is above 0x7F then it must be uchar, but if not then it could be handled as type char. I'd have to review how type char gets cast to type uchar first though.

it would be good to see a simpler function though for your problem so perhaps you can create one. I would be more sure of what i am saying might be done if i could see that.
 

ErnieM

Joined Apr 24, 2011
8,415
WBahn, would not peter = malloc... issue at least a warning for the type mismatch?

I surely would hope so, so I get reminded to cast the return of malloc before I use it for consistency and error checking sake.
 

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
A good case could be made that your LCD_write() function... /--/

I read all of your response carefully. I think I understand your evaluation and explanation and I think it is clearly communicated – thanks. I agree with, essentially, everything you wrote.

My defensive side might be inclined to say, “yeah, well I was not about to do a typecast every time I wanted to do an LCD_write(“whatever”);”.

This speaks to the risk assessment of the trade-off that you talked about. The thing is, you have to constantly try to be as objective as possible when doing that assessment. This is especially so because it can be so compelling to take the side of the trade-off that is the one that you “want” to be right. In other words, we are all vulnerable to bias. One can certainly rationalize their way into doing it one way and then have it bite you in the butt later, and sometimes bite you deep. One of the best defenses against making those mistakes, of course, is greater understanding.

There is one other point that is somewhat philosophical and only tangentially relevant. We all know of friends or colleagues, whose productivity is, well, pretty low. Not because of a lack of skill at all, but because whatever it is that they work on, seems to be in a perpetual state of improvement. They are simply unwilling or unable to make any trade-off at all…..if you can’t get it right…then you can’t do it at all. On the one hand, it is understandable and honorable. On the other hand, “better” is sometimes the enemy of “very good” or even “great” and employers are oftentimes interested more in what you have done (or what you have done lately), not what you are doing.

In any event, I am going to continue to think about this particular pointer issue. It may very well be that the next time it comes up; I will do it very differently.
 

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
This is my void pointer test code. The compile error is in line 13:


Code:
void f1(void* thing);
void f2(char* thing);

main()
{
char buf[]="hello";
f1(&buf);
f2(&buf);
}

void f1(void* thing)
{
    thing[0]=' ';
}

void f2(char* thing)
{
    thing[0]=' ';
}
Yup, the compiler here also says.."uhh No"......oh wait, it just changed its response to "Hell No!" :) (sorry I was reading the news this morning, bad idea)
 

Thread Starter

Raymond Genovese

Joined Mar 5, 2016
1,653
Hi,

More toward the main point of the thread which is how to pass both char and uchar with the same function possibly using a void pointer.

First, would it be possible to use a union? I cant remember the last time i used a union but it had to be many many years ago.

My main question though is what it appears to look like is you are trying to fool the function into thinking that the two types are really one type and thus it can operate on both types in the SAME way. This would depend on what the function is actually doing and maybe more important, what the range of the actual data is.
For example, if the range of uint_8 actually being passed is from 0x00 to 0x7F then they fit the range of positive values of type char so they dont need to be uchar. I mention this because your example included only values that fit this range.

The second question is what kind of function would need to handle both types in the SAME way.
This i believe becomes a "type only" question where the types were determined previously and can not be changed, such as in the string "hello\0" where that last null might be being passed as uint_8.
In this case it think everything could be cast to type char before passing, but in the function cast it to type uchar to check the range. If it is above 0x7F then it must be uchar, but if not then it could be handled as type char. I'd have to review how type char gets cast to type uchar first though.

it would be good to see a simpler function though for your problem so perhaps you can create one. I would be more sure of what i am saying might be done if i could see that.
I should have thrown in a 0xff – such is the way it goes with a cut and paste, contrived example – mostly to make sure I could clearly get the error and the situation right from memory to a post.

I confess that I don’t think I have used a Union in years. Probably in the days of Borland’s Turbo C and then only to see what the construct was all about (not to say that there were not times that maybe I should have used one) and I had to go refresh my memory.

I don’t know, you could have a Union with arrays of uint8_t and int8_t, but you would still need to pick the right one – U.unsigned or U.signed. - right? Off hand, I’m not sure what it would gain.

I have not thought this through at all, but what about using assembly? I know that this compiler has the usual C inline assembly. I don’t know much of anything about the D2000 Quark assembly language except that is likely some kind of Intel x86 variant.

Again, I have not thought this through, so I’m not sure how it might work, but the code itself could be more relaxed about whether it was being passed a pointer to a int8_t or an uint8_t. Of course you would still have an argument of one type or another being sent and would still get the compiler mismatch complaints – I don’t know, I have to think about it more. If I recall, you have some experience in x86, do you think using inline somehow might also work? The more I am thinking about it, the less I like it.
 

WBahn

Joined Mar 31, 2012
32,707
Hi,

More toward the main point of the thread which is how to pass both char and uchar with the same function possibly using a void pointer.

First, would it be possible to use a union? I cant remember the last time i used a union but it had to be many many years ago.
This is one of the main reasons that unions get used -- the other (which is possibly the main reason why they were originally supported) is that you can share the allocated memory between different data types as long as only one of them is used at a time. Unfortunately, C lacks the facilities to do any type checking on a union (some languages do, but at the cost of considerable run-time overhead, which we know C eschews as much as possible). This one thing -- the inability to type check unions -- is the one thing that results in C not being considered a strongly-typed language.

My main question though is what it appears to look like is you are trying to fool the function into thinking that the two types are really one type and thus it can operate on both types in the SAME way. This would depend on what the function is actually doing and maybe more important, what the range of the actual data is.
For example, if the range of uint_8 actually being passed is from 0x00 to 0x7F then they fit the range of positive values of type char so they dont need to be uchar. I mention this because your example included only values that fit this range.

The second question is what kind of function would need to handle both types in the SAME way.
And that's one of the key considerations when deciding to do something like this -- are the values that are expected to be used in any of the types involved compatible with each other. If not, then you have a real problem because you have no way to distinguish which are which.

This i believe becomes a "type only" question where the types were determined previously and can not be changed, such as in the string "hello\0" where that last null might be being passed as uint_8.
In this case it think everything could be cast to type char before passing, but in the function cast it to type uchar to check the range. If it is above 0x7F then it must be uchar, but if not then it could be handled as type char. I'd have to review how type char gets cast to type uchar first though.
But if it is above 0x7F so what? You do not know if it started out as a large positive number of type uchar or a negative number of type schar. If both are possible values, you simply cannot distinguish them. The allowed values that are outside the common range must have distinct representations. For instance, if you might have large positive values but they are all even or you might have negative values but they are all odd, then you could handle this -- but in general you can't.
 

WBahn

Joined Mar 31, 2012
32,707
WBahn, would not peter = malloc... issue at least a warning for the type mismatch?

I surely would hope so, so I get reminded to cast the return of malloc before I use it for consistency and error checking sake.
What type mismatch? There IS no type mismatch. You can assign a void pointer to a pointer of any type and you can assign a pointer of any type to a void pointer. This behavior is required by the standard. It is up to the programmer to properly typecast all void pointers before using them.

There may be some compilers that have a configuration option to check this and I would imagine most lint checkers will flag these things.
 

WBahn

Joined Mar 31, 2012
32,707
This is my void pointer test code. The compile error is in line 13:


Code:
void f1(void* thing);
void f2(char* thing);

main()
{
char buf[]="hello";
f1(&buf);
f2(&buf);
}

void f1(void* thing)
{
    thing[0]=' ';
}

void f2(char* thing)
{
    thing[0]=' ';
}
Yep. The compiler has no idea how to compute the address offset from the pointer in order to access the specific item mentioned. The code the compiler generates multiplies the index by the size of one of the things pointed to and adds that to the value of the pointer. It can't do that since the pointer points to an object of unknown type and, hence, unknown size. Even if there is no need to perform an offset calculation, for instance by doing:

*thing = ' ';

The compiler can't generate the code to coerce the value ' ' into the representation used to store something at the address 'thing' because it doesn't know the data type. If 'thing' is pointing to a value of type float then the bit representation is very different than if it is pointing to an int which is very different than if it is a char.

This is the key problem with trying to think of a void pointer as pointing to a block of bytes. It is not. It is pointing to a memory address and the type of data stored at that address is unknown. Period. You can ONLY use a void pointer to do things for which you do not need to know the type of data in any way, shape, or form. For anything else your must explicitly cast it to (or assign it to a variable of) a complete type.
 

MrAl

Joined Jun 17, 2014
13,667
This is one of the main reasons that unions get used -- the other (which is possibly the main reason why they were originally supported) is that you can share the allocated memory between different data types as long as only one of them is used at a time. Unfortunately, C lacks the facilities to do any type checking on a union (some languages do, but at the cost of considerable run-time overhead, which we know C eschews as much as possible). This one thing -- the inability to type check unions -- is the one thing that results in C not being considered a strongly-typed language.



And that's one of the key considerations when deciding to do something like this -- are the values that are expected to be used in any of the types involved compatible with each other. If not, then you have a real problem because you have no way to distinguish which are which.



But if it is above 0x7F so what? You do not know if it started out as a large positive number of type uchar or a negative number of type schar. If both are possible values, you simply cannot distinguish them. The allowed values that are outside the common range must have distinct representations. For instance, if you might have large positive values but they are all even or you might have negative values but they are all odd, then you could handle this -- but in general you can't.

Hi again,

Yes that last part is not right. What i was thinking was if ALL the data was limited to 0x00 to 0x7F then that high order bit could act as selector. To pass a char pass 0x00 to 0x7F, to pass a uchar pass the value (also 0x00 to 0x7F) with 0x80 added, or just 'or' it in. In the function, check for the high bit set or not set and decide what type it is, then if set use something like val-0x80 or val&0x7F.
That would work, but of course the range of data is then always limited to 7 bits only.

Since we like passing int's anyway, we could do the same with a full 32 bit int. Set the sign bit for type char, dont set it for type uchar, or vice versa. In the function AND it out after testing for type val=val&0x000000FF or maybe just cast it. Again we would be using a type int but really passing a char or uchar but hey if it works it works :)
In other words in the function a negative value is uchar (after removing the bit that represents the sign like with val&0x000000FF) and a positive value is char.

long MyFunc(int val)
{
if (val<0)
//handle int val as uchar after removing sign..
else
//handle int val as char...

return success;
}

Of course testing adds to the function execution time, but if the function has significant other activity it wont add up to much extra run time overall.
This is another time when statistical time profiling would come into play in deciding what value should be tested (and thus reduced). If statistically type char appears most often in the main code, test for type uchar instead. This of course depends on the application itself and someone that knows the typical usage of that function so they can judge the two frequencies. If there is no systematic deviation, then we can do it either way. Since we have to cast it somehow anyway though it may not make much difference either way. Might even be better to pass type unsigned int instead and just test the bit.
 
Last edited:
Top