signed char

gogo00 · Jan 6, 2024

The char data type in C stores characters. I'm a bit confused about the signed char variable in C. In C, the unsigned keyword indicates positive values ranging from 0 to 255, while signed represents both positive and negative values within the range of -128 to 127.

Papabravo · Jan 6, 2024

You are correct. In a signed character representation, the 256 possible values in a byte represent 128 positive values from 0 to 127 (0x00 to 0x7F) and 128 negative values from -128 to -1 (080 to 0xFF). Where does this make a difference? If you cast a signed character to a signed integer, it gets "sign extended". That is 0xC0 when cast to a signed int becomes 0xFFC0. It also matters for comparisons. That is -15 < 3. In the unsigned case it would be the other way around.

dl324 · Jan 6, 2024

gogo00 said:
I'm a bit confused about the signed char variable in C.

Other than the minimum signed char value being implementation dependent, your statements are correct.

This is for gcc on Debian on Win10:

Code:

grep SCHAR_MIN limits.h
limits.h:#  define SCHAR_MIN    (-128)
limits.h:#   define CHAR_MIN    SCHAR_MIN

What are you confused about?

BobTPH · Jan 6, 2024

In C, char is an integer data type that is guaranteed to be able to store one ASCII character. ASCII characters are in the range 0 to 127. Either a signed 8-bit integer or an unsigned one can store values in that range, so both are capable storing one ASCII character.

ApacheKid · Jan 6, 2024

gogo00 said:
The char data type in C stores characters. I'm a bit confused about the signed char variable in C. In C, the unsigned keyword indicates positive values ranging from 0 to 255, while signed represents both positive and negative values within the range of -128 to 127.

So, what are you confused about exactly?

gogo00 · Jan 7, 2024

I understand that the char data type can store ASCII values from 0 to 127, but I'm a bit confused about the range from -128 to -1.

Could you help clarify what happens in this range? I'm curious about how negative ASCII values, such as -128 to -1, are handled or if they have any representation.

BobTPH · Jan 7, 2024

There are no negative ASCII values. C actually has no datatype that is an ASCII character. signed char is an integer type. What happens is what happens with the integer values.

Papabravo · Jan 7, 2024

gogo00 said:
I understand that the char data type can store ASCII values from 0 to 127, but I'm a bit confused about the range from -128 to -1.

Could you help clarify what happens in this range? I'm curious about how negative ASCII values, such as -128 to -1, are handled or if they have any representation.

The representation of negative numbers is known as Two's Complement notation. Negative values are handled in the same way as positive numbers. If I tell you that a singed character of all 1's represents -1 and ask you to add +10 and -1 you would immediately say "9"! Now do it with a binary adder.

0000 1010 = 10 decimal
+1111 1111 = -1 decimal
----------------
0000 1001

It is true that there is a carry out of the high order bit, but we drop that when using Two's Complement arithmetic.

xox · Jan 7, 2024

gogo00 said:
I understand that the char data type can store ASCII values from 0 to 127, but I'm a bit confused about the range from -128 to -1.

Could you help clarify what happens in this range? I'm curious about how negative ASCII values, such as -128 to -1, are handled or if they have any representation.

ASCII is a standard which defines a mapping from a 7-bit unsigned integer to a "glyph". These characters are sometimes treated like ordinals during comparisons. For example 'a' < 'z' and 'a' + 2 = 'c'. So it is a quasi-number system of sorts.

A 8-bit byte on the other hand utilizes the highest bit to indicate the sign. Now theoretically you could "store" an extra bit of information within a given char variable (which of course would have to be masked off before treating it as an ordinary char value). Point being that ASCII, signed integers, and their unsigned counterparts are each an example of an encoding scheme. Among others...

MrChips · Jan 7, 2024

8-bit binary is, well, binary.

You can use it to represent 256 different things, even 256 random numbers, negative or positive. What it represents is entirely up to you.

The era of the IBM PC introduced the extended 256-character set.

https://www.ascii-codes.com/

WBahn · Jan 7, 2024

gogo00 said:
I understand that the char data type can store ASCII values from 0 to 127, but I'm a bit confused about the range from -128 to -1.

Could you help clarify what happens in this range? I'm curious about how negative ASCII values, such as -128 to -1, are handled or if they have any representation.

There are no negative ASCII values.

The 'char' data type in C is fundamentally just an integer data type whose name happens to be 'char'. It is no different than the other integer data types of 'int', or 'long'.

The C standard sets lower limits on the range of values each data type must be able to represent. For a integer of type 'char', that is from 0 through +255 if the type is unsigned and -128 through +127 if the type is signed.

The C standard also requires that all naked integer data types EXCEPT 'char' must be signed. For 'char', it leaves that up to the implementation. So if you want to be able to represent values greater than +128, you should be careful to include the 'unsigned' type modifier, while if you want to be able to represent negative values, you want to include the 'signed' modifier.

The 'char' data type is typically used to store single-byte values (although a given implementation is free to have it be more than one byte).

Notice that NOTHING above says ANYTHING about characters or ASCII.

The C standard also requires that the execution character set include encodings for a specific set of characters, at a minimum. ASCII happens to cover all of those required characters and is, by far, the most commonly used execution character set. It also requires that the 'char' data type be wide enough to be able to represent all encodings of the execution character set.

Probably the most common use of a variable of type 'char' is to store the encoding of one member of the execution character set, hence the name that was chosen for this data type. This does NOT mean that EVERY variable of type char can be interpreted as a character. How the value is interpreted is up to the programmer.

Consider the following:

int monkeys;

This variably might be used to store the number of monkeys in a room.

Just because one instance of an 'int'-type variable is used to store a number of monkeys, does not imply that all 'int'-type variables have something to do with monkeys.

The same with instances of 'char'-type variables. Just because they are commonly used to store character codes, that does NOT imply that they always do.

IF you are using it to store character codes, then it is your responsibility to ensure that the values stored in that variable do not exceed the bounds of the character encoding that you are using. If you do allow that, then the behavior is usually undefined if you try to interpret that value as a character code at some point.

ApacheKid · Jan 7, 2024

gogo00 said:
I understand that the char data type can store ASCII values from 0 to 127, but I'm a bit confused about the range from -128 to -1.

Could you help clarify what happens in this range? I'm curious about how negative ASCII values, such as -128 to -1, are handled or if they have any representation.

Modern computer memory is organized into pieces called "bytes" and these comprise eight bits. A "char" is a type that represents a byte of memory, it refers to a value that is always eight bits, an "unsigned char" too is also always eight bits.

We can store data in an eight bit byte, we can store 00000000 up to 11111111 and all the combinations in-between.

The difference between these is important when doing arithmetic, outside of the eight bits there is no way to store a sign for a number, so to have a sign indicating positive or negative we need a bit and so one of the eight bits is used for a sign, the remaining seven bits are then used to store the value.

char means a value that is stored as seven bit and a sign + or -.
unsigned char means a value that is stored as eight bits and no sign, no + or - it is just always regarded as being +, a positive number.

As others here have explained, ASCII (an old way to represent typewriter symbols and stuff) is a seven bit code, created many years ago to represent a limited number of symbols. There's no such thing as a negative ASCII character either.

An ASCII character can - in the C language - be stored in either a char or an unsigned char, it doesn't matter. The difference only matters when doing arithmetic, that is, using the value as a number rather than a symbol like 'a' or 'T' or '@'.

Does this help?

Papabravo · Jan 7, 2024

MrChips said:
8-bit binary is, well, binary.

You can use it to represent 256 different things, even 256 random numbers, negative or positive. What it represents is entirely up to you.

The era of the IBM PC introduced the extended 256-character set.

https://www.ascii-codes.com/

I would argue that 8-bit character sets preceded the introduction of the IBM PC by at least a decade and a half. System 360 (ca. 1964) had the 8-bit EBCDIC charter set.

WBahn · Jan 7, 2024

ApacheKid said:
Modern computer memory is organized into pieces called "bytes" and these comprise eight bits. A "char" is a type that represents a byte of memory, it refers to a value that is always eight bits, an "unsigned char" too is also always eight bits.

This is NOT true -- The C standard does NOT require that a 'char' data type always be eight bits -- that is merely the minimum required width, even if the execution character set can be represented with fewer bits. But the standard also requires that it be wide enough to represent every code in the execution character set, so if the execution character set has 500 characters, the minimum width of a 'char' would be 9, but would most likely be 16.

char means a value that is stored as seven bit and a sign + or -.
unsigned char means a value that is stored as eight bits and no sign, no + or - it is just always regarded as being +, a positive number.

Again, not true. A naked 'char' declaration is either signed or unsigned, at the discretion of the implementation. The limits.h header file has macros that let the program detect which case was chosen, should the distinction be important.

I've worked with compilers that have made different choices on this. If you assume that every compiler is going to make the same choices that the compilers you have use have made, that is asking to get bit.

WBahn · Jan 7, 2024

Papabravo said:
I would argue that 8-bit character sets preceded the introduction of the IBM PC by at least a decade and a half. System 360 (ca. 1964) had the 8-bit EBCDIC charter set.

But that is not an extended ASCII character set, which is what he was referring to.

Having said that, I don't know whether the PC era is the first in which the 7-bit ASCII code was extended to 8-bits or not. It's almost certainly the one that has caused the most heartache, since there are so many of them and no standard was ever devised or adopted. In fact, I'm pretty sure it wasn't, since systems like Atari and other video game systems often used extensions for defining sprites in the game.

The character set that we often refer to as IBM Extended ASCII is not even a true ASCII extension, as it assigned glyphs (such as smiley faces) to many of the ASCII control codes.

MrChips · Jan 7, 2024

Papabravo said:
I would argue that 8-bit character sets preceded the introduction of the IBM PC by at least a decade and a half. System 360 (ca. 1964) had the 8-bit EBCDIC charter set.

Yeah but...
EBCDIC was not ASCII.
ASCII started off as a 128-character set. The 8th bit was reserved for parity.

Papabravo · Jan 7, 2024

WBahn said:
But that is not an extended ASCII character set, which is what he was referring to.

Having said that, I don't know whether the PC era is the first in which the 7-bit ASCII code was extended to 8-bits or not. It's almost certainly the one that has caused the most heartache, since there are so many of them and no standard was ever devised or adopted. In fact, I'm pretty sure it wasn't, since systems like Atari and other video game systems often used extensions for defining sprites in the game.

The character set that we often refer to as IBM Extended ASCII is not even a true ASCII extension, as it assigned glyphs (such as smiley faces) to many of the ASCII control codes.

The original claim did not mention ASCII, only 8-bit character sets. There were 8-bit extensions of 7-bit ASCII for other languages and alphabets. Unlikely they would be called American standards, but they existed at least a decade before 1981. The IBM PC's claim to fame is notable, but it was one among many.

ApacheKid · Jan 9, 2024

WBahn said:
This is NOT true -- The C standard does NOT require that a 'char' data type always be eight bits -- that is merely the minimum required width, even if the execution character set can be represented with fewer bits. But the standard also requires that it be wide enough to represent every code in the execution character set, so if the execution character set has 500 characters, the minimum width of a 'char' would be 9, but would most likely be 16.

Again, not true. A naked 'char' declaration is either signed or unsigned, at the discretion of the implementation. The limits.h header file has macros that let the program detect which case was chosen, should the distinction be important.

I've worked with compilers that have made different choices on this. If you assume that every compiler is going to make the same choices that the compilers you have use have made, that is asking to get bit.

Yes, I accept your points, you are correct. But for the purposes of explaining the idea to the OP I wanted to avoid this idiosyncratic stuff. You also know my opinions on these traits of the C language, I think it's poorly designed, outdated and a hugely confusing language when you factor in stuff like this.

If a char was say 16 bits then the entire question of the legal range of numeric values changes, we're no longer able to say the range of char is -128 to 127 nor can we say the range of unsigned char is 0 to 255, so for the OP to even make sense we are inherently assuming a char is 8 bits given the way his question is stated.

WBahn · Jan 9, 2024

ApacheKid said:
Yes, I accept your points, you are correct. But for the purposes of explaining the idea to the OP I wanted to avoid this idiosyncratic stuff. You also know my opinions on these traits of the C language, I think it's poorly designed, outdated and a hugely confusing language when you factor in stuff like this.

If a char was say 16 bits then the entire question of the legal range of numeric values changes, we're no longer able to say the range of char is -128 to 127 nor can we say the range of unsigned char is 0 to 255, so for the OP to even make sense we are inherently assuming a char is 8 bits given the way his question is stated.

The TS gave the range for the char data type on his system, so it is quite reasonable to discuss the limits on his system. That is very different than making the blanket assertion that a char is ALWAYS eight bits and that a char is always signed.

ApacheKid · Jan 10, 2024

WBahn said:
The TS gave the range for the char data type on his system, so it is quite reasonable to discuss the limits on his system. That is very different than making the blanket assertion that a char is ALWAYS eight bits and that a char is always signed.

Yes, I accept your point, but if a newcomer to C were to learn the true nature of the language when beginning to learn, they'd likely scream when they learn how so much basic behavior is "implementation defined".

Thread starter	Similar threads	Forum	Replies	Date
	UART to signed int	Programming & Languages	21	Jan 9, 2024
	Converting VHDL code snippet to Verilog, issue with signed numbers.	FPGAs (Field Programmable Gate Array)	9	May 9, 2023
	Adder and substrate with signed number	Digital Design	5	Dec 9, 2021
M	What's difference between unsigned char and signed char ?	Programming & Languages	12	Apr 13, 2020
A	How to know whether char is signed or unsigned on your system?	Programming & Languages	12	Jan 27, 2017

signed char

Join our Engineering Community! Sign-in with:

signed char

gogo00

Papabravo

dl324

BobTPH

ApacheKid

gogo00

BobTPH

Papabravo

xox

MrChips

WBahn

ApacheKid

Papabravo

WBahn

WBahn

MrChips

Papabravo

ApacheKid

WBahn

ApacheKid

You May Also Like

Toshiba Collapses MCU, Motor Driver, and Sensorless Control Into One IC

Novosense’s Isolated CAN Transceiver Offers +/-70 V Bus Fault Protection

Proving Reliability in Critical Embedded Systems

Microchip’s CLB MCUs Marry Programmable Logic and Embedded Control