Rules of c language are very confusing

Thread Starter

Kittu20

Joined Oct 12, 2022
462
I find the rules of c language are very confusing

This is a valid statement in c language
Code:
long long count;
I don't understand why following statement is invalid in c language
Code:
short short count;
please can someone explain what is the reason behind this
 

ZCochran98

Joined Jul 24, 2018
303
A "short" is exactly 16 bits, while a "long" is either a 32- or 64-bit integer (depending on if your OS is a 32-bit or 64-bit OS). A "short short" would be redundant to the byte or char data types, so it is unnecessary, and thus not valid syntax. The "long long" asserts that you're getting a 64-bit integer, as there's no other native guaranteed 64-bit number.
 

BobTPH

Joined Jun 5, 2013
8,804
The C language allows as many short or long modifiers as you want.

long long long long long int x;

is also valid. Compilers can choose what to do with each number of longs or shorts.
 

nsaspook

Joined Aug 27, 2009
13,079
I find the rules of c language are very confusing

This is a valid statement in c language
Code:
long long count;
I don't understand why following statement is invalid in c language
Code:
short short count;
please can someone explain what is the reason behind this
Great answers above but:
Some people might get the wrong idea about short shorts.
we-wear-short-shorts-nair.gif
 

WBahn

Joined Mar 31, 2012
29,976
A "short" is exactly 16 bits, while a "long" is either a 32- or 64-bit integer (depending on if your OS is a 32-bit or 64-bit OS). A "short short" would be redundant to the byte or char data types, so it is unnecessary, and thus not valid syntax. The "long long" asserts that you're getting a 64-bit integer, as there's no other native guaranteed 64-bit number.
A "short int" is not exactly 16 bits, it is merely required to be a minimum of 16 bits. The same is true of an "int". A "long int" must be at least 32 bits. The "long long int" type must be at least 64 bits.

Since an int is usually 64-bits on modern platforms, the long int and long long int are often also 64 bit (since they have to be at least as long), but on some platforms they are 128-bit and long long is even 256 bit on some.

Similarly, the short is often 32-bits on a 64-bit platform.

You might then think that "short short int" would be 16-bits, but that is not a defined data type specifier (at least not as of C99, perhaps in C11).
 

Thread Starter

Kittu20

Joined Oct 12, 2022
462
Thank you all for clearing up the confusion. So it depends on how it is implemented by the compiler.

data type : int, char, float, double
type qualifier : short, long

Does the type qualifier used to change the size of a data type?

Does the rules in c language say that short type qualifier should always be equal to int or it must be less than the int, while int must be equal to long or less than long while long must be equal then long or must be less than long long
 

WBahn

Joined Mar 31, 2012
29,976
Thank you all for clearing up the confusion. So it depends on how it is implemented by the compiler.
Within limits. The C language intentionally left a lot to the compiler implementation because back in those days the capabilities of different processors was all over the map. Both the code, once compiled, and the compiler that was compiling the code were running on extremely resource-starved platforms by todays standards. Plus, most of the optimization that was done was done by hand written code exploiting specific code patterns matched up to the specific capabilities of the hardware.

data type : int, char, float, double
type qualifier : short, long

Does the type qualifier used to change the size of a data type?
It MAY change the size of the data type.

Does the rules in c language say that short type qualifier should always be equal to int or it must be less than the int, while int must be equal to long or less than long while long must be equal then long or must be less than long long
Yes (if by "equal" and "less than" you are referring to its precision/width). This is required by the rank requirements for the integer data types covered in Section 6.3 of the C11 draft standard. Among other things, these require that

1) No two signed integer types shall have the same rank, even if they have the same representation.
2) The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.
3) The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
4) The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
 

WBahn

Joined Mar 31, 2012
29,976
The C language allows as many short or long modifiers as you want.

long long long long long int x;

is also valid. Compilers can choose what to do with each number of longs or shorts.
This is not true. The C language standard provides an explicitly-enumerated list of allowed type declarations (see "Constraints" under Section 6.7.2 Type Specifiers, in either the C99 or the C11 standard).

So if a compiler accepts

long long long int x;

it is a non-conforming implementation.

Similarly, there is no "short short" allowed.
 

WBahn

Joined Mar 31, 2012
29,976
What do you mean by short short?

I find using uint64_t to be more meaningful than using unsigned long long int. If you wanted short short int to be 8 bits, you could use uint8_t (for unsigned).
I agree.

If you want/need a specific width for your integer representation, then use the size-specific typedefs found in <stdint.h>.

There CAN be processing advantages to using the older types, such as int, as these are usually chosen to be matched to the native datapath of the processor/platform. But care needs to be taken to ensure that they are adequate for your needs across all possible platforms.

Also, there's a caveat that few people are aware of. An implementation is not required to provide all of the size-specific data types. For instance, if none of the classic types is a 16-bit integer, then int16_t and uint_16t are likely undefined. This is because these size-specific types are simply macro definitions that translate to the appropriate classic type for that compiler.

This can be a problem for compilers that map an int to 64-bits, since a char is pretty universally 8-bits and the only thing in between is a short. So it is either (usually) 16-bits or 32-bits and whichever one it isn't, doesn't have a type specifier for it and so the size-specific ones would be undefined. This is probably the reason that most 64-bit implementations have chosen to leave an int as 32 bits, even though it is not the native datapath width.
 

ApacheKid

Joined Jan 12, 2015
1,533
I find the rules of c language are very confusing

This is a valid statement in c language
Code:
long long count;
I don't understand why following statement is invalid in c language
Code:
short short count;
please can someone explain what is the reason behind this
It is in fact the C grammar that is the source of these inelegant constructs, it is technically a poor grammar and unfortunately has contaminated many more recent languages which just perpetuate the problems.

C is what it is though, and unless someone takes a clean sheet of paper and really designs a grammar from the ground up we're stuck with it, even C#, go, Swift and Rust carry the contamination and this is unlikely to change as long as people continue to treat the C grammar as a sacrosanct necessity for all imperative languages.
 
Last edited:

nsaspook

Joined Aug 27, 2009
13,079
It is in fact the C grammar that is the source of these inelegant constructs, it is technically a poor grammar and unfortunately has contaminated many more recent languages which just perpetuate the problems.

C is what it is though, and unless someone takes a clean sheet of paper and really designs a grammar from the ground up we're stuck with it, even C#, go, Swift and Rust carry the contamination and this is unlikely to change as long as people continue to treat the C grammar as a sacrosanct necessity for all imperative languages.
There have been many possible 'better' languages over the years that make the C inelegance (inelegant because it interacts with the real-world of inelegant, unsafe, non-deterministic hardware) look wonderful with their endless restrictions on program structure that tend to ignore the low-level requirements of C that require what the programming priesthood would call unsafe behaviours and constructs needed to handle bit/byte level I/O patterns efficiently and in a way that mimics actual hardware. The only sacrosanct feature of C is that it works at a level hardware guys that program understand and can be useful quickly and logically as an extension of low-level hardware abstractions.

I blame ALGOL. ;)
https://en.wikipedia.org/wiki/TPK_algorithm

http://www.club.cc.cmu.edu/~ajo/disseminate/STAN-CS-76-562_EarlyDevelPgmgLang_Aug76.pdf
 
Last edited:

ApacheKid

Joined Jan 12, 2015
1,533
There have been many possible 'better' languages over the years that make the C inelegance (inelegant because it interacts with the real-world of inelegant, unsafe, non-deterministic hardware) look wonderful with their endless restrictions on program structure that tend to ignore the low-level requirements of C that require what the programming priesthood would call unsafe behaviours and constructs needed to handle bit/byte level I/O patterns efficiently and in a way that mimics actual hardware. The only sacrosanct feature of C is that it works at a level hardware guys that program understand and can be useful quickly and logically as an extension of low-level hardware abstractions.

I blame ALGOL. ;)
https://en.wikipedia.org/wiki/TPK_algorithm

http://www.club.cc.cmu.edu/~ajo/disseminate/STAN-CS-76-562_EarlyDevelPgmgLang_Aug76.pdf
The fact remains, C has a poorly designed grammar that can lead to serious bugs.

1. Being able to use a function as both a call target and within an expression.
2. Using an assignment statement as an expression.
3. Reliance on reserved words.
4. No string datatype.
5. No bit data type.

and more.

1. Means basically that we can "call" a function and blissfully ignore its returned value. 2. Arises with stuff like counter++ or --depth and so on. With these we can write expressions whose operands are assignments, totally bizarre thing to do. 3 Impacts the ability to grow the language easily and 4. means we have to jump through hoops, we cannot treat a string as just some other data type. Stunningly 5 just kills me, I mean what stopped these designers from having a type name "bit" or something? this is huge for a language that manipulates hardware too!

I've written truckloads of C and with strict discipline these can be contained but the language grammar really is a terrible and amateurish design, these carry over too into C++, Java, C# etc - Rust interestingly does not support these pre/post increment operators.

The entire way that language tokens are defined and used is terrible, why would they put <typename> <identifier> rather than say <declare-keyword> <identifier> <typename> ?
 
Last edited:

nsaspook

Joined Aug 27, 2009
13,079
I've never needed a bit type to write kernel code that directly manipulated bits. IMO the bit type is a useless abstraction when you need to actually manipulate physical bits in a memory register binary interface where you can declare bit-fields in structures, where one bit is accessed directly.

IMO the rest are minor C programming impurities that most embedded hardware programmers see as lesser evils to not having things work with firmware.
 
Last edited:

ApacheKid

Joined Jan 12, 2015
1,533
I've never needed a bit type to write kernel code that directly manipulated bits. IMO the bit type is a useless abstraction when you need to actually manipulate physical bits in a memory register binary interface where you can declare bit-fields in structures, where one bit is accessed directly.

IMO the rest are minor C programming impurities that most embedded hardware programmers see as lesser evils to not having things work with firmware.
Well I find your position interesting, as an engineer (and many of your posts indicate some clear expertise) you must value formality and tolerances and unambiguous specifications, good design and so on. Yet C as a language is the antithesis of this, it seems to fly in the face of these principles.

Many of the things I called out are - in terms of software engineering - "bad design", present in the language for some kind of convenience (at the time C was initially designed) rather than sound engineering qualities.

Having a native string data type would likely eliminate a huge percentage of memory corruption problems for example, there is no engineering advantage served by treating strings as null terminated arrays, none, no software engineer I know would purposefully design something that way.

The grammar is also not extensible, we can't add new keywords to C to help it evolve, so if it was decided to give it a string data type you cannot, extensibility is not a feature, this restriction permeates all languages that derive from C's grammar too - adding a new keyword is a huge problem because it almost always breaks backward compatibility.

Now what you say about bits is fine, all well and good, but it is the disregard for symmetry and uniformity that I'm referring to. A "bit" is no less a legitimate abstraction than a byte or an integer or a double, the syntax of using it should be obvious not cryptic.

Some languages make it easy to declare and work with bits, they are simply used an array which can be either aligned or unaligned in some way. In C we have to use a cryptic special syntax with a colon and so on, a syntax that applies only to bits!

I don't really know if there has ever been a truly well designed imperative language specifically for writing code for embedded systems, yes C and some other languages can do a decent job but were these ever designed specifically for working on embedded devices with peripherals and so on?

Perhaps there's scope for a new well designed from the ground up, imperative language specifically for such applications.
 

WBahn

Joined Mar 31, 2012
29,976
Complaints about C (and whatever other language you pick) are all well and good, but programming languages live in the real world in which inertia and installed codebases play a huge role. There have literally been hundreds of programming languages developed, but very few of them have ever gained a following because real people are not going to switch from what they know and are comfortable with except for a damn good reason. For better or worse, just being able to tout the beauty of some new language's syntax isn't going to be seen as a good enough reason to get enough people to adopt it to give it any hope of becoming widely accepted.

Look at the slow adoption of object-oriented programming prior to the introduction of C++. People just didn't see the benefit being enough to justify learning a completely different language. Whether they were correct in this assessment or not is beside the point, it's the assessment that was widely made. What C++ did was to recognize this and leverage it by building an OOP language as a superset of C. Now people could ease into OOP for small pieces of their projects while using the language they were already proficient in for the bulk of the work, so they saw this as a boost with a limited learning-curve penalty. The fact that OOP principles and C are almost diametrically opposed and encouraged people to write programs that were cross-bred monstrosities was a pretty foreseeable result to anyone that was willing to take off their rose-colored glasses long enough. But it DID get people exploring OOP in sufficient numbers so that later languages such as Java and C# could move the ball further into the OOP world because now you had enough people using C++ for OOP programming that moving to a new language that was more OOP-centric wasn't seen as a huge hurdle.

On a different note, a number of years ago some researchers in the field of provably-correct programs examined the C language specification and determined that it was actually impossible to write a compiler that was truly strictly compliant with the spec because of ambiguities or contradictions in the spec (I didn't read the paper, so I don't know of any specific examples). Of course the anti-C crowd jumped on this, but further work seems to have led to the general conclusion that it is likely that no programming language spec meets that bar and at least one group of researchers was exploring the question of whether it is even theoretically possible to construct a language spec that is. I haven't heard anything about that since then (haven't gone looking for it, either).
 

ApacheKid

Joined Jan 12, 2015
1,533
Complaints about C (and whatever other language you pick) are all well and good, but programming languages live in the real world in which inertia and installed codebases play a huge role. There have literally been hundreds of programming languages developed, but very few of them have ever gained a following because real people are not going to switch from what they know and are comfortable with except for a damn good reason. For better or worse, just being able to tout the beauty of some new language's syntax isn't going to be seen as a good enough reason to get enough people to adopt it to give it any hope of becoming widely accepted.

Look at the slow adoption of object-oriented programming prior to the introduction of C++. People just didn't see the benefit being enough to justify learning a completely different language. Whether they were correct in this assessment or not is beside the point, it's the assessment that was widely made. What C++ did was to recognize this and leverage it by building an OOP language as a superset of C. Now people could ease into OOP for small pieces of their projects while using the language they were already proficient in for the bulk of the work, so they saw this as a boost with a limited learning-curve penalty. The fact that OOP principles and C are almost diametrically opposed and encouraged people to write programs that were cross-bred monstrosities was a pretty foreseeable result to anyone that was willing to take off their rose-colored glasses long enough. But it DID get people exploring OOP in sufficient numbers so that later languages such as Java and C# could move the ball further into the OOP world because now you had enough people using C++ for OOP programming that moving to a new language that was more OOP-centric wasn't seen as a huge hurdle.

On a different note, a number of years ago some researchers in the field of provably-correct programs examined the C language specification and determined that it was actually impossible to write a compiler that was truly strictly compliant with the spec because of ambiguities or contradictions in the spec (I didn't read the paper, so I don't know of any specific examples). Of course the anti-C crowd jumped on this, but further work seems to have led to the general conclusion that it is likely that no programming language spec meets that bar and at least one group of researchers was exploring the question of whether it is even theoretically possible to construct a language spec that is. I haven't heard anything about that since then (haven't gone looking for it, either).
I don't disagree with much of what you say, bit let me paraphrase you a little:

"Complaints about COBOL (and whatever other language you pick) are all well and good, but programming languages live in the real world in which inertia and installed codebases play a huge role."

That was as true for COBOL in 1971 as it is today for C. In other words C wouldn't even exist if K&R had placed much credence on inertia and installed code base...
 

nsaspook

Joined Aug 27, 2009
13,079
Well I find your position interesting, as an engineer (and many of your posts indicate some clear expertise) you must value formality and tolerances and unambiguous specifications, good design and so on. Yet C as a language is the antithesis of this, it seems to fly in the face of these principles.

Many of the things I called out are - in terms of software engineering - "bad design", present in the language for some kind of convenience (at the time C was initially designed) rather than sound engineering qualities.
...
I've read and digested at lot of the early text books on software engineering and good design. Nothing has really changed since the 70's, it mainly cycles on what's good, bad or ugly depending on what 's the language flavor of the day. I consider C to be a ugly language, not a bad language.

"I admire its purity. A survivor, unclouded by conscience, remorse, or delusions of morality"

Personally, I find the faults of C trivial to avoid with just a bit of formal instruction, experience and restraint. All of that nice 'good design' gets eaten by the compiler when it generated unstructured, highly optimized spaghetti machine code for actual execution. I do understand the need for restraints in the learning phase but eventually with embedded programming you will and must be able to walk the tight-rope of actual hardware machine code (on hardware usually designed to be efficient with the C language) level debugging while analyzing source where the ability to use, at least temporary, the bad design features of C to crack a problem. Even Rust has 'unsafe' because the developers recognized that sometimes you need a hand-grenade.[/spoiler]
 
Last edited:

WBahn

Joined Mar 31, 2012
29,976
I don't disagree with much of what you say, bit let me paraphrase you a little:

"Complaints about COBOL (and whatever other language you pick) are all well and good, but programming languages live in the real world in which inertia and installed codebases play a huge role."

That was as true for COBOL in 1971 as it is today for C. In other words C wouldn't even exist if K&R had placed much credence on inertia and installed code base...
First off, I said it plays a huge role, so please don't craft counterarguments that rely on me having said that it was the only role.

Well, there's STILL a LOT of COBAL out there -- not only in legacy code (something like 90% of Fortune 500 companies have significant portions of their codebase in COBAL), but I've repeatedly heard claims (at least in the not-too-distant past) that more new COBAL code is written than any other language. I don't know the extent to which that is true. It's hard to gauge from the rankings that are put out by a number of sites because those are usually based on languages used in projects visible in various types of repositories, and large companies don't generally put their big programs there.

But comparing C to COBAL, particularly in the context of this discussion, is more than a bit of a red herring. Had K&R developed C and then just tried to convince people to use it as a general purpose language, it almost certainly would have suffered the same fate that the overwhelming fraction of other languages did (and do). But that's not what they attempted to do.

They set out to develop a new operating system that supported time-sharing and that ran on small, lower cost platforms. That was something that really didn't exist at the time. The handful of time-sharing operating systems were generally proprietary systems running on a particular vendor's mainframe computers. After backing out of the Multics effort because it was getting too unwieldy and seemingly not going anywhere, they developed Unix (written in assembly, like every other OS at the time). It wasn't until version four that they had an OS written in C, a language that they developed expressly for that purpose. The conventional wisdom of the time was that operating systems had to be written in assembly to get the compactness and performance that was needed to be useful on the machines of that time. Writing their OS in COBOL or any other high-level language simply wasn't a viable option -- and if it had been, they probably would have succumbed to the inertia and used it (or whatever other high level language they happened to know). To make their new language a viable option, they had to design it with that compactness and performance in mind, which is a big part of why the C language spec was written the way it was. But now you had an OS in which system programming could be done in a high-level language, namely the C language, that even shipped as part of the operating system. No amount of inertia was going to keep a LOT of people from learning to program in C in that environment, including a lot of people that really only knew how to program in assembly before that, as well as people that never intended to write programs at all.

So what language would you think those people chose when it came time to write more general-purpose programs? Simple, the only high-level language they knew and that also conveniently happened to be packaged free along with their operating system.

Then, what language is going to be the goto language when it comes time for a bunch of computer scientists that now use C as their daily programming language to sit down and decide what language should be at the core of their undergraduate computer science courses? And what language is going to be considered very seriously when the engineering departments just down the hall have to pick which language to teach their students?

The answer, somewhat regrettably, is a language that was never really intended to be a general purpose programming language, but rather a stripped down language for doing systems programming where performance and compactness mattered more than just about anything else. Though, to give the devil his due, C also spread like wildfire because it WAS a language that let you write small, fast programs on resource-starved PC-scale computers.

A similar thing happened with BASIC on a slew of machines that hit the consumer market back in the late-70s and early-80s simply because BASIC was included with the computer (on many, the BASIC interpreter essentially was the OS). For years and years, GW-BASIC and then QBASIC shipped as part of the DOS and Windows operating systems. That resulted in a LOT of inertia in a lot of different fields as people that only knew BASIC, because it was the first language they learned because of it already sitting their on their machines, insisted on not switching unless and until they had to. I saw this first hand when I was a co-op student working at NBS/NIST. The undergrad students wanted to write the programs used for analyzing scientific data in a small variety of languages, including FORTRAN and C, but the scientists absolutely insisted that it be written in BASIC. Their motivation wasn't too unreasonable, either. They only knew BASIC (of course, that's a too-sweeping statement, but it was true often enough to drive the debate) and student programmers come and go. At the end of the day, the scientists needed to be able to understand the programs (and guide the modifications of them by future students) that were often major components in the work that defined their entire career. Sure, it would have been best for them to learn a more suitable language, but programming was neither their strength nor their focus and doing so would detract from their primary work.
 

WBahn

Joined Mar 31, 2012
29,976
Personally, I find the faults of C trivial to avoid with just a bit of formal instruction, experience and restraint. All of that nice 'good design' gets eaten by the compiler when it generated unstructured, highly optimized spaghetti machine code for actual execution. I do understand the need for restraints in the learning phase but eventually with embedded programming you will and must be able to walk the tight-rope of actual hardware machine code (on hardware usually designed to be efficient with the C language) level debugging while analyzing source where the ability to use, at least temporary, the bad design features of C to crack a problem. Even Rust has 'unsafe' because the developers recognized that sometimes you need a hand-grenade.[/spoiler]
This was my attitude for a long time -- and in some respects still is. But unfortunately it's a pretty utopian view. It's almost akin to saying that (pick a random form of government) would be perfect if only the leaders are wise, honest, fair, and uncorruptible. It's something that is only achievable in our dreams and in works of fiction.

The depth of knowledge and level of experience that a C programmer needs in order to be able to avoid all of pitfalls is not something that is going to be coverable in a reasonable undergraduate curriculum (which often only involves four or five courses in computer programming to begin with). Add to that the fact that the overwhelming majority of textbook authors are either blissfully unaware of these pitfalls or consciously ignore them in the name of just "covering the essentials". Add to that the fact that a huge fraction of people teaching the courses are completely unaware of many of the pitfalls, or how to avoid them, in part because they have never written code that needed to work in a hostile environment (such as the commercial marketplace).

Then, even if you could wave a magic wand and get the educational component so that it completely addresses this stuff, the moment they leave campus they are going to write poor code because they will either take shortcuts out of laziness or they will be under pressure to get it done quickly.

Then, even if we were somehow able to fully indoctrinate students into a mindset that resulted in them habitually following good practices (and able to remember what all of those good practices were even after several years of not needing to use some of them), you still have all of the code being written by all of the self-taught coders out there.

All-in-all (and as much as I see and lament the unintended consequences), it's hard to discount the need for programming languages that do everything they can to protect programmers from themselves and to enforce as much safety under the hood as possible. Sadly, the unintended consequence, is that we end up with a bunch of programmers that have little more than a superficial notion of what they are doing and rely on the tool to do their thinking for them -- and this is something we see in LOTS of other places in which the tools available today permit people with little knowledge, skill, or competency to successfully perform tasks that required a significant amount of all three not too long ago.
 
Top