Programming

New C features in GCC 13 | Red Hat Developer

steloflute 2023. 5. 6. 23:42

New C features in GCC 13 | Red Hat Developer

 

New C features in GCC 13 | Red Hat Developer

The GNU Compiler Collection 13 release implemented a number of interesting features in its C front-end. This article summarizes the most interesting ones.

developers.redhat.com

The latest major version of the GNU Compiler Collection (GCC), 13.1, was released in April 2023. Like every major GCC release, this version will bring many additions, improvements, bug fixes, and new features. GCC 13 is already the system compiler in Fedora 38Red Hat Enterprise Linux (RHEL) users will get GCC 13 in the Red Hat GCC Toolset (RHEL 8 and RHEL 9). It's also possible to try GCC 13 on godbolt.org and similar pages.

This article describes new features implemented in the C front end; it does not discuss developments in the C language itself. It also doesn’t cover recent changes in the C library itself. If you’re interested in the C++ language and what's supported in recent GCC releases, check out New C++ features in GCC 10 and New C++ features in GCC 12.

The default C dialect in GCC 13 is -std=gnu17. You can use the -std=c2x or -std=gnu2x command-line options to enable C2X features. We use C2X to refer to the next major C standard version; it is expected to become C23.

C2X features

GCC 13 has implemented a host of C2X proposals. This section describes the most interesting ones.

nullptr

The nullptr constant first appeared in C++11, described in proposal N2431 from 2007. Its purpose was to alleviate the problems with the definition of NULL, which can be defined in a variety of ways: (void *)0 (a pointer constant), 0 (an integer), and so on. This posed problems for overload resolution, generic programming, etc. While C doesn’t have function overloading, the protean definition of NULL still causes headaches. Consider the interaction of _Generic with NULL: it’s not clear which function will be called because it depends on the definition of NULL:

  _Generic (NULL,
            void *: handle_ptr (),
            int: crash (),
            default: nop ());

Unfortunately, there are less contrived problems in practice. For instance, issues occur with conditional operators or when passing NULL to a variadic function (taking ...): in such a case, applying va_arg to the null argument may crash the program if an unexpected definition of NULL is encountered. GCC 13 implements N3042, which brings nullptr to C. Its type is nullptr_t and is defined in <stddef.h>. In C2X, the following assert therefore passes:

static_assert (_Generic (nullptr,
   nullptr_t: 1,
   void *: 2,
   default: 0) == 1,
   "nullptr_t was selected");

Enhanced enumerations

Enhanced enumerations is another feature that first appeared in C++11 via N2347. In C, the underlying type of an enum was not specified in the standard. In practice, the type would be determined based on the values of the enumerators. Typically, the type would be unsigned int, or, if any of the values is negative, int. In any case, the selected type must be capable of holding all of the values of the enum. Given this lacuna in the specification, enums have portability issues. To close this gap, C adopted N2963, adopting the C++ syntax:

enum E : long long { R, G, B } e;
static_assert (_Generic (e, long long: 1, default: 0) == 1, "E type");

It seems worth mentioning, however, that specifying the wrong underlying type may lead to subtle problems.  Consider the following:

enum F : int { A = 0x8000 } f;

On most platforms, this code will work as expected. The precision of int isn’t guaranteed to be at least 32 bits, however; it can validly be 16 bits, in which case the previous example will not compile. Thus a better variant would be to use one of the types defined in <stdint.h>, for example:

enum F : int_least32_t { A = 0x8000 } f;

(...) function prototypes

C, prior to C2X, required that a variable-argument function has a named argument before the ellipsis (...). This requirement was the result of historical baggage and is no longer necessary, so N2975 did away with the requirement. (C++ has always allowed foo(...).)

void f(int, ...); // OK
void g(...); // OK in C2X

Note, however, that fn(...) is not an unprototyped function, so it is possible to use the va_start and va_arg mechanism to access its arguments. An unprototyped function has the form void u();.

Such functions were removed in C2X (see below).

Type inference with auto

Type deduction is another feature that first appeared in C++11 via N1984. It is a convenient feature that allows the programmer to use the placeholder auto as the type in a declaration. The compiler will then deduce the variable’s type from the initializer:

auto i = 42;

It is, however, more than just a convenience feature to save typing a few more characters. Consider:

auto x = foo (y);

Here, the type of foo (y) may depend on y (foo could be a macro using _Generic), so changing y implies changing the type of x. Using auto in the example above means that the programmer doesn’t have to change the rest of the codebase when the type of y is updated. GCC has offered __auto_type since GCC 4.9, whose semantics is fairly close to C2X auto, though not exactly the same, and appears to have been used mostly in standard headers. Unlike C++, auto must be used plain: it cannot be combined with * or [] and similar. Moreover, auto also doesn’t support braces around the initializer. The auto feature is only enabled in C2X mode. In older modes, auto is a redundant storage class specifier which can only be used at block scope.

The constexpr specifier

Yet another feature that first appeared in C++ is the constexpr specifier (see for instance N2235, though C++ constexpr has been greatly expanded since). C constexpr was introduced in N3018, with much more limited functionality. Declaring a variable as constexpr guarantees that the variable can be used in various constant-expression contexts. C requires that objects with static storage duration are initialized with constant expressions. It follows that constexpr variables can be used to initialize objects with static storage duration. Another great advantage of constexpr is that various semantic constraints are checked at compile time. Let’s demonstrate both points with an example (note that you must specify -std=c2x or -std=gnu2x to be able to use constexpr):

constexpr int i = 12;
static_assert (i == 12);

struct X {
  int bf : i;
};

struct S {
  long l;
};

constexpr struct S s = { 1L };
static_assert (s.l == 1L);
constexpr unsigned char q = 0xff + i; // initializer not representable in type of object

Storage-class specifiers in compound literals

A compound literal is a way to create unnamed objects that typically have automatic storage duration. Because they are lvalues, it is permitted to take their address:

int *p = (int []){2, 4}; // p points to the first element of an array of two ints
const int *q = &(const int){1};

Paper N3038 allows using certain storage-class specifiers (things like constexpr, static, thread_local) in compound literals in C2X mode. This is useful to change the lifetime of the compound literal, or to make it a compound literal constant with the constexpr keyword:

struct S { int i; };
void
f (void)
{
  static struct S s = (constexpr struct S){ 42 };
}

int *
g (void)
{
  return &(static int){ 42 };
}

Note that even though typedef, extern, and auto are storage-class specifiers, they are not allowed in compound literals.

C2X typeof

C2X standardized typeof, a feature that has been supported as a GNU extension for many years which allows the programmer to get the type of an expression as described here. Along with typeof, C2X also adds typeof_unqual, which additionally removes all qualifiers and _Atomic from the resulting type:

int i;
volatile int vi;
extern typeof (vi) vi; // OK, no conflict
extern typeof_unqual (vi) i; // OK, no conflict

A minor difference between the GNU version and the standard version is the treatment of the noreturn property of a function: the GNU variant of typeof takes noreturn as part of the type of a pointer to function, but the standard version does not.

Note that C++11 standardized a similar feature under the name decltype, so sadly we wound up with two names for a nearly identical feature.

New keywords

This proposal harmonizes C and C++ further by making alignas, alignof, bool, false, static_assert, thread_local, and true ordinary keywords in C2X mode. Therefore this translation unit will compile OK in C2X mode:

static_assert (true, "");

This change can break existing code, for example

int alignof = 42;

will not compile in C2X mode.

The noreturn attribute

A further compatibility tweak to bring C and C++ closer together. C11 added the _Noreturn function specifier to signal to the compiler that a function never returns to its caller, but _Noreturn works in C only, so C2X N2764 added a standard [[noreturn]] attribute while simultaneously marking _Noreturn as obsolescent.

[[noreturn]] void exit (int);

Empty initializer braces

C2X standardized empty initializer braces ({}) and GCC 13 implements this proposal. Some cases were already supported as a GNU extension (e.g., initializing an array or a structure), but newly it’s possible to use {} to initialize a scalar variable or a variable-length array as well:

int i = {};
int arr[10] = {};
struct S { int i; };
struct S s = {};

void
g (void)
{
  int n = 10;
  int vla[n] = {};
}

unreachable macro

C2X brings the unreachable() macro, defined in <stddef.h>, which is a convenient shorthand for the GCC built-in function __builtin_unreachable():

#include <stddef.h>

int foo (int x)
{
  if (x < 0)
unreachable ();
  return x & 1;
}

Unprototyped functions removed

Unprototyped functions in C were of the form int foo(), which is a function foo that returns an integer which takes an unspecified number of arguments of unspecified types. This is very dangerous because the compiler can’t perform any checking when such a function is used.

In C2X, int foo() is equivalent to int foo(void), which is a function foo that returns an integer and takes no arguments.

New warnings

The C front end has gained some new warnings in GCC 13. For instance, -Wxor-used-as-pow, which was described in the C++ part of the GCC 13 blog post. There’s a new warning specific for the C front end.

-Wenum-int-mismatch

In C, an enumerated type is compatible with char, a signed integer type, or an unsigned integer type, so the following code compiles if the underlying type of enum E is int:

enum E { l = -1, z = 0, g = 1 };
int foo(void);
enum E foo(void) { return z; }

However, as I previously noted, the choice of the underlying type of the enum is implementation-defined. Since the code above is likely a mistake and constitutes a portability problem (the code will not compile if a different type than int is chosen to be the underlying type), GCC 13 implements a new warning which warns about enum/integer type mismatches. For the code above the warning looks like the following:

q.c:5:10: warning: conflicting types for ‘foo’ due to enum/integer mismatch; have ‘enum E(void)’ [-Wenum-int-mismatch]
5 |   enum E foo(void) { return z; }
  |         ^~~

q.c:4:7: note: previous declaration of ‘foo’ with type ‘int(void)’
4 |   int foo(void);
  |       ^~~

Conclusion

GCC 13 implements many C2X proposals. These proposals align the C and C++ languages a little bit closer to each other by entwining certain features, and make programming in C easier and more secure.