The Subtle Differences Between C and C++

Even though we usually think of C as a subset of C++, there are some tiny little differences that we really, really have to pay attention to. Oh, and yep, that’s right, I learned C before C++~

C++ supports function overloading, which causes function names at the assembly level to become super wacky and almost unreadable for us humans! This is also why you need to add extern "C" when calling C functions from C++, or the linker won’t be able to find them. Let’s take GCC, which I use a lot, as an example. For a function like int compute(int a, int b):

int compute(int a, int b)
{
    return a + b;
}

The assembly in C++ will be:

_Z7computeii:
.LFB0:
    pushq    %rbp
    .seh_pushreg    %rbp
    movq    %rsp, %rbp
    .seh_setframe    %rbp, 0
    .seh_endprologue
    movl    %ecx, 16(%rbp)
    movl    %edx, 24(%rbp)
    movl    16(%rbp), %edx
    movl    24(%rbp), %eax
    addl    %edx, %eax
    popq    %rbp
    ret

And in C:

compute:
    pushq    %rbp
    .seh_pushreg    %rbp
    movq    %rsp, %rbp
    .seh_setframe    %rbp, 0
    .seh_endprologue
    movl    %ecx, 16(%rbp)
    movl    %edx, 24(%rbp)
    movl    16(%rbp), %edx
    movl    24(%rbp), %eax
    addl    %edx, %eax
    popq    %rbp
    ret

This int compute(int a, int b) becomes _Z7computeii in C++, but in C, it’s just compute. This isn’t really a language standard thing, but rather how compilers implement namespaces and overloading by implicitly renaming things (this process is called name mangling!). It kinda pushes C++ away from being a “high-level assembly,” and using inline asm isn’t as free and easy as it is in C.

Another issue that comes with a fancy feature is the virtual table (vtable). For example, with this code:

struct test1
{
    virtual int compute(void)
    {
        return 1;
    }
};
 
struct expanded_test1 : public test1
{
    int compute(void) override
    {
        return 2;
    }
};
 
int main(void)
{
    expanded_test1 object;
    test1 &reference = object;
    reference.compute();
}

The assembly will contain this:

main:
.LFB2:
    subq    $56, %rsp
    .seh_stackalloc    56
    .seh_endprologue
    call    __main
    leaq    16+_ZTV14expanded_test1(%rip), %rax
    movq    %rax, 40(%rsp)
    movl    $0, %eax
    addq    $56, %rsp
    ret
    .seh_endproc
    .globl    _ZTS5test1
    .section    .rdata$_ZTS5test1,"dr"
    .linkonce same_size
_ZTS5test1:
    .ascii "5test1\0"
    .globl    _ZTI5test1
    .section    .rdata$_ZTI5test1,"dr"
    .linkonce same_size
    .align 8
_ZTS14expanded_test1:
    .ascii "14expanded_test1\0"
    .globl    _ZTI14expanded_test1
    .section    .rdata$_ZTI14expanded_test1,"dr"
    .linkonce same_size
    .align 8
_ZTI14expanded_test1:
    .quad    _ZTVN10__cxxabiv120__si_class_type_infoE+16
    .quad    _ZTS14expanded_test1
    .quad    _ZTI5test1
    .globl    _ZTV14expanded_test1
    .section    .rdata$_ZTV14expanded_test1,"dr"
    .linkonce same_size
    .align 8
_ZTV14expanded_test1:
    .quad    0
    .quad    _ZTI14expanded_test1
    .quad    _ZN14expanded_test17computeEv

his _ZTV14expanded_test1 is the vtable pointer! It means that a C-style memset(this, 0, sizeof(T)) will destroy the vtable, causing some very sneaky bugs. That’s why the safe way to initialize in C++ is to use initializer lists or assign values inside the constructor. Another fun little tidbit is that when RTTI (Run-Time Type Information) is enabled (which is standard now), the compiler provides this _ZTVN10__cxxabiv120__si_class_type_infoE+16 for type info.

Then there are some tiny details, like NULL. In stdlib.h (C), we can see it’s:

#define NULL ((void *)0)

But in C++, it became:

#define NULL 0

Soooo, if you use NULL during template argument deduction, it will be deduced as an int type, not a pointer! That’s why in C++, you should always, always use the built-in nullptr.

Another big topic worth diving into is the difference in memory allocation between C and C++. You can’t just think of it as a simple switch from malloc/free to new/delete. The biggest deal with C++‘s new and delete is that they automatically call constructors and destructors. This can lead to some very hidden high costs! Imagine you have a relatively complex type T. If you write new T[100];, it will call the constructor 100 times right at the moment of allocation, not when you actually start using the objects. This is a “pay upfront” model. In cases like a hash table, which might have a low load factor and thus many unused objects, this is just a pure waste of resources. A more modern approach is to use std::pmr (Polymorphic Memory Resources) to decouple the memory allocation strategy, or use an std::allocator and call std::construct_at only when you’re ready to use the object, and then manually destroy it when you’re done. The placement new operator has a similar effect, but the former methods are clearly more modern. Given that both malloc/free and new/delete allocate memory on the global heap, they will require locks in a concurrent environment, effectively serializing memory operations. This is why decoupling memory allocation strategies has always been a critical need in high-performance computing.

In conclusion, when we look at low-level development involving C today, it’s often a combination of “C and assembly”. Because C, as a high-level language, saves a lot of effort while ensuring efficiency, and its affinity with assembly is much better, making it suitable for many fine-tuned scenarios, like the Linux kernel. Similarly, if a project uses assembly, introducing C++ creates what is effectively heterogeneous code across multiple “languages”. Plus, implementing a good C++ compiler is infinitely more torturous than for C—after all, C++‘s template system has been Turing-complete since C++98. Therefore, C++‘s introduction of features like object-orientation, lambdas, and templates makes it more suitable for high-performance software development. The goal of “zero-cost abstractions” is the result of wanting both high-level architecture and low-level performance.