Even though we usually think of C as a subset of C++, there are some tiny little differences that we really, really have to pay attention to. Oh, and yep, that’s right, I learned C before C++~
C++ supports function overloading, which causes function names at the assembly level to become super wacky and almost unreadable for us humans! This is also why you need to add extern "C"
when calling C functions from C++, or the linker won’t be able to find them. Let’s take GCC, which I use a lot, as an example. For a function like int compute(int a, int b)
:
int compute(int a, int b)
{
return a + b;
}
The assembly in C++ will be:
_Z7computeii:
.LFB0:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
.seh_endprologue
movl %ecx, 16(%rbp)
movl %edx, 24(%rbp)
movl 16(%rbp), %edx
movl 24(%rbp), %eax
addl %edx, %eax
popq %rbp
ret
And in C:
compute:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
.seh_endprologue
movl %ecx, 16(%rbp)
movl %edx, 24(%rbp)
movl 16(%rbp), %edx
movl 24(%rbp), %eax
addl %edx, %eax
popq %rbp
ret
This int compute(int a, int b)
becomes _Z7computeii
in C++, but in C, it’s just compute. This isn’t really a language standard thing, but rather how compilers implement namespaces and overloading by implicitly renaming things (this process is called name mangling!). It kinda pushes C++ away from being a “high-level assembly,” and using inline asm isn’t as free and easy as it is in C.
Another issue that comes with a fancy feature is the virtual table (vtable). For example, with this code:
struct test1
{
virtual int compute(void)
{
return 1;
}
};
struct expanded_test1 : public test1
{
int compute(void) override
{
return 2;
}
};
int main(void)
{
expanded_test1 object;
test1 &reference = object;
reference.compute();
}
The assembly will contain this:
main:
.LFB2:
subq $56, %rsp
.seh_stackalloc 56
.seh_endprologue
call __main
leaq 16+_ZTV14expanded_test1(%rip), %rax
movq %rax, 40(%rsp)
movl $0, %eax
addq $56, %rsp
ret
.seh_endproc
.globl _ZTS5test1
.section .rdata$_ZTS5test1,"dr"
.linkonce same_size
_ZTS5test1:
.ascii "5test1\0"
.globl _ZTI5test1
.section .rdata$_ZTI5test1,"dr"
.linkonce same_size
.align 8
_ZTS14expanded_test1:
.ascii "14expanded_test1\0"
.globl _ZTI14expanded_test1
.section .rdata$_ZTI14expanded_test1,"dr"
.linkonce same_size
.align 8
_ZTI14expanded_test1:
.quad _ZTVN10__cxxabiv120__si_class_type_infoE+16
.quad _ZTS14expanded_test1
.quad _ZTI5test1
.globl _ZTV14expanded_test1
.section .rdata$_ZTV14expanded_test1,"dr"
.linkonce same_size
.align 8
_ZTV14expanded_test1:
.quad 0
.quad _ZTI14expanded_test1
.quad _ZN14expanded_test17computeEv
his _ZTV14expanded_test1
is the vtable pointer! It means that a C-style memset(this, 0, sizeof(T))
will destroy the vtable, causing some very sneaky bugs. That’s why the safe way to initialize in C++ is to use initializer lists or assign values inside the constructor. Another fun little tidbit is that when RTTI (Run-Time Type Information) is enabled (which is standard now), the compiler provides this _ZTVN10__cxxabiv120__si_class_type_infoE+16
for type info.
Then there are some tiny details, like NULL.
In stdlib.h
(C), we can see it’s:
#define NULL ((void *)0)
But in C++, it became:
#define NULL 0
Soooo, if you use NULL during template argument deduction, it will be deduced as an int type, not a pointer! That’s why in C++, you should always, always use the built-in nullptr
.
Another big topic worth diving into is the difference in memory allocation between C and C++. You can’t just think of it as a simple switch from malloc/free
to new/delete
. The biggest deal with C++‘s new
and delete
is that they automatically call constructors and destructors. This can lead to some very hidden high costs! Imagine you have a relatively complex type T
. If you write new T[100]
;, it will call the constructor 100 times right at the moment of allocation, not when you actually start using the objects. This is a “pay upfront” model. In cases like a hash table, which might have a low load factor and thus many unused objects, this is just a pure waste of resources. A more modern approach is to use std::pmr
(Polymorphic Memory Resources) to decouple the memory allocation strategy, or use an std::allocator
and call std::construct_at
only when you’re ready to use the object, and then manually destroy it when you’re done. The placement new operator has a similar effect, but the former methods are clearly more modern. Given that both malloc/free
and new/delete
allocate memory on the global heap, they will require locks in a concurrent environment, effectively serializing memory operations. This is why decoupling memory allocation strategies has always been a critical need in high-performance computing.
In conclusion, when we look at low-level development involving C today, it’s often a combination of “C and assembly”. Because C, as a high-level language, saves a lot of effort while ensuring efficiency, and its affinity with assembly is much better, making it suitable for many fine-tuned scenarios, like the Linux kernel. Similarly, if a project uses assembly, introducing C++ creates what is effectively heterogeneous code across multiple “languages”. Plus, implementing a good C++ compiler is infinitely more torturous than for C—after all, C++‘s template system has been Turing-complete since C++98. Therefore, C++‘s introduction of features like object-orientation, lambdas, and templates makes it more suitable for high-performance software development. The goal of “zero-cost abstractions” is the result of wanting both high-level architecture and low-level performance.