Tag:C++
Article From:https://www.cnblogs.com/zhao-zongsheng/p/9099603.html
Reprint please keep the following statement
  Author: Zhao Zongsheng
  Source: https://www.cnblogs.com/zhao-zongsheng/p/9099603.html

Many people who write C/C++ know the concept and rules of “memory alignment”, but they do not necessarily have a deep understanding of him. This article tries to talk more about the memory alignment of C++ from hardware to C++ language.

What is memory alignment (memory alignment)

First, what is memory alignment? This is a concept that appears on the hardware level. As we all know, executable programs consist of a series of CPU instructions. Some instructions in the CPU directive require access to memory. The most common thing is read from memory toRegister and write from memory to memory. In the old architecture (including x86), there are some instructions that can be directly operated on memory, and these instructions also imply memory access. Under many CPU architectures, these instructions require the memory address of the operation (to be more precise).The initial address of the operating memory can be removed by the size of the operating memory, and the memory access that meets this requirement is called aligned memory access, otherwise it is access to unaligned memory (unaligned memory AC).Cess). For example, the LDRH instruction of ARM reads 2 byte to registers from memory. If the address of the specified memory is 0x2587c20, because the number of 0x2587c20 can be divisible by 2, so the 2 byte are aligned. And if specifiedThe address of memory is 0x2587c33, because it cannot be divisible by 2, so it is not aligned.

What happens if you access unaligned memory? It depends on CPU.

  • Some CPU architectures can access unaligned memory, but they have performance implications. Typical is the x86 architecture CPU
  • Some CPU will throw out the exception
  • Some CPU will not throw any exceptions and will silently access the wrong address.
  • In recent years, some of CPU’s instructions have normal access to unaligned memory without any performance impact.

Because each CPU handles different ways of accessing unaligned memory, access to unaligned memory is avoided as much as possible. So the memory alignment mechanism of C/C++ appears.

C++Memory alignment mechanism

In C++, there are two attributes for each type, one is the size (size), and the other is the alignment requirement, or the alignment. The C++ standard does not specify the alignment of each type.But in general, there will be such a rule.

  1. The alignment of all base types is equal to the size of this type.
  2. struct, class, unionThe type of alignment is equal to the largest alignment in his non static member variable.

In addition, the standard specifies that all alignment must be a power of 2.

When compilers allocate memory for a variable, they need to figure out and satisfy this type of alignment requirement. The byte offset (offset) of struct and class type non static member variables also satisfy the alignment requirements of their respective types.

For example,

class MyObject
{
    char c;
    int i;
    short s;
};

cIt is char type, the alignment request is 1, I is int type, alignment request is 4, s is short type, alignment request is 2. So MyObject takes the biggest part, that is, 4 as his alignment requirement. If a MyObject type variable is declared in a function,The initial address of the memory allocated to this variable can be divisible by 4.

Let’s look at the member variables of the MyObject. C is the first member variable of MyObject, so its byte count offset is 0, that is, variable C occupies the first byte of MyObject. The alignment requirement of I is 4, so the byte count offset must be a multiple of 4.Because the variable I must be behind the variable C, the I byte number offset is 4, that is, the variable I occupies fifth to eighth byte of MyObject, and second to fourth byte is blank filling (padding). The alignment requirement for S is 2, and because s must be inSo the number of bytes of S is 8, that is to say, variable s occupies ninth and tenth byte of MyObject, I. In addition, because each element of struct, class and union arrays should be aligned in memory, struc is generally used.The size of the T, class, and union is all the integer times of this type of alignment, so the size of the MyObject is 12, that is, the variable s will have a blank filling of 2 byte.

Because all memory access in C++ is accessed through the read and write of the variable, this mechanism ensures that all variables satisfy the memory alignment, and that all memory access in the program is aligned.

Of course, C++ will not prevent us from accessing unaligned memory. For example, the following code is likely to access unaligned memory:

char buf[10];
int* ptr = (int*)(buf + 1);
++*ptr;

This kind of code is also encountered in our actual work. In fact, this writing is dangerous because he will probably visit unaligned memory. This is why writing c++ is not recommended for C style conversion, but for static_cast, dyna.Mic_cast, const_cast and reinterpret_cast. In this case, the above code must use reinterpret_cast. As we all know, reinterpret_cast is very dangerous, maybe.There will be a way to avoid such logic.

Unaligned memory access to common CPU

According to Intel’s latest Intel 64 and IA-32 architecture instructions, both Intel 64 and IA-32 architectures support unaligned memory access, but there will be additional performance overhead (see http://www.intel.com/produc)Ts/processor/manuals). But in fact, the recent Core series CPU has been able to access unaligned memory without extra cost.

The most common ARMv8 architecture on a mobile phone, if it is an unaligned memory access that is normal, without multi core synchronization, can generate a alignment fault or perform an unaligned memory operation. In other words, will it be wrong or normal?To see the implementation of the specific CPU. There are limits to normal operation. For example, we can not guarantee the atomicity of reading and writing (except the operation of a byte), and it is likely to generate additional overhead. (see https://developer.arm.com/doc for details)S/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile). Co in ARMv8The rtex-A series, a common CPU family on mobile phones, can handle unaligned memory access normally, but in general, there will be additional overhead (see http://infocenter.arm.com/help/index.jsp? Topic=/)Com.arm.doc.faqs/ka15414.html).

We can also write a simple program to test your CPU’s support for unaligned memory access. Here’s the code:

#include <iostream>
#include <chrono>

using namespace std;
using namespace std::chrono;

milliseconds test_duration(volatile int * ptr)  // Using the volatile pointer to prevent compiler optimization
{
    auto start = steady_clock::now();
    for (unsigned i = 0; i < 100'000'000; ++i)
    {
        ++(*ptr);
    }
    auto end = steady_clock::now();
    return duration_cast<milliseconds>(end - start);
}

int main()
{
    int raw[2] = {0, 0};
    {
        int* ptr = raw;
        cout << "address of aligned pointer: " << (void*)ptr << endl;
        cout << "aligned access: " << test_duration(ptr).count() << "ms" << endl;
        *ptr = 0;
    }
    {
        int* ptr = (int*)(((char*)raw) + 1);
        cout << "address of unaligned pointer: " << (void*)ptr << endl;
        cout << "unaligned access: " << test_duration(ptr).count() << "ms" << endl;
        *ptr = 0;
    }
    cin.get();
    return 0;
}

The CPU I tested was Intel Core i7 2630QM, the Intel 2 generation core CPU, and the test results were:

address of aligned pointer: 000000668DEFFA78
aligned access: 282ms
address of unaligned pointer: 000000668DEFFA79
unaligned access: 285ms

We can see that there is no performance difference between aligned and unaligned memory access.

Modify alignment requirements in C++

Generally speaking, we do not need to customize alignment requirements, but there are special circumstances that need to be adjusted. In C++, we can use the alignas keyword to modify the alignment requirements of a type or a variable. For example:

class MyObject
{
    char c;
    alignas(8) int i;
    short s;
};

In this way, the alignment requirement of the variable I changed from 4 to 8. As a result, the byte number offset of the I changed from 4 to 8, the byte number offset of s changed from 8 to 12, the alignment requirement of MyObject became 8, and the size became 16.

We can also use the alignas for the definition of MyObject:

class alignas(16) MyObject
{
    char c;
    int i;
    short s;
};

You can also write a type in alignas. You can also use multiple alignas, and the result is to use the largest alignment requirement. For example, the following alignment requirements for MyObject are 16:

class alignas(int) alignas(16) MyObject
{
    char c;
    int i;
    short s;
};

alignasOne limitation is that you can’t use alignas to change alignment requirements. For example, the following code will be reported to be wrong:

alignas(1) int i;

In addition, there is a special type in C++: max_align_t, which is not greater than his align amount called the base alignment amount (fundamental alignment), which is larger than this align amount called the extended alignMent). The C++ standard stipulates that all platforms must support basic alignment, and support for extended alignment depends on all platforms. Generally speaking, the alignment of max_align_t is equal to the alignment of long double.

C++There are many functions for the support of memory alignment, such as the alignof keyword of query alignment, which can create aligned_storage templates of arbitrary size and arbitrarily aligned type, and alignment_of for convenient template programming, and so on.That’s about.

Similar Posts:

Leave a Reply

Your email address will not be published. Required fields are marked *