智能指针和移动语义简介

本节阅读量:

考虑一个会动态分配值的函数：

1
2
3
4
5
6
7
8


void someFunction()
{
    Resource* ptr = new Resource(); // Resource 是结构体或类

    // 这里对ptr做一些操作

    delete ptr;
}

尽管上面的代码看起来相当简单，但我们很容易忘记释放ptr。即使您确实记得在函数末尾删除ptr，如果函数提前退出，也可能忘记在中途退出前删除ptr。例如：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


#include <iostream>

void someFunction()
{
    Resource* ptr = new Resource();

    int x;
    std::cout << "Enter an integer: ";
    std::cin >> x;

    if (x == 0)
        return; // 函数提前退出, ptr没有被删除!

    // 这里对ptr做一些操作

    delete ptr;
}

或者因为抛出异常而提前退出：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


#include <iostream>

void someFunction()
{
    Resource* ptr = new Resource();

    int x;
    std::cout << "Enter an integer: ";
    std::cin >> x;

    if (x == 0)
        throw 0; // 函数提前退出, ptr没有被删除!

    // 这里对ptr做一些操作

    delete ptr;
}

在上述两个程序中，return或throw语句都会导致函数提前终止，从而无法删除变量ptr。因此，分配给ptr的内存会发生泄漏（并且每次调用该函数并提前返回时都会再次泄漏）。

从本质上讲，之所以会出现这种问题，是因为指针变量没有内建机制来清理自己。

智能指针类

类的一个重要优点是它们拥有析构函数。当类对象超出作用域时，析构函数会自动执行。因此，如果在构造函数中分配（或获取）内存，就可以在析构函数中释放它，并确保类对象被销毁时释放内存（无论它是超出作用域，还是被显式删除，等等）。这是RAII编程范式的核心。

因此，我们可以使用类来帮助我们管理和清理指针吗？当然可以！

考虑一个类，它的唯一任务是保存并“拥有”传递给它的指针，然后在类对象超出作用域时释放该指针。只要该类对象被创建为局部变量，就可以保证它超出作用域时（无论函数何时或如何终止），其拥有的指针都会被销毁。

这是这个想法的初稿：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43


#include <iostream>

template <typename T>
class Auto_ptr1
{
	T* m_ptr {};
public:
	// 通过构造函数，“拥有”传递进来的指针
	Auto_ptr1(T* ptr=nullptr)
		:m_ptr(ptr)
	{
	}
	
	// 析构函数，确保拥有的指针被释放
	~Auto_ptr1()
	{
		delete m_ptr;
	}

	// 重载 解引用 和 operator-> ，以便 Auto_ptr1 用起来和 m_ptr 一样.
	T& operator*() const { return *m_ptr; }
	T* operator->() const { return m_ptr; }
};

// 示例class，用来证明上述方案有效
class Resource
{
public:
    Resource() { std::cout << "Resource acquired\n"; }
    ~Resource() { std::cout << "Resource destroyed\n"; }
};

int main()
{
	Auto_ptr1<Resource> res(new Resource()); // 注意这里动态分配了内存

        // ... 但没有显式delete释放

	// 注意这里使用 <Resource>, 而不是 <Resource*>
        // 这是因为 m_ptr 的类型是 T* (而不是 T)

	return 0;
} // res 这里超出作用域, 并且释放了 Resource 申请的内存

该程序打印：

1
2


Resource acquired
Resource destroyed

考虑一下这个程序和类是如何工作的。首先，我们动态创建一个Resource，并将其作为参数传递给模板化的Auto_ptr1类。从这一刻开始，Auto_ptr1变量res就拥有该Resource对象。因为res被声明为局部变量，并且具有块作用域，所以当块结束时，它会超出作用域并被销毁（不用担心忘记释放它）。因为它是类对象，所以销毁时会调用Auto_ptr1析构函数。该析构函数会确保它持有的资源指针被删除！

只要Auto_ptr1被定义为局部变量（具有自动存储期，这也是类名中“Auto”的含义），就可以保证资源在声明它的块末尾被销毁，无论函数如何终止（即使它提前终止）。

这样的类称为智能指针。智能指针是一个组合类，用于管理动态分配的内存，并确保智能指针对象超出作用域时删除内存。（相对地，内置指针有时被称为“哑指针”，因为它们不能在指针变量失效后自行清理）。

现在，让我们回到上面的someFunction()示例，看看智能指针类如何解决这个问题：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54


#include <iostream>

template <typename T>
class Auto_ptr1
{
	T* m_ptr {};
public:
	// 通过构造函数，“拥有”传递进来的指针
	Auto_ptr1(T* ptr=nullptr)
		:m_ptr(ptr)
	{
	}
	
	// 析构函数，确保拥有的指针被释放
	~Auto_ptr1()
	{
		delete m_ptr;
	}

	// 重载 解引用 和 operator-> ，以便 Auto_ptr1 用起来和 m_ptr 一样.
	T& operator*() const { return *m_ptr; }
	T* operator->() const { return m_ptr; }
};

// 示例class，用来证明上述方案有效
class Resource
{
public:
    Resource() { std::cout << "Resource acquired\n"; }
    ~Resource() { std::cout << "Resource destroyed\n"; }
    void sayHi() { std::cout << "Hi!\n"; }
};

void someFunction()
{
    Auto_ptr1<Resource> ptr(new Resource()); // ptr 现在拥有 Resource
 
    int x;
    std::cout << "Enter an integer: ";
    std::cin >> x;
 
    if (x == 0)
        return; // 这里提前返回
 
    // 对 ptr 做一些操作
    ptr->sayHi();
}

int main()
{
    someFunction();

    return 0;
}

如果用户输入非零整数，则上述程序将打印：

1
2
3


Resource acquired
Hi!
Resource destroyed

如果用户输入零，上述程序将提前终止，并打印：

1
2


Resource acquired
Resource destroyed

请注意，即使用户输入零并导致函数提前终止，资源仍然会被正确释放。

由于变量ptr是局部变量，因此函数终止时ptr会被销毁（无论函数如何终止）。由于Auto_ptr1析构函数会清理资源，因此我们可以确保资源被正确释放。

一个关键的缺陷

Auto_ptr1类在某些自动生成代码的背后潜伏着一个严重缺陷。在继续阅读之前，请先思考一下您是否能识别出它是什么。

（提示:考虑如果不提供类的哪些部分将自动生成）

（危险的音乐）

好了，时间到了。

考虑以下程序：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36


#include <iostream>

// 和上面一样
template <typename T>
class Auto_ptr1
{
	T* m_ptr {};
public:
	Auto_ptr1(T* ptr=nullptr)
		:m_ptr(ptr)
	{
	}
	
	~Auto_ptr1()
	{
		delete m_ptr;
	}

	T& operator*() const { return *m_ptr; }
	T* operator->() const { return m_ptr; }
};

class Resource
{
public:
	Resource() { std::cout << "Resource acquired\n"; }
	~Resource() { std::cout << "Resource destroyed\n"; }
};

int main()
{
	Auto_ptr1<Resource> res1(new Resource());
	Auto_ptr1<Resource> res2(res1);

	return 0;
}

该程序打印：

1
2
3


Resource acquired
Resource destroyed
Resource destroyed

此时，您的程序很可能（但不一定）崩溃。现在看到问题了吗？由于没有提供拷贝构造函数或赋值运算符，C++会为我们提供默认实现。这些默认函数会执行浅拷贝。因此，当用res1初始化res2时，两个Auto_ptr1变量都指向同一个资源。当res2超出作用域时，它会删除资源，使res1持有悬空指针。当res1随后删除其（已经被删除的）资源时，将导致未定义行为（可能是崩溃）！

对于这样的函数，您也会遇到类似问题：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


void passByValue(Auto_ptr1<Resource> res)
{
}

int main()
{
	Auto_ptr1<Resource> res1(new Resource());
	passByValue(res1);

	return 0;
}

在该程序中，res1会按值复制到参数res中，因此res1.m_ptr和res.m_ptr会保存相同地址。

当函数末尾销毁res时，res1.m_ptr会变成悬空指针。稍后删除res1.m_ptr时，将产生未定义行为。

很明显，这并不好。我们如何解决这个问题？

嗯，我们可以做的一件事是显式删除拷贝构造函数和赋值运算符，从而防止进行任何复制。这将阻止按值传递的情况（这很好，因为我们本来通常也不应该按值传递它们）。

但是，我们如何将函数中的Auto_ptr1返回给调用者呢？

1
2
3
4
5


??? generateResource()
{
     Resource* r{ new Resource() };
     return Auto_ptr1(r);
}

我们不能通过引用返回Auto_ptr1，因为局部的Auto_ptr1会在函数结束时被销毁，调用方将得到一个悬空引用。也可以直接返回Resource*，但稍后可能会忘记删除这个指针。这样就完全没有用到智能指针的各种好处。按值返回Auto_ptr1是我们需要的能力，但它最终会带来浅拷贝、重复指针和崩溃。

另一种选择是重载拷贝构造函数和赋值运算符来执行深拷贝。这样至少可以避免多个指针指向同一对象。但复制可能代价高昂（并且可能不理想，甚至不可能），我们也不希望仅仅为了从函数返回Auto_ptr1，就对对象进行不必要的复制。再加上分配或初始化原始指针并不会复制它所指向的对象，因此我们也自然期望智能指针具有类似行为。

我们该怎么办？

移动语义（Move semantics）

如果我们不是让拷贝构造函数和赋值运算符复制指针（“复制语义”），而是将指针的所有权从源对象转移/移动到目标对象，会怎么样？这正是移动语义背后的核心思想。移动语义意味着类会转移对象的所有权，而不是制作副本。

让我们更新Auto_ptr1类，看看如何完成此操作：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66


#include <iostream>

template <typename T>
class Auto_ptr2
{
	T* m_ptr {};
public:
	Auto_ptr2(T* ptr=nullptr)
		:m_ptr(ptr)
	{
	}
	
	~Auto_ptr2()
	{
		delete m_ptr;
	}

	// 实现移动语义的拷贝构造函数
	Auto_ptr2(Auto_ptr2& a) // 注: 非 const
	{
		// 不需要删除 m_ptr.  构造函数只有在对象创建时调用，m_ptr不可能被赋值过
		m_ptr = a.m_ptr; // 将内部指针的所有权转移到当前对象
		a.m_ptr = nullptr; // 确保源对象，不再拥有指针
	}
	
	// 实现移动语义的赋值运算符
	Auto_ptr2& operator=(Auto_ptr2& a) // 注: 非 const
	{
		if (&a == this)
			return *this;

		delete m_ptr; // 释放之前已经持有的指针
		m_ptr = a.m_ptr; // 将内部指针的所有权转移到当前对象
		a.m_ptr = nullptr; // 确保源对象，不再拥有指针
		return *this;
	}

	T& operator*() const { return *m_ptr; }
	T* operator->() const { return m_ptr; }
	bool isNull() const { return m_ptr == nullptr; }
};

class Resource
{
public:
	Resource() { std::cout << "Resource acquired\n"; }
	~Resource() { std::cout << "Resource destroyed\n"; }
};

int main()
{
	Auto_ptr2<Resource> res1(new Resource());
	Auto_ptr2<Resource> res2; // 以 nullptr 初始化

	std::cout << "res1 is " << (res1.isNull() ? "null\n" : "not null\n");
	std::cout << "res2 is " << (res2.isNull() ? "null\n" : "not null\n");

	res2 = res1; // res2 获得所有权, res1 被设置为 null

	std::cout << "Ownership transferred\n";

	std::cout << "res1 is " << (res1.isNull() ? "null\n" : "not null\n");
	std::cout << "res2 is " << (res2.isNull() ? "null\n" : "not null\n");

	return 0;
}

该程序打印:

1
2
3
4
5
6
7


Resource acquired
res1 is not null
res2 is null
Ownership transferred
res1 is null
res2 is not null
Resource destroyed

注意，我们重载的“operator=”将m_ptr的所有权从res1转移给了res2！因此，我们不会产生重复指针，一切也都会被干净地清理。

std::auto_ptr，以及为什么这是一个坏方案

现在是讨论std::auto_ptr的适当时机。在C++98中引入并在C++17中删除的std::auto_ptr是C++对标准化智能指针的首次尝试。std::auto_ptr选择像auto_ptr2类那样实现移动语义。

然而，std::auto_ptr（和我们的auto_ptr2类）有许多问题，使得使用它变得危险。

首先，由于std::auto_ptr通过拷贝构造函数和赋值运算符实现移动语义，因此按值将std::auto_ptr传递给函数会导致资源移动到函数参数中（并在函数参数超出作用域时，于函数末尾被销毁）。随后，当您在调用方访问auto_ptr变量时（却没有意识到它已经被移动并删除），就可能突然开始解引用空指针！

其次，std::auto_ptr总是使用非数组删除来删除其内容。这意味着auto_ptr不能正确地处理动态分配的数组，因为它使用了错误的释放方式。更糟糕的是，它不会阻止您向它传递数组，然后它会错误地管理该数组，从而导致内存泄漏。

最后，auto_ptr不能很好地配合标准库中的许多其他类使用，包括大多数容器和算法。出现这种情况，是因为这些标准库类假设复制元素时真的会进行复制，而不是移动。

由于上述缺点，std::auto_ptr在C++11中被弃用，并在C++17中被删除。

再进一步

std::auto_ptr设计的核心问题是，在C++11之前，C++语言根本没有区分“复制语义”和“移动语义”的机制。覆盖复制语义来实现移动语义，会导致奇怪的边缘情况和意外错误。例如，您可以编写res1 = res2，却不知道res2是否会被更改！

正因为如此，在C++11中，“移动”的概念被正式定义，并且语言添加了“移动语义”，用来正确区分复制和移动。既然我们已经知道移动语义为什么有用，那么本章剩余部分将继续探索移动语义。我们还会使用移动语义修复Auto_ptr2类。

在C++11中，std::auto_ptr已被一系列其他“移动感知”的智能指针取代：std::unique_ptr、std::weak_ptr和std::shared_ptr。我们还将研究其中最流行的两个：unique_ptr（它是auto_ptr的直接替代品）和shared_ptr。

一个提醒

delete nullptr是可以的，因为它什么也不做。

21.14 第21章总结

22.1 右值引用

本节目录