＜C++＞ list模拟实现

目录

前言

一、list的使用

1. list的构造函数

2. list iterator的使用

3. list capacity

4. list modifiers

5. list的算法

1. unique

2. sort

3. merge

4. remove

5. splice

二、list模拟实现

1. 设置节点类 && list类

2. push_back

3. 迭代器

重载 *

重载前置 ++

重载后置++

重载前置--

重载后置--

重载 !=

重载 ==

begin() && end()

拷贝构造 && 析构函数

const迭代器

4. insert

5. erase

6. size

7. push_back && push_front && pop_back && pop_front

8. clear

9. list的析构函数

10. list的拷贝构造函数

11. empty_init

12. list的赋值运算符重载

总结

前言

前面学习了string、vector的模拟实现，其两者相似，但今天要学习的list就又有新的不同

列表是序列容器，允许在序列中的任意位置进行恒定时间插入和擦除操作，以及双向迭代。

列表容器实现为双向链表;双向链表可以将它们包含的每个元素存储在不同且不相关的存储位置。排序通过与每个元素的关联在内部保持，该链接指向它前面的元素，并链接到它后面的元素。

它们与forward_list非常相似：主要区别在于forward_list对象是单链表，因此它们只能向前迭代，以换取更小、更高效。

与其他基本标准序列容器（数组、vector和 deque）相比，列表在已经获得迭代器的容器内任何位置插入、提取和移动元素方面通常表现更好，因此在大量使用这些元素的算法（如排序算法）中也是如此。

与其他序列容器相比，lists 和 forward_lists 的主要缺点是它们无法通过其位置直接访问元素;例如，要访问列表中的第六个元素，必须从已知位置（如开始或结束）迭代到该位置，这需要它们之间的距离的线性时间。它们还会消耗一些额外的内存，以保持与每个元素关联的链接信息（这可能是大型小元素列表的重要因素）。

一、list的使用

1. list的构造函数

构造函数（constructor）	接口说明
list（）	构造空的list
list (size_type n, const value_type& val = value_type() )	构造的list中包含n个值为val的元素
list (const list& x)	拷贝构造函数
list (lnputlterator first, inputlterator last)	用 [first, last)区间中的元素构造list

2. list iterator的使用

函数声明	接口声明
begin + end	返回第一个元素的迭代器+返回最后一个元素下一个位置的迭代器
rbegin + rend	返回第一个元素的reserve_iterator，即end位置，返回最后一个元素下一个位置的reserve_iterator，即begin位置

3. list capacity

函数声明	接口说明
empty	检查list是否为空，是返回true，否则返回false
size	返回list中有效结点的个数

4. list modifiers

函数声明	接口说明
push_front	在list首元素前插入值为val的元素
pop_front	删除list中第一个元素
push_back	在list尾部插入值为val的元素
pop_back	删除list中最后一个元素
insert	在list position 位置中插入值为val的元素
erase	删除list position位置的元素
swap	交换两个list中的元素
clear	清空list中的有效元素

insert、erase 与string的 insert 不同，string的insert形参是下标，而vector、list的形参都是迭代器，但是list和vector还是有不同的，因为vector在第i个位置插入是begin() + i，而list就不能使用迭代器加数字，因为链表物理空间不连续，要实现加的代价有点大。

要实现迭代器的改变，只能循环自加（重载的++）
	std::list<int> lt;lt.push_back(1);lt.push_back(2);lt.push_back(3);lt.push_back(4);auto it = lt.begin();for (size_t i = 0; i < 4; i++){++it;}lt.insert(it, 100);for (auto e : lt){cout << e << " ";}cout << endl;
如果我们想在元素3之前插入一个数据，我们应该使用find函数来查找，但是我们可以发现vector、list都没有find函数，这是因为STL将容器和算法分开了，提出了容器公共的算法。

因为容器数据一般都是私有的，外部不能访问，但是各容器都是通过迭代器访问的，C++通过迭代器提取出了容器公共的算法，迭代器没有暴露原容器的细节，不管是树还是数组...，都是通过同样的方式访问
	it = c.begin();while (it != c.end()){cout << *it << " ";}
循环条件也是关键，并不是 it < c.end( ) 而是 it != c.end( )，这是因为容器的存储方式不一定连续存储

【first，last)

所以公共算法是通过迭代器实现的，例如find函数
template<class InputIterator, class T>InputIterator find (InputIterator first, InputIterator last, const T& val)
{while (first!=last) {if (*first==val) return first;++first;}return last;
}
例如：
auto it = find(lt.begin(), lt.end(), 3);
if (it != lt.end())
{lt.insert(it, 100);
}for (auto e : lt)
{cout << e << " ";
}cout << endl;
因为list不需要扩容，节点的地址不改变，所以list的insert就不存在失效的问题

同理分析erase，迭代器指向的节点被释放，那么迭代器一定是失效的

5. list的算法

迭代器从功能上来说，分为：单向、双向、随机三类迭代器，它是由容器的底层结构决定的

单向：只能++ forward_list / 哈希

双向：可以++ / -- list / map / set

随机：++ / -- / + / - vector / string / deque

单向迭代器访问自由度小于双向小于随机。所以，单向迭代器可以使用的情况，双向、随机迭代器都可以访问。

对于不同的容器，迭代器种类不同，是否使用alrorithm的算法情况也不同，例如sort函数，实现原理是快排，sort的形参只能使用随机迭代器

	std::list<int> lt;lt.push_back(1);lt.push_back(2);lt.push_back(3);lt.push_back(4);lt.push_back(5);//错误，随机sort需要随即迭代器，而list是双向迭代器//sort(lt.begin(), lt.end());for (auto e : lt)cout << e << " ";cout << endl;lt.reverse();lt.sort();for (auto e : lt)cout << e << " ";cout << endl;

list的sort实现原理是归并排序，但是一旦数据量增大，list的sort就不合适了，效率还不如转到vector，vector排序后再转回list。

所以，容器所提供的接口都是有讲究的，vector不提供专门的头插头删（只是复用insert），是因为如果它频繁的头插头删那么就不应该使用vector，而是list。

再比如，list的sort，因为迭代器的原因，list需要自己实现sort，但是list它不是一个适合多次排序的容器，vector等随机迭代器才适合sort的快排算法，所以list的sort只适合少量数据的排序，数据量如果很大，那么就考虑更改容器来提高效率。

1. unique

去重函数，注意要先排序，不排序去重效率很低
使用双指针进行去重操作

2. sort

双向迭代器

3. merge

两个有序链表的合并
第二个参数是仿函数

4. remove

等效于find + erase
如果没找到，什么也不做，不会报错

5. splice

转接链表，把 list1 的节点取下来，接到 list2 上 ’
可以转接到自身，但是不能重叠，重叠会陷入死循环

三个重载函数功能：

        1. 将链表x全部转移到position之前位置

        2. 将链表x的 i 迭代器处节点转移到position之前位置，即只转移一个节点

        3. 将链表x从first到last范围的节点全部转移到position之前位置

例：
int main()
{std::list<int> mylist1, mylist2;std::list<int>::iterator it;// set some initial values:for (int i = 1; i <= 4; ++i)mylist1.push_back(i);      // mylist1: 1 2 3 4for (int i = 1; i <= 3; ++i)mylist2.push_back(i * 10);   // mylist2: 10 20 30it = mylist1.begin();++it;                         // points to 2//将list2全部接在list1的2之前mylist1.splice(it, mylist2);//部分转移mylist1.splice(it, mylist2, ++mylist2.begin());mylist1.splice(it, mylist2, ++mylist2.begin(), mylist2.end());//转移到自身mylist1.splice(mylist1.begin(), mylist1, ++mylist1.begin(), mylist2.end());return 0;
}

二、list模拟实现

1. 设置节点类 && list类

list_node采用struct类，因为struct默认是公有，用class也行，但是要加上public，如果不加公有条件，后续使用十分麻烦，还要搞友元
template<class T>
struct list_node
{list_node<T>* _next;list_node<T>* _prev;T _val;
};
C++的struct已经成为类，并且还有模板，所以list_node不是类型，list_node<T>才是类型，为了避免书写时忘记<T>，我们重命名
list成员变量即为头节点
	private:Node* _head;

namespace my_list
{template<class T>struct list_node{list_node<T>* _next;list_node<T>* _prev;T _val;};template<class T>class list{typedef list_node<T> Node;private:Node* _head;};

2. push_back

开辟新节点，新节点类型是一个类，new的时候会调用默认构造，我们再struct类中定义构造函数来构造新节点

namespace my_list
{template<class T>struct list_node{list_node<T>* _next;list_node<T>* _prev;T _val;//默认构造list_node(const T& val = T()):_next(nullptr),_prev(nullptr),_val(val){}};template<class T>class list{typedef list_node<T> Node;public:list(){_head = new Node;_head->_prev = _head;_head->_next = _head;}void push_back(const T& x){Node* tail = _head->_prev;Node* newnode = new Node(x);tail->_next = newnode;newnode->_prev = tail;newnode->_next = _head;_head->_prev = newnode;}private:Node* _head;};
}

3. 迭代器

思考：直接将 Node* 作为 iterator 可以吗？

答案：不能！

原因：

1. 首先，Node*作为迭代器解引用得到的是结构体，不是数据

2. 其次，节点地址不连续，迭代器++不能移动到下一个节点

之前的vector、string的元素指针天生就有两点优势，所以它们的迭代器就用指针实现

解决：

使用运算符重载！将iterator封装为一个类，重载运算符 ++ 和 * 。iterator是自定义类型，它的成员是一个节点的指针
list<int>::iterator it = lt.begin();
属于类域的是在类内部定义的，一种为typedef的内嵌类型，另一种是内部类

	template<class T>struct __list_iterator{typedef list_node<T> Node;//成员Node* _node;//默认构造__list_iterator(Node* _node):_node(node){}};

在 list 内部，我们将__list_iterator<T> 重定义为 iterator，切记用public修饰，因为我们需要在类外部使用迭代器
之所以将迭代器命名为__list_iterator<T>是为了避免重名，因为后面的容器都是需要iterator的

重载 *

迭代器解引用应为元素值

		//引用返回，因为节点生命周期不在该函数内//可读可写版本T& operator*(){return _node->_val;}

重载前置 ++

返回this解引用

		__list_iterator<T>& operator++(){_node = _node->_next;return *this;}

重载后置++

		//后置++__list_iterator<T> operator++(int){__list_iterator<T> tmp(*this);_node = _node->_next;return tmp;}

重载前置--

		//前置--self& operator--(){_node = _node->_prev;return *this;}

重载后置--

		//后置--self operator--(int){self tmp(*this);_node = _node->_prev;return tmp;}

重载 !=

这里注意const，因为比较!=时，右边时end(), 该函数返回的是临时对象，临时对象具有常性，所以需要const修饰

		//这里注意const，因为比较!=时，右边时end(),该函数返回的是临时对象，临时对象具有常性bool operator!=(const __list_iterator<T>& it){return _node != it._node;}

重载 ==

复用即可

		bool operator==(const __list_iterator<T>& it){return _node == it._node;}

begin() && end()

返回指针类型会隐式类型转换为iterator类，我们也可以直接返回匿名对象

	template<class T>struct __list_iterator{typedef list_node<T> Node;//成员Node* _node;//默认构造__list_iterator(Node* node):_node(node){}//引用返回，因为节点生命周期不在该函数内//可读可写版本T& operator*(){return _node->_val;}__list_iterator<T>& operator++(){_node = _node->_next;return *this;}//这里注意const，因为比较!=时，右边时end(),该函数返回的是临时对象，临时对象具有常性bool operator!=(const __list_iterator<T>& it){return _node != it._node;}};template<class T>class list{typedef list_node<T> Node;public://迭代器要公有，让外面可以使用typedef __list_iterator<T> iterator;iterator begin(){//由指针类型隐式转换为iterator类//return _head->_next;//也可以用匿名对象return iterator(_head->_next);}iterator end(){return iterator(_head);}list(){_head = new Node;_head->_prev = _head;_head->_next = _head;}private:Node* _head;};}

拷贝构造 && 析构函数

list<int>::iterator it = lt.begin();
1. 这里就是拷贝构造 it ，但是成员是内置类型Node*，浅拷贝正确吗？

我们要的就是那个节点的指针，就是需要浅拷贝来访问该节点，所以对于iterator的拷贝构造我们仅仅就是需要浅拷贝即可，不需要深拷贝，所以编译器默认生成的就足够使用。

2. 但是，我们要思考，多个迭代器对象指向同一个节点，会导致程序崩溃吗？

浅拷贝导致程序崩溃的原因就是多次析构，但是每次使用完迭代器，就需要析构该节点吗？

不可以！节点不属于迭代器，迭代器无权释放节点，迭代器仅仅是访问修改功能，有权限释放节点的是 list 的析构函数

.

所以，在实际的工作中，对于内置类型指针，需要浅拷贝or深拷贝是根据目标来决定的

const迭代器

如何设计const迭代器？
typedef const __list_iterator<T> const_iterator;
这样设计可以吗？

        不可以！如果这样设计，迭代器本身就不能修改了。

        我们想实现的是迭代器指向的内容不能被修改，而迭代器本身可以++修改指向，如果使用上面的迭代器，那么我们就不能修改迭代器了，因为const迭代器对象就不能调用++函数，因为++函数没有const修饰，也不能加上const修饰，因为函数内要进行修改

        正确设计应将解引用重载函数的返回值设为const T&，

但是又来了一个问题，我们需要再写一个除了解引用重载函数外一模一样的const_iterator类吗？

        代码冗余！

我们来看stl的解决方案：

因为只有解引用的返回值不同，所以在这里我们可以使用模板参数控制返回值类型
对 __list_iterator<T, Ref> 重命名为self，方便日后再次修改模板时，不用再去修改各成员函数的返回值
	//typedef __list_iterator<T, T&> iterator;//typedef __list_iterator<T, const T&> const_iterator;//对于const迭代器和普通迭代器，这属于两个类，根据模板生成相应的类template<class T, class Ref>struct __list_iterator{typedef list_node<T> Node;typedef __list_iterator<T, Ref> self;//成员变量Node* _node;__list_iterator(Node* node):_node(node){}//引用返回，因为节点生命周期不在该函数内//可读可写版本//Ref根据不同的参数，返回相应权限的引用Ref operator*(){return _node->_val;}//前置++self& operator++(){_node = _node->_next;return *this;}//后置++self operator++(int){self tmp(*this);_node = _node->_next;return tmp;}//这里注意const，因为比较!=时，右边时end(),该函数返回的是临时对象，临时对象具有常性bool operator!=(const self& it){return _node != it._node;}bool operator==(const self& it){return _node == it._node;}};
我们再来升级：

可以看到STL的list模板有三个类型，并且list的重载函数重载了->。

问：为什么要重载 ->呢？*解引用得不到数据吗？

答：是的，对于其他的自定义类型，operator*确实得不到其中数据，例如
struct A
{A(int a1 = 0, int a2 = 0):_a1(a1),_a2(a2){}int _a1;int _a2;
};void test()
{list<A> lt;lt.push_back(A(1, 1));lt.push_back(A(2, 2));lt.push_back(A(3, 3));lt.push_back(A(4, 4));list<A>::iterator it = lt.begin();while (it != lt.end()){cout << *it << " ";++it;}cout << endl;
}
编译错误，对于*it，解引用的结果是结构体，如果想要输出，要自己重载流插入和流提取，对于私有的变量，是需要自己重载，但是对于公有的变量，我们可以用 ->或(*it)._a1来访问数据。用 -> 是因为迭代器模拟的就是指针

但是，it -> _a1本质其实是it -> -> _a1，因为第一个->是运算符重载，返回的是一个指针，后面没有运算符了，所以是访问不了数据的，能访问是因为迭代器特殊处理，省略了一个->

由于operator->返回值有两种类型，所以我们又加了一个模板参数Ptr
template<class T, class Ref, class Ptr>
所以对于复杂的代码，很难一步看懂，先省略细节，往后看

4. insert

在pos之前插入，由于双向链表特性，pos在哪里都可以，即使是头节点，那就是尾插
返回插入后的迭代器

		//在pos之前插入，由于双向链表特性，pos哪里都可以，即使是头节点iterator insert(iterator pos){//这里就体现了struct比class优点，class是需要封装为私有的，后面还要搞友元Node* cur = pos._node;Node* prev = cur->_prev;Node* newnode = new Node(x);prev->_next = newnode;newnode->_next = cur;cur->_prev = newnode;newnode->_prev = prev;++_size;return newnode;}

5. erase

erase就需要判断pos是否合法
返回删除后的下一个迭代器

		iterator erase(iterator pos){assert(pos != end());Node* cur = pos._node;Node* prev = cur->_prev;Node* next = cur->_next;prev->_next = next;next->_prev = prev;delete cur;--_size;return next;}

6. size

可以遍历一遍，时间复杂度是O(N)
也可以加一个_size成员变量，记录size

		size_t size(){/*size_t sz = 0;iterator it = begin();while (it != end()){++sz;++it;}return sz;*///也可增加一个_size的成员变量return _size;}

7. push_back && push_front && pop_back && pop_front

全部复用insert、erase即可

		void push_back(const T& x){/*Node* tail = _head->_prev;Node* newnode = new Node(x);tail->_next = newnode;newnode->_prev = tail;newnode->_next = _head;_head->_prev = newnode;*/insert(end(), x);}void push_front(const T& x){insert(beign(), x);}void pop_back(){erase(--end());}void pop_front(){erase(begin());}

8. clear

清除数据，除了头节点

		//清数据，不需要清除头节点void clear(){iterator it = beign();while (it != end()){//不要++it，迭代器失效it = erase(it);}_size = 0;}

9. list的析构函数

复用clear函数

		~list(){//复用clearclear();delete _head;_head = nullptr;}

10. list的拷贝构造函数

逐个push_back即可

		//拷贝构造list(const list<T>& lt){_head = new Node;_head->_prev = _head;_head->_next = _head;_size = 0;for (auto& e : lt)	//引用是为了防止数组类型很大{push_back(e);}}

11. empty_init

因为构造和拷贝构造都需要四行初始化，所以我们封装为empty_init函数

		void empty_init(){_head = new Node;_head->_prev = _head;_head->_next = _head;_size = 0;}

12. list的赋值运算符重载

现代写法

		void swap(list<T>& lt){std::swap(_head, lt._head);std::swap(_size, lt._size);}list<T>& operator=(list<T> lt)	//赋值运算符重载的现代写法{swap(lt);return *this;}

注意：对于拷贝构造和赋值运算符重载，STL的list中这两个函数声明时，返回值是list&，而我们写的是list<T>&，list<T>是类型，list是类名，写类型是必然正确的，但是这里写类名也对，编译器也可以识别。

类模板在类里既可以写类名又可以写类型，但是不推荐缩写类型名

整体代码

#pragma once
#include<assert.h>
#include<iostream>
#include<algorithm>namespace my_list
{template<class T>struct list_node{//成员list_node<T>* _next;list_node<T>* _prev;T _val;//默认构造list_node(const T& val = T()):_next(nullptr),_prev(nullptr),_val(val){}};//typedef __list_iterator<T, T&> iterator;//typedef __list_iterator<T, const T&> const_iterator;//对于const迭代器和普通迭代器，这属于两个类，根据模板生成相应的类template<class T, class Ref, class Ptr>struct __list_iterator{typedef list_node<T> Node;typedef __list_iterator<T, Ref, Ptr> self;//成员变量Node* _node;__list_iterator(Node* node):_node(node){}//引用返回，因为节点生命周期不在该函数内//可读可写版本//Ref根据不同的参数，返回相应权限的引用Ref operator*(){return _node->_val;}//第三个模板参数，T* 和 const T*Ptr operator->(){return &(_node->_val);}//前置++self& operator++(){_node = _node->_next;return *this;}//后置++self operator++(int){self tmp(*this);_node = _node->_next;return tmp;}//前置--self& operator--(){_node = _node->_prev;return *this;} //后置--self operator--(int){self tmp(*this);_node = _node->_prev;return tmp;}//这里注意const，因为比较!=时，右边时end(),该函数返回的是临时对象，临时对象具有常性bool operator!=(const self& it){return _node != it._node;}bool operator==(const self& it){return _node == it._node;}};template<class T>class list{typedef list_node<T> Node;public:typedef __list_iterator<T, T&, T*> iterator;	//迭代器要公有，让外面可以使用typedef __list_iterator<T, const T&, const T*> const_iterator;iterator begin(){//由指针类型隐式转换为iterator类//return _head->_next;//也可以用匿名对象return iterator(_head->_next);}iterator end(){return iterator(_head);}const_iterator beign() const{return const_iterator(_head->next);}const_iterator end() const{return const_iterator(_head);}void empty_init(){_head = new Node;_head->_prev = _head;_head->_next = _head;_size = 0;}list(){/*_head = new Node;_head->_prev = _head;_head->_next = _head;_size = 0;*///因为构造和拷贝构造都需要这四行初始化，所以我们封装为empty_initempty_init();}//拷贝构造list(const list<T>& lt){/*_head = new Node;_head->_prev = _head;_head->_next = _head;_size = 0;*/empty_init();for (auto& e : lt)	//引用是为了防止数组类型很大{push_back(e);}}void swap(list<T>& lt){std::swap(_head, lt._head);std::swap(_size, lt._size);}//list<T>是类型list<T>& operator=(list<T> lt)	//赋值运算符重载的现代写法{swap(lt);return *this;}~list(){//复用clearclear();delete _head;_head = nullptr;}//清数据，不需要清除头节点void clear(){iterator it = beign();while (it != end()){//不要++it，迭代器失效it = erase(it);}_size = 0;}void push_back(const T& x){/*Node* tail = _head->_prev;Node* newnode = new Node(x);tail->_next = newnode;newnode->_prev = tail;newnode->_next = _head;_head->_prev = newnode;*/insert(end(), x);}void push_front(const T& x){insert(beign(), x);}void pop_back(){erase(--end());}void pop_front(){erase(begin());}//在pos之前插入，由于双向链表特性，pos哪里都可以，即使是头节点iterator insert(iterator pos){//这里就体现了struct比class优点，class是需要封装为私有的，后面还要搞友元Node* cur = pos._node;Node* prev = cur->_prev;Node* newnode = new Node(x);prev->_next = newnode;newnode->_next = cur;cur->_prev = newnode;newnode->_prev = prev;++_size;return newnode;}iterator erase(iterator pos){assert(pos != end());Node* cur = pos._node;Node* prev = cur->_prev;Node* next = cur->_next;prev->_next = next;next->_prev = prev;delete cur;--_size;return next;}size_t size(){/*size_t sz = 0;iterator it = begin();while (it != end()){++sz;++it;}return sz;*///也可增加一个_size的成员变量return _size;}private:Node* _head;size_t _size;};}