Python Pandas 使用示例

文章目录

使用Boolean 选择rows
读取Excel表格里指定的sheet, 并跳过起始n行
删除只有一个元素的行
删除重复的
合并多个csv文件到excel表格中
获取csv文件的数据

使用Boolean 选择rows

import pandas as pd# Sample DataFrame
data = {'Name': ['John', 'Alice', 'Bob', 'Emily'],'Age': [25, 30, 35, 40],'City': ['New York', 'Paris', 'London', 'Tokyo']}
df = pd.DataFrame(data)
print(df)
print('================')# List of boolean values
boolean_list = [True, False, True, False]# Select rows based on the boolean list
selected_rows = df[boolean_list]
print(selected_rows)

输出：

    Name  Age      City
0   John   25  New York
1  Alice   30     Paris
2    Bob   35    London
3  Emily   40     Tokyo
================Name  Age      City
0  John   25  New York
2   Bob   35    London

读取Excel表格里指定的sheet, 并跳过起始n行

安装openpyxl, openpyxl是Python操作Excel文件的另一个库，可以非常方便地读写Excel文件的各种内容

pip install openpyxl

df = pd.read_excel(path, sheet_name= sheetName, skiprows=1)

sheet_name：指定读取某个sheet
skiprows：参数跳过起始的n行，因为可能有表头等合并单元格之类的数据

删除只有一个元素的行

假设excel表格正常的每行有4个数据(即4列)，但是某一行只有一个单元格有数据，需要删除该行（改行可能是数据或者其他内容）

df = df.dropna(thresh=2)

thresh： thresh = 2表示，如果一行至少有2个以上非空值时才会被删除

删除重复的

原始data frame

brand style  rating
0  Yum Yum   cup     4.0
1  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

默认删除，基于所有列删除重复行

df = df.drop_duplicates()

 brand style  rating
0  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

删除特定的重复列, 使用subset

>>> df.drop_duplicates(subset=['brand'])brand style  rating
0  Yum Yum   cup     4.0
2  Indomie   cup     3.5

keep参数的使用

keep{‘first’, ‘last’, False}, default ‘first’
Determines which duplicates (if any) to keep.
‘first’ : Drop duplicates except for the first occurrence.
‘last’ : Drop duplicates except for the last occurrence.

df.drop_duplicates(subset=['brand', 'style'], keep='last')brand style  rating
1  Yum Yum   cup     4.0
2  Indomie   cup     3.5
4  Indomie  pack     5.0

合并多个csv文件到excel表格中

使用前确保安装了xlsxwriter

pip install xlsxwriter

import os
import pandas as pd# Get the directory name
directory_name = "your file path"# List all CSV files in the directory
csv_files = [f for f in os.listdir(directory_name) if f.endswith('.csv')]# Create a Pandas Excel writer using the directory name as the Excel file name
excel_file_name = os.path.basename(directory_name) + '.xlsx'
excel_writer = pd.ExcelWriter(excel_file_name, engine='xlsxwriter')# Loop through each CSV file and add it as a new sheet in the Excel file
for csv_file in csv_files:sheet_name = os.path.splitext(csv_file)[0]  # Use CSV file name as sheet name# remove the special characters of the name, here is an example# sheet_name = sheet_name.replace("[", "").replace("]", "")  # Remove "[" and "]"df = pd.read_csv(os.path.join(directory_name, csv_file))df.to_excel(excel_writer, sheet_name=sheet_name, index=False)# Save the Excel file
excel_writer.close()print(f"CSV files merged into {excel_file_name} with different sheets.")

获取csv文件的数据

data = pd.read_csv('test.csv', header=None, encoding = "utf-8").values

读取后，data如下所示
在这里插入图片描述
可以通过索引去获取某一行某一列的数据

for i in range(len(data)):print(data[i][0])print(data[i][1])print(data[i][2])

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.rhkb.cn/news/83192.html

如若内容造成侵权/违法违规/事实不符，请联系长河编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

Python Pandas 使用示例

文章目录

使用Boolean 选择rows

读取Excel表格里指定的sheet, 并跳过起始n行

删除只有一个元素的行

删除重复的

合并多个csv文件到excel表格中

获取csv文件的数据

相关文章

扫雷（超详解+全部码源）

centos linux 安装RDMA Soft-RoCE|虚拟机安装Soft-RoCE

Java-数据类型

Leetcode-每日一题【剑指 Offer 15. 二进制中1的个数】

postgresql|数据库|角色（用户）管理工作---授权和去权以及usage和select两种权限的区别

详解推送Git分支时发生的 cannot lock ref 错误

Python数据分析实战-格式化字符串的两种方法（附源码和实现效果）

软件安全测试包含哪些内容和方法?安全测试报告的必要性

echarts 图表设置滚动条

Mysql的instr()函数用法详解

《Linux运维实战：Docker基础总结》

Qt 6. 其他类调用Ui中的控件

Redis过期键删除策略

Spring 创建和使用

数据结构----结构--线性结构--链式存储--链表

zookeeper+kafka分布式消息队列集群的部署

C语言有关文件的操作

SAP度量单位转换功能

限流式保护器在高校宿舍电气防火的应用

Kafka与Zookeeper版本对应关系