秘籍:10个Python字符串处理技巧(附代码)
本文约1600字,建议阅读7分钟。
本文为你介绍利用Python处理字符串的一些基本操作。
1. 空格剥离
s = ' This is a sentence with whitespace. n'
print('Strip leading whitespace: {}'.format(s.lstrip()))
print('Strip trailing whitespace: {}'.format(s.rstrip()))
print('Strip all whitespace: {}'.format(s.strip()))
Strip leading whitespace: This is a sentence with whitespace.
Strip trailing whitespace: This is a sentence with whitespace.
Strip all whitespace: This is a sentence with whitespace.
s = 'This is a sentence with unwanted characters.AAAAAAAA'
print('Strip unwanted characters: {}'.format(s.rstrip('A')))
Strip unwanted characters: This is a sentence with unwanted characters.
2. 字符串拆分
s = 'KDnuggets is a fantastic resource'
print(s.split())
['KDnuggets', 'is', 'a', 'fantastic', 'resource']
s = 'these,words,are,separated,by,comma'
print('',' separated split -> {}'.format(s.split(',')))
s = 'abacbdebfgbhhgbabddba'
print(''b' separated split -> {}'.format(s.split('b')))
',' separated split -> ['these', 'words', 'are', 'separated', 'by', 'comma']
'b' separated split -> ['a', 'ac', 'de', 'fg', 'hhg', 'a', 'dd', 'a']
s = ['KDnuggets', 'is', 'a', 'fantastic', 'resource']
print(' '.join(s))
KDnuggets is a fantastic resource
s = ['Eleven', 'Mike', 'Dustin', 'Lucas', 'Will']
print(' and '.join(s))
Eleven and Mike and Dustin and Lucas and Will
4. 字符串反转
5. 大小写转换
s = 'KDnuggets'
print(''KDnuggets' as uppercase: {}'.format(s.upper()))
print(''KDnuggets' as lowercase: {}'.format(s.lower()))
print(''KDnuggets' as swapped case: {}'.format(s.swapcase()))
'KDnuggets' as uppercase: KDNUGGETS
'KDnuggets' as lowercase: kdnuggets
'KDnuggets' as swapped case: kdNUGGETS
6. 检查是否有字符串成员
s1 = 'perpendicular'
s2 = 'pen'
s3 = 'pep'
print(''pen' in 'perpendicular' -> {}'.format(s2 in s1))
print(''pep' in 'perpendicular' -> {}'.format(s3 in s1))
'pen' in 'perpendicular' -> True
'pep' in 'perpendicular' -> False
s = 'Does this string contain a substring?'
print(''string' location -> {}'.format(s.find('string')))
print(''spring' location -> {}'.format(s.find('spring')))
'string' location -> 10
'spring' location -> -1
7. 子字符串替换
s1 = 'The theory of data science is of the utmost importance.'
s2 = 'practice'
print('The new sentence: {}'.format(s1.replace('theory', s2)))
The new sentence: The practice of data science is of the utmost importance.
8. 组合多个列表的输出
countries = ['USA', 'Canada', 'UK', 'Australia']
cities = ['Washington', 'Ottawa', 'London', 'Canberra']
for x, y in zip(countries, cities):
print('The capital of {} is {}.'.format(x, y))
The capital of USA is Washington.
The capital of Canada is Ottawa.
The capital of UK is London.
The capital of Australia is Canberra.
9. 同字母异序词检查
from collections import Counter
def is_anagram(s1, s2):
return Counter(s1) == Counter(s2)
s1 = 'listen'
s2 = 'silent'
s3 = 'runner'
s4 = 'neuron'
print(''listen' is an anagram of 'silent' -> {}'.format(is_anagram(s1, s2)))
print(''runner' is an anagram of 'neuron' -> {}'.format(is_anagram(s3, s4)))
'listen' an anagram of 'silent' -> True
'runner' an anagram of 'neuron' -> False
10. 回文检查
def is_palindrome(s):
reverse = s[::-1]
if (s == reverse):
return True
return False
s1 = 'racecar'
s2 = 'hippopotamus'
print(''racecar' a palindrome -> {}'.format(is_palindrome(s1)))
print(''hippopotamus' a palindrome -> {}'.format(is_palindrome(s2)))
'racecar' is a palindrome -> True
'hippopotamus' is a palindrome -> False
文本数据预处理:Python演练
https://www.kdnuggets.com/2018/03/text-data-preprocessing-walkthrough-python.html
Python中的·文本预处理:步骤、工具和示例https://www.kdnuggets.com/2018/11/text-preprocessing-python.html
文本数据分析完整探索与可视化:可视化与NLP相结合https://www.kdnuggets.com/2019/05/complete-exploratory-data-analysis-visualization-text-data.html
原文标题:
10 Python String Processing Tips & Tricks
原文链接:
https://www.kdnuggets.com/2020/01/python-string-processing-primer.html
译者简介:陈之炎,北京交通大学通信与控制工程专业毕业,获得工学硕士学位,历任长城计算机软件与系统公司工程师,大唐微电子公司工程师,现任北京吾译超群科技有限公司技术支持。目前从事智能化翻译教学系统的运营和维护,在人工智能深度学习和自然语言处理(NLP)方面积累有一定的经验。
转自: 数据派THU 公众号;
END
版权声明:本号内容部分来自互联网,转载请注明原文链接和作者,如有侵权或出处有误请和我们联系。
合作请加QQ:365242293
数据分析(ID : ecshujufenxi )互联网科技与数据圈自己的微信,也是WeMedia自媒体联盟成员之一,WeMedia联盟覆盖5000万人群。
关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/