2018年2月12日 星期一

Python- 去除list中半形與全形標點符號 -Removing Punctuation From Python List Items

版本相關資訊:

System version : Windows 10 64-bit
Python version : Python 3.6.0 :: Anaconda 4.3.1 (64-bit)

Codes:

tokens = ['abc',',.aaa','=','』','abc']
full_punctuation = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~' + '→↓△▿⋄•!??〞#$%&』()*+,-╱︰;<=>@〔╲〕 _ˋ{∣}∼、〃》「」『』【】﹝﹞【】〝〞–—『』「」…﹏'

punctuation_tokens_1 = [ele for ele in tokens if not ele in full_punctuation] 
print('punctuation_tokens_1 :',punctuation_tokens_1)

translator = str.maketrans('', '', full_punctuation)
punctuation_tokens_2 = [s.translate(translator) for s in tokens]
print('punctuation_tokens_2 :',punctuation_tokens_2)

punctuation_tokens_3 = [ele for ele in punctuation_tokens_2 if not ele=='']
print('punctuation_tokens_3 :',punctuation_tokens_3)

結果:

punctuation_tokens_1 : ['abc', ',.aaa', 'abc']
punctuation_tokens_2 : ['abc', 'aaa', '', '', 'abc']
punctuation_tokens_3 : ['abc', 'aaa', 'abc']

沒有留言:

張貼留言