Article From:

pythonString objects provide many ways to manipulate strings, which are quite rich in functions.


[..........'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

The use of these methods is described in the official document: string methods, which is explained in detail in this paper. You can use this as a manual later.

There is no pattern matching (regular) related function here. In Python, we need to manipulate strings using pattern matching methods.import reImport the re module. For regular pattern matching, see: re Module Contents.

Notice, in PythonString is an immutable object,So all the ways to modify and generate strings are to create a new string object in another memory fragment. For example,'abc'.upper()It will divide another memory fragment and return it.ABCSave in this memory.

The “S” that appears below represents the string to be manipulated. There is no right in this articlecasefold,encode,format,format_mapIntroduction, the first two are related to Unicode, and the latter two are a little too much.


1.1 lower、upper


Returns the lowercase and uppercase format of the S string. (note that this is a newly generated string, which will no longer be interpreted in another memory fragment).

For example:

>>> print('ab XY'.lower())
ab xy
>>> print('ab XY'.upper())

1.2 title、capitalize


The former returns the format of the capitalization of all the initial letters of the word in the S string and the other letter lowercase, and the latter returns the new string of the first letter capitalization and all the other letters in the other letters.

For example:

>>> print('ab XY'.title())
Ab Xy
>>> print('abc DE'.capitalize())
Abc de

1.3 swapcase


swapcase()All the strings in S are converted to uppercase (uppercase –&gt, lowercase, lowercase –&gt, uppercase).

>>> print('abc XYZ'.swapcase())
ABC xyz

2.isXXX judgment”

2.1 isalpha,isdecimal,isdigit,isnumeric,isalnum


Test string S whether it is numbers, letters, letters or numbers. For non Unicode strings, the first 3 methods are equivalent.

For example:

>>> print('34'.isdigit())
>>> print('abc'.isalpha())
>>> print('a34'.isalnum())

2.2 islower,isupper,istitle


Judge whether or not a lowercase, uppercase, and initials are capitalized. Requires at least one string character in S, otherwise it will return False directly. For example, it can’t be a pure number.

Be careful,istitle()The first letter boundary of each word is judged. For example,word1 Word2word1_Word2word1()Word2There are two words in it, and their initials are “W” and “W”. Therefore, if usedistitle()To judge them, it will return to False, becausewIt’s a lowercase.

For example:

>>> print('a34'.islower())
>>> print('AB'.isupper())
>>> print('Aa'.isupper())
>>> print('Aa Bc'.istitle())
>>> print('Aa_Bc'.istitle())
>>> print('Aa bc'.istitle())
>>> print('Aa_bc'.istitle())

# The following returns to False, because the non initials C is not lowercase.
>>> print('Aa BC'.istitle())

2.3 isspace,isprintable,isidentifier


Determine whether a string is a blank (a space, a tabs, a newline, etc.), or a printable character (such as a tabs, a newline is not a printable character, but a space), and whether it satisfies the identifier definition rule.

For example:

  1. Judge whether it is a blank. No character is not a blank.

    >>> print(' '.isspace())
    >>> print(' \t'.isspace())
    >>> print('\n'.isspace())
    >>> print(''.isspace())
    >>> print('Aa BC'.isspace())
  2. Determine whether it is a printable character.

    >>> print('\n'.isprintable())
    >>> print('\t'.isprintable())
    >>> print('acd'.isprintable())
    >>> print(' '.isprintable())
    >>> print(''.isprintable())
  3. Determine whether the identifier definition rules are satisfied.
    The identifier defines the rule as follows: it can only be the beginning of the letter or underline, and can not contain any character other than digits, letters and underlines.

    >>> print('abc'.isidentifier())
    >>> print('2abc'.isidentifier())
    >>> print('abc2'.isidentifier())
    >>> print('_abc2'.isidentifier())
    >>> print('_abc_2'.isidentifier())
    >>> print('_Abc_2'.isidentifier())
    >>> print('Abc_2'.isidentifier())

3. filling”

3.1 center[, fillchar])

Center the strings and fill the left and right sides with fillchar, so that the length of the entire string is width. Fillchar is a default space. If width is less than the length of the string, it can not fill directly into the string itself (not creating a new string object).

For example:

  1. Use underline to fill and live in the middle string

    >>> print('ab'.center(4,'_'))
    >>> print('ab'.center(5,'_'))
  2. Using the default space fill and the middle string

    >>> print('ab'.center(4))
    >>> print(len('ab'.center(4)))
  3. widthLess than string length

    >>> print('abcde'.center(3))

3.2 ljust and rjust

S.ljust(width[, fillchar])
S.rjust(width[, fillchar])

ljust()Fill in the right side of the string S using fillchar, so that the overall length is width.rjust()It is filled to the left. If no fillchar is specified, space is added by default.

If width is less than or equal to the length of the string S, it can not be filled and returns the string S directly (not creating a new string object).

For example:

>>> print('xyz'.ljust(5,'_'))
>>> print('xyz'.rjust(5,'_'))

3.3 zfill


Fill in the left side of the string S with 0 and make it width. If right positive and negative before S+/-,Then 0 is filled behind the two symbols, and the symbols are counted in length.

If width is less than or equal to the length of S, it can not be filled and returns directly to S itself (no new string object will be created).

>>> print('abc'.zfill(5))

>>> print('-abc'.zfill(5))

>>> print('+abc'.zfill(5))

>>> print('42'.zfill(5))

>>> print('-42'.zfill(5))

>>> print('+42'.zfill(5))

4. substring search”

4.1 count

S.count(sub[, start[, end]])

Returns the number of times the string S neutron string sub appears, and you can specify where to start computing (start) and where to end (end), the index starts from 0 and does not include the end boundary.

For example:

>>> print('xyabxyxy'.count('xy'))

# The number of times is 2, because from index=1, that is, starting from'y', the range of search is'yabxyxy'.
>>> print('xyabxyxy'.count('xy',1))

# The number of times is 1, because it does not include end, so the range of search is'yabxyx'
>>> print('xyabxyxy'.count('xy',1,7))

# The number of times 2, because the range of lookup is'yabxyxy'
>>> print('xyabxyxy'.count('xy',1,8))

4.2 endswith and startswith

S.endswith(suffix[, start[, end]])
S.startswith(prefix[, start[, end]])

endswith()Check whether the string S has suffix end and return the Boolean value of True and False. A suffix can be a tuple (tuple). You can specify the search boundary for starting start and ending end.

Empathystartswith()It is used to determine whether the string S begins with a prefix.

For example:

  1. suffixWhen it is a common string.

    >>> print('abcxyz'.endswith('xyz'))
    # False,Because the search range is'yz'
    >>> print('abcxyz'.endswith('xyz',4))
    # False,Because the search range is'abcxy'
    >>> print('abcxyz'.endswith('xyz',0,5))
    >>> print('abcxyz'.endswith('xyz',0,6))
  2. suffixWhen tuple is a element, if any element in tuple satisfies the condition of endswith, it returns True.

    # tuple'xyz'satisfied conditions
    >>> print('abcxyz'.endswith(('ab','xyz')))
    # tupleBoth'ab'and'xy' do not meet the conditions
    >>> print('abcxyz'.endswith(('ab','xy')))
    # tuple'z'satisfied conditions
    >>> print('abcxyz'.endswith(('ab','xy','z')))

4.3 find, rfind and index, rindex

S.find(sub[, start[, end]])
S.rfind(sub[, start[, end]])¶
S.index(sub[, start[, end]])
S.rindex(sub[, start[, end]])

find()Search string S contains substring sub. If it contains, it returns the index position of sub, otherwise it returns “-1”. You can specify the search location for starting start and ending end.

index()Like find (), the only difference is that when a substring is not found, it is thrown out.ValueErrorWrong.

rfind()It is to return the location of the rightmost substring that is searched. If only one or no substring is searched, it is equivalent to find ().

The same reason rindex ().

For example:

>>> print('abcxyzXY'.find('xy'))
>>> print('abcxyzXY'.find('Xy'))
>>> print('abcxyzXY'.find('xy',4))

>>> print('xyzabcabc'.find('bc'))
>>> print('xyzabcabc'.rfind('bc'))

>>> print('xyzabcabc'.rindex('bcd'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

have access toinOperator to determine whether string S contains substring sub, which returns not the index position, but the Boolean value.

>>> 'xy' in 'abxycd'
>>> 'xyz' in 'abxycd'

5. replacement”

5.1 replace

S.replace(old, new[, count])

Replace the substring old in the string to the new string. If given count, it means replacing only the count old substrings. If the substring old is not searched in S, it can not be replaced, returning the string S directly (not creating a new string object).

>>> print('abcxyzoxy'.replace('xy','XY'))
>>> print('abcxyzoxy'.replace('xy','XY',1))
>>> print('abcxyzoxy'.replace('mn','XY',1))

5.2 expandtabs


In the string S\tReplace a certain number of spaces. Default N=8.

Be careful,expandtabs(8)Not to be\tIt is replaced directly to 8 spaces. for example'xyz\tab'.expandtabs()Will be\tReplace it with 5 spaces, because “XYZ” takes up 3 character bits.

In addition, it does not replace the newline character (\nor\r)At the time.

For example:

>>> '01\t012\t0123\t01234'.expandtabs(4)
'01  012 0123    01234'

>>> '01\t012\t0123\t01234'.expandtabs(8)
'01      012     0123    01234'

>>> '01\t012\t0123\t01234'.expandtabs(7)
'01     012    0123   01234'

>>> print('012\t0123\n01234'.expandtabs(7))
012    0123

5.3 translate and maketrans

static str.maketrans(x[, y[, z]])

str.maketrans()Generate a character mapping table and use it.translate(table)Each character in the string S is mapped.

If you are familiar with Linux, you know the TR command, and translate () functions are similar to tr.

For example, you now want to do a simple encryption of “I love fairy” and replace some of the characters in the number, so others don’t know what it means after the conversion.

>>> in_str='abcxyz'
>>> out_str='123456'

# maketrans()Generating mapping table
>>> map_table=str.maketrans(in_str,out_str)

# Mapping using translate ()
>>> my_love='I love fairy'
>>> result=my_love.translate(map_table)
>>> print(result)
I love f1ir5

Be careful,maketrans(x[, y[, z]])Both X and y are strings and must be equal in length.

Ifmaketrans(x[, y[, z]])Given the third parameter Z, each character in this parameter string is mapped to None.

For example, “O” and “Y” are not replaced.

>>> in_str='abcxyz'
>>> out_str='123456'
>>> map_table=str.maketrans(in_str,out_str,'ay')
>>> my_love='I love fairy'
>>> result=my_love.translate(map_table)
>>> print(result)
I love fir

6. segmentation”

6.1 partition and rpartition


The substring SEP in the string S is searched and the S is divided from Sep to a tuple of 3 elements: the part on the left of the SEP is the first element of the tuple, the SEP itself is the two element of the tuple, and the third element of the tuple on the right side of the SEP.

partition(sep)Split from the first SEP on the left,rpartition(sep)Split from the first SEP on the right.

If SEP is not searched, two elements in the 3 element tuples returned are empty. Partition () is the last two elements are empty, rpartition () is the first two elements are empty.

For example:

# When only one SEP is searched, the two results are the same
>>> print('abcxyzopq'.partition('xy'))
('abc', 'xy', 'zopq')
>>> print('abcxyzopq'.rpartition('xy'))
('abc', 'xy', 'zopq')

# When searching multiple SEP, separate the first and the first SEP from the left first.
>>> print('abcxyzxyopq'.partition('xy'))
('abc', 'xy', 'zxyopq')
>>> print('abcxyzxyopq'.rpartition('xy'))
('abcxyz', 'xy', 'opq')

# No search for Sep
>>> print('abcxyzxyopq'.partition('xyc'))
('abcxyzxyopq', '', '')
>>> print('abcxyzxyopq'.rpartition('xyc'))
('', '', 'abcxyzxyopq')

6.2 split, rsplit and splitlines

S.split(sep=None, maxsplit=-1)
S.rsplit(sep=None, maxsplit=-1)

Both are used to split strings and generate a list.

split()According to the segmentation of S based on Sep, maxsplit is used to specify the number of segments. If no maxsplit is specified or the given value is “-1”, the right search is done and each SEP is divided until the string is searched. If SEP is not specified or None is specifiedThe segmentation algorithm is changed: the space is used as a separator and the continuous space is compressed into one space.

rsplit()andsplit()It’s the same thing. It’s just searching from the right to the left.

splitlines()Used specifically for dividing lines. Although it’s a bit likesplit('\n')orsplit('\r\n'),But they have some differences, and they are explained below.

The first is the example analysis of split ().rsplit()The example is brief.

# sepFor a single character
>>> '1,2,3'.split(',')
['1', '2', '3']

>>> '1,2,3'.split(',',1)
['1', '2,3']    # Only once separated

>>> '1,2,,3'.split(',')
['1', '2', '', '3']  # No continuous separators are not compressed

>>> '<hello><><world>'.split('<')
['', 'hello>', '>', 'world>']

# sepFor multiple characters
>>> '<hello><><world>'.split('<>')
['<hello>', '<world>']

# When no SEP is specified
>>> '1 2 3'.split()
['1', '2', '3']

>>> '1 2 3'.split(maxsplit=1)
['1', '2 3']

>>> '   1    2   3   '.split()
['1', '2', '3']

>>> '   1    2   3  \n'.split()
['1', '2', '3']

# Explicitly specifies that SEP is blank, tab, and line break.
>>> ' 1  2  3  \n'.split(' ')
['', '1', '', '2', '', '3', '', '\n']

>>> ' 1  2  3  \n'.split('\t')
[' 1  2  3  \n']

>>> ' 1 2\n3 \n'.split('\n')
[' 1 2', '3 ', '']  # Pay attention to the last item of the list.

>>> ''.split('\n')

An example analysis of splitlines ().

splitlines()It is possible to specify a variety of newline characters, which are common\n\r\r\n。If keepends is specified as True, all line breaks are preserved.

>>> 'ab c\n\nde fg\rkl\r\n'.splitlines()
['ab c', '', 'de fg', 'kl']

>>> 'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
['ab c\n', '\n', 'de fg\r', 'kl\r\n']

Compare split () with splitlines ():

#### split()
>>> ''.split('\n')
['']            # Because no line of line can be divided

>>> 'One line\n'.split('\n')
['One line', '']

#### splitlines()
>>> "".splitlines()
[]              # Because no lines can be divided

>>> 'Two lines\n'.splitlines()
['Two lines']



The string in the Iterable object (Iterable) is linked by S. Note that Iterable must be all string types, otherwise it will be wrong.

If you are a beginner of python, you don’t know what Iterable is, but you want to look at the specific syntax of join, then you can understand it for a while: string string, list list, tuple tuple, dictionary dict, collection set.

For example:

  1. Character string

    >>> L='python'
    >>> '_'.join(L)
  2. tuple

    >>> L1=('1','2','3')
    >>> '_'.join(L1)
  3. Set. Note that the collection is unordered.

    >>> L2={'p','y','t','h','o','n'}
    >>> '_'.join(L2)
  4. list

    >>> L2=['py','th','o','n']
    >>> '_'.join(L2)
  5. Dictionaries

    >>> L3={'name':"malongshuai",'gender':'male','from':'China','age':18}
    >>> '_'.join(L3)
  6. iterablePart of the iteration must be string type, and cannot contain numbers or other types.

    >>> L1=(1,2,3)
    >>> '_'.join(L1)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: sequence item 0: expected str instance, int found

    The following two can not be join.

    >>> L1=('ab',2)
    >>> L2=('AB',{'a','cd'})

8. pruning: strip, lstrip and rstrip”


Remove the characters char on the left side, left side and right side respectively. If chars is not specified or specified asNone,The blank spaces (tabs, tabs, line breaks) are removed by default.

The only thing to note is that chars can be multiple character sequences. When removed, the characters in the sequence will be removed.

For example:

  1. Remove a single character or blank.

    >>> '   spacious   '.lstrip()
    'spacious   '
    >>> '   spacious   '.rstrip()
    '   spacious'
    >>> 'spacious   '.lstrip('s')
    'pacious   '
    >>> 'spacious'.rstrip('s')

    2.Remove the characters in the character.

    >>> print(''.lstrip('cmowz.'))
    >>> print(''.lstrip('cmowz.'))
    >>> print(''.lstrip('cmowz.'))
    >>> print(''.strip('cmowz.'))

Becausewww.example.comThe first 4 characters are character sequencescmowz.The characters in the character are removed, and the fifth character e is not in the character sequence, so the pruning ends here.

wwaw.example.comThe middle third character a is not a character in the character sequence, so pruning ends here.

Link of this Article: Python string method Encyclopedia