Python String: Working With Text

So we’ve seen numbers, but what about text? This page is about the Python string, which is the go-to Python data type for storing and using text in Python. So, in Python, a piece of text is called a string and you can perform all kinds of operations on a string. But let’s start with the basics first!

What is a Python string?

The following is a formal definition of what a string is:

String
A string in Python is a sequence of characters

In even simpler terms, a string is a piece of text. Strings are not just a Python thing. It’s a well-known term in the field of computer science and means the same thing in most other languages as well. Now that we know what a string is, we’ll look at how to create a string.

How to create a Python string

A Python string needs quotes around it for it to be recognized as such, like this:

>>> 'Hello, World'
'Hello, World'

Because of the quotes, Python understands this is a sequence of characters and not a command, number, or variable.

And just like with numbers, some of the operators we learned before work on Python strings too. Try it with the following expressions:

>>> 'a' + 'b'
'ab'
>>> 'ab' * 4
'abababab'
>>> 'a' - 'b'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'str' and 'str'

This is what happens in the code above:

  • The plus operator glues two Python strings together.
  • The multiplication operator repeats our Python string the given number of times.
  • The minus operator doesn’t work on a Python string and produces an error. If you want to remove parts of a string, there are other methods that you’ll learn about later on.

Single or double quotes?

We’ve used single quotes, but Python accepts double-quotes around a string as well:

>>> "a" + "b"
'ab'

Note that these are not two single quotes next to each other. It’s the character that’s often found next to the enter key on your keyboard. You need to press shift together with this key to get a double quote.

As you can see from its answer, Python itself seems to prefer single quotes. It looks more clear, and Python tries to be as clear and well readable as it can. So why does it support both? It’s because it allows you to use strings that contain a quote.

In the first example below, we use double quotes. Hence there’s no problem with the single quote in the word it’s. However, in the second example, we try to use single quotes. Python sees the quote in the word it’s and thinks this is the end of the string! The following letter, “s”, causes a syntax error. A syntax error is a character or string incorrectly placed in a command or instruction that causes a failure in execution.

In other words, Python doesn’t understand the s at that spot, because it expects the string to be ended already, and fails with an error:

>>> mystring = "It's a string, with a single quote!"
>>> mystring = 'It's a string, with a single quote!'
  File "<stdin>", line 1
    mystring = 'It's a string, with a single quote!'
                   ^
SyntaxError: invalid syntax

As you can see, even the syntax highlighter on the code block above gets confused! And as you can also see, Python points out the exact location of where it encountered the error. Python errors tend to be very helpful, so look closely at them and you’ll often be able to pinpoint what’s going wrong.

Escaping

There’s actually another way around this problem, called escaping. You can escape a special character, like a quote, with a backward slash:

>>> mystring = 'It\'s an escaped quote!'
>>> _

You can also escape double quotes inside a double-quoted string:

>>> mystring = "I'm a so-called \"script kiddie\""
>>> _

So which one should you use? It’s simple: always opt for the option in which you need the least amount of escapes because they make your Python strings less readable.

Multiline strings

Python also has syntax for creating multiline strings, using triple quotes. By this I mean three double quotes or three single quotes, both work but I’ll demonstrate with double quotes:

>>> my_big_string = """This is line 1,
... this is line 2,
... this is line 3."""
>>> _

The nice thing here is that you can use both single and double quotes within a multiline string. So you can use triple quotes to cleanly create strings that contain both single and double quotes:

>>> line = """He said: "Hello, I've got a question" from the audience"""
>>> _

String operations

Strings come with a number of handy, built-in operations you can execute. I’ll show you only a couple here since I don’t want to divert your attention from the tutorial too much.

In the REPL, you can use auto-completion. In the next code fragment, we create a string, mystring, and on the next line we type its name followed by hitting the <TAB> key twice:

>>> mystring = "Hello world"
>>> mystring.
mystring.capitalize(    mystring.find(          mystring.isdecimal(     mystring.istitle(       mystring.partition(     mystring.rstrip(        mystring.translate(
mystring.casefold(      mystring.format(        mystring.isdigit(       mystring.isupper(       mystring.replace(       mystring.split(         mystring.upper(
mystring.center(        mystring.format_map(    mystring.isidentifier(  mystring.join(          mystring.rfind(         mystring.splitlines(    mystring.zfill(
mystring.count(         mystring.index(         mystring.islower(       mystring.ljust(         mystring.rindex(        mystring.startswith(
mystring.encode(        mystring.isalnum(       mystring.isnumeric(     mystring.lower(         mystring.rjust(         mystring.strip(
mystring.endswith(      mystring.isalpha(       mystring.isprintable(   mystring.lstrip(        mystring.rpartition(    mystring.swapcase(
mystring.expandtabs(    mystring.isascii(       mystring.isspace(       mystring.maketrans(     mystring.rsplit(        mystring.title(

If all went well, you should get a big list of operations that can be performed on a string. You can try some of these yourself:

>>> mystring.lower()
'hello world'
>>> mystring.upper()
'HELLO WORLD'

An explanation of each of these operations can be found in the official Python documentation, but we’ll cover a few here as well.

Getting the string length

A common operation is to get the string length. Unlike the operations above, this can be done with Python’s len() function like this:

>>> len("I wonder how long this string will be...")
40
>>> len(mystring)
11

In fact, the len() function can be used on many objects in Python, as you’ll learn later on. If functions are new to you, you’re in luck, because our next page will explain exactly what a function in Python is, and how you can create one yourself.

Split a string

Another common operation is splitting a string. For this, we can use one of the built-in operations, conveniently called split. Let’s start simple, by splitting up two words on the space character between them:

'Hello world'.split(' ')
['Hello', 'world']

The split operation takes one argument, which is the sequence of characters to split on. The output is a Python list, containing all the separate words.

Split on whitespace

A common use-case is to split on whitespace. The problem is that whitespace can be a lot of things. Three common ones that you probably know already are:

  • space characters
  • tabs
  • newlines

But there are many more, and to make it even more complicated, whitespace doesn’t mean just one of these characters, but can also be a whole sequence of them. E.g., three consecutive spaces and a tab character form one piece of whitespace.

Exactly because this is such a common operation among programmers, and because it’s hard to do it perfectly, Python has a convenient shortcut for it. Calling the split operation without any arguments splits a string on whitespace, as can be seen below:

>>> 'Hello \t\n there,\t\t\t stranger.'.split()
['Hello', 'there,', 'stranger.']

As you can see, no matter what whitespace character and how many, Python is still able to split this string for us into separate words.

Replace parts of a string

Let’s look at one more built-in operation on strings: the replace function. It’s used to replace one or more characters or sequences of characters:

>>> 'Hello world'.replace('H', 'h')
'hello world'
>>> 'Hello world'.replace('l', '_')
'He__o wor_d
>>> 'Hello world'.replace('world', 'readers')
'Hello readers'

Reversing a string

A common assignment is to reverse a Python string. There’s no reverse operation, though, as you might have noticed when studying the list of operations like lower() and upper() that comes with a string. This is not exactly beginner stuff, so feel free to skip this for now if you’re going through the tutorial sequentially.

To reverse a string efficiently, we can treat a string as a list. Lists are covered later on in this tutorial (see for-loop). In fact, you could see a string as a list of characters. And, more importantly, you can treat it as such. List index operations like mystring[2] work just like they work on lists:

>>> mystring = 'Hello world'
>>> mystring[2]
'l'
>>> mystring[0]
'H'

Note that in Python, like in all computer languages, we start counting from 0.

What also works exactly the same as in lists, is the slicing operator. Details of list slicing can be found on the Python list page and won’t be repeated here. If you’re coming from other languages, you might compare it to an operation like substring() in Java, which allows you to retrieve specific parts of a string.

Slicing in Python works with the slicing operator, which looks like this: mystring[start:stop:step_size]. The key feature we use from slicing is the step size. By giving the slicing operator a negative step size of -1, we traverse the string from end to beginning. By leaving the start and end position empty, Python assumes with want to slice the entire string.

So we can use slicing to reverse a Python string as follows:

>>> mystring = 'Hello world'
>>> mystring[::-1]
'dlrow olleH'

Python string format with f-strings

A common pattern is the need to merge some text strings together or use a variable inside your string. There are several ways to do so, but the most modern way is to use f-strings, short for formatted strings.

Let’s first look at an example, before we dive into the details:

>>> my_age = 40
>>> f'My age is {my_age}'
My age is 40

The f-string looks like a regular string with the addition of an f prefix. This f tells Python to scan the string for curly braces. Inside these curly braces, we can put any Python expression we want. In the above case, we just included the variable my_age. F-strings provide an elegant way of including the results of expressions inside strings.

Here are a couple more examples you can try for yourself as well, inside the REPL:

>>> f'3 + 4 = {3+4}'
'3 + 4 = 7'
>>> my_age = 40
>>> f'My age is, unfortunately, not {my_age-8}'
'My age is, unfortunately, not 32'

I’m just touching the basics here. If you’re following the tutorial front to back, you can continue with the next topic since you know more than enough for now. If you’d like to find out more about f-strings, try the following resources:

About Erik van Baaren

Erik is the owner of Python Land and the author of many of the articles and tutorials on this website. He's been working as a professional software developer for 25 years, and he holds a Master of Science degree in computer science. His favorite language of choice: Python! Writing good articles takes time and effort. Did you like this tutorial? You can buy him a coffee to show your appreciation.

Leave a Comment