Strings and user input

This chapter will discuss various ways to specify string literals. After that, you'll see how to get input data from the user and handle type conversions.

Single and double quoted strings

The most common way to declare string literals is by enclosing a sequence of characters within single or double quotes. Unlike other scripting languages like Bash, Perl and Ruby, there is no feature difference between these forms. Idiomatically, single quotes are preferred and other variations are used when needed.

REPL will again be used predominantly in this chapter. One important detail to note is that the result of an expression is displayed using the syntax of that particular data type. Use print() function when you want to see how a string literal looks visually.

>>> 'hello'
'hello'

>>> print("world")
world

If the string literal itself contains single or double quote characters, the other form can be used.

>>> print('"Will you come?" he asked.')
"Will you come?" he asked.

>>> print("it's a fine sunny day")
it's a fine sunny day

What to do if a string literal has both single and double quotes? You can use the \ character to escape the quote characters. In the below examples, \' and \" will evaluate to ' and " characters respectively, instead of prematurely terminating the string definition. Use \\ if a literal backslash character is needed.

>>> print('"It\'s so pretty!" can I get one?')
"It's so pretty!" can I get one?

>>> print("\"It's so pretty!\" can I get one?")
"It's so pretty!" can I get one?

In general, the backslash character is used to construct escape sequences. For example, \n represents the newline character, \t is for tab character and so on. You can use \ooo and \xhh to represent 256 characters in octal and hexadecimal formats respectively. For Unicode characters, you can use \N{name}, \uxxxx and \Uxxxxxxxx formats. See docs.python: String and Bytes literals for full list of escape sequences and details about undefined ones.

>>> greeting = 'hi there.\nhow are you?'
>>> greeting
'hi there.\nhow are you?'
>>> print(greeting)
hi there.
how are you?

>>> print('item\tquantity')
item    quantity

>>> print('\u03b1\u03bb\u03b5\N{LATIN SMALL LETTER TURNED DELTA}')
αλεƍ

Triple quoted strings

You can also declare multiline strings by enclosing the value with three single/double quote characters. If backslash is the last character of a line, then a newline won't be inserted at that position. Here's a Python program named triple_quotes.py to illustrate this concept.

# triple_quotes.py
print('''hi there.
how are you?''')

student = '''\
Name:\tlearnbyexample
Age:\t25
Dept:\tCSE'''

print(student)

Here's the output of the above script:

$ python3.9 triple_quotes.py
hi there.
how are you?
Name:   learnbyexample
Age:    25
Dept:   CSE

info See Docstrings section for another use of triple quoted strings.

Raw strings

For certain cases, escape sequences would be too much of a hindrance to workaround. For example, filepaths in Windows use \ as the delimiter. Another would be regular expressions, where the backslash character has yet another special meaning. Python provides a raw string syntax, where all the characters are treated literally. This form, also known as r-strings for short, requires a r or R character prefix to quoted strings. Forms like triple quoted strings and raw strings are for user convenience. Internally, there's just a single representation for string literals.

>>> print(r'item\tquantity')
item\tquantity

>>> r'item\tquantity'
'item\\tquantity'
>>> r'C:\Documents\blog\monsoon_trip.txt'
'C:\\Documents\\blog\\monsoon_trip.txt'

Here's an example with re built-in module. The import statement used below will be discussed in Importing and creating modules chapter. See my book Python re(gex)? for details on regular expressions.

>>> import re

# numbers >= 100 with optional leading zeros
>>> re.findall(r'\b0*[1-9]\d{2,}\b', '0501 035 154 12 26 98234')
['0501', '154', '98234']

# without raw strings
>>> re.findall('\\b0*[1-9]\d{2,}\\b', '0501 035 154 12 26 98234')
['0501', '154', '98234']

String operators

Python provides a wide variety of features to work with strings. This chapter introduces some of them, like the + and * operators in this section. Here's some examples to concatenate strings using the + operator. The operands can be any expression that results in a string value and you can use any of the different ways to specify a string literal.

>>> str1 = 'hello'
>>> str2 = ' world'
>>> str3 = str1 + str2
>>> print(str3)
hello world

>>> str3 + r'. 1\n2'
'hello world. 1\\n2'

Another way to concatenate is to simply place any kind of string literal next to each other. You can use zero or more whitespaces between the two literals. But you cannot mix an expression and a string literal. If the strings are inside parentheses, you can also use newline to separate the literals and optionally use comments.

>>> 'hello' r' 1\n2\\3'
'hello 1\\n2\\\\3'

# note that ... is REPL's indication for multiline statements, blocks, etc
>>> print('hi '
... 'there')
hi there

You can repeat a string by using the * operator between a string and an integer.

>>> style_char = '-'
>>> print(style_char * 50)
--------------------------------------------------
>>> word = 'buffalo '
>>> print(8 * word)
buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo 

String formatting

As per PEP 20: The Zen of Python,

There should be one-- and preferably only one --obvious way to do it.

However, there are several approaches for formatting strings. This section will focus mostly on formatted string literals (f-strings for short). And then show alternate approaches.

f-strings allow you to embed an expression within {} characters as part of the string literal. Like raw strings, you need to use a prefix, which is f or F in this case. Python will substitute the embeds with the result of the expression, converting it to string if necessary (such as numeric results). See docs.python: Format String Syntax and docs.python: Formatted string literals for documentation and more examples.

>>> str1 = 'hello'
>>> str2 = ' world'
>>> f'{str1}{str2}'
'hello world'

>>> f'{str1}({str2 * 3})'
'hello( world world world)'

A recent feature allows you to add = after an expression to get both the expression and the result in the output.

>>> num1 = 42
>>> num2 = 7

>>> f'{num1 + num2 = }'
'num1 + num2 = 49'
>>> f'{num1 + (num2 * 10) = }'
'num1 + (num2 * 10) = 112'

Optionally, you can provide a format specifier along with the expression after a : character. These specifiers are similar to the ones provided by printf() function in C language, printf built-in command in Bash and so on. Here's some examples for numeric formatting.

>>> appx_pi = 22 / 7

# restricting number of digits after the decimal point
>>> f'Approx pi: {appx_pi:.5f}'
'Approx pi: 3.14286'

# rounding is applied
>>> f'{appx_pi:.3f}'
'3.143'

# exponential notation 
>>> f'{32 ** appx_pi:.2e}'
'5.38e+04'

Here's some alignment examples:

>>> fruit = 'apple'

>>> f'{fruit:=>10}'
'=====apple'
>>> f'{fruit:=<10}'
'apple====='
>>> f'{fruit:=^10}'
'==apple==='

# default is space character
>>> f'{fruit:^10}'
'  apple   '

You can use b, o and x to display integer values in binary, octal and hexadecimal formats respectively. Using # before these characters will result in appropriate prefix for these formats.

>>> num = 42

>>> f'{num:b}'
'101010'
>>> f'{num:o}'
'52'
>>> f'{num:x}'
'2a'

>>> f'{num:#x}'
'0x2a'

str.format() method, format() function and % operator are alternate approaches for string formatting.

>>> num1 = 22
>>> num2 = 7

>>> 'Output: {} / {} = {:.2f}'.format(num1, num2, num1 / num2)
'Output: 22 / 7 = 3.14'

>>> format(num1 / num2, '.2f')
'3.14'

>>> 'Approx pi: %.2f' % (num1 / num2)
'Approx pi: 3.14'

info See docs.python: The String format() Method and the sections that follow for more details about the above features. See docs.python: Format examples for more examples, including datetime formatting. The Text processing chapter will discuss more about the string processing methods.

info In case you don't know what a method is, see stackoverflow: What's the difference between a method and a function?

User input

The input() built-in function can be used to get data from the user. It also allows an optional string to make it an interactive process. It always returns a string data type, which you can convert to another type (explained in the next section).

# Python will wait until you type your data and press the Enter key
# the blinking cursor is represented by a rectangular block shown below
>>> name = input('what is your name? ')
what is your name? █

Here's the rest of the above example.

>>> name = input('what is your name? ')
what is your name? learnbyexample

# note that newline isn't part of the value saved in the 'name' variable
>>> print(f'pleased to meet you {name}.')
pleased to meet you learnbyexample.

Type conversion

The type() built-in function can be used to know what data type you are dealing with. You can pass any expression as an argument.

>>> num = 42
>>> type(num)
<class 'int'>

>>> type(22 / 7)
<class 'float'>

>>> type('Hi there')
<class 'str'>

The built-in functions int(), float() and str() can be used to convert from one data type to another. These function names are the same as their data type class names seen above.

>>> num = 3.14
>>> int(num)
3
# you can also use f'{num}'
>>> str(num)
'3.14'

>>> usr_ip = input('enter a float value ')
enter a float value 45.24e22
>>> type(usr_ip)
<class 'str'>
>>> float(usr_ip)
4.524e+23

info See docs.python: Built-in Functions for documentation on all of the built-in functions. You can also use help() function from the REPL as discussed in the Documentation and getting help section.

Exercises

  • Read about Bytes literals from docs.python: String and Bytes literals. See also stackoverflow: What is the difference between a string and a byte string?
  • If you check out docs.python: int() function, you'll see that the int() function accepts an optional argument. As an example, write a program that asks the user for hexadecimal number as input. Then, use int() function to convert the input string to an integer (you'll need the second argument for this). Add 5 and display the result in hexadecimal format.
  • Write a program to accept two input values. First can be either a number or a string value. Second is an integer value, which should be used to display the first value in centered alignment. You can use any character you prefer to surround the value, other than the default space character.
  • What happens if you use a combination of r, f and other such valid prefix characters while declaring a string literal? What happens if you use raw strings syntax and provide only a single \ character? Does the documentation describe these cases?
  • Try out at least two format specifiers not discussed in this chapter.
  • Given a = 5, get '{5}' as the output using f-strings.