Python comes with a collection of built-in data types that make common data-wrangling operations easy. Among them is the list, a simple but versatile collection type. With a Python list, you can group Python objects together in a one-dimensional row that allows objects to be accessed by position, added, removed, sorted, and subdivided.
Python list basics
Defining a list in Python is easy—just use the bracket syntax to indicate items in a list, like this:
list_of_ints = [1, 2, 3]
Items in a list do not have to all be the same type; they can be any Python object. In the list below, assume Three
is a function:
list_of_objects = ["One", TWO, Three, {"Four":4}, None]
Note that having mixed objects in a list can have implications for sorting the list. We’ll go into that later.
The biggest reason to use a list is to able to find objects by their position in the list. To do this, you use Python’s index notation: a number in brackets, starting at 0, that indicates the position of the item in the list.
In the list_of_ints
example, list_of_ints[0]
yields 1
, list_of_ints[1]
yields 2
, and list_of_objects[4]
would be the None
object.
Python list indexing
If you use a positive integer for the index, the integer indicates the position of the item to look for. But if you use a negative integer, then the integer indicates the position starting from the end of the list. For example, using an index of -1
is a handy way to grab the last item from a list no matter the size of the list.
In that case, list_of_ints[-1]
yields 3
, and list_of_objects[-1]
yields None
.
You can also use an integer variable as your index. If x=0
, list_of_ints[x]
yields 1, and so on.
If you try to index beyond a list's boundaries you'll trigger an IndexError
exception.
Adding and removing Python list items
Python has several ways you can add or remove items from a list:
.append()
inserts an item at the end of the list. For example,list_of_ints.append(4)
would turnlist_of_ints
into the list[1,2,3,4]
. Appends are fast and efficient; it takes about the same amount of time to append one item to a list no matter how long the list is..extend()
takes the contents of some iterable—such as another list—and adds each item from the iterable to the list as a separate item. This is useful if you want to quickly insert the contents of a list item-by-item into another list. (If you try to.append()
one list to another, the entire list gets appended as a single object, rather than item-by-item.).pop()
removes and returns the last item from the list. If we ranx = list_of_ints.pop()
on the originallist_of_ints
, x would contain the value3
. (You don’t have to assign the results of.pop()
to a value if you don’t need it.).pop()
operations are also fast and efficient..insert()
inserts an item at some arbitrary position in the list. For example,list_of_ints.insert(0,10)
would turnlist_of_ints
into[10,1,2,3]
. Note that the closer you insert to the front of the list, the slower this operation will be, though you won’t see much of a slowdown unless your list has many thousands of elements or you’re doing the inserts in a tight loop..pop(x)
removes the item at the indexx
. Solist_of_ints.pop(0)
would remove the item at index 0. Again, the closer you are to the front of the list, the slower this operation can be..remove(item)
removes an item from a list, but not based on its index. Rather,.remove()
removes the first occurrence of the object you specify, searching from the top of the list down. For[3,7,7,9,8].remove(7)
, the first7
would be removed, resulting in the list[3,7,9,8]
. This operation, too, can slow down for a large list, since it theoretically has to traverse the entire list to work.
Slicing a Python list
Lists can be divided up into new lists, a process called slicing. Python’s slice
syntax lets you specify which part of a list to carve off and how to manipulate the carved-off portion.
You saw above how to use the bracket notation to get a single item from a list: my_list[2]
, for example. Slices use a variant of the same index notation (and follow the same indexing rules): list_object[start:stop:step]
.
Note the following:
start
is the position in the list to start the slice.stop
is the position in the list where we stop slicing. In other words, that position and everything after it is omitted.step
is an optional “every nth element” indicator for the slice. By default this is1
, so the slice retains every element from the list it’s slicing from. Setstep
to2
, and you’ll select every second element, and so on.
Here are some examples. Consider this list:
slice_list = [1,2,3,4,5,6,7,8,9,0]
slice_list[0:5] = [1, 2, 3, 4, 5]
Note that we’re stopping at index 4, not index 5!
slice_list[0:5:2] = [1, 3, 5]
If you omit a particular slice index, Python assumes a default. Leave off the start
index, and Python assumes the start of the list:
slice_list[:5] = [1, 2, 3, 4, 5]
Leave off the stop
index, and Python assumes the end of the list:
slice_list[4:] = [5, 6, 7, 8, 9, 0]
The step
element can also be negative. This lets us take slices that are reversed copies of the original:
slice_list[::-1] = [0, 9, 8, 7, 6, 5, 4, 3, 2, 1]
Note that you can slice in reverse by using start
and stop
indexes that go backwards, not forwards:
slice_list[5:2:-1] = [6, 5, 4]
Slicing and shallow copies
Also keep in mind that slices of lists are shallow copies of the original list. The original list remains unchanged. The elements inserted into the new list are the same kinds of references to those items as the ones in the old list.
For instance, if you have a class instance in a list and you make a slice containing that object, a distinct new class instance isn't created—the slice just now contains a different reference to the same class instance.
Slicing and out-of-bounds indexes
If you try to make a slice that's bigger than the item you're slicing—an "out of bounds" slice—you will not get an IndexError
, but you will only get as much as the sliced item actually has. For instance:
[1,2,3][:10]
would yield [1,2,3]
. This allows you to make slices without worrying too much about constraining the boundaries of the slice to the thing you're slicing.
Sorting a Python list
Python provides two ways to sort lists. You can generate a new, sorted list from the old one, or you can sort an existing list in-place. These options have different behaviors and different usage scenarios.
To create a new, sorted list, use the sorted()
function on the old list:
new_list = sorted(old_list)
This will sort the contents of the list using Python’s default sorting methods. For strings, the default is lexical order; for numbers, it’s ascending values.
If you want to sort a list in reverse, pass the reverse
parameter:
new_list = sorted(old_list, reverse=True)
The other way to sort, in-place sorting, performs the sort operation directly on the original list. To do this, use the list’s .sort()
method:
old_list.sort()
.sort()
also takes reverse
as a parameter, allowing you to sort in reverse.
Note that the contents of the list need to be consistent for sorting to work. For instance, you can’t sort a mix of integers and strings, but you can sort a list that is all integers or all strings. Otherwise you’ll get a TypeError
in the sort operation.
Both sorted()
and .sort()
also take a key
parameter. The key
parameter lets you provide a function that can be used to perform a custom sorting operation. When the list is sorted, each element is passed to the key
function, and the resulting value is used for sorting. For instance, if we had a mix of integers and strings, and we wanted to sort them, we could use key
, like this:
mixed_list = [1,"2",3,"4", None]
def sort_mixed(item):
try:
return int(item)
except ValueError:
return 0
sorted_list = sorted(mixed_list, key = sort_mixed)
print (sorted_list)
Note that this code wouldn’t convert each element of the list into an integer! Rather, it would use the integer value of each item as its sort value. Also note how we use a try/except
block to trap any values that don’t translate cleanly into an integer, and return 0
for them by default.
Multidimensional list objects
Lists are by nature one-dimensional; they store everything in a single, flat row. But since lists can contain any type of object, including other lists, this makes it possible to create multidimensional lists.
Here's an example of a two-dimensional list:
two_dimensional_list = [
[0,1,2],
[3,4,5]
]
The outermost list, the first dimension, is two elements; the inner dimension, the lists within, are three elements each.
If you wanted to access the lists within, you'd use a stacked indexing syntax like this:
two_dimensional_list[0][2]
This would give you the first element in the outer list—the list of [0,1,2]
—and then the third element from that—the 2
.
Note that Python doesn't enforce any kind of dimensionality on objects like this. You could have a list of lists where each sublist is a totally different length, but you'd need to ensure you didn't generate an IndexError
by using indexes that didn't match the object in question.
Python lists are not arrays
One important thing to know about lists in Python is that they aren’t “arrays.” Other languages, like C, have one-dimensional or multidimensional constructions called arrays that accept values of a single type. Lists are heterogenous; they can accept objects of any type.
What’s more, there is a separate array
type in Python. The Python array
is designed to emulate the behavior of an array in C, and it’s meant chiefly to allow Python to work with C arrays. The array
type is useful in those cases, but in almost every pure-Python case you’ll want to use lists. For everyday work that would normally use a list, there's no performance advantage to using arrays instead.
When to use Python lists (and when not to)
So, when are Python lists most useful? A list is best when:
- You need to find things quickly by their position in a collection. Accessing any position in a list takes the same amount of time, so there is no performance penalty for looking up even the millionth item in a list.
- You’re adding and removing to the collection mainly by appending to the end or removing from the end, in the manner of a stack. Again, these operations take the same amount of time regardless of the length of the list.
A Python list is less suitable when:
- You want to find an item in a list but you don’t know its position. You can do this with the
.index()
property. For instance, you could uselist_of_ints.index(1)
to find the index of the first occurrence of the number1
inlist_of_ints
. Speed should not be not an issue if your list is only a few items long, but for lists thousands of items long, it means Python has to search the entire list. For a scenario like this, use a dictionary, where each item can be found using a key, and where the lookup time will be the same for each value. - You want to add or remove items from any position but the end. Each time you do this, Python must move every other item after the added or removed item. The longer the list, the greater the performance issue this becomes. Python’s
deque
object is a better fit if you want to add or remove objects freely from either the start or the end of the list.