Data structures

List, dictionary, tuple, set, and more.

List

l = [3, 2, -1, 6, 12, 45, 5, 23]
l.sort()
print(l)

[-1, 2, 3, 5, 6, 12, 23, 45]

We can get all supported methods and attributes for a list by using the dir function:

print(dir(l))

['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

Creating and manipulating list

def main() -> None:
    # Creating and manipulating a list of integers
    numbers = [1, 2, 3, 4, 5]
    print("Original List:", numbers)

    # Lists are mutable: elements can be changed
    numbers[0] = 10  # Changing the first element
    print(f"List after changing first element: {numbers}")

    # Accessing elements using their position
    second_element = numbers[1]
    print(f"second_element: {second_element}")

    # Slicing
    first_three = numbers[:3]
    print(f"Fifth three element: {first_three}")

    # Slicing with a step
    every_other = numbers[::2]
    print(f"Every other element: {every_other}")

    # Reversing a list
    reversed_list = numbers[::-1]
    print(f"Reversed List: {reversed_list}")

    # Slicing from to and to a position
    middle_three = numbers[1:4]
    print(f"Middle three elements: {middle_three}")

    # Accessing the last element
    last_element = numbers[-1]
    print(f"Last element: {last_element}")

    # Discussing memory management
    # Python lists automatically resize as items are added or removed
    numbers.append(6)
    print(f"List after appending an element: {numbers}")

    # Lists can contain different types of objects (do not recommend this)
    mixed_list = [1, "Hello", 3.14, [2, 4, 6]]
    print("Mixed Type List:", mixed_list)


if __name__ == "__main__":
    main()

Original List: [1, 2, 3, 4, 5]
List after changing first element: [10, 2, 3, 4, 5]
second_element: 2
Fifth three element: [10, 2, 3]
Every other element: [10, 3, 5]
Reversed List: [5, 4, 3, 2, 10]
Middle three elements: [2, 3, 4]
Last element: 5
List after appending an element: [10, 2, 3, 4, 5, 6]
Mixed Type List: [1, 'Hello', 3.14, [2, 4, 6]]

Array vs List

Array is a more basic and less flexible alternative to list. It has fixed-sized but much faster to access and modify elements. This is because arrays store elements in a single memory location (contiguous memory).

import numpy as np


def main() -> None:
    a = np.array([1, 2, 3])
    b = np.array([4, 5, 6])

    c = np.append(a, 4)

    print(f"a: {a}, c: {c}")
1    print(f"id: a -> {id(a)}, id: c -> {id(c)}")

    print(f"a + b = {a + b}")
    print(f"a * b = {a * b}")

    data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

    print(f"Mean: {np.mean(data)}")
    print(f"median: {np.median(data)}")
    print(f"std: {np.std(data)}")


if __name__ == "__main__":
    main()

1: The id function returns the memory address of the object. Here, we observe that the memory address of a and c are different, which means that c is a new object in memory. The append method returns a new list object, even though in numpy it’s array. The numpy array are not made for appending elements, it’s better to use list for this purpose.

a: [1 2 3], c: [1 2 3 4]
id: a -> 4466368720, id: c -> 4466368336
a + b = [5 7 9]
a * b = [ 4 10 18]
Mean: 5.5
median: 5.5
std: 2.8722813232690143

Dictionary

Lists are not optimized for search operations. To find an element in a list, Python typically scans the elements sequentially, resulting in a time complexity of \(O(n)\) in the worst case.

If frequent lookups are required, a dictionary is usually a better choice. A dictionary stores key–value pairs and uses a hash table internally, allowing keys to be mapped to positions in the table using a hash function. This enables average lookup time of \(O(1)\).

Each key in a dictionary must be unique and hashable. The hash value determines the bucket where the key–value pair is stored, which allows Python to locate elements much faster than scanning a list.

The trade-off is that dictionaries generally require more memory than lists because they store additional information for hashing and collision handling.

import time

def main():
    n = 100_000_000

    # create data
    data_list = list(range(n))
    data_dict = {i: True for i in range(n)}

    target = n - 1

    # list search
    start = time.perf_counter()
    found = target in data_list
    end = time.perf_counter()
    print(f"List search: {end - start:.6f} seconds")

    # dictionary search
    start = time.perf_counter()
    found = target in data_dict
    end = time.perf_counter()
    print(f"Dictionary lookup: {end - start:.6f} seconds")


if __name__ == "__main__":
    main()

List search: 2.324399 seconds
Dictionary lookup: 0.000012 seconds

But in the case that you need do frequently do operations on the element, then opt for list or np.array over dictionary.

Dictionary (dict) → optimized for fast key lookup average (O(1)).
List / np.array → better when you process many elements sequentially or perform operations on each element.

Dictionaries have higher memory overhead and are not designed for bulk numerical operations.

Basic operations

def main() -> None:
    # Creating and manipulating a dictionary
    person: dict[str | int, str] = {
        "name": "Arjan",
        "profession": "Developer",
        "city": "Utrecht",
    }
    print("Original Dictionary:", person)

    # Dictionaries are mutable: values can be changed based on their keys
    person["name"] = "Jane"  # Changing the value associated with the key 'name'
    print(f"Dictionary after changing 'name': {person}")

    # Accessing elements using their keys
    profession = person["profession"]
    print(f"profession: {profession}")

    # Adding a new key-value pair
    person["profession"] = "YouTuber"
    print(f"Dictionary after adding a new key-value pair: {person}")

    # Removing a key-value pair
    del person["city"]
    print(f"Dictionary after removing 'city': {person}")

    # Keys and values can be of different types
    person[1] = "One"
    print("Mixed Type Dictionary:", person)

    # Discussing memory management
    # Python dictionaries automatically resize and rehash as items are added or removed
    person["hobby"] = "Photography"
    print(f"Dictionary after adding a new hobby: {person}")

    # Getting a list of all keys and values
    print(f"Keys: {person.keys()}")
    print(f"Values: {person.values()}")


if __name__ == "__main__":
    main()

Enum

Another commonly used type is when you need to represent a number of limited options (months in a year, status code for a request, etc.). This is where the enum type comes in handy.

from enum import IntEnum, StrEnum, auto


1class HTTPStatus(IntEnum):
    OK = 200
    CREATED = 201
    ACCEPTED = 202
    NO_CONTENT = 204
    BAD_REQUEST = 400
    UNAUTHORIZED = 401
    FORBIDDEN = 403
    NOT_FOUND = 404
    INTERNAL_SERVER_ERROR = 500
    NOT_IMPLEMENTED = 501

class RESTMethod(StrEnum):
    GET = "GET"
    POST = "POST"
    PUT = auto()
    DELETE = auto()

for status in HTTPStatus:
    print(f"{status.name}: {status.value}")


print(f"Name of the enum member: {HTTPStatus.OK.name}")
print(f"Enum member from value: {HTTPStatus(404)}")
print(f"value of the enum member: {HTTPStatus.NOT_FOUND.value}")

def response_description(status: HTTPStatus) -> str:
    if status == HTTPStatus.OK:
        return "Request succeeded"
    elif status == HTTPStatus.NOT_FOUND:
        return "Resource not found"
    else:
        return "Error occurred"


def main() -> None:
    description = response_description(HTTPStatus.OK)
    print(description)

    if HTTPStatus.BAD_REQUEST == HTTPStatus.BAD_REQUEST:
        print("Both are client error responses")

    if HTTPStatus.INTERNAL_SERVER_ERROR in HTTPStatus:
        print("500 Internal Server Error is a valid HTTP status code.")


if __name__ == "__main__":
    main()

1: For enum, the value do not need to be unique, but the name must be unique.

OK: 200
CREATED: 201
ACCEPTED: 202
NO_CONTENT: 204
BAD_REQUEST: 400
UNAUTHORIZED: 401
FORBIDDEN: 403
NOT_FOUND: 404
INTERNAL_SERVER_ERROR: 500
NOT_IMPLEMENTED: 501
Name of the enum member: OK
Enum member from value: 404
value of the enum member: 404
Request succeeded
Both are client error responses
500 Internal Server Error is a valid HTTP status code.