Getting Started with Python Data Structures in 5 Steps

Getting Started with Python Data Structures in 5 Steps Introduction to Python Data Structures

When it comes to learning how to program, regardless of the particular programming language you use for this task, you find that there are a few major topics of your newly-chosen discipline that into which most of what you are being exposed to could be categorized. A few of these, in general order of grokking, are: syntax (the vocabulary of the language); commands (putting the vocabulary together into useful ways); flow control (how we guide the order of command execution); algorithms (the steps we take to solve specific problems… how did this become such a confounding word?); and, finally, data structures (the virtual storage depots that we use for data manipulation during the execution of algorithms (which are, again… a series of steps).

Essentially, if you want to implement the solution to a problem, by cobbling together a series of commands into the steps of an algorithm, at some point data will need to be processed, and data structures will become essential. Such data structures provide a way to organize and store data efficiently, and are critical for creating fast, modular code that can perform useful functions and scale well. Python, a particular programming language, has a series of built-in data structures of its own.

This tutorial will focus on these four foundational Python data structures:

  • Lists – Ordered, mutable, allows duplicate elements. Useful for storing sequences of data.
  • Tuples – Ordered, immutable, allows duplicate elements. Think of them as immutable lists.
  • Dictionaries – Unordered, mutable, mapped by key-value pairs. Useful for storing data in a key-value format.
  • Sets – Unordered, mutable, contains unique elements. Useful for membership testing and eliminating duplicates.

Beyond the fundamental data structures, Python also provides more advanced structures, such as heaps, queues, and linked lists, which can further enhance your coding prowess. These advanced structures, built upon the foundational ones, enable more complex data handling and are often used in specialized scenarios. But you aren't constrained here; you can use all of the existing structures as a base to implement your own structures as well. However, the understanding of lists, tuples, dictionaries, and sets remains paramount, as these are the building blocks for more advanced data structures.

This guide aims to provide a clear and concise understanding of these core structures. As you start your Python journey, the following sections will guide you through the essential concepts and practical applications. From creating and manipulating lists to leveraging the unique capabilities of sets, this tutorial will equip you with the skills needed to excel in your coding.

Step 1: Using Lists in Python

What is a List in Python?

A list in Python is an ordered, mutable data type that can store various objects, allowing for duplicate elements. Lists are defined by the use of square brackets [ ], with elements being separated by commas.

For example:

fibs = [0, 1, 1, 2, 3, 5, 8, 13, 21]

Lists are incredibly useful for organizing and storing data sequences.

Creating a List

Lists can contain different data types, like strings, integers, booleans, etc. For example:

mixed_list = [42, "Hello World!", False, 3.14159]

Manipulating a List

Elements in a list can be accessed, added, changed, and removed. For example:

# Access 2nd element (indexing begins at '0')  print(mixed_list[1])    # Append element   mixed_list.append("This is new")    # Change element  mixed_list[0] = 5    # Remove last element  mixed_list.pop(0)

Useful List Methods

Some handy built-in methods for lists include:

  • sort() – Sorts list in-place
  • append() – Adds element to end of list
  • insert() – Inserts element at index
  • pop() – Removes element at index
  • remove() – Removes first occurrence of value
  • reverse() – Reverses list in-place

Hands-on Example with Lists

# Create shopping cart as a list  cart = ["apples", "oranges", "grapes"]    # Sort the list   cart.sort()    # Add new item   cart.append("blueberries")     # Remove first item  cart.pop(0)    print(cart)

Output:

['grapes', 'oranges', 'blueberries']

Step 2: Understanding Tuples in Python

What Are Tuples?

Tuples are another type of sequence data type in Python, similar to lists. However, unlike lists, tuples are immutable, meaning their elements cannot be altered once created. They are defined by enclosing elements in parentheses ( ).

# Defining a tuple  my_tuple = (1, 2, 3, 4)

When to Use Tuples

Tuples are generally used for collections of items that should not be modified. Tuples are faster than lists, which makes them great for read-only operations. Some common use-cases include:

  • Storing constants or configuration data
  • Function return values with multiple components
  • Dictionary keys, since they are hashable

Accessing Tuple Elements

Accessing elements in a tuple is done in a similar manner as accessing list elements. Indexing and slicing work the same way.

# Accessing elements  first_element = my_tuple[0]  sliced_tuple = my_tuple[1:3]

Operations on Tuples

Because tuples are immutable, many list operations like append() or remove() are not applicable. However, you can still perform some operations:

  • Concatenation: Combine tuples using the + operator.
concatenated_tuple = my_tuple + (5, 6)
  • Repetition: Repeat a tuple using the * operator.
repeated_tuple = my_tuple * 2
  • Membership: Check if an element exists in a tuple with the in keyword.
exists = 1 in my_tuple

Tuple Methods

Tuples have fewer built-in methods compared to lists, given their immutable nature. Some useful methods include:

  • count(): Count the occurrences of a particular element.
count_of_ones = my_tuple.count(1)
  • index(): Find the index of the first occurrence of a value.
index_of_first_one = my_tuple.index(1)

Tuple Packing and Unpacking

Tuple packing and unpacking are convenient features in Python:

  • Packing: Assigning multiple values to a single tuple.
packed_tuple = 1, 2, 3
  • Unpacking: Assigning tuple elements to multiple variables.
a, b, c = packed_tuple

Immutable but Not Strictly

While tuples themselves are immutable, they can contain mutable elements like lists.

# Tuple with mutable list  complex_tuple = (1, 2, [3, 4])

Note that while you can't change the tuple itself, you can modify the mutable elements within it.

Step 3: Mastering Dictionaries in Python

What is a Dictionary in Python?

A dictionary in Python is an unordered, mutable data type that stores mappings of unique keys to values. Dictionaries are written with curly braces { } and consist of key-value pairs separated by commas.

For example:

student = {"name": "Michael", "age": 22, "city": "Chicago"}

Dictionaries are useful for storing data in a structured manner and accessing values by keys.

Creating a Dictionary

Dictionary keys must be immutable objects like strings, numbers, or tuples. Dictionary values can be any object.

student = {"name": "Susan", "age": 23}    prices = {"milk": 4.99, "bread": 2.89}

Manipulating a Dictionary

Elements can be accessed, added, changed, and removed via keys.

# Access value by key  print(student["name"])    # Add new key-value   student["major"] = "computer science"      # Change value  student["age"] = 25    # Remove key-value  del student["city"]

Useful Dictionary Methods

Some useful built-in methods include:

  • keys() – Returns list of keys
  • values() – Returns list of values
  • items() – Returns (key, value) tuples
  • get() – Returns value for key, avoids KeyError
  • pop() – Removes key and returns value
  • update() – Adds multiple key-values

Hands-on Example with Dictionaries

scores = {"Francis": 95, "John": 88, "Daniel": 82}    # Add new score  scores["Zoey"] = 97    # Remove John's score  scores.pop("John")      # Get Daniel's score  print(scores.get("Daniel"))    # Print all student names   print(scores.keys())

Step 4: Exploring Sets in Python

What is a Set in Python?

A set in Python is an unordered, mutable collection of unique, immutable objects. Sets are written with curly braces { } but unlike dictionaries, do not have key-value pairs.

For example:

numbers = {1, 2, 3, 4}

Sets are useful for membership testing, eliminating duplicates, and mathematical operations.

Creating a Set

Sets can be created from lists by passing it to the set() constructor:

my_list = [1, 2, 3, 3, 4]  my_set = set(my_list) # {1, 2, 3, 4}

Sets can contain mixed data types like strings, booleans, etc.

Manipulating a Set

Elements can be added and removed from sets.

numbers.add(5)     numbers.remove(1)

Useful Set Operations

Some useful set operations include:

  • union() – Returns union of two sets
  • intersection() – Returns intersection of sets
  • difference() – Returns difference between sets
  • symmetric_difference() – Returns symmetric difference

Hands-on Example with Sets

A = {1, 2, 3, 4}  B = {2, 3, 5, 6}    # Union - combines sets   print(A | B)     # Intersection   print(A & B)    # Difference    print(A - B)    # Symmetric difference  print(A ^ B)

Step 5: Comparing Lists, Dictionaries, and Sets

Comparison of Characteristics

The following is a concise comparison of the four Python data structures we referred to in this tutorial.

Structure Ordered Mutable Duplicate Elements Use Cases
List Yes Yes Yes Storing sequences
Tuple Yes No Yes Storing immutable sequences
Dictionary No Yes Keys: No
Values: Yes
Storing key-value pairs
Set No Yes No Eliminating duplicates, membership testing

When to Use Each Data Structure

Treat this as a soft guideline for which structure to turn to first in a particular situation.

  • Use lists for ordered, sequence-based data. Useful for stacks/queues.
  • Use tuples for ordered, immutable sequences. Useful when you need a fixed collection of elements that should not be changed.
  • Use dictionaries for key-value data. Useful for storing related properties.
  • Use sets for storing unique elements and mathematical operations.

Hands-on Example Using All Four Data Structures

Let's have a look at how these structures can all work together in an example that is a little more complex than a one liner.

# Make a list of person names  names = ["John", "Mary", "Bob", "Mary", "Sarah"]    # Make a tuple of additional information (e.g., email)  additional_info = ("john@example.com", "mary@example.com", "bob@example.com", "mary@example.com", "sarah@example.com")    # Make set to remove duplicates  unique_names = set(names)    # Make dictionary of name-age pairs  persons = {}  for name in unique_names:    persons[name] = random.randint(20,40)    print(persons)

Output:

{'John': 34, 'Bob': 29, 'Sarah': 25, 'Mary': 21}

This example utilizes a list for an ordered sequence, a tuple for storing additional immutable information, a set to remove duplicates, and a dictionary to store key-value pairs.

Moving Forward

In this comprehensive tutorial, we've taken a deep look at the foundational data structures in Python, including lists, tuples, dictionaries, and sets. These structures form the building blocks of Python programming, providing a framework for data storage, processing, and manipulation. Understanding these structures is essential for writing efficient and scalable code. From manipulating sequences with lists, to organizing data with key-value pairs in dictionaries, and ensuring uniqueness with sets, these essential tools offer immense flexibility in data handling.

As we've seen through code examples, these data structures can be combined in various ways to solve complex problems. By leveraging these data structures, you can open the doors to a wide range of possibilities in data analysis, machine learning, and beyond. Don't hesitate to explore the official Python data structures documentation for more insights.

Happy coding!

Matthew Mayo (@mattmayo13) holds a Master's degree in computer science and a graduate diploma in data mining. As Editor-in-Chief of KDnuggets, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.

More On This Topic

  • Getting Started with Python for Data Science
  • Getting Started with Python Generators
  • Getting Started Cleaning Data
  • Getting Started with 5 Essential Natural Language Processing Libraries
  • Getting Started with Distributed Machine Learning with PyTorch and Ray
  • Getting Started with Reinforcement Learning
Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...