Python Data Types

Introduction

Python provides a rich set of built-in data types that make it easy to work with different kinds of data. This guide covers the fundamental data types and their common operations.

Basic Types

Numbers

Integers (int)
Floating-point numbers (float)
Complex numbers (complex)
Boolean values (bool)

Strings

String literals
String operations
String formatting
String methods

Collections

Lists
Tuples
Dictionaries
Sets

Advanced Data Structures

Collections Module

deque (double-ended queue)
Counter
OrderedDict
defaultdict
ChainMap

Working with Sequences

Unpacking

# Basic unpacking
p = (4, 5)
x, y = p

# Using _ as throwaway variable
data = ['ACME', 50, 91.1, (2012, 12, 21)]
name, shares, price, date = data
_, shares, _, date = data

# Unpacking N elements
record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')
name, email, *phone_numbers = record

# String splitting and unpacking
line = 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'
uname, *fields, homedir, sh = line.split(':')

Sorting and Grouping

# Sorting dictionaries
rows = [
    {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
    {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
    {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
    {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}
]

from operator import itemgetter

# Sort by single key
rows_by_fname = sorted(rows, key=itemgetter('fname'))
rows_by_uid = sorted(rows, key=itemgetter('uid'))

# Sort by multiple keys
rows_by_lfname = sorted(rows, key=itemgetter('lname', 'fname'))

# Grouping records
from itertools import groupby
for date, items in groupby(rows, key=itemgetter('date')):
    print(date)
    for i in items:
        print('    ', i)

Working with Collections

Deque

from collections import deque

# Create a deque with maximum length
q = deque(maxlen=3)
q.append(1)
q.append(2)
q.append(3)
q.append(4)  # First element is removed
print(q)  # deque([2, 3, 4])

# Append to left
q.appendleft(4)
print(q)  # deque([4, 2, 3])

Finding Largest/Smallest Items

import heapq

# Find N smallest items
smallest = heapq.nsmallest(3, items, key=lambda s: s['price'])

# Find N largest items
largest = heapq.nlargest(3, items, key=lambda s: s['price'])

Data Manipulation

Filtering and Subsetting

# Filtering lists
from itertools import compress
addresses = ['a', 'b', 'c', 'd']
counts = [0, 3, 10, 4]
more5 = [n > 5 for n in counts]
list(compress(addresses, more5))

# Dictionary subsetting
prices = {'ACME': 45.23, 'AAPL': 612.78, 'IBM': 205.55}
p1 = {key: value for key, value in prices.items() if value > 200}

Text Processing

# String alignment
text = 'Hello World'
text.ljust(20)  # Left justify
text.rjust(20)  # Right justify
text.center(20)  # Center

Date and Time

Working with Time Objects

from datetime import timedelta

# Create time deltas
a = timedelta(days=2, hours=6)
b = timedelta(hours=4.5)
c = a + b

# Access components
print(c.days)  # 2
print(c.seconds)  # 37800
print(c.total_seconds())  # 210600.0

Best Practices

Choose the right data type for your needs
Use list comprehensions for simple transformations
Leverage built-in functions and methods
Consider memory usage for large datasets
Use appropriate data structures for performance
Type Annotations
Data Structures
Collections Module
Date and Time