#1 Data Analytics Program in India
₹2,499₹1,499Enroll Now
7 min read
•Question 16 of 41medium

Generators in Python

Creating iterators with yield.

What You'll Learn

  • What generators are and how they differ from regular functions
  • Using yield to create generator functions
  • Generator expressions for concise syntax
  • Memory efficiency benefits
  • Advanced generator methods: send, throw, close

Understanding Generators

Generators are special functions that produce a sequence of values lazily - one at a time, on demand. Unlike regular functions that compute all values at once, generators pause and resume execution, saving memory.

Key differences from regular functions:

  • Use yield instead of return
  • Maintain state between calls
  • Are iterable (work with for loops, next())
  • Produce values lazily (on demand)

Generator Functions

code.pyPython
# Regular function - creates entire list in memory
def get_squares_list(n):
    result = []
    for i in range(n):
        result.append(i ** 2)
    return result

# Generator function - yields one value at a time
def get_squares_gen(n):
    for i in range(n):
        yield i ** 2

# Usage
for sq in get_squares_gen(5):
    print(sq)  # 0, 1, 4, 9, 16

# Generator object
gen = get_squares_gen(3)
print(next(gen))  # 0
print(next(gen))  # 1
print(next(gen))  # 4
# print(next(gen))  # StopIteration!

How Generators Work

code.pyPython
def simple_gen():
    print("Start")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")
    yield 3
    print("End")

gen = simple_gen()
print(next(gen))  # Prints "Start", returns 1
print(next(gen))  # Prints "After first yield", returns 2
print(next(gen))  # Prints "After second yield", returns 3
# print(next(gen))  # Prints "End", raises StopIteration

Generator Expressions

Concise syntax similar to list comprehensions:

code.pyPython
# List comprehension - creates list in memory
list_sq = [x**2 for x in range(10)]

# Generator expression - lazy, memory efficient
gen_sq = (x**2 for x in range(10))

# Use generator expression when:
# 1. You only need to iterate once
# 2. Working with large datasets
# 3. Memory is a concern

# Memory comparison
import sys
list_comp = [x for x in range(10000)]
gen_expr = (x for x in range(10000))

print(sys.getsizeof(list_comp))  # ~87,624 bytes
print(sys.getsizeof(gen_expr))   # ~112 bytes (constant!)

Practical Examples

Reading Large Files

code.pyPython
def read_large_file(file_path):
    """Read file line by line without loading entire file."""
    with open(file_path, 'r') as f:
        for line in f:
            yield line.strip()

# Process millions of lines with constant memory
for line in read_large_file("huge_log.txt"):
    if "ERROR" in line:
        print(line)

Infinite Sequences

code.pyPython
def infinite_counter(start=0):
    """Generate infinite sequence of numbers."""
    num = start
    while True:
        yield num
        num += 1

# Use with caution!
for num in infinite_counter():
    if num > 10:
        break
    print(num)

Fibonacci Generator

code.pyPython
def fibonacci():
    """Generate infinite Fibonacci sequence."""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Get first 10 Fibonacci numbers
from itertools import islice
fib_10 = list(islice(fibonacci(), 10))
print(fib_10)  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Pipeline Processing

code.pyPython
def read_data(file):
    for line in open(file):
        yield line

def parse_lines(lines):
    for line in lines:
        yield line.strip().split(',')

def filter_valid(records):
    for record in records:
        if len(record) == 3:
            yield record

# Chain generators - memory efficient pipeline
data = read_data("data.csv")
parsed = parse_lines(data)
valid = filter_valid(parsed)

for record in valid:
    print(record)

Advanced: Generator Methods

code.pyPython
def coroutine():
    value = 0
    while True:
        received = yield value
        if received is not None:
            value = received
        else:
            value += 1

gen = coroutine()
print(next(gen))      # 0 (start generator)
print(next(gen))      # 1
print(gen.send(10))   # 10 (send value into generator)
print(next(gen))      # 11

# close() - terminate generator
gen.close()

# throw() - raise exception in generator
# gen.throw(ValueError, "Something went wrong")

yield from (Delegation)

code.pyPython
def sub_generator():
    yield 1
    yield 2
    yield 3

def main_generator():
    yield 'a'
    yield from sub_generator()  # Delegate to sub-generator
    yield 'b'

list(main_generator())  # ['a', 1, 2, 3, 'b']

Interview Tip

When asked about generators:

  1. yield pauses function and returns value, return ends function
  2. Memory efficient - produce values on demand
  3. Generator expressions: (x for x in range(n))
  4. Use for large datasets, streaming, infinite sequences
  5. Can only iterate once (not reusable)
  6. Know send(), close(), and yield from