9 min read
ā¢Question 32 of 41hardCPython Internals
Understanding Python implementation details.
What You'll Learn
- How CPython represents objects internally
- Integer and string interning optimizations
- Bytecode compilation and the dis module
- The Global Interpreter Lock (GIL)
- Stack frames and code objects
Everything is an Object
In CPython, everything is a PyObject ā integers, functions, even types.
code.pyPython
# The type hierarchy
print(type(42)) # <class 'int'>
print(type(type(42))) # <class 'type'>
print(type(type)) # <class 'type'>
# type is its own metaclass!
print(type.__class__) # <class 'type'>
# Even None is an object
print(type(None)) # <class 'NoneType'>
# id() returns memory address in CPython
a = [1, 2, 3]
print(id(a)) # Memory address
print(hex(id(a))) # As hexadecimal
# Objects have header: refcount + type pointer
import sys
print(sys.getsizeof(1)) # 28 bytes for int!
print(sys.getsizeof([])) # 56 bytes for empty listInteger Caching (Small Integer Pool)
CPython pre-allocates integers from -5 to 256 for performance:
code.pyPython
# Small integers are cached
a = 256
b = 256
print(a is b) # True - same object
a = 257
b = 257
print(a is b) # False - different objects
# In a single expression, compiler may optimize
x = 1000
y = 1000
print(x is y) # May be True (compile-time optimization)
# But in separate statements at runtime
def get_num():
return 1000
print(get_num() is get_num()) # False
# Check the cache range
import sys
# Actually -5 to 256 are interned
print(-5 is (-6 + 1)) # True
print(256 is (255 + 1)) # True
print(257 is (256 + 1)) # FalseString Interning
Short, identifier-like strings are automatically interned:
code.pyPython
# Interned (looks like identifier)
a = "hello"
b = "hello"
print(a is b) # True
# Not interned (has spaces/special chars)
a = "hello world!"
b = "hello world!"
print(a is b) # False
# Force interning
import sys
a = sys.intern("hello world!")
b = sys.intern("hello world!")
print(a is b) # True
# Useful for dictionary keys that repeat often
keys = [sys.intern("user_id") for _ in range(10000)]
# All reference the same string object
# What gets auto-interned:
# - Single characters
# - Strings that look like identifiers (a-z, A-Z, 0-9, _)
# - String literals at compile timeBytecode and the dis Module
Python source is compiled to bytecode, then interpreted:
code.pyPython
import dis
def add(a, b):
c = a + b
return c
# Disassemble the function
dis.dis(add)
# Output:
# 2 0 LOAD_FAST 0 (a)
# 2 LOAD_FAST 1 (b)
# 4 BINARY_ADD
# 6 STORE_FAST 2 (c)
# 3 8 LOAD_FAST 2 (c)
# 10 RETURN_VALUE
# Access the code object
code = add.__code__
print(code.co_code) # Raw bytecode bytes
print(code.co_varnames) # ('a', 'b', 'c')
print(code.co_consts) # (None,) - constants
print(code.co_names) # () - global names
print(code.co_stacksize) # Stack depth needed
# Compile to code object
source = "x = 1 + 2"
compiled = compile(source, "<string>", "exec")
dis.dis(compiled)The Global Interpreter Lock (GIL)
The GIL is a mutex that allows only one thread to execute Python bytecode at a time.
code.pyPython
import threading
import time
counter = 0
def increment():
global counter
for _ in range(1000000):
counter += 1 # Not atomic!
# Even with threads, only one runs Python bytecode at a time
threads = [threading.Thread(target=increment) for _ in range(4)]
for t in threads: t.start()
for t in threads: t.join()
print(counter) # Less than 4000000 due to race conditions
# GIL doesn't prevent race conditions on shared data!
# GIL is released during:
# - I/O operations (file, network)
# - time.sleep()
# - C extension code (NumPy, etc.)
# Workarounds:
# 1. multiprocessing - separate processes, no shared GIL
from multiprocessing import Pool
# 2. C extensions that release GIL
import numpy as np # NumPy releases GIL during computation
# 3. asyncio for I/O-bound concurrency
import asyncioStack Frames
Each function call creates a frame object:
code.pyPython
import sys
import traceback
def outer():
x = 10
y = 20
inner()
def inner():
z = 30
# Get current frame
frame = sys._getframe()
print(f"Current function: {frame.f_code.co_name}")
print(f"Local vars: {frame.f_locals}")
# Get caller's frame
caller = sys._getframe(1)
print(f"Caller: {caller.f_code.co_name}")
print(f"Caller locals: {caller.f_locals}")
# Walk the entire call stack
current = frame
while current:
print(f" {current.f_code.co_name}:{current.f_lineno}")
current = current.f_back
outer()
# Current function: inner
# Local vars: {'z': 30}
# Caller: outer
# Caller locals: {'x': 10, 'y': 20}
# Frame attributes:
# f_back - previous frame
# f_code - code object
# f_locals - local variables dict
# f_globals - global variables dict
# f_lineno - current line numberOptimization Peepholes
CPython applies various optimizations:
code.pyPython
import dis
# Constant folding
def calc():
return 2 * 3 * 4
dis.dis(calc)
# LOAD_CONST 24 - computed at compile time!
# Dead code elimination (limited)
def always_true():
if True:
return 1
return 2 # Still in bytecode (not removed)
# Membership test optimization
'a' in ['a', 'b', 'c'] # Converted to set lookup
'a' in {'a', 'b', 'c'} # Even faster (already a set)Interview Tip
When asked about CPython internals:
- id() returns memory address; small ints (-5 to 256) are cached
- String interning optimizes identifier-like strings
- Python compiles to bytecode, then interprets
- GIL allows one thread to run bytecode at a time
- Frame objects track call stack and local variables