Skip to main content

JSON in Python

Learning Focus

Python has built-in JSON support via the json module. Learn to parse, serialize, stream, and validate JSON safely in production code.

Parse and Serialize

import json

# Parse string → dict
data = json.loads('{"name": "Alice", "age": 30, "active": true}')
print(data["name"]) # Alice
print(type(data["age"])) # <class 'int'>

# Parse file → dict
with open("data.json", "r", encoding="utf-8") as f:
data = json.load(f)

# Serialize dict → string
payload = {"name": "Bob", "scores": [95, 87], "active": True, "notes": None}
print(json.dumps(payload, indent=2, sort_keys=True))

# Serialize dict → file
with open("output.json", "w", encoding="utf-8") as f:
json.dump(payload, f, indent=2, ensure_ascii=False)

Type Mapping

PythonJSON
dictobject {}
list, tuplearray []
strstring
int, floatnumber
Truetrue
Falsefalse
Nonenull

Custom Serialization (Dates, etc.)

from datetime import datetime

def default_encoder(obj):
if isinstance(obj, datetime):
return obj.isoformat()
raise TypeError(f"Not serializable: {type(obj)}")

data = {"event": "launch", "at": datetime.utcnow()}
print(json.dumps(data, default=default_encoder))

Streaming Large Files with ijson

pip install ijson
import ijson

with open("large.json", "rb") as f:
for item in ijson.items(f, "item"):
process(item) # one record at a time — constant memory

Validation with jsonschema

# pip install jsonschema
import jsonschema

schema = {
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"},
"age": {"type": "integer", "minimum": 18}
},
"required": ["email"]
}

try:
jsonschema.validate({"email": "alice@example.com", "age": 25}, schema)
print("Valid")
except jsonschema.ValidationError as e:
print(f"Error: {e.message}")

Concept Map

Concept Flow

Raw JSON
└── json.loads / json.load
└── Python dict / list
├── Business Logic
│ └── json.dumps / json.dump → JSON Output
└── jsonschema.validate → Valid / Invalid

Common Pitfalls

PitfallConsequencePrevention
str() instead of json.dumps()Python repr, not valid JSONAlways use json.dumps()
datetime not serializableTypeError at runtimeUse default= encoder
json.load() on huge fileMemory overflowUse ijson for streaming
Missing encoding="utf-8"Garbled chars on WindowsAlways specify encoding

What's Next