Skip to main content

Binary JSON Formats

Learning Focus

When JSON's text format becomes a bottleneck — payload size, parse time, or missing types — binary formats offer order-of-magnitude improvements.

Why Go Binary?

JSON LimitationBinary Solution
No native binary/blob typeBSON Binary, CBOR bytes
No native Date typeBSON Date, CBOR datetime
Large text payloadsMessagePack: 50–80% smaller
No integer vs float distinctionProtocol Buffers: strongly typed

Format Comparison

FormatSelf-DescribingSchema RequiredCommon Users
BSONMongoDB
MessagePackRedis, Fluentd
Protocol BuffersGoogle, gRPC
CBORIoT, WebAuthn
AvroApache Kafka

BSON (Python)

from bson import ObjectId, encode, decode
from datetime import datetime

document = {"_id": ObjectId(), "name": "Alice", "createdAt": datetime.utcnow()}
raw = encode(document) # BSON bytes
back = decode(raw) # Python dict with native types

MessagePack (Python)

# pip install msgpack
import msgpack

data = {"name": "Alice", "scores": [98, 87, 92]}
packed = msgpack.packb(data, use_bin_type=True)
print(f"JSON: {len(json.dumps(data))} bytes | MsgPack: {len(packed)} bytes")

unpacked = msgpack.unpackb(packed, raw=False)

Decision Flowchart

Format Selection Guide

  • Need human-readable / debug output?JSON
  • Using MongoDB?BSON
  • Building a gRPC service with strict types?Protocol Buffers
  • IoT device or IETF protocol?CBOR
  • General-purpose binary serialization?MessagePack

Common Pitfalls

PitfallConsequencePrevention
Binary format for configHuman-unreadableKeep config as JSON/YAML
Unversioned Protobuf schemasField conflicts break serializationFollow Protobuf schema evolution rules
BSON over a REST APIClients expect JSONSerialize to JSON at the API boundary
Premature optimizationComplexity for minimal gainProfile first; JSON is fast enough for most APIs

What's Next