Implementing RLE in Python: Step-by-Step Tutorial with Examples
Run-Length Encoding (RLE) is a simple lossless compression technique that replaces consecutive repeated values with a single value and a count. It works best on data with long runs of the same value (e.g., simple images, repeated characters). This tutorial shows how to implement RLE in Python, explains common variations, and provides examples for strings and binary image data.
1. Basic idea
- Represent a run as (value, count).
- Example: “AAAABBBCC” → [(“A”,4),(“B”,3),(“C”,2)] or as a compact string “A4B3C2”.
2. Simple RLE for strings
Implementation (encode/decode):
python
def rle_encode(s: str) -> str: if not s: return ”” result = [] prev = s[0] count = 1 for ch in s[1:]: if ch == prev: count += 1 else: result.append(f”{prev}{count}“) prev = ch count = 1 result.append(f”{prev}{count}“) return ””.join(result) def rle_decode(encoded: str) -> str: if not encoded: return ”” import re parts = re.findall(r’(.)\d+’, encoded) # safer parse decoded = [] i = 0 while i < len(encoded): char = encoded[i] i += 1 num_start = i while i < len(encoded) and encoded[i].isdigit(): i += 1 count = int(encoded[numstart:i]) decoded.append(char * count) return ””.join(decoded)
Example:
python
s = “AAAABBBCCDAA” enc = rle_encode(s) # “A4B3C2D1A2” dec = rledecode(enc) # “AAAABBBCCDAA”
Notes:
- This simple format assumes characters are not digits. For general text, use separators or escape rules.
3. RLE for bytes (binary data)
For arbitrary bytes, store runs as (byte, count) pairs. Use bytes objects for compactness and define a fixed-size count (e.g., 1 byte allows counts up to 255; for longer runs, split across multiple pairs).
python
def rle_encode_bytes(data: bytes) -> bytes: if not data: return b”” out = bytearray() prev = data[0] count = 1 for b in data[1:]: if b == prev and count < 255: count += 1 else: out.append(prev) out.append(count) prev = b count = 1 out.append(prev) out.append(count) return bytes(out) def rle_decodebytes(encoded: bytes) -> bytes: if not encoded: return b”” out = bytearray() it = iter(encoded) for b in it: count = next(it, None) if count is None: raise ValueError(“Malformed RLE data”) out.extend(bytes([b]) * count) return bytes(out)
Example:
python
data = b”\x00\x00\x00\xff\xff” enc = rle_encode_bytes(data) # b”\x00\x03\xff\x02” dec = rle_decode_bytes(enc) # b”\x00\x00\x00\xff\xff”
4. RLE for simple grayscale images
For 1D flattened pixel arrays (values 0–255), use the byte-based functions above. For many image formats, RLE is applied per row or with markers to indicate line ends. Example using Pillow to get raw bytes:
”`python from PIL