7.2 Working with Binary Files in Python

In addition to reading and writing text files, Python allows you to work with binary files. Unlike text files, which store data as human-readable characters, binary files store data as a sequence of bytes. These files can contain any type of data, such as images, audio files, videos, or executable programs. Working with binary files is crucial for handling non-text data efficiently.

In this section, we’ll explore how to open, read, and write binary files in Python, along with some key techniques for handling them.


7.2.1 Opening Binary Files with open()

To work with binary files, you use the open() function just like with text files, but you need to specify binary mode when opening the file.

Basic Syntax:

file = open(filename, mode)

Common modes for binary files:

  • "rb": Read binary. Opens a file for reading in binary mode.
  • "wb": Write binary. Opens a file for writing in binary mode. Overwrites the file if it exists or creates a new one.
  • "ab": Append binary. Opens a file in binary mode for appending. Adds new content at the end without overwriting existing data.
  • "rb+": Read and write in binary mode.

7.2.2 Reading Binary Files

When reading a binary file, data is read as a sequence of bytes, which are represented as byte objects in Python. You can use methods like read(), readline(), and readlines() in binary mode, but the data returned will be in bytes rather than text.

Example: Reading a Binary File

Let’s read a binary file using the read() method:

# Opening a binary file in read mode
with open("example_image.png", "rb") as file:
    binary_data = file.read()  # Reading the entire file as bytes
    print(binary_data[:20])  # Display the first 20 bytes

In this example:

  • The file is opened in binary read mode ("rb").
  • The read() method reads the entire content of the file as a sequence of bytes.
  • The first 20 bytes of the binary data are printed.

Reading in Chunks

Reading large binary files all at once may not be efficient. Instead, you can read them in smaller chunks.

# Reading a binary file in chunks
with open("example_image.png", "rb") as file:
    while chunk := file.read(1024):  # Read 1024 bytes at a time
        print(chunk[:10])  # Display the first 10 bytes of each chunk

In this example:

  • The file is read in chunks of 1024 bytes using the read() method in a loop.
  • This approach is more memory-efficient for large binary files, as it reads smaller pieces of data at a time.

7.2.3 Writing Binary Files

You can write to binary files using the write() method, just like with text files. The difference is that you must write data as bytes rather than as strings. You can create or overwrite binary files in write mode ("wb") or append to existing binary files in append mode ("ab").

Example: Writing to a Binary File

# Binary data to write (as bytes)
binary_data = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00'

# Writing binary data to a file
with open("output_binary_file.bin", "wb") as file:
    file.write(binary_data)

In this example:

  • binary_data is a byte object (denoted by the prefix b).
  • The write() method is used to write the binary data to a file in binary write mode ("wb").
  • This will create a file named output_binary_file.bin with the given binary data.

Appending to a Binary File

If you want to append binary data to an existing file, you can open the file in append mode ("ab").

# Appending binary data to a file
with open("output_binary_file.bin", "ab") as file:
    file.write(b'\x00\x00\x00\x00')  # Appending four null bytes

In this example:

  • The file is opened in binary append mode ("ab").
  • The write() method appends additional binary data (four null bytes) to the file without overwriting the existing content.

7.2.4 Copying Binary Files

One common use case when working with binary files is copying the contents of one file to another. This is especially useful for working with media files (images, videos, audio).

Example: Copying a Binary File

# Copying a binary file
with open("source_image.png", "rb") as source_file:
    with open("destination_image.png", "wb") as dest_file:
        while chunk := source_file.read(1024):  # Reading in chunks of 1024 bytes
            dest_file.write(chunk)

In this example:

  • The source file (source_image.png) is opened in binary read mode ("rb"), and the destination file is opened in binary write mode ("wb").
  • The source file is read in chunks of 1024 bytes and written to the destination file. This approach ensures that even large files are copied efficiently without consuming too much memory.

7.2.5 Handling Binary File Exceptions

When working with binary files, it’s essential to handle potential errors, such as file not found or permission issues. You can use try-except blocks to catch and handle these exceptions.

Example: Handling File Exceptions

try:
    with open("nonexistent_file.bin", "rb") as file:
        binary_data = file.read()
except FileNotFoundError:
    print("The file was not found.")
except IOError:
    print("An I/O error occurred.")

In this example:

  • A FileNotFoundError is raised if the file does not exist.
  • Other I/O-related errors are caught by the IOError exception, such as file access issues or permission errors.

7.2.6 Working with Binary Data Structures

In some cases, you may need to work with binary data structures, such as integers, floating-point numbers, or other packed data formats. Python provides the struct module to work with packed binary data.

Example: Using the struct Module

The struct module allows you to convert between Python values and C-style binary data structures.

import struct

# Packing data into binary format (little-endian, integer)
binary_data = struct.pack('<i', 12345)
print(binary_data)  # Output: b'90\x00\x00'

# Unpacking binary data into a Python integer
unpacked_data = struct.unpack('<i', binary_data)
print(unpacked_data)  # Output: (12345,)

In this example:

  • The pack() function converts the integer 12345 into a binary format (little-endian) and stores it as a byte object.
  • The unpack() function converts the binary data back into a Python integer.

The struct module is particularly useful when working with binary protocols, file formats, or memory-mapped data.


7.2.7 Working with Binary Files Using pathlib

You can also handle binary files using the pathlib module, which provides an object-oriented way of working with file paths and files.

Example: Reading and Writing Binary Files with pathlib

from pathlib import Path

# Defining the path to the binary file
binary_file_path = Path("example_binary_file.bin")

# Writing binary data using pathlib
binary_data = b'\x00\x01\x02\x03\x04\x05'
binary_file_path.write_bytes(binary_data)

# Reading binary data using pathlib
read_data = binary_file_path.read_bytes()
print(read_data)  # Output: b'\x00\x01\x02\x03\x04\x05'

In this example:

  • The write_bytes() method writes binary data to a file, while the read_bytes() method reads binary data from a file using pathlib.

7.2.8 Summary

  • Opening binary files: Use the open() function with modes like "rb" for reading, "wb" for writing, and "ab" for appending binary data.
  • Reading binary files: Use read() to read the entire file as bytes or read in chunks for large files.
  • Writing to binary files: Use write() to write byte data to binary files, and be careful when using modes like "wb" (which overwrites files) or "ab" (which appends data).
  • Copying binary files: Read binary files in chunks and write them to another file to perform efficient file copying.
  • Handling exceptions: Use try-except blocks to handle errors such as FileNotFoundError or IOError.
  • Working with binary data structures: The struct module allows you to pack and

unpack binary data, making it easy to handle structured binary formats.

  • Using pathlib: You can work with binary files in an object-oriented way using pathlib's read_bytes() and write_bytes() methods.

Working with binary files is essential for handling non-text data, such as images, audio, and video, or communicating with low-level hardware or protocols. Mastering binary file operations in Python allows you to build applications that handle complex data efficiently.