spdl.io.iter_tarfile¶

iter_tarfile(src: SupportsRead) → Iterator[tuple[str, bytes]][source]¶

iter_tarfile(src: bytes) → Iterator[tuple[str, memoryview]]

[Experimental] Parse a TAR file and yields file path and its contents.

See also

Benchmark tarfile: A benchmark script that compares the performance of iter_tarfile function with Python’s built-in tarfile module.

Parameters:: src – Source data. A bytes object containing all the data or a file-like object (only a read method i.e. read(n: int) -> bytes is required).

Yields

If the source is bytes, then it yields a series of tuples consist of the file name and a memoryview of the contents.

If the source is a file-like object, then it yields a series of tuples consist of the file name and its contents (in bytes).

Example - Parsing an in-memory TAR file.

with open(path, "rb") as f:
    data = f.read()

for filepath, contents_view in iter_tarfile(tar_data):
    print(f"File: {filepath}, ({len(contents_view)} bytes)")
    print(f"Preview: {contents_view[:30]}")

Example - Parsing a TAR file from a file-like object.

with open(path, "rb") as f:
    for filepath, contents in iter_tarfile(f):
        print(f"File: {filepath}, ({len(contents)} bytes)")
        print(f"Preview: {contents[:30]}")