Sparse files: How to find contents Sparse files: How to find contents unix unix

Sparse files: How to find contents


Just writing an answer based on the previous comments:

#!/usr/bin/env python3from errno import ENXIOfrom os import lseekfrom sys import argv, stderrSEEK_DATA = 3SEEK_HOLE = 4def get_ranges(fobj):    ranges = []    end = 0    while True:        try:            start = lseek(fobj.fileno(), end, SEEK_DATA)            end = lseek(fobj.fileno(), start, SEEK_HOLE)            ranges.append((start, end))        except OSError as e:            if e.errno == ENXIO:                return ranges            raisedef main():    if len(argv) < 2:        print('Usage: %s <sparse_file>' % argv[0], file=stderr)        raise SystemExit(1)    try:        with open(argv[1], 'rb') as f:            ranges = get_ranges(f)            for start, end in ranges:                print('[%d:%d]' % (start, end))                size = end-start                length = min(20, size)                f.seek(start)                data = f.read(length)                print(data)    except OSError as e:        print('Error:', e)        raise SystemExit(1)if __name__ == '__main__': main()

It probably doesn't do what you want, however, which is returning exactly the data you wrote. Zeroes may surround the returned data and must be trimmed by hand.

Current status of SEEK_DATA and SEEK_HOLE are described in https://man7.org/linux/man-pages/man2/lseek.2.html:

SEEK_DATA and SEEK_HOLE are nonstandard extensions also present in Solaris, FreeBSD, and DragonFly BSD; they are proposed for inclusion in the next POSIX revision (Issue 8).