Sparse files: How to find contents
Just writing an answer based on the previous comments:
#!/usr/bin/env python3from errno import ENXIOfrom os import lseekfrom sys import argv, stderrSEEK_DATA = 3SEEK_HOLE = 4def get_ranges(fobj): ranges = [] end = 0 while True: try: start = lseek(fobj.fileno(), end, SEEK_DATA) end = lseek(fobj.fileno(), start, SEEK_HOLE) ranges.append((start, end)) except OSError as e: if e.errno == ENXIO: return ranges raisedef main(): if len(argv) < 2: print('Usage: %s <sparse_file>' % argv[0], file=stderr) raise SystemExit(1) try: with open(argv[1], 'rb') as f: ranges = get_ranges(f) for start, end in ranges: print('[%d:%d]' % (start, end)) size = end-start length = min(20, size) f.seek(start) data = f.read(length) print(data) except OSError as e: print('Error:', e) raise SystemExit(1)if __name__ == '__main__': main()
It probably doesn't do what you want, however, which is returning exactly the data you wrote. Zeroes may surround the returned data and must be trimmed by hand.
Current status of SEEK_DATA and SEEK_HOLE are described in https://man7.org/linux/man-pages/man2/lseek.2.html:
SEEK_DATA and SEEK_HOLE are nonstandard extensions also present in Solaris, FreeBSD, and DragonFly BSD; they are proposed for inclusion in the next POSIX revision (Issue 8).