Reading a large file using C (greater than 4GB) using read function, causing problems Reading a large file using C (greater than 4GB) using read function, causing problems unix unix

Reading a large file using C (greater than 4GB) using read function, causing problems


In the first place, why do you need lseek() in your cycle? read() will advance the cursor in the file by the number of bytes read.

And, to the topic: long, and, respectively, chunk, have a maximum value of 2147483647, any number greater than that will actually become negative.

You want to use off_t to declare chunk: off_t chunk, and size as size_t.That's the main reason why lseek() fails.

And, then again, as other people have noticed, you do not want to free() your buffer inside the cycle.

Note also that you will overwrite the data you have already read.Additionally, read() will not necessarily read as much as you have asked it to, so it is better to advance chunk by the amount of the bytes actually read, rather than amount of bytes you want to read.

Taking everything in regards, the correct code should probably look something like this:

// Edited: note comments after the code#ifndef O_LARGEFILE#define O_LARGEFILE 0#endifint read_from_file_open(char *filename,size_t size){int fd;long *buffer=(long*) malloc(size * sizeof(long));fd = open(filename, O_RDONLY|O_LARGEFILE);   if (fd == -1)    {       printf("\nFile Open Unsuccessful\n");       exit (0);;    }off_t chunk=0;lseek(fd,0,SEEK_SET);printf("\nCurrent Position%d\n",lseek(fd,size,SEEK_SET));while ( chunk < size )  {   printf ("the size of chunk read is  %d\n",chunk);   size_t readnow;   readnow=read(fd,((char *)buffer)+chunk,1048576);   if (readnow < 0 )     {        printf("\nRead Unsuccessful\n");        free (buffer);        close (fd);        return 0;     }   chunk=chunk+readnow;  }printf("\nRead Successful\n");free(buffer);close(fd);return 1;}

I also took the liberty of removing result variable and all related logic since, I believe, it can be simplified.

Edit: I have noted that some systems (most notably, BSD) do not have O_LARGEFILE, since it is not needed there. So, I have added an #ifdef in the beginning, which would make the code more portable.


The lseek function may have difficulty in supporting big file sizes. Try using lseek64

Please check the link to see the associated macros which needs to be defined when you use lseek64 function.


If its 32 bit machine, it will cause some problem for reading a file of larger than 4gb. So if you are using gcc compiler try to use the macro -D_LARGEFILE_SOURCE=1 and -D_FILE_OFFSET_BITS=64.

Please check this link also

If you are using any other compiler check for similar types of compiler option.