Python Socket Receive Large Amount of Data Python Socket Receive Large Amount of Data python python

Python Socket Receive Large Amount of Data


TCP/IP is a stream-based protocol, not a message-based protocol. There's no guarantee that every send() call by one peer results in a single recv() call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv() calls, due to packet fragmentation.

You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv() until you've read an entire message or an error occurs.

One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:

def send_msg(sock, msg):    # Prefix each message with a 4-byte length (network byte order)    msg = struct.pack('>I', len(msg)) + msg    sock.sendall(msg)def recv_msg(sock):    # Read message length and unpack it into an integer    raw_msglen = recvall(sock, 4)    if not raw_msglen:        return None    msglen = struct.unpack('>I', raw_msglen)[0]    # Read the message data    return recvall(sock, msglen)def recvall(sock, n):    # Helper function to recv n bytes or return None if EOF is hit    data = bytearray()    while len(data) < n:        packet = sock.recv(n - len(data))        if not packet:            return None        data.extend(packet)    return data

Then you can use the send_msg and recv_msg functions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.


You can use it as: data = recvall(sock)

def recvall(sock):    BUFF_SIZE = 4096 # 4 KiB    data = b''    while True:        part = sock.recv(BUFF_SIZE)        data += part        if len(part) < BUFF_SIZE:            # either 0 or end of data            break    return data


The accepted answer is fine but it will be really slow with big files -string is an immutable class this means more objects are created every time you use the + sign, using list as a stack structure will be more efficient.

This should work better

while True:     chunk = s.recv(10000)    if not chunk:         break    fragments.append(chunk)print "".join(fragments)