Python sockets buffering Python sockets buffering python python

Python sockets buffering


If you are concerned with performance and control the socket completely(you are not passing it into a library for example) then try implementingyour own buffering in Python -- Python string.find and string.split and such canbe amazingly fast.

def linesplit(socket):    buffer = socket.recv(4096)    buffering = True    while buffering:        if "\n" in buffer:            (line, buffer) = buffer.split("\n", 1)            yield line + "\n"        else:            more = socket.recv(4096)            if not more:                buffering = False            else:                buffer += more    if buffer:        yield buffer

If you expect the payload to consist of linesthat are not too huge, that should run pretty fast,and avoid jumping through too many layers of functioncalls unnecessarily. I'd be interesting in knowinghow this compares to file.readline() or using socket.recv(1).


The recv() call is handled directly by calling the C library function.

It will block waiting for the socket to have data. In reality it will just let the recv() system call block.

file.readline() is an efficient buffered implementation. It is not threadsafe, because it presumes it's the only one reading the file. (For example by buffering upcoming input.)

If you are using the file object, every time read() is called with a positive argument, the underlying code will recv() only the amount of data requested, unless it's already buffered.

It would be buffered if:

  • you had called readline(), which reads a full buffer

  • the end of the line was before the end of the buffer

Thus leaving data in the buffer. Otherwise the buffer is generally not overfilled.

The goal of the question is not clear. if you need to see if data is available before reading, you can select() or set the socket to nonblocking mode with s.setblocking(False). Then, reads will return empty, rather than blocking, if there is no waiting data.

Are you reading one file or socket with multiple threads? I would put a single worker on reading the socket and feeding received items into a queue for handling by other threads.

Suggest consulting Python Socket Module source and C Source that makes the system calls.


def buffered_readlines(pull_next_chunk, buf_size=4096):  """  pull_next_chunk is callable that should accept one positional argument max_len,  i.e. socket.recv or file().read and returns string of up to max_len long or  empty one when nothing left to read.  >>> for line in buffered_readlines(socket.recv, 16384):  ...   print line    ...  >>> # the following code won't read whole file into memory  ... # before splitting it into lines like .readlines method  ... # of file does. Also it won't block until FIFO-file is closed  ...  >>> for line in buffered_readlines(open('huge_file').read):  ...   # process it on per-line basis        ...  >>>  """  chunks = []  while True:    chunk = pull_next_chunk(buf_size)    if not chunk:      if chunks:        yield ''.join(chunks)      break    if not '\n' in chunk:      chunks.append(chunk)      continue    chunk = chunk.split('\n')    if chunks:      yield ''.join(chunks + [chunk[0]])    else:      yield chunk[0]    for line in chunk[1:-1]:      yield line    if chunk[-1]:      chunks = [chunk[-1]]    else:      chunks = []