Populating FosElasticaBundle running out of php memory, possible memory leak? Populating FosElasticaBundle running out of php memory, possible memory leak? elasticsearch elasticsearch

Populating FosElasticaBundle running out of php memory, possible memory leak?


I'm not able to solve the memory leak entirely, but by running the command

app/console fos:elastica:populate --no-debug --no-reset --env=prod --offset=n

I've been able to populate in batches. I drastically cut down the amount of memory leaking by turning off the logger, using a solution on this page

https://github.com/FriendsOfSymfony/FOSElasticaBundle/issues/273

Setting my php memory_limit to 4G (!) I'm able to get more than 5m records populated without error, and thus after a couple of batches I should be done with this process.

Most solutions seem to involve writing a custom provider (see https://github.com/FriendsOfSymfony/FOSElasticaBundle/issues/457) but through a ridiculous memory_limit and limiting the memory leak as much as possible I didn't need to.


The main problem here that everything is done in one process, all entities have to load in memory. It is done by chunks but still, it loads all the data. There is much you can do with it cuz the problem in the design.

The solution: The data could be split into chunks which are processed in separate processes in parallel. The worker processes may quit from time to time (they have to be restarted by Supervisord or similar tool) freeing the memory and resources. As a result, you'll get a lot better performance and better fault tolerance and less memory footprint.

There are many ways to implement this (using forks, pthreads or message queues) but I personally suggest looking at enqueue/elastica-bundle. It improves populate command by splitting the job and sending the messages.


If the --no-debug option is not sufficient, you might want to check if you have any fingers_crossed handler and set the buffer_size:

monolog:    handlers:        main:            type: fingers_crossed            action_level: critical            handler: grouped            excluded_404s:                - ^            buffer_size: 30