Processing a large result set with a Doctrine iterator

Published on 2019-11-16 • Modified on 2019-11-27

When having to process a large number of rows with Doctrine, it's not a good idea to use the standard getResult function as you run into memory problems very quickly. That's where the iterate function can be useful. It returns an IterableResult object you can loop on without being afraid of memory problems. Here is an example. Of course on my blog, I don't have millions of rows... maybe soon! 😉
I have tested this recently for the migration of a table to a new one, with functional rules. There were half a million rows to process. The batch took about 250mb of memory and five minutes to finish.
PS: Don't forget to turn off the debug mode!


<?php declare(strict_types=1);

// src/Controller/Snippet/Snippet56Trait.php

namespace App\Controller\Snippet;

use App\Entity\Article;
use App\Entity\ArticleRepository;
use Doctrine\Bundle\DoctrineBundle\Registry;

/**
 * I am using a PHP trait in order to isolate each snippet in a file.
 * This code should be called from a Symfony controller extending AbstractController (as of Symfony 4.2)
 * or Symfony\Bundle\FrameworkBundle\Controller\Controller (Symfony <= 4.1).
 * Services are injected in the main controller constructor.
 *
 * @property ArticleRepository $articleRepo
 */
trait Snippet56Trait
{
    public function snippet56(): void
    {
        $doctrine = $this->getDoctrine();
        if (!$doctrine instanceof Registry) {
            throw new \RuntimeException("Houston, We've Got a Problem. 💥");
        }

        $manager = $doctrine->getManager();
        $qb = $this->articleRepo->createQueryBuilder('get_all_articles');
        $processed = [];
        $batchSize = 3;
        $flushCount = 0;

        echo 'Memory before: '.round((memory_get_usage()/1024/1024), 2)." mb\n";
        foreach ($qb->getQuery()->iterate() as $row) {
            $article = $row[0] ?? null;
            if ($article instanceof Article) {
                if ((count($processed) % $batchSize) === 0) {
                    // persist here
                    $manager->flush();
                    $manager->clear(); // call clear, so memory can be freed.
                    ++$flushCount;
                }
                $processed[] = $article->getId();
            }
        }
        $manager->flush(); // for last rows
        $manager->clear();

        echo 'Memory after: '.round((memory_get_usage()/1024/1024), 2).' mb'.PHP_EOL;
        echo 'Number of flush/clear: '.($flushCount+1).PHP_EOL;
        echo 'Processed articles with IDs: '.implode(',', $processed);

        // That's it! 😁
    }
}

 Run this snippet  More on Stackoverflow   Read the doc