Skip to content

High memory use and possible leak with sequential access mode #268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mzur opened this issue Apr 14, 2025 · 6 comments
Open

High memory use and possible leak with sequential access mode #268

mzur opened this issue Apr 14, 2025 · 6 comments

Comments

@mzur
Copy link

mzur commented Apr 14, 2025

Hi, first of all thank you @jcupitt for this library and your incredible support!

I ran across an issue where sequential access mode seems to consume a lot more memory and also seems to have a leak where memory is not freed after processing, comapred to random mode. Here is a minimal example script:

<?php

include 'vendor/autoload.php';

use Jcupitt\Vips\Image;

$access = 'sequential';
// $access = 'random';

for ($i=0; $i < 10; $i++) {
    $image = Image::newFromFile('my_image.jpg', ['access' => $access]);
    $width = $image->width;
    $height = $image->height;

    $buf = $image->crop(round($width / 2) - 150, round($height / 2) - 150, 300, 300)
        ->writeToBuffer('.jpg');
}

When I use sequential mode and call /usr/bin/time -v php my_script.php I get:

	Command being timed: "php my_script.php"
	User time (seconds): 13.18
	System time (seconds): 5.63
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:18.82
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 15905256
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 4083940
	Voluntary context switches: 852
	Involuntary context switches: 672
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

So it uses 15 GB to process the images. When I increase the number of iterations in the loop, the consumed memory also increases.

With random mode I get:

	Command being timed: "php my_script.php"
	User time (seconds): 5.20
	System time (seconds): 4.96
	Percent of CPU this job got: 181%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.61
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 472804
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 107770
	Voluntary context switches: 7281
	Involuntary context switches: 1091
	Swaps: 0
	File system inputs: 0
	File system outputs: 12657432
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

It is faster and only uses 500 MB.

The image I use is a heavily compressed JPEG (~60 MB):

$ vipsheader my_image.jpg
my_image.jpg: 46789x46169 uchar, 3 bands, srgb, jpegload

I can share the actual file via email but not publicly here.

I use vips-8.15.1 and jcupitt/vips:2.4.1.

@jcupitt
Copy link
Member

jcupitt commented Apr 14, 2025

Hi @mzur,

This is the libvips operation cache -- it's tracking recent operations and trying to reuse them. If you disable the cache with Vips\Config::cacheSetMax(0); (ie. set the max cache size to 0), the seq version runs in 500mb too. You will see memory use creep up over time due to heap fragmentation, but there's no leak.

random mode will decompress the entire image to a temporary file for random access, then reuse that 10 times. sequential will decompress each time, so it'll be a lot slower.

@mzur
Copy link
Author

mzur commented Apr 14, 2025

Thanks! So now I have this:

<?php

include 'vendor/autoload.php';

use Jcupitt\Vips\Image;
use Jcupitt\Vips\Config;

$access = 'sequential';
// $access = 'random';

Config::cacheSetMax(0);

for ($i=0; $i < 10; $i++) {
    $image = Image::newFromFile('my_image.jpg', ['access' => $access]);
    $width = $image->width;
    $height = $image->height;

    $buf = $image->crop(round($width / 2) - 150, round($height / 2) - 150, 300, 300)
        ->writeToBuffer('.jpg');
}

But it still uses lots of memory, increasing with the number of loop iterations:

	Maximum resident set size (kbytes): 18410168

@mzur
Copy link
Author

mzur commented Apr 14, 2025

cacheSetMax() only seems to have an effect on access random where it will no longer reuse the temporary file.

@mzur
Copy link
Author

mzur commented Apr 16, 2025

I did some more experiments, re-encoded the image, converted it to PNG but nothing helped. Even if I found a way to reduce the amount of memory used, this would only delay the issue, as the used memory would still stack up (only slower). I found no way to clear the memory. It also seems to be outside of the memory_limit checks of PHP. So the only solution left to me is to configure my worker processes to restart after each job. That's an ok solution for me and I don't think anything else could be done here so I'll close this. Thanks again for everything you do here @jcupitt!

@mzur mzur closed this as completed Apr 16, 2025
@jcupitt
Copy link
Member

jcupitt commented Apr 16, 2025

Sorry, I was stuck on another issue. I'll have a look into this today.

@jcupitt jcupitt reopened this Apr 16, 2025
@mzur
Copy link
Author

mzur commented Apr 16, 2025

No need to apologize at all! I think there is nothing to be done about the continuously increasing memory usage. PHP just isn't designed for continuously running scripts. My first observation with sequential vs random is irrelevant, as random just uses disk space instead of memory, as you pointed out. I'm fine with closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants