Skip to content

stream.read(8192) on image heavy repository returns 0 despite having more data #43

Closed
@dividuum

Description

@dividuum

I'm using gitdb in combination with GitPython to handle files directly from git repositories. So far this worked perfectly. Thanks!

Something odd is happening in a repository I'm handling at the moment. I've traced it down to the
read function within the DecompressMemMapReader. More precisely this line of code:

# if window is too small, make it larger so zip can decompress something

If I understand things correctly, this check tries to enlarge the input buffer (containing compressed data) to at least 8 bytes, so the following self._zip.decompress call returns at last some data.

In my repository this doesn't help: len(dcompdat) is 0 and all the way back in my code, a read(8192) returns '' despite more data being available. Changing 8192 to 8191 (or other random values) most of the time "fixes" this. I suspect this is the result of different internal buffering.

Not sure if it helps with finding a reason why decompress doesn't return anything from 8 input bytes, but the file responsible is an already compressed JPEG file.

How to fix this? Changing the minimum window size to 48 seems to help

if self._cwe - self._cws < 48:
    self._cwe = self._cws + 48

but I'm not sure if this has consequences I don't fully understand. Any help would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions