stream.read(8192) on image heavy repository returns 0 despite having more data

I'm using gitdb in combination with GitPython to handle files directly from git repositories. So far this worked perfectly. Thanks!

Something odd is happening in a repository I'm handling at the moment. I've traced it down to the 
`read` function within the `DecompressMemMapReader`. More precisely this line of code: https://github.com/gitpython-developers/gitdb/blob/90c4f25493b918ff9dc4ee52ae8216a554bb3446/gitdb/stream.py#L275

If I understand things correctly, this check tries to enlarge the input buffer (containing compressed data) to at least 8 bytes, so the following `self._zip.decompress` call returns at last some data.

In my repository this doesn't help: `len(dcompdat)` is `0` and all the way back in my code, a `read(8192)` returns `''` despite more data being available. Changing `8192` to `8191` (or other random values) most of the time "fixes" this. I suspect this is the result of different internal buffering.

Not sure if it helps with finding a reason why `decompress` doesn't return anything from 8 input bytes, but the file responsible is an already compressed JPEG file.

How to fix this? Changing the minimum window size to 48 seems to help
```
if self._cwe - self._cws < 48:
    self._cwe = self._cws + 48
```
but I'm not sure if this has consequences I don't fully understand. Any help would be appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

stream.read(8192) on image heavy repository returns 0 despite having more data #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

stream.read(8192) on image heavy repository returns 0 despite having more data #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions