You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to extract data from an HTTrack cache zip, documented here https://www.httrack.com/html/cache.html. That has per-file data (which I need) in the per-file local header's extra field field. Currently, the central directory's file header's extra field field is accessible via ZipInfo.extra, but as far as I can tell from the spec, that's not required to be the same thing. For example, 7zip writes NTFS timestamps to the central directory but not the local headers according to https://sourceforge.net/p/sevenzip/bugs/2313/
Proposal
Add some way to access this, and any other interesting local header fields to ZipFile. All the others are things where it only makes sense for a file to have a single value, like the CRC or filename, so it's probably just this one field that matters if only sane zips need to be supported.
Therefore, there could be a method such as ZipFile.getlocalheaderextra(name), functioning roughly like getinfo or read, and returning a bytes object.
Implementing this could be mostly a copy-and-paste job - ZipFile.open already finds the field and seeks past it here
Add new attribute local_header to zipinfo to access
information about the local file header as bytes.
Fixes: python#113994
Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
Feature or enhancement
Proposal:
Background
I'm trying to extract data from an HTTrack cache zip, documented here https://www.httrack.com/html/cache.html. That has per-file data (which I need) in the per-file local header's extra field field. Currently, the central directory's file header's extra field field is accessible via
ZipInfo.extra
, but as far as I can tell from the spec, that's not required to be the same thing. For example, 7zip writes NTFS timestamps to the central directory but not the local headers according to https://sourceforge.net/p/sevenzip/bugs/2313/Proposal
Add some way to access this, and any other interesting local header fields to
ZipFile
. All the others are things where it only makes sense for a file to have a single value, like the CRC or filename, so it's probably just this one field that matters if only sane zips need to be supported.Therefore, there could be a method such as
ZipFile.getlocalheaderextra(name)
, functioning roughly likegetinfo
orread
, and returning abytes
object.Implementing this could be mostly a copy-and-paste job -
ZipFile.open
already finds the field and seeks past it herecpython/Lib/zipfile/__init__.py
Line 1634 in ac92527
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Linked PRs
The text was updated successfully, but these errors were encountered: