Help request to compute the amount of disk space and bandwidth wasted by useless published items? #178676
Replies: 2 comments 3 replies
-
|
There’s no public way to get that data directly. NPM doesn’t expose file by file sizes for every package.What you can do is grab dist.unpackedSize from the registry API and then use something like jsDelivr to list what’s inside each package. From there, you can check how much of it is stuff like .map, test, or src files and figure out roughly what percentage is junk.If you want a global estimate, just sample a few thousand popular packages, calculate the waste ratio, and scale it up. For anything registry wide, you’d need npm’s internal data they don’t publish that. |
Beta Was this translation helpful? Give feedback.
-
|
You can estimate that pretty reliably using the npm registry metadata and a small crawler. The registry.npmjs.org/ endpoint exposes tarball URLs and versions, so you can fetch the dist.unpackedSize field from each manifest — that’s the size after unpacking. Combine that across all versions, and you’ll get a close estimate of wasted disk space. For a quick proof of concept, try scripting it with Node.js + got or axios to pull package metadata and log unpacked sizes per version. You can filter out .md, .map, and test files before summing if you only want production code. If you’re analyzing bandwidth too, you might need to approximate based on tarball sizes (dist.size) since download stats aren’t public. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Select Topic Area
Question
Body
Dear maintainers,
I'd like to conduct a survey about the amount of disk space and bandwidth wasted on the npm registry by items that are never used by package consumers - i.e. items that are published in the registry and are never imported by projects and / or products, because they technically can't be imported or because they are not supposed to be.
Typically, the following items are a waste of resources, hanging around, only taking space in the registry and in the client systems:
If you need an example of what I mean,
svgicons2svgfontis a good one:https://www.npmjs.com/package/svgicons2svgfont?activeTab=code
The package comes with sources, source maps, tests, and a package.json that holds 190 line when only around 80-90 are actually needed to consume the package. It consumes 425KB on my computer, when only 280KB are reachable and required to make it work. Around 30% of its size is just wasted.
My objective is to compute the total amount of space occupied by this uneeded published information, accross all packages (public and, ideally, private) in order to write a paper about it and, hopefully, raise awareness and initiate a change in both npm and client sides.
Is there a way to achieve such a thing, using a public API, or maybe by having access to some data that npm's maintainer are likely to possess? I don't care about the content - except for package.json files. I'm only interested in file types (path and extension) and their size.
Thanks in advance,
Eric.
Beta Was this translation helpful? Give feedback.
All reactions