You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,6 +6,6 @@ Simply run the program `scraper.py` with all the dependencies from [requirements
6
6
The rest of the python files are:
7
7
-`to-new-skins.py` for cleaning the data by converting all pre-1.8 skins to their newer version using `mc_skin_updater.py` from https://github.com/RandomGamingDev/mc_skin_updater_py. This script expects the default structure from scraping (which it keeps the same).
8
8
-`to-imagefolder.py` for converting to a format that's easier for use in things like HuggingFace (although I still recommend you do stuff like zip the file) and for general processing. This script expects the default structure from scraping (which it converts to the imagefolder structure).
9
-
-`to-1dir-dataset.py` for converting to a format that's easier to use for multiple projects. This script expects the imagefolder structure from converting the data via `to-imagefolder.py` (which it converts to a 1 directory based basic structure).
9
+
-`to-1dir-dataset.py` for converting to a format that's easier to use for multiple projects. This script expects the imagefolder structure from converting the data via `to-imagefolder.py` (which it converts to a 1 directory based basic structure where there's the images with a txt file of the same front part of the name with a description created from the name, category, and description).
10
10
11
11
Note: This doesn't make use of any async or multithreaded code, and is completely made with synchronous code. This makes it easier to understand for more people, but far slower, and tbh part of it's just the fact that I don't feel like optimizing it any further since tthis is sufficient for my needs. However, if you feel like optimizing it and creating a fork or pull request go right ahead :D
0 commit comments