The primary mechanism for loading datasets is the dataset
function, coupled
with open()
to open the resulting DataSet
as some Julia type.
dataset
In addition, DataSets.jl provides two macros @datafunc
and
@datarun
to help in creating program entry points and running them.
Note that these APIs aren't fully formed and might be deprecated before
DataSets-1.0.
@datafunc
@datarun
The global data environment for the session is defined by
DataSets.PROJECT
which is initialized from the JULIA_DATASETS_PATH
environment variable. To load a data project from a particular TOML file, use
DataSets.load_project
.
DataSets.PROJECT
DataSets.load_project
DataSets.load_project!
The DataSet
is a holder for dataset metadata, including the type of
the data and the method for access (the storage driver - see Storage
Drivers). DataSet
s are managed in projects which may be stacked
together. The library provides several subtypes of
DataSets.AbstractDataProject
for this purpose which are listed below.
(Most users will simply to configure the global data project via
DataSets.PROJECT
.)
DataSet
DataSets.AbstractDataProject
DataSets.DataProject
DataSets.StackedDataProject
DataSets.ActiveDataProject
DataSets.TomlFileDataProject
The metadata for a dataset may be updated using config!
DataSets.config!
DataSets provides some builtin data models File
and
FileTree
for accessin file- and directory-like data respectively. For
modifying these, the functions newfile
and newdir
can be
used.
File
FileTree
newfile
newdir
To add a new kind of data storage backend, call DataSets.add_storage_driver
DataSets.add_storage_driver