Address polymorphic return value for unique
?
#212
Labels
API change
Changes to existing functions or objects in the API.
Milestone
unique
was added very early on in gh-25. The review discussions have focused onThere is another issue that was not discussed:
unique
is the only function whose return type is polymorphic. It can return an array, or a length-2, 3 or 4 tuple of arrays:API doc: https://data-apis.org/array-api/latest/API_specification/set_functions.html#unique-x-return-counts-false-return-index-false-return-inverse-false
This issue is painful for, e.g., libraries that have to support it in their JIT compiler. For
svd
, which previously was polymorphic too, we decided that it was better to introducesvdvals
to get rid of the polymorphism. There's even a design principle that says as much in https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.html: "In general, interfaces should avoid polymorphic return values (e.g., returning an array or a namedtuple, dependent on, e.g., an optional keyword argument)."Given that there are 3 boolean keywords, it's not feasible to split
unique
into separate functions (there'd be 8 of them). An alternative may be:unique
,indices
,inverse
,counts
).False
, accessing the field is undefined behaviorAnother question that may be relevant is: how often are the boolean keywords each used? The data at https://raw.githubusercontent.com/data-apis/python-record-api/master/data/typing/numpy.py may help there:
Conclusion: a large majority of usage is
unique(x)
, without any keywords.return_inverse
is fairly often used,return_index
almost never (checked by searching Pandas, sklearn et al.).So another alternative count be to include two separate functions:
unique(x, /)
- returns array of unique valuesunique_all(x, /)
- returns tuple of 4 arrays (may need a better name)The text was updated successfully, but these errors were encountered: