These are the changes in pandas 1.4.0. See :ref:`release` for a full changelog including other versions of pandas.
{{ header }}
Until now, it has only been possible to create numeric indexes with int64/float64/uint64 dtypes. It is now possible to create an index of any numpy int/uint/float dtype using the new :class:`NumericIndex` index type (:issue:`41153`):
.. ipython:: python pd.NumericIndex([1, 2, 3], dtype="int8") pd.NumericIndex([1, 2, 3], dtype="uint32") pd.NumericIndex([1, 2, 3], dtype="float32")
In order to maintain backwards compatibility, calls to the base :class:`Index` will currently
return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index`, where relevant.
For example, the code below returns an Int64Index
with dtype int64
:
In [1]: pd.Index([1, 2, 3], dtype="int8")
Int64Index([1, 2, 3], dtype='int64')
but will in a future version return a :class:`NumericIndex` with dtype int8
.
More generally, currently, all operations that until now have
returned :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` will
continue to so. This means, that in order to use NumericIndex
in the current version, you
will have to call NumericIndex
explicitly. For example the below series will have an Int64Index
:
In [2]: ser = pd.Series([1, 2, 3], index=[1, 2, 3])
In [3]: ser.index
Int64Index([1, 2, 3], dtype='int64')
Instead, if you want to use a NumericIndex
, you should do:
.. ipython:: python idx = pd.NumericIndex([1, 2, 3], dtype="int8") ser = pd.Series([1, 2, 3], index=idx) ser.index
In a future version of Pandas, :class:`NumericIndex` will become the default numeric index type and
Int64Index
, UInt64Index
and Float64Index
are therefore deprecated and will
be removed in the future, see :ref:`here <whatsnew_140.deprecations.int64_uint64_float64index>` for more.
See :ref:`here <advanced.numericindex>` for more about :class:`NumericIndex`.
:class:`.Styler` has been further developed in 1.4.0. The following enhancements have been made:
- Styling and formatting of indexes has been added, with :meth:`.Styler.apply_index`, :meth:`.Styler.applymap_index` and :meth:`.Styler.format_index`. These mirror the signature of the methods already used to style and format data values, and work with both HTML, LaTeX and Excel format (:issue:`41893`, :issue:`43101`, :issue:`41993`, :issue:`41995`).
- :meth:`.Styler.bar` introduces additional arguments to control alignment and display (:issue:`26070`, :issue:`36419`), and it also validates the input arguments
width
andheight
(:issue:`42511`).- :meth:`.Styler.to_latex` introduces keyword argument
environment
, which also allows a specific "longtable" entry through a separate jinja2 template (:issue:`41866`).- :meth:`.Styler.to_html` introduces keyword arguments
sparse_index
,sparse_columns
,bold_headers
,caption
,max_rows
andmax_columns
(:issue:`41946`, :issue:`43149`, :issue:`42972`).- Keyword arguments
level
andnames
added to :meth:`.Styler.hide_index` and :meth:`.Styler.hide_columns` for additional control of visibility of MultiIndexes and index names (:issue:`25475`, :issue:`43404`, :issue:`43346`)- Global options have been extended to configure default
Styler
properties including formatting and encoding and mathjax options and LaTeX (:issue:`41395`)- Naive sparsification is now possible for LaTeX without the multirow package (:issue:`43369`)
- :meth:`.Styler.to_html` omits CSSStyle rules for hidden table elements (:issue:`43619`)
- Custom CSS classes can now be directly specified without string replacement (:issue:`43686`)
- Bug where row trimming failed to reflect hidden rows (:issue:`43703`, :issue:`44247`)
- Update and expand the export and use mechanics (:issue:`40675`)
- New method :meth:`.Styler.hide` added and deprecates :meth:`.Styler.hide_index` and :meth:`.Styler.hide_columns` (:issue:`43758`)
Formerly Styler relied on display.html.use_mathjax
, which has now been replaced by styler.html.mathjax
.
There are also bug fixes and deprecations listed below.
Validation now for caption
arg (:issue:`43368`)
:func:`pandas.read_csv` now accepts engine="pyarrow"
(requires at least pyarrow
1.0.1) as an argument, allowing for faster csv parsing on multicore machines
with pyarrow installed. See the :doc:`I/O docs </user_guide/io>` for more info. (:issue:`23697`, :issue:`43706`)
Added rank
function to :class:`Rolling` and :class:`Expanding`. The new function supports the method
, ascending
, and pct
flags of :meth:`DataFrame.rank`. The method
argument supports min
, max
, and average
ranking methods.
Example:
.. ipython:: python s = pd.Series([1, 4, 2, 3, 5, 3]) s.rolling(3).rank() s.rolling(3).rank(method="max")
It is now possible to specify positional ranges relative to the ends of each group.
Negative arguments for :meth:`.GroupBy.head` and :meth:`.GroupBy.tail` now work correctly and result in ranges relative to the end and start of each group, respectively. Previously, negative arguments returned empty frames.
.. ipython:: python df = pd.DataFrame([["g", "g0"], ["g", "g1"], ["g", "g2"], ["g", "g3"], ["h", "h0"], ["h", "h1"]], columns=["A", "B"]) df.groupby("A").head(-1)
:meth:`.GroupBy.nth` now accepts a slice or list of integers and slices.
.. ipython:: python df.groupby("A").nth(slice(1, -1)) df.groupby("A").nth([slice(None, 1), slice(-1, None)])
A new 'tight'
dictionary format that preserves :class:`MultiIndex` entries and names
is now available with the :meth:`DataFrame.from_dict` and :meth:`DataFrame.to_dict` methods
and can be used with the standard json
library to produce a tight
representation of :class:`DataFrame` objects (:issue:`4889`).
.. ipython:: python df = pd.DataFrame.from_records( [[1, 3], [2, 4]], index=pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")], names=["n1", "n2"]), columns=pd.MultiIndex.from_tuples([("x", 1), ("y", 2)], names=["z1", "z2"]), ) df df.to_dict(orient='tight')
- :class:`DataFrameGroupBy` operations with
as_index=False
now correctly retainExtensionDtype
dtypes for columns being grouped on (:issue:`41373`) - Add support for assigning values to
by
argument in :meth:`DataFrame.plot.hist` and :meth:`DataFrame.plot.box` (:issue:`15079`) - :meth:`Series.sample`, :meth:`DataFrame.sample`, and :meth:`.GroupBy.sample` now accept a
np.random.Generator
as input torandom_state
. A generator will be more performant, especially withreplace=False
(:issue:`38100`) - :meth:`Series.ewm`, :meth:`DataFrame.ewm`, now support a
method
argument with a'table'
option that performs the windowing operation over an entire :class:`DataFrame`. See :ref:`Window Overview <window.overview>` for performance and functional benefits (:issue:`42273`) - :meth:`.GroupBy.cummin` and :meth:`.GroupBy.cummax` now support the argument
skipna
(:issue:`34047`) - :meth:`read_table` now supports the argument
storage_options
(:issue:`39167`) - :meth:`DataFrame.to_stata` and :meth:`StataWriter` now accept the keyword only argument
value_labels
to save labels for non-categorical columns - Methods that relied on hashmap based algos such as :meth:`DataFrameGroupBy.value_counts`, :meth:`DataFrameGroupBy.count` and :func:`factorize` ignored imaginary component for complex numbers (:issue:`17927`)
- Add :meth:`Series.str.removeprefix` and :meth:`Series.str.removesuffix` introduced in Python 3.9 to remove pre-/suffixes from string-type :class:`Series` (:issue:`36944`)
- Attempting to write into a file in missing parent directory with :meth:`DataFrame.to_csv`, :meth:`DataFrame.to_html`, :meth:`DataFrame.to_excel`, :meth:`DataFrame.to_feather`, :meth:`DataFrame.to_parquet`, :meth:`DataFrame.to_stata`, :meth:`DataFrame.to_json`, :meth:`DataFrame.to_pickle`, and :meth:`DataFrame.to_xml` now explicitly mentions missing parent directory, the same is true for :class:`Series` counterparts (:issue:`24306`)
- Indexing with
.loc
and.iloc
now supportsEllipsis
(:issue:`37750`) - :meth:`IntegerArray.all` , :meth:`IntegerArray.any`, :meth:`FloatingArray.any`, and :meth:`FloatingArray.all` use Kleene logic (:issue:`41967`)
- Added support for nullable boolean and integer types in :meth:`DataFrame.to_stata`, :class:`~pandas.io.stata.StataWriter`, :class:`~pandas.io.stata.StataWriter117`, and :class:`~pandas.io.stata.StataWriterUTF8` (:issue:`40855`)
- :meth:`DataFrame.__pos__`, :meth:`DataFrame.__neg__` now retain
ExtensionDtype
dtypes (:issue:`43883`) - The error raised when an optional dependency can't be imported now includes the original exception, for easier investigation (:issue:`43882`)
- Added :meth:`.ExponentialMovingWindow.sum` (:issue:`13297`)
- :meth:`Series.str.split` now supports a
regex
argument that explicitly specifies whether the pattern is a regular expression. Default isNone
(:issue:`43563`, :issue:`32835`, :issue:`25549`) - :meth:`DataFrame.dropna` now accepts a single label as
subset
along with array-like (:issue:`41021`) - :meth:`read_excel` now accepts a
decimal
argument that allow the user to specify the decimal point when parsing string columns to numeric (:issue:`14403`) - :meth:`.GroupBy.mean` now supports Numba execution with the
engine
keyword (:issue:`43731`)
These are bug fixes that might have notable behavior changes.
The dayfirst
option of :func:`to_datetime` isn't strict, and this can lead to surprising behaviour:
.. ipython:: python :okwarning: pd.to_datetime(["31-12-2021"], dayfirst=False)
Now, a warning will be raised if a date string cannot be parsed accordance to the given dayfirst
value when
the value is a delimited date string (e.g. 31-12-2012
).
When using :func:`concat` to concatenate two or more :class:`DataFrame` objects, if one of the DataFrames was empty or had all-NA values, its dtype was _sometimes_ ignored when finding the concatenated dtype. These are now consistently _not_ ignored (:issue:`43507`).
.. ipython:: python df1 = pd.DataFrame({"bar": [pd.Timestamp("2013-01-01")]}, index=range(1)) df2 = pd.DataFrame({"bar": np.nan}, index=range(1, 2)) res = df1.append(df2)
Previously, the float-dtype in df2
would be ignored so the result dtype would be datetime64[ns]
. As a result, the np.nan
would be cast to NaT
.
Previous behavior:
In [4]: res
Out[4]:
bar
0 2013-01-01
1 NaT
Now the float-dtype is respected. Since the common dtype for these DataFrames is object, the np.nan
is retained.
New behavior:
.. ipython:: python res
:meth:`Series.value_counts` and :meth:`Series.mode` no longer coerce None
, NaT
and other null-values to a NaN-value for np.object
-dtype. This behavior is now consistent with unique
, isin
and others (:issue:`42688`).
.. ipython:: python s = pd.Series([True, None, pd.NaT, None, pd.NaT, None]) res = s.value_counts(dropna=False)
Previously, all null-values were replaced by a NaN-value.
Previous behavior:
In [3]: res
Out[3]:
NaN 5
True 1
dtype: int64
Now null-values are no longer mangled.
New behavior:
.. ipython:: python res
Some minimum supported versions of dependencies were updated. If installed, we now require:
Package | Minimum Version | Required | Changed |
---|---|---|---|
numpy | 1.18.5 | X | X |
pytz | 2020.1 | X | X |
python-dateutil | 2.8.1 | X | X |
bottleneck | 1.3.1 | X | |
numexpr | 2.7.1 | X | |
pytest (dev) | 6.0 | ||
mypy (dev) | 0.910 | X |
For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.
Package | Minimum Version | Changed |
---|---|---|
beautifulsoup4 | 4.8.2 | X |
fastparquet | 0.4.0 | |
fsspec | 0.7.4 | |
gcsfs | 0.6.0 | |
lxml | 4.5.0 | X |
matplotlib | 3.3.2 | X |
numba | 0.50.1 | X |
openpyxl | 3.0.2 | X |
pyarrow | 1.0.1 | X |
pymysql | 0.10.1 | X |
pytables | 3.6.1 | X |
s3fs | 0.4.0 | |
scipy | 1.4.1 | X |
sqlalchemy | 1.3.11 | X |
tabulate | 0.8.7 | |
xarray | 0.15.1 | X |
xlrd | 2.0.1 | X |
xlsxwriter | 1.2.2 | X |
xlwt | 1.3.0 | |
pandas-gbq | 0.14.0 | X |
See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.
- :meth:`Index.get_indexer_for` no longer accepts keyword arguments (other than 'target'); in the past these would be silently ignored if the index was not unique (:issue:`42310`)
:class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` have been deprecated in favor of the new :class:`NumericIndex` and will be removed in Pandas 2.0 (:issue:`43028`).
Currently, in order to maintain backward compatibility, calls to :class:`Index` will continue to return :class:`Int64Index`, :class:`UInt64Index` and :class:`Float64Index` when given numeric data, but in the future, a :class:`NumericIndex` will be returned.
Current behavior:
In [1]: pd.Index([1, 2, 3], dtype="int32")
Out [1]: Int64Index([1, 2, 3], dtype='int64')
In [1]: pd.Index([1, 2, 3], dtype="uint64")
Out [1]: UInt64Index([1, 2, 3], dtype='uint64')
Future behavior:
In [3]: pd.Index([1, 2, 3], dtype="int32")
Out [3]: NumericIndex([1, 2, 3], dtype='int32')
In [4]: pd.Index([1, 2, 3], dtype="uint64")
Out [4]: NumericIndex([1, 2, 3], dtype='uint64')
- Deprecated :meth:`Index.is_type_compatible` (:issue:`42113`)
- Deprecated
method
argument in :meth:`Index.get_loc`, useindex.get_indexer([label], method=...)
instead (:issue:`42269`) - Deprecated treating integer keys in :meth:`Series.__setitem__` as positional when the index is a :class:`Float64Index` not containing the key, a :class:`IntervalIndex` with no entries containing the key, or a :class:`MultiIndex` with leading :class:`Float64Index` level not containing the key (:issue:`33469`)
- Deprecated treating
numpy.datetime64
objects as UTC times when passed to the :class:`Timestamp` constructor along with a timezone. In a future version, these will be treated as wall-times. To retain the old behavior, useTimestamp(dt64).tz_localize("UTC").tz_convert(tz)
(:issue:`24559`) - Deprecated ignoring missing labels when indexing with a sequence of labels on a level of a MultiIndex (:issue:`42351`)
- Creating an empty Series without a dtype will now raise a more visible
FutureWarning
instead of aDeprecationWarning
(:issue:`30017`) - Deprecated the 'kind' argument in :meth:`Index.get_slice_bound`, :meth:`Index.slice_indexer`, :meth:`Index.slice_locs`; in a future version passing 'kind' will raise (:issue:`42857`)
- Deprecated dropping of nuisance columns in :class:`Rolling`, :class:`Expanding`, and :class:`EWM` aggregations (:issue:`42738`)
- Deprecated :meth:`Index.reindex` with a non-unique index (:issue:`42568`)
- Deprecated :meth:`.Styler.render` in favour of :meth:`.Styler.to_html` (:issue:`42140`)
- Deprecated passing in a string column label into
times
in :meth:`DataFrame.ewm` (:issue:`43265`) - Deprecated the 'include_start' and 'include_end' arguments in :meth:`DataFrame.between_time`; in a future version passing 'include_start' or 'include_end' will raise (:issue:`40245`)
- Deprecated the
squeeze
argument to :meth:`read_csv`, :meth:`read_table`, and :meth:`read_excel`. Users should squeeze the DataFrame afterwards with.squeeze("columns")
instead. (:issue:`43242`) - Deprecated the
index
argument to :class:`SparseArray` construction (:issue:`23089`) - Deprecated the
closed
argument in :meth:`date_range` and :meth:`bdate_range` in favor ofinclusive
argument; In a future version passingclosed
will raise (:issue:`40245`) - Deprecated :meth:`.Rolling.validate`, :meth:`.Expanding.validate`, and :meth:`.ExponentialMovingWindow.validate` (:issue:`43665`)
- Deprecated silent dropping of columns that raised a
TypeError
in :class:`Series.transform` and :class:`DataFrame.transform` when used with a dictionary (:issue:`43740`) - Deprecated silent dropping of columns that raised a
TypeError
,DataError
, and some cases ofValueError
in :meth:`Series.aggregate`, :meth:`DataFrame.aggregate`, :meth:`Series.groupby.aggregate`, and :meth:`DataFrame.groupby.aggregate` when used with a list (:issue:`43740`) - Deprecated casting behavior when setting timezone-aware value(s) into a timezone-aware :class:`Series` or :class:`DataFrame` column when the timezones do not match. Previously this cast to object dtype. In a future version, the values being inserted will be converted to the series or column's existing timezone (:issue:`37605`)
- Deprecated casting behavior when passing an item with mismatched-timezone to :meth:`DatetimeIndex.insert`, :meth:`DatetimeIndex.putmask`, :meth:`DatetimeIndex.where` :meth:`DatetimeIndex.fillna`, :meth:`Series.mask`, :meth:`Series.where`, :meth:`Series.fillna`, :meth:`Series.shift`, :meth:`Series.replace`, :meth:`Series.reindex` (and :class:`DataFrame` column analogues). In the past this has cast to object dtype. In a future version, these will cast the passed item to the index or series's timezone (:issue:`37605`)
- Deprecated the 'errors' keyword argument in :meth:`Series.where`, :meth:`DataFrame.where`, :meth:`Series.mask`, and meth:DataFrame.mask; in a future version the argument will be removed (:issue:`44294`)
- Performance improvement in :meth:`.GroupBy.sample`, especially when
weights
argument provided (:issue:`34483`) - Performance improvement when converting non-string arrays to string arrays (:issue:`34483`)
- Performance improvement in :meth:`.GroupBy.transform` for user-defined functions (:issue:`41598`)
- Performance improvement in constructing :class:`DataFrame` objects (:issue:`42631`, :issue:`43142`, :issue:`43147`, :issue:`43307`, :issue:`43144`)
- Performance improvement in :meth:`GroupBy.shift` when
fill_value
argument is provided (:issue:`26615`) - Performance improvement in :meth:`DataFrame.corr` for
method=pearson
on data without missing values (:issue:`40956`) - Performance improvement in some :meth:`GroupBy.apply` operations (:issue:`42992`, :issue:`43578`)
- Performance improvement in :func:`read_stata` (:issue:`43059`, :issue:`43227`)
- Performance improvement in :func:`read_sas` (:issue:`43333`)
- Performance improvement in :meth:`to_datetime` with
uint
dtypes (:issue:`42606`) - Performance improvement in :meth:`to_datetime` with
infer_datetime_format
set toTrue
(:issue:`43901`) - Performance improvement in :meth:`Series.sparse.to_coo` (:issue:`42880`)
- Performance improvement in indexing with a :class:`UInt64Index` (:issue:`43862`)
- Performance improvement in indexing with a :class:`Float64Index` (:issue:`43705`)
- Performance improvement in indexing with a non-unique Index (:issue:`43792`)
- Performance improvement in indexing with a listlike indexer on a :class:`MultiIndex` (:issue:`43370`)
- Performance improvement in indexing with a :class:`MultiIndex` indexer on another :class:`MultiIndex` (:issue:43370`)
- Performance improvement in :meth:`GroupBy.quantile` (:issue:`43469`, :issue:`43725`)
- Performance improvement in :meth:`GroupBy.count` (:issue:`43730`, :issue:`43694`)
- Performance improvement in :meth:`GroupBy.any` and :meth:`GroupBy.all` (:issue:`43675`, :issue:`42841`)
- Performance improvement in :meth:`GroupBy.std` (:issue:`43115`, :issue:`43576`)
- Performance improvement in :meth:`GroupBy.cumsum` (:issue:`43309`)
- :meth:`SparseArray.min` and :meth:`SparseArray.max` no longer require converting to a dense array (:issue:`43526`)
- Indexing into a :class:`SparseArray` with a
slice
withstep=1
no longer requires converting to a dense array (:issue:`43777`) - Performance improvement in :meth:`SparseArray.take` with
allow_fill=False
(:issue:`43654`) - Performance improvement in :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.Rolling.sum`, :meth:`.Expanding.sum` with
engine="numba"
(:issue:`43612`, :issue:`44176`) - Improved performance of :meth:`pandas.read_csv` with
memory_map=True
when file encoding is UTF-8 (:issue:`43787`) - Performance improvement in :meth:`RangeIndex.sort_values` overriding :meth:`Index.sort_values` (:issue:`43666`)
- Performance improvement in :meth:`RangeIndex.insert` (:issue:`43988`)
- Performance improvement in :meth:`Index.insert` (:issue:`43953`)
- Performance improvement in :meth:`DatetimeIndex.tolist` (:issue:`43823`)
- Performance improvement in :meth:`DatetimeIndex.union` (:issue:`42353`)
- Performance improvement in :meth:`Series.nsmallest` (:issue:`43696`)
- Performance improvement in :meth:`DataFrame.insert` (:issue:`42998`)
- Performance improvement in :meth:`DataFrame.dropna` (:issue:`43683`)
- Performance improvement in :meth:`DataFrame.fillna` (:issue:`43316`)
- Performance improvement in :meth:`DataFrame.values` (:issue:`43160`)
- Performance improvement in :meth:`DataFrame.select_dtypes` (:issue:`42611`)
- Performance improvement in :class:`DataFrame` reductions (:issue:`43185`, :issue:`43243`, :issue:`43311`, :issue:`43609`)
- Performance improvement in :meth:`Series.unstack` and :meth:`DataFrame.unstack` (:issue:`43335`, :issue:`43352`, :issue:`42704`, :issue:`43025`)
- Performance improvement in :meth:`Series.to_frame` (:issue:`43558`)
- Performance improvement in :meth:`Series.mad` (:issue:`43010`)
- Performance improvement in :func:`merge` (:issue:`43332`)
- Performance improvement in :func:`concat` (:issue:`43354`)
- Bug in setting dtype-incompatible values into a :class:`Categorical` (or
Series
orDataFrame
backed byCategorical
) raisingValueError
instead ofTypeError
(:issue:`41919`) - Bug in :meth:`Categorical.searchsorted` when passing a dtype-incompatible value raising
KeyError
instead ofTypeError
(:issue:`41919`) - Bug in :meth:`Series.where` with
CategoricalDtype
when passing a dtype-incompatible value raisingValueError
instead ofTypeError
(:issue:`41919`) - Bug in :meth:`Categorical.fillna` when passing a dtype-incompatible value raising
ValueError
instead ofTypeError
(:issue:`41919`) - Bug in :meth:`Categorical.fillna` with a tuple-like category raising
ValueError
instead ofTypeError
when filling with a non-category tuple (:issue:`41919`)
- Bug in :class:`DataFrame` constructor unnecessarily copying non-datetimelike 2D object arrays (:issue:`39272`)
- Bug in :func:`to_datetime` with
format
andpandas.NA
was raisingValueError
(:issue:`42957`) - :func:`to_datetime` would silently swap
MM/DD/YYYY
andDD/MM/YYYY
formats if the givendayfirst
option could not be respected - now, a warning is raised in the case of delimited date strings (e.g.31-12-2012
) (:issue:`12585`) - Bug in :meth:`date_range` and :meth:`bdate_range` do not return right bound when
start
=end
and set is closed on one side (:issue:`43394`) - Bug in inplace addition and subtraction of :class:`DatetimeIndex` or :class:`TimedeltaIndex` with :class:`DatetimeArray` or :class:`TimedeltaArray` (:issue:`43904`)
- Bug in in calling
np.isnan
,np.isfinite
, ornp.isinf
on a timezone-aware :class:`DatetimeIndex` incorrectly raisingTypeError
(:issue:`43917`) - Bug in constructing a :class:`Series` from datetime-like strings with mixed timezones incorrectly partially-inferring datetime values (:issue:`40111`)
- Bug in division of all-
NaT
:class:`TimeDeltaIndex`, :class:`Series` or :class:`DataFrame` column with object-dtype arraylike of numbers failing to infer the result as timedelta64-dtype (:issue:`39750`)
- Bug in :func:`to_datetime` with
infer_datetime_format=True
failing to parse zero UTC offset (Z
) correctly (:issue:`41047`) - Bug in :meth:`Series.dt.tz_convert` resetting index in a :class:`Series` with :class:`CategoricalIndex` (:issue:`43080`)
- Bug in :meth:`DataFrame.rank` raising
ValueError
withobject
columns andmethod="first"
(:issue:`41931`) - Bug in :meth:`DataFrame.rank` treating missing values and extreme values as equal (for example
np.nan
andnp.inf
), causing incorrect results whenna_option="bottom"
orna_option="top
used (:issue:`41931`) - Bug in
numexpr
engine still being used when the optioncompute.use_numexpr
is set toFalse
(:issue:`32556`) - Bug in :class:`DataFrame` arithmetic ops with a subclass whose :meth:`_constructor` attribute is a callable other than the subclass itself (:issue:`43201`)
- Bug in arithmetic operations involving :class:`RangeIndex` where the result would have the incorrect
name
(:issue:`43962`)
- Bug in :class:`UInt64Index` constructor when passing a list containing both positive integers small enough to cast to int64 and integers too large too hold in int64 (:issue:`42201`)
- Bug in :class:`Series` constructor returning 0 for missing values with dtype
int64
andFalse
for dtypebool
(:issue:`43017`, :issue:`43018`) - Bug in :class:`IntegerDtype` not allowing coercion from string dtype (:issue:`25472`)
- Bug in :func:`to_datetime` with
arg:xr.DataArray
andunit="ns"
specified raises TypeError (:issue:`44053`)
- Bug in :meth:`Series.rename` when index in Series is MultiIndex and level in rename is provided. (:issue:`43659`)
- Bug in :meth:`DataFrame.truncate` and :meth:`Series.truncate` when the object's Index has a length greater than one but only one unique value (:issue:`42365`)
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` with a :class:`MultiIndex` when indexing with a tuple in which one of the levels is also a tuple (:issue:`27591`)
- Bug in :meth:`Series.loc` when with a :class:`MultiIndex` whose first level contains only
np.nan
values (:issue:`42055`) - Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`DatetimeIndex` when passing a string, the return type depended on whether the index was monotonic (:issue:`24892`)
- Bug in indexing on a :class:`MultiIndex` failing to drop scalar levels when the indexer is a tuple containing a datetime-like string (:issue:`42476`)
- Bug in :meth:`DataFrame.sort_values` and :meth:`Series.sort_values` when passing an ascending value, failed to raise or incorrectly raising
ValueError
(:issue:`41634`) - Bug in updating values of :class:`pandas.Series` using boolean index, created by using :meth:`pandas.DataFrame.pop` (:issue:`42530`)
- Bug in :meth:`Index.get_indexer_non_unique` when index contains multiple
np.nan
(:issue:`35392`) - Bug in :meth:`DataFrame.query` did not handle the degree sign in a backticked column name, such as `Temp(°C)`, used in an expression to query a dataframe (:issue:`42826`)
- Bug in :meth:`DataFrame.drop` where the error message did not show missing labels with commas when raising
KeyError
(:issue:`42881`) - Bug in :meth:`DataFrame.query` where method calls in query strings led to errors when the
numexpr
package was installed. (:issue:`22435`) - Bug in :meth:`DataFrame.nlargest` and :meth:`Series.nlargest` where sorted result did not count indexes containing
np.nan
(:issue:`28984`) - Bug in indexing on a non-unique object-dtype :class:`Index` with an NA scalar (e.g.
np.nan
) (:issue:`43711`) - Bug in :meth:`DataFrame.__setitem__` incorrectly writing into an existing column's array rather than setting a new array when the new dtype and the old dtype match (:issue:`43406`)
- Bug in setting floating-dtype values into a :class:`Series` with integer dtype failing to set inplace when those values can be losslessly converted to integers (:issue:`44316`)
- Bug in :meth:`Series.__setitem__` with object dtype when setting an array with matching size and dtype='datetime64[ns]' or dtype='timedelta64[ns]' incorrectly converting the datetime/timedeltas to integers (:issue:`43868`)
- Bug in :meth:`DataFrame.sort_index` where
ignore_index=True
was not being respected when the index was already sorted (:issue:`43591`) - Bug in :meth:`Index.get_indexer_non_unique` when index contains multiple
np.datetime64("NaT")
andnp.timedelta64("NaT")
(:issue:`43869`) - Bug in setting a scalar :class:`Interval` value into a :class:`Series` with
IntervalDtype
when the scalar's sides are floats and the values' sides are integers (:issue:`44201`) - Bug when setting string-backed :class:`Categorical` values that can be parsed to datetimes into a :class:`DatetimeArray` or :class:`Series` or :class:`DataFrame` column backed by :class:`DatetimeArray` failing to parse these strings (:issue:`44236`)
- Bug in :meth:`Series.__setitem__` with an integer dtype other than
int64
setting with arange
object unnecessarily upcasting toint64
(:issue:`44261`) - Bug in :meth:`Series.__setitem__` with a boolean mask indexer setting a listlike value of length 1 incorrectly broadcasting that value (:issue:`44265`)
- Bug in :meth:`DataFrame.fillna` with limit and no method ignores axis='columns' or
axis = 1
(:issue:`40989`) - Bug in :meth:`DataFrame.fillna` not replacing missing values when using a dict-like
value
and duplicate column names (:issue:`43476`)
- Bug in :meth:`MultiIndex.get_loc` where the first level is a :class:`DatetimeIndex` and a string key is passed (:issue:`42465`)
- Bug in :meth:`MultiIndex.reindex` when passing a
level
that corresponds to anExtensionDtype
level (:issue:`42043`) - Bug in :meth:`MultiIndex.get_loc` raising
TypeError
instead ofKeyError
on nested tuple (:issue:`42440`) - Bug in :meth:`MultiIndex.putmask` where the other value was also a :class:`MultiIndex` (:issue:`43212`)
- Bug in :func:`read_excel` attempting to read chart sheets from .xlsx files (:issue:`41448`)
- Bug in :func:`json_normalize` where
errors=ignore
could fail to ignore missing values ofmeta
whenrecord_path
has a length greater than one (:issue:`41876`) - Bug in :func:`read_csv` with multi-header input and arguments referencing column names as tuples (:issue:`42446`)
- Bug in :func:`read_fwf`, where difference in lengths of
colspecs
andnames
was not raisingValueError
(:issue:`40830`) - Bug in :func:`Series.to_json` and :func:`DataFrame.to_json` where some attributes were skipped when serialising plain Python objects to JSON (:issue:`42768`, :issue:`33043`)
- Column headers are dropped when constructing a :class:`DataFrame` from a sqlalchemy's
Row
object (:issue:`40682`) - Bug in unpickling a :class:`Index` with object dtype incorrectly inferring numeric dtypes (:issue:`43188`)
- Bug in :func:`read_csv` where reading multi-header input with unequal lengths incorrectly raising uncontrolled
IndexError
(:issue:`43102`) - Bug in :func:`read_csv`, changed exception class when expecting a file path name or file-like object from
OSError
toTypeError
(:issue:`43366`) - Bug in :func:`read_json` not handling non-numpy dtypes correctly (especially
category
) (:issue:`21892`, :issue:`33205`) - Bug in :func:`json_normalize` where multi-character
sep
parameter is incorrectly prefixed to every key (:issue:`43831`) - Bug in :func:`read_csv` with
float_precision="round_trip"
which did not skip initial/trailing whitespace (:issue:`43713`) - Bug in dumping/loading a :class:`DataFrame` with
yaml.dump(frame)
(:issue:`42748`)
- Bug in adding a :class:`Period` object to a
np.timedelta64
object incorrectly raisingTypeError
(:issue:`44182`)
- Fixed bug in :meth:`SeriesGroupBy.apply` where passing an unrecognized string argument failed to raise
TypeError
when the underlyingSeries
is empty (:issue:`42021`) - Bug in :meth:`Series.rolling.apply`, :meth:`DataFrame.rolling.apply`, :meth:`Series.expanding.apply` and :meth:`DataFrame.expanding.apply` with
engine="numba"
where*args
were being cached with the user passed function (:issue:`42287`) - Bug in :meth:`GroupBy.max` and :meth:`GroupBy.min` with nullable integer dtypes losing precision (:issue:`41743`)
- Bug in :meth:`DataFrame.groupby.rolling.var` would calculate the rolling variance only on the first group (:issue:`42442`)
- Bug in :meth:`GroupBy.shift` that would return the grouping columns if
fill_value
was not None (:issue:`41556`) - Bug in :meth:`SeriesGroupBy.nlargest` and :meth:`SeriesGroupBy.nsmallest` would have an inconsistent index when the input Series was sorted and
n
was greater than or equal to all group sizes (:issue:`15272`, :issue:`16345`, :issue:`29129`) - Bug in :meth:`pandas.DataFrame.ewm`, where non-float64 dtypes were silently failing (:issue:`42452`)
- Bug in :meth:`pandas.DataFrame.rolling` operation along rows (
axis=1
) incorrectly omits columns containingfloat16
andfloat32
(:issue:`41779`) - Bug in :meth:`Resampler.aggregate` did not allow the use of Named Aggregation (:issue:`32803`)
- Bug in :meth:`Series.rolling` when the :class:`Series`
dtype
wasInt64
(:issue:`43016`) - Bug in :meth:`DataFrame.rolling.corr` when the :class:`DataFrame` columns was a :class:`MultiIndex` (:issue:`21157`)
- Bug in :meth:`DataFrame.groupby.rolling` when specifying
on
and calling__getitem__
would subsequently return incorrect results (:issue:`43355`) - Bug in :meth:`GroupBy.apply` with time-based :class:`Grouper` objects incorrectly raising
ValueError
in corner cases where the grouping vector contains aNaT
(:issue:`43500`, :issue:`43515`) - Bug in :meth:`GroupBy.mean` failing with
complex
dtype (:issue:`43701`) - Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not calculating window bounds correctly for the first row when
center=True
and index is decreasing (:issue:`43927`) - Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` for centered datetimelike windows with uneven nanosecond (:issue:`43997`)
- Bug in :meth:`GroupBy.nth` failing on
axis=1
(:issue:`43926`) - Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not respecting right bound on centered datetime-like windows, if the index contain duplicates (:issue:`#3944`)
- Improved error message when creating a :class:`DataFrame` column from a multi-dimensional :class:`numpy.ndarray` (:issue:`42463`)
- :func:`concat` creating :class:`MultiIndex` with duplicate level entries when concatenating a :class:`DataFrame` with duplicates in :class:`Index` and multiple keys (:issue:`42651`)
- Bug in :meth:`pandas.cut` on :class:`Series` with duplicate indices (:issue:`42185`) and non-exact :meth:`pandas.CategoricalIndex` (:issue:`42425`)
- Bug in :meth:`DataFrame.append` failing to retain dtypes when appended columns do not match (:issue:`43392`)
- Bug in :func:`concat` of
bool
andboolean
dtypes resulting inobject
dtype instead ofboolean
dtype (:issue:`42800`) - Bug in :func:`crosstab` when inputs are are categorical Series, there are categories that are not present in one or both of the Series, and
margins=True
. Previously the margin value for missing categories wasNaN
. It is now correctly reported as 0 (:issue:`43505`) - Bug in :func:`concat` would fail when the
objs
argument all had the same index and thekeys
argument contained duplicates (:issue:`43595`) - Bug in :func:`concat` which ignored the
sort
parameter (:issue:`43375`) - Fixed bug in :func:`merge` with :class:`MultiIndex` as column index for the
on
argument returning an error when assigning a column internally (:issue:`43734`) - Bug in :func:`crosstab` would fail when inputs are lists or tuples (:issue:`44076`)
- Bug in :meth:`DataFrame.append` failing to retain
index.name
when appending a list of :class:`Series` objects (:issue:`44109`) - Fixed metadata propagation in :meth:`Dataframe.apply` method, consequently fixing the same issue for :meth:`Dataframe.transform`, :meth:`Dataframe.nunique` and :meth:`Dataframe.mode` (:issue:`28283`)
- Bug in :meth:`DataFrame.sparse.to_coo` raising
AttributeError
when column names are not unique (:issue:`29564`) - Bug in :meth:`SparseArray.max` and :meth:`SparseArray.min` raising
ValueError
for arrays with 0 non-null elements (:issue:`43527`) - Bug in :meth:`DataFrame.sparse.to_coo` silently converting non-zero fill values to zero (:issue:`24817`)
- Bug in :class:`SparseArray` comparison methods with an array-like operand of mismatched length raising
AssertionError
or unclearValueError
depending on the input (:issue:`43863`)
- Bug in :func:`array` failing to preserve :class:`PandasArray` (:issue:`43887`)
- NumPy ufuncs
np.abs
,np.positive
,np.negative
now correctly preserve dtype when called on ExtensionArrays that implement__abs__, __pos__, __neg__
, respectively. In particular this is fixed for :class:`TimedeltaArray` (:issue:`43899`) - Avoid raising
PerformanceWarning
about fragmented DataFrame when using many columns with an extension dtype (:issue:`44098`)
- Minor bug in :class:`.Styler` where the
uuid
at initialization maintained a floating underscore (:issue:`43037`) - Bug in :meth:`.Styler.to_html` where the
Styler
object was updated if theto_html
method was called with some args (:issue:`43034`) - Bug in :meth:`.Styler.copy` where
uuid
was not previously copied (:issue:`40675`) - Bug in :meth:`Styler.apply` where functions which returned Series objects were not correctly handled in terms of aligning their index labels (:issue:`13657`, :issue:`42014`)
- Bug when rendering an empty DataFrame with a named index (:issue:`43305`).
- Bug when rendering a single level MultiIndex (:issue:`43383`).
- Bug when combining non-sparse rendering and :meth:`.Styler.hide_columns` or :meth:`.Styler.hide_index` (:issue:`43464`)
- Bug setting a table style when using multiple selectors in :class:`.Styler` (:issue:`44011`)
- Bug in :meth:`CustomBusinessMonthBegin.__add__` (:meth:`CustomBusinessMonthEnd.__add__`) not applying the extra
offset
parameter when beginning (end) of the target month is already a business day (:issue:`41356`) - Bug in :meth:`RangeIndex.union` with another
RangeIndex
with matching (even)step
and starts differing by strictly less thanstep / 2
(:issue:`44019`) - Bug in :meth:`RangeIndex.difference` with
sort=None
andstep<0
failing to sort (:issue:`44085`) - Bug in :meth:`Series.to_frame` and :meth:`Index.to_frame` ignoring the
name
argument whenname=None
is explicitly passed (:issue:`44212`) - Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` with
value=None
and ExtensionDtypes (:issue:`44270`)