Skip to content

Commit 2470690

Browse files
authored
PERF: Improve perf initalizing DataFrame with a range (#30171)
1 parent 419dd39 commit 2470690

File tree

3 files changed

+17
-1
lines changed

3 files changed

+17
-1
lines changed

asv_bench/benchmarks/frame_ctor.py

+12
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,16 @@ def time_frame_from_lists(self):
105105
self.df = DataFrame(self.data)
106106

107107

108+
class FromRange:
109+
110+
goal_time = 0.2
111+
112+
def setup(self):
113+
N = 1_000_000
114+
self.data = range(N)
115+
116+
def time_frame_from_range(self):
117+
self.df = DataFrame(self.data)
118+
119+
108120
from .pandas_vb_common import setup # noqa: F401 isort:skip

doc/source/whatsnew/v1.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -634,6 +634,7 @@ Performance improvements
634634
- Performance improvement in indexing with a non-unique :class:`IntervalIndex` (:issue:`27489`)
635635
- Performance improvement in `MultiIndex.is_monotonic` (:issue:`27495`)
636636
- Performance improvement in :func:`cut` when ``bins`` is an :class:`IntervalIndex` (:issue:`27668`)
637+
- Performance improvement when initializing a :class:`DataFrame` using a ``range`` (:issue:`30171`)
637638
- Performance improvement in :meth:`DataFrame.corr` when ``method`` is ``"spearman"`` (:issue:`28139`)
638639
- Performance improvement in :meth:`DataFrame.replace` when provided a list of values to replace (:issue:`28099`)
639640
- Performance improvement in :meth:`DataFrame.select_dtypes` by using vectorization instead of iterating over a loop (:issue:`28317`)

pandas/core/internals/construction.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -246,10 +246,13 @@ def init_dict(data, index, columns, dtype=None):
246246
# ---------------------------------------------------------------------
247247

248248

249-
def prep_ndarray(values, copy=True):
249+
def prep_ndarray(values, copy=True) -> np.ndarray:
250250
if not isinstance(values, (np.ndarray, ABCSeries, Index)):
251251
if len(values) == 0:
252252
return np.empty((0, 0), dtype=object)
253+
elif isinstance(values, range):
254+
arr = np.arange(values.start, values.stop, values.step, dtype="int64")
255+
return arr[..., np.newaxis]
253256

254257
def convert(v):
255258
return maybe_convert_platform(v)

0 commit comments

Comments
 (0)