Skip to content

BUG: Inconsistent behavior for step with slice for label-based indexing #63311

@ianhi

Description

@ianhi

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# /// script
# requires-python = ">=3.13"
# dependencies = [
#     "pandas==2.3.3",
# ]
# ///

import numpy as np
import pandas as pd

# Create a Series with non-contiguous integer index (step of 5)
# Index: 0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, ...
# Values: 0, 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, ...
T = np.arange(0, 100, 5)
series = pd.Series(np.arange(len(T)), index=T)

# Without step: returns all labels from 10 to 50 inclusive (label-based)
result_no_step = series.loc[10:50]
print("series.loc[10:50] (no step):")
print(series.loc[10:50])
print()

# With step=1: same as no step
print("series.loc[10:50:1] (step=1):")
print(series.loc[10:50:1])
print()

# With step=2: step is applied positionally, start, stop applied to labels
print("series.loc[10:50:2] (step=2):")
print(series.loc[10:50:2])
print()

# With step=5: Same behavior as step 2
print("series.loc[10:50:5] (step=5):")
print(series.loc[10:50:5])
print()

# Using arange with same arguments as the slice
print("series.loc[np.arange(10,50,5)] (step=5):")
print(series.loc[np.arange(10, 50, 5)])
print()

Issue Description

When using .loc with a slice start/stop are applied over a different space than step which I found very counterintuitive. I also was not able to find any docs (on this admittedly niche use case)

start/stop are applied over the values of the labels
step is applied positionally over the index

The result of the above script is

series.loc[10:50] (no step):
10     2
15     3
20     4
25     5
30     6
35     7
40     8
45     9
50    10
dtype: int64

series.loc[10:50:1] (step=1):
10     2
15     3
20     4
25     5
30     6
35     7
40     8
45     9
50    10
dtype: int64

series.loc[10:50:2] (step=2):
10     2
20     4
30     6
40     8
50    10
dtype: int64

series.loc[10:50:5] (step=5):
10    2
35    7
dtype: int64

series.loc[np.arange(10,50,5)] (step=5):
10    2
15    3
20    4
25    5
30    6
35    7
40    8
45    9
dtype: int64

Expected Behavior

I would have expected either of the following:

error

Throw an error saying that step is ambiguous and cannot be used here. This seems to be the approach of IntervalIndex:

def _convert_slice_indexer(self, key: slice, kind: Literal["loc", "getitem"]):
if not (key.step is None or key.step == 1):
# GH#31658 if label-based, we require step == 1,
# if positional, we disallow float start/stop
msg = "label-based slicing with step!=1 is not supported for IntervalIndex"

(Though as a sidenote I wasn't able to hit that code path)

Step applies to Label Space

In my example I would expect the slice with step=5 to behave the same as step=1 as it should hit each of the same values. My mental model is that for the case of integers as in my example

series.loc[slice(start, stop, step)]

should be equivalent to

series.loc[np.arange(start, stop, step)]

and in more amgious cases e.g. slice("a", "f", 3) and error should be thrown

Installed Versions

INSTALLED VERSIONS

------------------
commit                : 9c8bc3e55188c8aff37207a74f1dd144980b8874
python                : 3.13.0
python-bits           : 64
OS                    : Darwin
OS-release            : 24.6.0
Version               : Darwin Kernel Version 24.6.0: Mon Jul 14 11:30:51 PDT 2025; root:xnu-11417.140.69~1/RELEASE_ARM64_T8112
machine               : arm64
processor             : arm
byteorder             : little
LC_ALL                : None
LANG                  : en_US.UTF-8
LOCALE                : en_US.UTF-8

pandas                : 2.3.3
numpy                 : 2.3.5
pytz                  : 2025.2
dateutil              : 2.9.0.post0
pip                   : None
Cython                : None
sphinx                : None
IPython               : None
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : None
blosc                 : None
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : None
html5lib              : None
hypothesis            : None
gcsfs                 : None
jinja2                : None
lxml.etree            : None
matplotlib            : None
numba                 : None
numexpr               : None
odfpy                 : None
openpyxl              : None
pandas_gbq            : None
psycopg2              : None
pymysql               : None
pyarrow               : None
pyreadstat            : None
pytest                : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : None
sqlalchemy            : None
tables                : None
tabulate              : None
xarray                : None
xlrd                  : None
xlsxwriter            : None
zstandard             : None
tzdata                : 2025.2
qtpy                  : None
pyqt5                 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions