Skip to content

Commit 32bd658

Browse files
dmadisettiCopilotpre-commit-ci[bot]
authored
check + fix: Protections around user packges being used as site packages (#6638)
## 📝 Summary fixes #6553 We overload import spec to hook in for various format types. If there's a user file that matches an expected format type, we erroneously attempted to register the formatter for that import. This PR fixes this issue by introducing `marimo/_utils/site_packages.py` for utils to detect collisions, and a `marimo check` rule for users to more easily troubleshoot this issue in the future <img width="1204" height="261" alt="image" src="https://github.com/user-attachments/assets/0b4de0ae-7948-4f5b-97c7-47ccd5a1423b" /> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent cea1fef commit 32bd658

File tree

19 files changed

+662
-33
lines changed

19 files changed

+662
-33
lines changed

docs/guides/lint_rules/index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,14 @@ These errors prevent notebook execution.
3333
| [MB004](rules/setup_cell_dependencies.md) | setup-cell-dependencies | Setup cell cannot have dependencies ||
3434
| [MB005](rules/invalid_syntax.md) | invalid-syntax | Cell contains code that throws a SyntaxError on compilation ||
3535

36+
### ⚠️ Runtime Rules
37+
38+
These issues may cause runtime problems.
39+
40+
| Code | Name | Description | Fixable |
41+
|------|------|-------------|----------|
42+
| [MR001](rules/self_import.md) | self-import | Importing a module with the same name as the file ||
43+
3644
### ✨ Formatting Rules
3745

3846
These are style and formatting issues.
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# MR001: self-import
2+
3+
⚠️ **Runtime** ❌ Not Fixable
4+
5+
MR001: Importing a module with the same name as the file.
6+
7+
## What it does
8+
9+
Analyzes import statements in each cell to detect cases where the imported
10+
module name matches the current file's name (without the .py extension).
11+
12+
## Why is this bad?
13+
14+
Importing a module with the same name as the file causes several issues:
15+
- Python may attempt to import the current file instead of the intended module
16+
- This can lead to circular import errors or unexpected behavior
17+
- It makes the code confusing and hard to debug
18+
- It can prevent the notebook from running correctly
19+
20+
This is a runtime issue because it can cause import confusion and unexpected behavior.
21+
22+
## Examples
23+
24+
**Problematic (in a file named `requests.py`):**
25+
```python
26+
import requests # Error: conflicts with file name
27+
```
28+
29+
**Problematic (in a file named `math.py`):**
30+
```python
31+
from math import sqrt # Error: conflicts with file name
32+
```
33+
34+
**Solution:**
35+
```python
36+
# Rename the file to something else, like my_requests.py
37+
import requests # Now this works correctly
38+
```
39+
40+
**Alternative Solution:**
41+
```python
42+
# Use a different approach that doesn't conflict
43+
import urllib.request # Use alternative library
44+
```
45+
46+
## References
47+
48+
- [Understanding Errors](https://docs.marimo.io/guides/understanding_errors/)
49+
- [Python Import System](https://docs.python.org/3/reference/import.html)
50+

marimo/_lint/rules/__init__.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,15 @@
22
from marimo._lint.rules.base import LintRule
33
from marimo._lint.rules.breaking import BREAKING_RULE_CODES
44
from marimo._lint.rules.formatting import FORMATTING_RULE_CODES
5+
from marimo._lint.rules.runtime import RUNTIME_RULE_CODES
56

67
RULE_CODES: dict[str, type[LintRule]] = (
7-
BREAKING_RULE_CODES | FORMATTING_RULE_CODES
8+
BREAKING_RULE_CODES | RUNTIME_RULE_CODES | FORMATTING_RULE_CODES
89
)
910

1011
__all__ = [
1112
"RULE_CODES",
1213
"BREAKING_RULE_CODES",
14+
"RUNTIME_RULE_CODES",
1315
"FORMATTING_RULE_CODES",
1416
]
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Copyright 2025 Marimo. All rights reserved.
2+
from marimo._lint.rules.base import LintRule
3+
from marimo._lint.rules.runtime.self_import import SelfImportRule
4+
5+
RUNTIME_RULE_CODES: dict[str, type[LintRule]] = {
6+
"MR001": SelfImportRule,
7+
}
8+
9+
__all__ = [
10+
"SelfImportRule",
11+
"RUNTIME_RULE_CODES",
12+
]
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Copyright 2025 Marimo. All rights reserved.
2+
from __future__ import annotations
3+
4+
import os
5+
from typing import TYPE_CHECKING
6+
7+
from marimo._lint.diagnostic import Diagnostic, Severity
8+
from marimo._lint.rules.breaking.graph import GraphRule
9+
from marimo._utils.site_packages import (
10+
has_local_conflict,
11+
)
12+
13+
if TYPE_CHECKING:
14+
from marimo._lint.context import RuleContext
15+
from marimo._runtime.dataflow import DirectedGraph
16+
17+
18+
class SelfImportRule(GraphRule):
19+
"""MR001: Importing a module with the same name as the file.
20+
21+
This rule detects attempts to import a module that has the same name as the
22+
current file. This can cause import conflicts, circular import issues, and
23+
unexpected behavior where the file might try to import itself instead of
24+
the intended external module.
25+
26+
## What it does
27+
28+
Analyzes import statements in each cell to detect cases where the imported
29+
module name matches the current file's name (without the .py extension).
30+
31+
## Why is this bad?
32+
33+
Importing a module with the same name as the file causes several issues:
34+
- Python may attempt to import the current file instead of the intended module
35+
- This can lead to circular import errors or unexpected behavior
36+
- It makes the code confusing and hard to debug
37+
- It can prevent the notebook from running correctly
38+
39+
This is a runtime issue because it can cause import confusion and unexpected behavior.
40+
41+
## Examples
42+
43+
**Problematic (in a file named `requests.py`):**
44+
```python
45+
import requests # Error: conflicts with file name
46+
```
47+
48+
**Problematic (in a file named `math.py`):**
49+
```python
50+
from math import sqrt # Error: conflicts with file name
51+
```
52+
53+
**Solution:**
54+
```python
55+
# Rename the file to something else, like my_requests.py
56+
import requests # Now this works correctly
57+
```
58+
59+
**Alternative Solution:**
60+
```python
61+
# Use a different approach that doesn't conflict
62+
import urllib.request # Use alternative library
63+
```
64+
65+
## References
66+
67+
- [Understanding Errors](https://docs.marimo.io/guides/understanding_errors/)
68+
- [Python Import System](https://docs.python.org/3/reference/import.html)
69+
"""
70+
71+
code = "MR001"
72+
name = "self-import"
73+
description = "Importing a module with the same name as the file"
74+
severity = Severity.RUNTIME
75+
fixable = False
76+
77+
async def _validate_graph(
78+
self, graph: DirectedGraph, ctx: RuleContext
79+
) -> None:
80+
"""Check for imports that conflict with the file name."""
81+
# Get the file name without extension
82+
if not ctx.notebook.filename:
83+
return
84+
85+
file_name = os.path.basename(ctx.notebook.filename)
86+
if file_name.endswith(".py"):
87+
module_name = file_name[:-3]
88+
else:
89+
# For .md or other extensions, we can't determine conflicts
90+
return
91+
92+
# Get directory containing the notebook file for local package checking
93+
notebook_dir = os.path.dirname(ctx.notebook.filename)
94+
95+
await self._check_cells_for_conflicts(
96+
graph, ctx, module_name, file_name, notebook_dir
97+
)
98+
99+
async def _check_cells_for_conflicts(
100+
self,
101+
graph: DirectedGraph,
102+
ctx: RuleContext,
103+
module_name: str,
104+
file_name: str,
105+
notebook_dir: str,
106+
) -> None:
107+
"""Check all cells for import conflicts using compiled cell data."""
108+
for cell_id, cell_impl in graph.cells.items():
109+
# Check imports from compiled cell data
110+
for variable, var_data_list in cell_impl.variable_data.items():
111+
for var_data in var_data_list:
112+
if var_data.import_data is None:
113+
continue
114+
115+
import_data = var_data.import_data
116+
top_level_module = import_data.module.split(".")[0]
117+
fix_msg = f"Rename the file to avoid conflicts with the '{top_level_module}' module. "
118+
if top_level_module == module_name:
119+
# Standard self-import conflict
120+
message = f"Importing module '{top_level_module}' conflicts with file name '{file_name}'"
121+
# Check if there's a local file/package with the same name
122+
elif has_local_conflict(top_level_module, notebook_dir):
123+
# Module exists in site-packages - enhanced diagnostic
124+
message = (
125+
f"Importing module '{top_level_module}' conflicts "
126+
"with a module in site-packages, and may cause import ambiguity."
127+
)
128+
else:
129+
continue
130+
131+
line, column = self._get_variable_line_info(
132+
cell_id, variable, ctx
133+
)
134+
diagnostic = Diagnostic(
135+
message=message,
136+
line=line,
137+
column=column,
138+
code=self.code,
139+
name=self.name,
140+
severity=self.severity,
141+
fixable=self.fixable,
142+
fix=fix_msg,
143+
)
144+
await ctx.add_diagnostic(diagnostic)

marimo/_lint/visitors.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,27 @@ def visit_ClassDef(self, node: ast.ClassDef) -> None:
3838
self.column_number = node.col_offset + 1
3939
return
4040
self.generic_visit(node)
41+
42+
def visit_ImportFrom(self, node: ast.ImportFrom) -> None:
43+
"""Visit ImportFrom nodes to find imported variable definitions."""
44+
for alias in node.names:
45+
if (
46+
alias.asname == self.target_variable
47+
or alias.name == self.target_variable
48+
):
49+
self.line_number = node.lineno
50+
self.column_number = node.col_offset + 1
51+
return
52+
self.generic_visit(node)
53+
54+
def visit_Import(self, node: ast.Import) -> None:
55+
"""Visit Import nodes to find imported variable definitions."""
56+
for alias in node.names:
57+
if (
58+
alias.asname == self.target_variable
59+
or alias.name == self.target_variable
60+
):
61+
self.line_number = node.lineno
62+
self.column_number = node.col_offset + 1
63+
return
64+
self.generic_visit(node)

marimo/_output/formatters/formatters.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@
3838
from marimo._output.formatters.structures import StructuresFormatter
3939
from marimo._output.formatters.sympy_formatters import SympyFormatter
4040
from marimo._output.formatters.tqdm_formatters import TqdmFormatter
41+
from marimo._utils.site_packages import is_local_module
4142

4243
LOGGER = _loggers.marimo_logger()
4344

@@ -123,6 +124,10 @@ def find_spec( # type:ignore[no-untyped-def]
123124
if spec is None:
124125
return spec
125126

127+
# Skip patching for local modules (not under site-packages)
128+
if is_local_module(spec):
129+
return spec
130+
126131
if spec.loader is not None and fullname in third_party_factories:
127132
# We're now in the process of importing a module with
128133
# an associated formatter factory. We'll hook into its

0 commit comments

Comments
 (0)