Skip to content

Commit 6f992fe

Browse files
authored
feat: add sql solution to lc problem: No.3451 (#4057)
No.3451.Find Invalid IP Addresses
1 parent 0d6c8dd commit 6f992fe

File tree

9 files changed

+372
-0
lines changed

9 files changed

+372
-0
lines changed

.prettierignore

+1
Original file line numberDiff line numberDiff line change
@@ -26,3 +26,4 @@ node_modules/
2626
/solution/3100-3199/3150.Invalid Tweets II/Solution.sql
2727
/solution/3100-3199/3198.Find Cities in Each State/Solution.sql
2828
/solution/3300-3399/3328.Find Cities in Each State II/Solution.sql
29+
/solution/3400-3499/3451.Find Invalid IP Addresses/Solution.sql
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
comments: true
3+
difficulty: 困难
4+
edit_url: https://github.com/doocs/leetcode/edit/main/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README.md
5+
tags:
6+
- 数据库
7+
---
8+
9+
<!-- problem:start -->
10+
11+
# [3451. Find Invalid IP Addresses](https://leetcode.cn/problems/find-invalid-ip-addresses)
12+
13+
[English Version](/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README_EN.md)
14+
15+
## 题目描述
16+
17+
<!-- description:start -->
18+
19+
<p>Table: <code> logs</code></p>
20+
21+
<pre>
22+
+-------------+---------+
23+
| Column Name | Type |
24+
+-------------+---------+
25+
| log_id | int |
26+
| ip | varchar |
27+
| status_code | int |
28+
+-------------+---------+
29+
log_id is the unique key for this table.
30+
Each row contains server access log information including IP address and HTTP status code.
31+
</pre>
32+
33+
<p>Write a solution to find <strong>invalid IP addresses</strong>. An IPv4 address is invalid if it meets any of these conditions:</p>
34+
35+
<ul>
36+
<li>Contains numbers <strong>greater than</strong> <code>255</code> in any octet</li>
37+
<li>Has <strong>leading zeros</strong> in any octet (like <code>01.02.03.04</code>)</li>
38+
<li>Has <strong>less or more</strong> than <code>4</code> octets</li>
39+
</ul>
40+
41+
<p>Return <em>the result table </em><em>ordered by</em> <code>invalid_count</code>,&nbsp;<code>ip</code>&nbsp;<em>in <strong>descending</strong> order respectively</em>.&nbsp;</p>
42+
43+
<p>The result format is in the following example.</p>
44+
45+
<p>&nbsp;</p>
46+
<p><strong class="example">Example:</strong></p>
47+
48+
<div class="example-block">
49+
<p><strong>Input:</strong></p>
50+
51+
<p>logs table:</p>
52+
53+
<pre class="example-io">
54+
+--------+---------------+-------------+
55+
| log_id | ip | status_code |
56+
+--------+---------------+-------------+
57+
| 1 | 192.168.1.1 | 200 |
58+
| 2 | 256.1.2.3 | 404 |
59+
| 3 | 192.168.001.1 | 200 |
60+
| 4 | 192.168.1.1 | 200 |
61+
| 5 | 192.168.1 | 500 |
62+
| 6 | 256.1.2.3 | 404 |
63+
| 7 | 192.168.001.1 | 200 |
64+
+--------+---------------+-------------+
65+
</pre>
66+
67+
<p><strong>Output:</strong></p>
68+
69+
<pre class="example-io">
70+
+---------------+--------------+
71+
| ip | invalid_count|
72+
+---------------+--------------+
73+
| 256.1.2.3 | 2 |
74+
| 192.168.001.1 | 2 |
75+
| 192.168.1 | 1 |
76+
+---------------+--------------+
77+
</pre>
78+
79+
<p><strong>Explanation:</strong></p>
80+
81+
<ul>
82+
<li>256.1.2.3&nbsp;is invalid because 256 &gt; 255</li>
83+
<li>192.168.001.1&nbsp;is invalid because of leading zeros</li>
84+
<li>192.168.1&nbsp;is invalid because it has only 3 octets</li>
85+
</ul>
86+
87+
<p>The output table is ordered by invalid_count, ip in descending order respectively.</p>
88+
</div>
89+
90+
<!-- description:end -->
91+
92+
## 解法
93+
94+
<!-- solution:start -->
95+
96+
### 方法一:模拟
97+
98+
我们可以根据题意,判断 IP 地址是否不合法,判断的条件有:
99+
100+
1. IP 地址中的 `.` 的个数不等于 $3$;
101+
2. IP 地址中的某个 octet 以 `0` 开头;
102+
3. IP 地址中的某个 octet 大于 $255$。
103+
104+
然后我们将不合法的 IP 地址进行分组,并统计每个不合法的 IP 地址的个数 `invalid_count`,最后按照 `invalid_count``ip` 降序排序。
105+
106+
<!-- tabs:start -->
107+
108+
#### MySQL
109+
110+
```sql
111+
SELECT
112+
ip,
113+
COUNT(*) AS invalid_count
114+
FROM logs
115+
WHERE
116+
LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
117+
OR SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]'
118+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]'
119+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]'
120+
OR SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]'
121+
OR SUBSTRING_INDEX(ip, '.', 1) > 255
122+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) > 255
123+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) > 255
124+
OR SUBSTRING_INDEX(ip, '.', -1) > 255
125+
GROUP BY 1
126+
ORDER BY 2 DESC, 1 DESC;
127+
```
128+
129+
#### Pandas
130+
131+
```python
132+
import pandas as pd
133+
134+
135+
def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame:
136+
def is_valid_ip(ip: str) -> bool:
137+
octets = ip.split(".")
138+
if len(octets) != 4:
139+
return False
140+
for octet in octets:
141+
if not octet.isdigit():
142+
return False
143+
value = int(octet)
144+
if not 0 <= value <= 255 or octet != str(value):
145+
return False
146+
return True
147+
148+
logs["is_valid"] = logs["ip"].apply(is_valid_ip)
149+
invalid_ips = logs[~logs["is_valid"]]
150+
invalid_count = invalid_ips["ip"].value_counts().reset_index()
151+
invalid_count.columns = ["ip", "invalid_count"]
152+
result = invalid_count.sort_values(
153+
by=["invalid_count", "ip"], ascending=[False, False]
154+
)
155+
return result
156+
```
157+
158+
<!-- tabs:end -->
159+
160+
<!-- solution:end -->
161+
162+
<!-- problem:end -->
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
---
2+
comments: true
3+
difficulty: Hard
4+
edit_url: https://github.com/doocs/leetcode/edit/main/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README_EN.md
5+
tags:
6+
- Database
7+
---
8+
9+
<!-- problem:start -->
10+
11+
# [3451. Find Invalid IP Addresses](https://leetcode.com/problems/find-invalid-ip-addresses)
12+
13+
[中文文档](/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README.md)
14+
15+
## Description
16+
17+
<!-- description:start -->
18+
19+
<p>Table: <code> logs</code></p>
20+
21+
<pre>
22+
+-------------+---------+
23+
| Column Name | Type |
24+
+-------------+---------+
25+
| log_id | int |
26+
| ip | varchar |
27+
| status_code | int |
28+
+-------------+---------+
29+
log_id is the unique key for this table.
30+
Each row contains server access log information including IP address and HTTP status code.
31+
</pre>
32+
33+
<p>Write a solution to find <strong>invalid IP addresses</strong>. An IPv4 address is invalid if it meets any of these conditions:</p>
34+
35+
<ul>
36+
<li>Contains numbers <strong>greater than</strong> <code>255</code> in any octet</li>
37+
<li>Has <strong>leading zeros</strong> in any octet (like <code>01.02.03.04</code>)</li>
38+
<li>Has <strong>less or more</strong> than <code>4</code> octets</li>
39+
</ul>
40+
41+
<p>Return <em>the result table </em><em>ordered by</em> <code>invalid_count</code>,&nbsp;<code>ip</code>&nbsp;<em>in <strong>descending</strong> order respectively</em>.&nbsp;</p>
42+
43+
<p>The result format is in the following example.</p>
44+
45+
<p>&nbsp;</p>
46+
<p><strong class="example">Example:</strong></p>
47+
48+
<div class="example-block">
49+
<p><strong>Input:</strong></p>
50+
51+
<p>logs table:</p>
52+
53+
<pre class="example-io">
54+
+--------+---------------+-------------+
55+
| log_id | ip | status_code |
56+
+--------+---------------+-------------+
57+
| 1 | 192.168.1.1 | 200 |
58+
| 2 | 256.1.2.3 | 404 |
59+
| 3 | 192.168.001.1 | 200 |
60+
| 4 | 192.168.1.1 | 200 |
61+
| 5 | 192.168.1 | 500 |
62+
| 6 | 256.1.2.3 | 404 |
63+
| 7 | 192.168.001.1 | 200 |
64+
+--------+---------------+-------------+
65+
</pre>
66+
67+
<p><strong>Output:</strong></p>
68+
69+
<pre class="example-io">
70+
+---------------+--------------+
71+
| ip | invalid_count|
72+
+---------------+--------------+
73+
| 256.1.2.3 | 2 |
74+
| 192.168.001.1 | 2 |
75+
| 192.168.1 | 1 |
76+
+---------------+--------------+
77+
</pre>
78+
79+
<p><strong>Explanation:</strong></p>
80+
81+
<ul>
82+
<li>256.1.2.3&nbsp;is invalid because 256 &gt; 255</li>
83+
<li>192.168.001.1&nbsp;is invalid because of leading zeros</li>
84+
<li>192.168.1&nbsp;is invalid because it has only 3 octets</li>
85+
</ul>
86+
87+
<p>The output table is ordered by invalid_count, ip in descending order respectively.</p>
88+
</div>
89+
90+
<!-- description:end -->
91+
92+
## Solutions
93+
94+
<!-- solution:start -->
95+
96+
### Solution 1: Simulation
97+
98+
We can determine if an IP address is invalid based on the following conditions:
99+
100+
1. The number of `.` in the IP address is not equal to $3$;
101+
2. Any octet in the IP address starts with `0`;
102+
3. Any octet in the IP address is greater than $255$.
103+
104+
Then we group the invalid IP addresses and count the occurrences of each invalid IP address `invalid_count`, and finally sort by `invalid_count` and `ip` in descending order.
105+
106+
<!-- tabs:start -->
107+
108+
#### MySQL
109+
110+
```sql
111+
SELECT
112+
ip,
113+
COUNT(*) AS invalid_count
114+
FROM logs
115+
WHERE
116+
LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
117+
OR SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]'
118+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]'
119+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]'
120+
OR SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]'
121+
OR SUBSTRING_INDEX(ip, '.', 1) > 255
122+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) > 255
123+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) > 255
124+
OR SUBSTRING_INDEX(ip, '.', -1) > 255
125+
GROUP BY 1
126+
ORDER BY 2 DESC, 1 DESC;
127+
```
128+
129+
#### Pandas
130+
131+
```python
132+
import pandas as pd
133+
134+
135+
def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame:
136+
def is_valid_ip(ip: str) -> bool:
137+
octets = ip.split(".")
138+
if len(octets) != 4:
139+
return False
140+
for octet in octets:
141+
if not octet.isdigit():
142+
return False
143+
value = int(octet)
144+
if not 0 <= value <= 255 or octet != str(value):
145+
return False
146+
return True
147+
148+
logs["is_valid"] = logs["ip"].apply(is_valid_ip)
149+
invalid_ips = logs[~logs["is_valid"]]
150+
invalid_count = invalid_ips["ip"].value_counts().reset_index()
151+
invalid_count.columns = ["ip", "invalid_count"]
152+
result = invalid_count.sort_values(
153+
by=["invalid_count", "ip"], ascending=[False, False]
154+
)
155+
return result
156+
```
157+
158+
<!-- tabs:end -->
159+
160+
<!-- solution:end -->
161+
162+
<!-- problem:end -->
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import pandas as pd
2+
3+
4+
def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame:
5+
def is_valid_ip(ip: str) -> bool:
6+
octets = ip.split(".")
7+
if len(octets) != 4:
8+
return False
9+
for octet in octets:
10+
if not octet.isdigit():
11+
return False
12+
value = int(octet)
13+
if not 0 <= value <= 255 or octet != str(value):
14+
return False
15+
return True
16+
17+
logs["is_valid"] = logs["ip"].apply(is_valid_ip)
18+
invalid_ips = logs[~logs["is_valid"]]
19+
invalid_count = invalid_ips["ip"].value_counts().reset_index()
20+
invalid_count.columns = ["ip", "invalid_count"]
21+
result = invalid_count.sort_values(
22+
by=["invalid_count", "ip"], ascending=[False, False]
23+
)
24+
return result
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
SELECT
2+
ip,
3+
COUNT(*) AS invalid_count
4+
FROM logs
5+
WHERE
6+
LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
7+
8+
OR SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]'
9+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]'
10+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]'
11+
OR SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]'
12+
13+
OR SUBSTRING_INDEX(ip, '.', 1) > 255
14+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) > 255
15+
OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) > 255
16+
OR SUBSTRING_INDEX(ip, '.', -1) > 255
17+
18+
GROUP BY 1
19+
ORDER BY 2 DESC, 1 DESC;

solution/DATABASE_README.md

+1
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,7 @@
309309
| 3415 | [查找具有三个连续数字的产品](/solution/3400-3499/3415.Find%20Products%20with%20Three%20Consecutive%20Digits/README.md) | `数据库` | 简单 | 🔒 |
310310
| 3421 | [查找进步的学生](/solution/3400-3499/3421.Find%20Students%20Who%20Improved/README.md) | `数据库` | 中等 | |
311311
| 3436 | [查找合法邮箱](/solution/3400-3499/3436.Find%20Valid%20Emails/README.md) | `数据库` | 简单 | |
312+
| 3451 | [Find Invalid IP Addresses](/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README.md) | | 困难 | |
312313

313314
## 版权
314315

solution/DATABASE_README_EN.md

+1
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,7 @@ Press <kbd>Control</kbd> + <kbd>F</kbd>(or <kbd>Command</kbd> + <kbd>F</kbd> on
307307
| 3415 | [Find Products with Three Consecutive Digits](/solution/3400-3499/3415.Find%20Products%20with%20Three%20Consecutive%20Digits/README_EN.md) | `Database` | Easy | 🔒 |
308308
| 3421 | [Find Students Who Improved](/solution/3400-3499/3421.Find%20Students%20Who%20Improved/README_EN.md) | `Database` | Medium | |
309309
| 3436 | [Find Valid Emails](/solution/3400-3499/3436.Find%20Valid%20Emails/README_EN.md) | `Database` | Easy | |
310+
| 3451 | [Find Invalid IP Addresses](/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README_EN.md) | | Hard | |
310311

311312
## Copyright
312313

solution/README.md

+1
Original file line numberDiff line numberDiff line change
@@ -3461,6 +3461,7 @@
34613461
| 3448 | [统计可以被最后一个数位整除的子字符串数目](/solution/3400-3499/3448.Count%20Substrings%20Divisible%20By%20Last%20Digit/README.md) | | 困难 | 第 436 场周赛 |
34623462
| 3449 | [最大化游戏分数的最小值](/solution/3400-3499/3449.Maximize%20the%20Minimum%20Game%20Score/README.md) | | 困难 | 第 436 场周赛 |
34633463
| 3450 | [一张长椅的上最多学生](/solution/3400-3499/3450.Maximum%20Students%20on%20a%20Single%20Bench/README.md) | | 简单 | 🔒 |
3464+
| 3451 | [Find Invalid IP Addresses](/solution/3400-3499/3451.Find%20Invalid%20IP%20Addresses/README.md) | | 困难 | |
34643465

34653466
## 版权
34663467

0 commit comments

Comments
 (0)