Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add solutions to lc problem: No.3087 #2456

Merged
merged 1 commit into from
Mar 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 126 additions & 0 deletions solution/3000-3099/3087.Find Trending Hashtags/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# [3087. Find Trending Hashtags](https://leetcode.cn/problems/find-trending-hashtags)

[English Version](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README_EN.md)

<!-- tags: -->

## 题目描述

<!-- 这里写题目描述 -->

<p>Table: <code>Tweets</code></p>

<pre>
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| tweet_id | int |
| tweet_date | date |
| tweet | varchar |
+-------------+---------+
tweet_id is the primary key (column with unique values) for this table.
Each row of this table contains user_id, tweet_id, tweet_date and tweet.
</pre>

<p>Write a solution to find the <strong>top</strong> <code>3</code> trending <strong>hashtags</strong>&nbsp;in&nbsp;<strong>February</strong> <code>2024</code>.</p>

<p>Return <em>the result table orderd by count of hashtag, hastag in </em><strong>descending</strong><em> order.</em></p>

<p>The result format is in the following example.</p>

<p>&nbsp;</p>
<p><strong class="example">Example 1:</strong></p>

<div class="example-block">
<p><strong>Input:</strong></p>

<p>Tweets table:</p>

<pre class="example-io">
+---------+----------+----------------------------------------------+------------+
| user_id | tweet_id | tweet | tweet_date |
+---------+----------+----------------------------------------------+------------+
| 135 | 13 | Enjoying a great start to the day! #HappyDay | 2024-02-01 |
| 136 | 14 | Another #HappyDay with good vibes! | 2024-02-03 |
| 137 | 15 | Productivity peaks! #WorkLife | 2024-02-04 |
| 138 | 16 | Exploring new tech frontiers. #TechLife | 2024-02-04 |
| 139 | 17 | Gratitude for today&#39;s moments. #HappyDay | 2024-02-05 |
| 140 | 18 | Innovation drives us. #TechLife | 2024-02-07 |
| 141 | 19 | Connecting with nature&#39;s serenity. #Nature | 2024-02-09 |
+---------+----------+----------------------------------------------+------------+
</pre>

<p><strong>Output:</strong></p>

<pre class="example-io">
+-----------+--------------+
| hashtag | hashtag_count|
+-----------+--------------+
| #HappyDay | 3 |
| #TechLife | 2 |
| #WorkLife | 1 |
+-----------+--------------+

</pre>

<p><strong>Explanation:</strong></p>

<ul>
<li><strong>#HappyDay:</strong> Appeared in tweet IDs 13, 14, and 17, with a total count of 3 mentions.</li>
<li><strong>#TechLife:</strong> Appeared in tweet IDs 16 and 18, with a total count of 2 mentions.</li>
<li><strong>#WorkLife:</strong> Appeared in tweet ID 15, with a total count of 1 mention.</li>
</ul>

<p><b>Note:</b> Output table is sorted in descending order by hashtag_count and hashtag respectively.</p>
</div>

## 解法

### 方法一:提取子串 + 分组

我们可以查询得到 2024 年 2 月的所有 tweet,利用 `SUBSTRING_INDEX` 函数提取 Hashtag,然后使用 `GROUP BY` 和 `COUNT` 函数统计每个 Hashtag 出现的次数,最后按照出现次数降序、Hashtag 降序排序,取前三个热门 Hashtag。

<!-- tabs:start -->

```sql
# Write your MySQL query statement below
SELECT
CONCAT('#', SUBSTRING_INDEX(SUBSTRING_INDEX(tweet, '#', -1), ' ', 1)) AS hashtag,
COUNT(1) AS hashtag_count
FROM Tweets
WHERE DATE_FORMAT(tweet_date, '%Y%m') = '202402'
GROUP BY 1
ORDER BY 2 DESC, 1 DESC
LIMIT 3;
```

```python
import pandas as pd


def find_trending_hashtags(tweets: pd.DataFrame) -> pd.DataFrame:
# 过滤数据框以获取特定日期的数据
tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]

# 提取 Hashtag
tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")

# 统计 Hashtag 出现次数
hashtag_counts = tweets["hashtag"].value_counts().reset_index()
hashtag_counts.columns = ["hashtag", "hashtag_count"]

# 根据出现次数降序排序 Hashtag
hashtag_counts = hashtag_counts.sort_values(
by=["hashtag_count", "hashtag"], ascending=[False, False]
)

# 返回前三个热门 Hashtag
top_3_hashtags = hashtag_counts.head(3)

return top_3_hashtags
```

<!-- tabs:end -->

<!-- end -->
114 changes: 114 additions & 0 deletions solution/3000-3099/3087.Find Trending Hashtags/README_EN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# [3087. Find Trending Hashtags](https://leetcode.com/problems/find-trending-hashtags)

[中文文档](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README.md)

<!-- tags: -->

## Description

<p>Table: <code>Tweets</code></p>

<pre>
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| tweet_id | int |
| tweet_date | date |
| tweet | varchar |
+-------------+---------+
tweet_id is the primary key (column with unique values) for this table.
Each row of this table contains user_id, tweet_id, tweet_date and tweet.
</pre>

<p>Write a solution to find the <strong>top</strong> <code>3</code> trending <strong>hashtags</strong>&nbsp;in&nbsp;<strong>February</strong> <code>2024</code>.</p>

<p>Return <em>the result table orderd by count of hashtag, hastag in </em><strong>descending</strong><em> order.</em></p>

<p>The result format is in the following example.</p>

<p>&nbsp;</p>
<p><strong class="example">Example 1:</strong></p>

<div class="example-block">
<p><strong>Input:</strong></p>

<p>Tweets table:</p>

<pre class="example-io">
+---------+----------+----------------------------------------------+------------+
| user_id | tweet_id | tweet | tweet_date |
+---------+----------+----------------------------------------------+------------+
| 135 | 13 | Enjoying a great start to the day! #HappyDay | 2024-02-01 |
| 136 | 14 | Another #HappyDay with good vibes! | 2024-02-03 |
| 137 | 15 | Productivity peaks! #WorkLife | 2024-02-04 |
| 138 | 16 | Exploring new tech frontiers. #TechLife | 2024-02-04 |
| 139 | 17 | Gratitude for today&#39;s moments. #HappyDay | 2024-02-05 |
| 140 | 18 | Innovation drives us. #TechLife | 2024-02-07 |
| 141 | 19 | Connecting with nature&#39;s serenity. #Nature | 2024-02-09 |
+---------+----------+----------------------------------------------+------------+
</pre>

<p><strong>Output:</strong></p>

<pre class="example-io">
+-----------+--------------+
| hashtag | hashtag_count|
+-----------+--------------+
| #HappyDay | 3 |
| #TechLife | 2 |
| #WorkLife | 1 |
+-----------+--------------+

</pre>

<p><strong>Explanation:</strong></p>

<ul>
<li><strong>#HappyDay:</strong> Appeared in tweet IDs 13, 14, and 17, with a total count of 3 mentions.</li>
<li><strong>#TechLife:</strong> Appeared in tweet IDs 16 and 18, with a total count of 2 mentions.</li>
<li><strong>#WorkLife:</strong> Appeared in tweet ID 15, with a total count of 1 mention.</li>
</ul>

<p><b>Note:</b> Output table is sorted in descending order by hashtag_count and hashtag respectively.</p>
</div>

## Solutions

### Solution 1: Extract Substring + Grouping

We can query all tweets from February 2024, use the `SUBSTRING_INDEX` function to extract Hashtags, then use the `GROUP BY` and `COUNT` functions to count the occurrences of each Hashtag. Finally, we sort by the number of occurrences in descending order and by Hashtag in descending order, and take the top three popular Hashtags.

<!-- tabs:start -->

```sql
# Write your MySQL query statement below
SELECT
CONCAT('#', SUBSTRING_INDEX(SUBSTRING_INDEX(tweet, '#', -1), ' ', 1)) AS hashtag,
COUNT(1) AS hashtag_count
FROM Tweets
WHERE DATE_FORMAT(tweet_date, '%Y%m') = '202402'
GROUP BY 1
ORDER BY 2 DESC, 1 DESC
LIMIT 3;
```

```python
import pandas as pd


def find_trending_hashtags(tweets: pd.DataFrame) -> pd.DataFrame:
tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]
tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")
hashtag_counts = tweets["hashtag"].value_counts().reset_index()
hashtag_counts.columns = ["hashtag", "hashtag_count"]
hashtag_counts = hashtag_counts.sort_values(
by=["hashtag_count", "hashtag"], ascending=[False, False]
)
top_3_hashtags = hashtag_counts.head(3)
return top_3_hashtags
```

<!-- tabs:end -->

<!-- end -->
13 changes: 13 additions & 0 deletions solution/3000-3099/3087.Find Trending Hashtags/Solution.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import pandas as pd


def find_trending_hashtags(tweets: pd.DataFrame) -> pd.DataFrame:
tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]
tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")
hashtag_counts = tweets["hashtag"].value_counts().reset_index()
hashtag_counts.columns = ["hashtag", "hashtag_count"]
hashtag_counts = hashtag_counts.sort_values(
by=["hashtag_count", "hashtag"], ascending=[False, False]
)
top_3_hashtags = hashtag_counts.head(3)
return top_3_hashtags
9 changes: 9 additions & 0 deletions solution/3000-3099/3087.Find Trending Hashtags/Solution.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Write your MySQL query statement below
SELECT
CONCAT('#', SUBSTRING_INDEX(SUBSTRING_INDEX(tweet, '#', -1), ' ', 1)) AS hashtag,
COUNT(1) AS hashtag_count
FROM Tweets
WHERE DATE_FORMAT(tweet_date, '%Y%m') = '202402'
GROUP BY 1
ORDER BY 2 DESC, 1 DESC
LIMIT 3;
1 change: 1 addition & 0 deletions solution/DATABASE_README.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@
| 3059 | [Find All Unique Email Domains](/solution/3000-3099/3059.Find%20All%20Unique%20Email%20Domains/README.md) | `数据库` | 简单 | 🔒 |
| 3060 | [User Activities within Time Bounds](/solution/3000-3099/3060.User%20Activities%20within%20Time%20Bounds/README.md) | `数据库` | 困难 | 🔒 |
| 3061 | [计算滞留雨水](/solution/3000-3099/3061.Calculate%20Trapping%20Rain%20Water/README.md) | `数据库` | 困难 | 🔒 |
| 3087 | [Find Trending Hashtags](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README.md) | | 中等 | 🔒 |

## 版权

Expand Down
1 change: 1 addition & 0 deletions solution/DATABASE_README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,7 @@ Press <kbd>Control</kbd> + <kbd>F</kbd>(or <kbd>Command</kbd> + <kbd>F</kbd> on
| 3059 | [Find All Unique Email Domains](/solution/3000-3099/3059.Find%20All%20Unique%20Email%20Domains/README_EN.md) | `Database` | Easy | 🔒 |
| 3060 | [User Activities within Time Bounds](/solution/3000-3099/3060.User%20Activities%20within%20Time%20Bounds/README_EN.md) | `Database` | Hard | 🔒 |
| 3061 | [Calculate Trapping Rain Water](/solution/3000-3099/3061.Calculate%20Trapping%20Rain%20Water/README_EN.md) | `Database` | Hard | 🔒 |
| 3087 | [Find Trending Hashtags](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README_EN.md) | | Medium | 🔒 |

## Copyright

Expand Down
11 changes: 6 additions & 5 deletions solution/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3083,11 +3083,11 @@
| 3070 | [元素和小于等于 k 的子矩阵的数目](/solution/3000-3099/3070.Count%20Submatrices%20with%20Top-Left%20Element%20and%20Sum%20Less%20Than%20k/README.md) | `数组`,`矩阵`,`前缀和` | 中等 | 第 387 场周赛 |
| 3071 | [在矩阵上写出字母 Y 所需的最少操作次数](/solution/3000-3099/3071.Minimum%20Operations%20to%20Write%20the%20Letter%20Y%20on%20a%20Grid/README.md) | `数组`,`哈希表`,`计数`,`矩阵` | 中等 | 第 387 场周赛 |
| 3072 | [将元素分配到两个数组中 II](/solution/3000-3099/3072.Distribute%20Elements%20Into%20Two%20Arrays%20II/README.md) | `树状数组`,`线段树`,`数组`,`模拟` | 困难 | 第 387 场周赛 |
| 3073 | [最大递增三元组](/solution/3000-3099/3073.Maximum%20Increasing%20Triplet%20Value/README.md) | | 中等 | 🔒 |
| 3074 | [重新分装苹果](/solution/3000-3099/3074.Apple%20Redistribution%20into%20Boxes/README.md) | | 简单 | 第 388 场周赛 |
| 3075 | [幸福值最大化的选择方案](/solution/3000-3099/3075.Maximize%20Happiness%20of%20Selected%20Children/README.md) | | 中等 | 第 388 场周赛 |
| 3076 | [数组中的最短非公共子字符串](/solution/3000-3099/3076.Shortest%20Uncommon%20Substring%20in%20an%20Array/README.md) | | 中等 | 第 388 场周赛 |
| 3077 | [K 个不相交子数组的最大能量值](/solution/3000-3099/3077.Maximum%20Strength%20of%20K%20Disjoint%20Subarrays/README.md) | | 困难 | 第 388 场周赛 |
| 3073 | [最大递增三元组](/solution/3000-3099/3073.Maximum%20Increasing%20Triplet%20Value/README.md) | `数组`,`有序集合` | 中等 | 🔒 |
| 3074 | [重新分装苹果](/solution/3000-3099/3074.Apple%20Redistribution%20into%20Boxes/README.md) | `贪心`,`数组`,`排序` | 简单 | 第 388 场周赛 |
| 3075 | [幸福值最大化的选择方案](/solution/3000-3099/3075.Maximize%20Happiness%20of%20Selected%20Children/README.md) | `贪心`,`数组`,`排序` | 中等 | 第 388 场周赛 |
| 3076 | [数组中的最短非公共子字符串](/solution/3000-3099/3076.Shortest%20Uncommon%20Substring%20in%20an%20Array/README.md) | `字典树`,`数组`,`哈希表`,`字符串` | 中等 | 第 388 场周赛 |
| 3077 | [K 个不相交子数组的最大能量值](/solution/3000-3099/3077.Maximum%20Strength%20of%20K%20Disjoint%20Subarrays/README.md) | `数组`,`动态规划`,`前缀和` | 困难 | 第 388 场周赛 |
| 3078 | [Match Alphanumerical Pattern in Matrix I](/solution/3000-3099/3078.Match%20Alphanumerical%20Pattern%20in%20Matrix%20I/README.md) | | 中等 | 🔒 |
| 3079 | [求出加密整数的和](/solution/3000-3099/3079.Find%20the%20Sum%20of%20Encrypted%20Integers/README.md) | | 简单 | 第 126 场双周赛 |
| 3080 | [执行操作标记数组中的元素](/solution/3000-3099/3080.Mark%20Elements%20on%20Array%20by%20Performing%20Queries/README.md) | | 中等 | 第 126 场双周赛 |
Expand All @@ -3097,6 +3097,7 @@
| 3084 | [统计以给定字符开头和结尾的子字符串总数](/solution/3000-3099/3084.Count%20Substrings%20Starting%20and%20Ending%20with%20Given%20Character/README.md) | | 中等 | 第 389 场周赛 |
| 3085 | [成为 K 特殊字符串需要删除的最少字符数](/solution/3000-3099/3085.Minimum%20Deletions%20to%20Make%20String%20K-Special/README.md) | | 中等 | 第 389 场周赛 |
| 3086 | [拾起 K 个 1 需要的最少行动次数](/solution/3000-3099/3086.Minimum%20Moves%20to%20Pick%20K%20Ones/README.md) | | 困难 | 第 389 场周赛 |
| 3087 | [Find Trending Hashtags](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README.md) | | 中等 | 🔒 |

## 版权

Expand Down
11 changes: 6 additions & 5 deletions solution/README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -3081,11 +3081,11 @@ Press <kbd>Control</kbd> + <kbd>F</kbd>(or <kbd>Command</kbd> + <kbd>F</kbd> on
| 3070 | [Count Submatrices with Top-Left Element and Sum Less Than k](/solution/3000-3099/3070.Count%20Submatrices%20with%20Top-Left%20Element%20and%20Sum%20Less%20Than%20k/README_EN.md) | `Array`,`Matrix`,`Prefix Sum` | Medium | Weekly Contest 387 |
| 3071 | [Minimum Operations to Write the Letter Y on a Grid](/solution/3000-3099/3071.Minimum%20Operations%20to%20Write%20the%20Letter%20Y%20on%20a%20Grid/README_EN.md) | `Array`,`Hash Table`,`Counting`,`Matrix` | Medium | Weekly Contest 387 |
| 3072 | [Distribute Elements Into Two Arrays II](/solution/3000-3099/3072.Distribute%20Elements%20Into%20Two%20Arrays%20II/README_EN.md) | `Binary Indexed Tree`,`Segment Tree`,`Array`,`Simulation` | Hard | Weekly Contest 387 |
| 3073 | [Maximum Increasing Triplet Value](/solution/3000-3099/3073.Maximum%20Increasing%20Triplet%20Value/README_EN.md) | | Medium | 🔒 |
| 3074 | [Apple Redistribution into Boxes](/solution/3000-3099/3074.Apple%20Redistribution%20into%20Boxes/README_EN.md) | | Easy | Weekly Contest 388 |
| 3075 | [Maximize Happiness of Selected Children](/solution/3000-3099/3075.Maximize%20Happiness%20of%20Selected%20Children/README_EN.md) | | Medium | Weekly Contest 388 |
| 3076 | [Shortest Uncommon Substring in an Array](/solution/3000-3099/3076.Shortest%20Uncommon%20Substring%20in%20an%20Array/README_EN.md) | | Medium | Weekly Contest 388 |
| 3077 | [Maximum Strength of K Disjoint Subarrays](/solution/3000-3099/3077.Maximum%20Strength%20of%20K%20Disjoint%20Subarrays/README_EN.md) | | Hard | Weekly Contest 388 |
| 3073 | [Maximum Increasing Triplet Value](/solution/3000-3099/3073.Maximum%20Increasing%20Triplet%20Value/README_EN.md) | `Array`,`Ordered Set` | Medium | 🔒 |
| 3074 | [Apple Redistribution into Boxes](/solution/3000-3099/3074.Apple%20Redistribution%20into%20Boxes/README_EN.md) | `Greedy`,`Array`,`Sorting` | Easy | Weekly Contest 388 |
| 3075 | [Maximize Happiness of Selected Children](/solution/3000-3099/3075.Maximize%20Happiness%20of%20Selected%20Children/README_EN.md) | `Greedy`,`Array`,`Sorting` | Medium | Weekly Contest 388 |
| 3076 | [Shortest Uncommon Substring in an Array](/solution/3000-3099/3076.Shortest%20Uncommon%20Substring%20in%20an%20Array/README_EN.md) | `Trie`,`Array`,`Hash Table`,`String` | Medium | Weekly Contest 388 |
| 3077 | [Maximum Strength of K Disjoint Subarrays](/solution/3000-3099/3077.Maximum%20Strength%20of%20K%20Disjoint%20Subarrays/README_EN.md) | `Array`,`Dynamic Programming`,`Prefix Sum` | Hard | Weekly Contest 388 |
| 3078 | [Match Alphanumerical Pattern in Matrix I](/solution/3000-3099/3078.Match%20Alphanumerical%20Pattern%20in%20Matrix%20I/README_EN.md) | | Medium | 🔒 |
| 3079 | [Find the Sum of Encrypted Integers](/solution/3000-3099/3079.Find%20the%20Sum%20of%20Encrypted%20Integers/README_EN.md) | | Easy | Biweekly Contest 126 |
| 3080 | [Mark Elements on Array by Performing Queries](/solution/3000-3099/3080.Mark%20Elements%20on%20Array%20by%20Performing%20Queries/README_EN.md) | | Medium | Biweekly Contest 126 |
Expand All @@ -3095,6 +3095,7 @@ Press <kbd>Control</kbd> + <kbd>F</kbd>(or <kbd>Command</kbd> + <kbd>F</kbd> on
| 3084 | [Count Substrings Starting and Ending with Given Character](/solution/3000-3099/3084.Count%20Substrings%20Starting%20and%20Ending%20with%20Given%20Character/README_EN.md) | | Medium | Weekly Contest 389 |
| 3085 | [Minimum Deletions to Make String K-Special](/solution/3000-3099/3085.Minimum%20Deletions%20to%20Make%20String%20K-Special/README_EN.md) | | Medium | Weekly Contest 389 |
| 3086 | [Minimum Moves to Pick K Ones](/solution/3000-3099/3086.Minimum%20Moves%20to%20Pick%20K%20Ones/README_EN.md) | | Hard | Weekly Contest 389 |
| 3087 | [Find Trending Hashtags](/solution/3000-3099/3087.Find%20Trending%20Hashtags/README_EN.md) | | Medium | 🔒 |

## Copyright

Expand Down
1 change: 1 addition & 0 deletions solution/database-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,3 +261,4 @@
- [3059.Find All Unique Email Domains](/database-solution/3000-3099/3059.Find%20All%20Unique%20Email%20Domains/README.md)
- [3060.User Activities within Time Bounds](/database-solution/3000-3099/3060.User%20Activities%20within%20Time%20Bounds/README.md)
- [3061.计算滞留雨水](/database-solution/3000-3099/3061.Calculate%20Trapping%20Rain%20Water/README.md)
- [3087.Find Trending Hashtags](/database-solution/3000-3099/3087.Find%20Trending%20Hashtags/README.md)
Loading
Loading