Skip to content

Commit c68626a

Browse files
authored
Merge pull request larymak#62 from KanakamSasikalyan/main
News Article Scraping
2 parents d8b7042 + 391af6a commit c68626a

File tree

3 files changed

+119
-0
lines changed

3 files changed

+119
-0
lines changed

News_Article_Scraping/Article.py

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
###############################################################################
2+
"""
3+
@Newspaper3k usage
4+
@Usage of functions in newspaper:
5+
@Some Useful functions
6+
================================
7+
To create an instance of article
8+
9+
article_name = Article(url, language="language code according to newspaper")
10+
11+
To download an article
12+
article_name.download()
13+
To parse an article
14+
15+
article_name.parse()
16+
To apply nlp(natural language procesing) on article
17+
18+
article_name.nlp()
19+
To extract article’s text
20+
21+
article_name.text
22+
To extract article’s title
23+
24+
article_name.title
25+
To extract article’s summary
26+
27+
article_name.summary
28+
To extract article’s keywords
29+
30+
article_name.keywords
31+
"""
32+
####################################################################################
33+
34+
#Sample Usage Program
35+
from newspaper import Article
36+
37+
#A new article from TOI
38+
url = "http:// timesofindia.indiatimes.com/world/china/chinese-expert-warns-of-troops-entering-kashmir/articleshow/59516912.cms"
39+
40+
#For different language newspaper refer above table
41+
toi_article = Article(url, language="en") # en for English
42+
43+
#To download the article
44+
toi_article.download()
45+
46+
#To parse the article
47+
toi_article.parse()
48+
49+
#To perform natural language processing ie..nlp
50+
toi_article.nlp()
51+
52+
#To extract title
53+
print("Article's Title:")
54+
print(toi_article.title)
55+
print("n")
56+
57+
#To extract text
58+
print("Article's Text:")
59+
print(toi_article.text)
60+
print("n")
61+
62+
#To extract summary
63+
print("Article's Summary:")
64+
print(toi_article.summary)
65+
print("n")
66+
67+
#To extract keywords
68+
print("Article's Keywords:")
69+
print(toi_article.keywords)
70+
71+
72+
#####################################################################################################################################################################################
73+
"""
74+
Output:
75+
=======
76+
Article's Title:
77+
India China News: Chinese expert warns of troops entering Kashmir
78+
79+
80+
Article's Text:
81+
BEIJING: A Chinese expert has argued that his country's troops would be entitled to enter the Indian side of Kashmir by extending the logic that has permitted Indian troops to enter an area which is disputed by China and Bhutan This is one of the several arguments made by the scholar in an attempt to blame India for. India has responded to efforts by China to build a road in the Doklam area, which falls next to the trijunction connecting Sikkim with Tibet and Bhutan and"Even if India were requested to defend Bhutan's territory, this could only be limited to its established territory, not the disputed area, " Long Xingchun, director of the Center for Indian Studies at China West Normal University said in an article. "Otherwise, under India's logic, if the Pakistani government requests, a third country's army can enter the area disputed by India and Pakistan, including India-controlled Kashmir".China is not just interfering, it is building roads and other infrastructure projects right inside Pakistan-Occupied Kashmir (PoK), which is claimed by both India and Pakistan. This is one of the facts that the article did not mention.The scholar, through his article in the Beijing-based Global Times, suggested that Beijing can internationalize the Doklam controversy without worrying about western countries supporting India because the West has a lot of business to do with China."China can show the region and the international community or even the UN Security Council its evidence to illustrate China's position, " Long said. At the same time, he complained that "Western governments and media kept silent, ignoring India's hegemony over the small countries of South Asia" when India imposed a blockade on the flow of goods to Nepal in 2015.Recent actions by US president Donald Trump, which include selling arms to Taiwan and pressuring China on the North Korean issue, shows that the West is not necessarily cowered down by China's business capabilities.He reiterated the government's stated line that Doklam belongs to China, and that Indian troops had entered the area under the guise of helping Bhutan protect its territory."For a long time, India has been talking about international equality and non-interference in the internal affairs of others, but it has pursued hegemonic diplomacy in South Asia, seriously violating the UN Charter and undermining the basic norms of international relations, " he said.Interestingly, Chinese scholars are worrying about India interfering in Bhutan's "sovereignty and national interests" even though it is Chinese troops who have entered the Doklam area claimed by it."Indians have migrated in large numbers to Nepal and Bhutan, interfering with Nepal's internal affairs. The first challenge for Nepal and Bhutan is to avoid becoming a state of India, like Sikkim, " he said.
82+
83+
84+
Article's Summary:
85+
sending its troops to the disputed Doklam area +puts Indian territory at risk +BEIJING: A Chinese expert has argued that his country's troops would be entitled to enter the Indian side of Kashmir by extending the logic that has permitted Indian troops to enter an area which is disputed by China and Bhutan This is one of the several arguments made by the scholar in an attempt to blame India for.
86+
"Otherwise, under India's logic, if the Pakistani government requests, a third country's army can enter the area disputed by India and Pakistan, including India-controlled Kashmir".China is not just interfering, it is building roads and other infrastructure projects right inside Pakistan-Occupied Kashmir (PoK), which is claimed by both India and Pakistan.
87+
"China can show the region and the international community or even the UN Security Council its evidence to illustrate China's position, " Long said.
88+
"Indians have migrated in large numbers to Nepal and Bhutan, interfering with Nepal's internal affairs.
89+
The first challenge for Nepal and Bhutan is to avoid becoming a state of India, like Sikkim, " he said.
90+
"""
91+
####################################################################################################################################################################################
23.4 KB
Binary file not shown.

News_Article_Scraping/README.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Article Scraping (Python)
2+
![image](https://user-images.githubusercontent.com/67740644/129327650-24a85343-0371-4e8e-aa42-f290c4a6eb9c.png)
3+
## Description :
4+
Newspaper is a Python module used for extracting and parsing newspaper articles.</br>
5+
Newspaper use advance algorithms with web scrapping to extract all the useful text from a website.</br>
6+
It works amazingly well on online newspapers websites. Since it use web scrapping too many request,</br>
7+
to a newspaper website may lead to blocking, so use it accordingly.
8+
9+
## Installation :
10+
$ pip install newspaper3k (Right Command)
11+
## Note :
12+
$ pip install newspaper (Wrong Command)
13+
14+
## Languages Supported :
15+
Newspaper supports following languages:
16+
17+
input&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;full name</br>
18+
</br>
19+
ar&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Arabic</br>
20+
da&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Danish</br>
21+
de&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;German</br>
22+
el&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Greek</br>
23+
en&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;English</br>
24+
it&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Italian</br>
25+
zh&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Chinese</br>
26+
....etc.
27+
## Link :
28+
You Can Read The original Documentation Here. [NewsPaper3k Documentation](https://newspaper.readthedocs.io/en/latest/)

0 commit comments

Comments
 (0)