Python学习之路(14)使用API【项目2 数据可视化】

茴香豆 Lv5

在本章中,你将学习如何编写一个独立的程序,并对其获取的数据进行可视化。

1.使用Web API

本章的可视化将基于来自GitHub的信息,这是一个让程序员能够协作开发 项目的网站。我们将使用GitHub的API来请求有关该网站中Python项目的信 息,然后使用Pygal生成交互式可视化,以呈现这些项目的受欢迎程度。

1
2
3
4
5
6
7
8
9
https://api.github.com/search/repositories?q=language:python&sort=stars
功能: 返回GitHub当前托管了多少个Python项目,还有有关最受欢迎的Python仓库的信息。
第一部分:
(https://api.github.com/ )将请求发送到GitHub网站中响应API调用的部分
第二部分(search/repositories )让API搜索GitHub上的所有仓库。
repositories 后面的问号指出我们要传递一个实参。
q表示查询,而等号让我们能够开始指定查询(q=)。
通过使用language:python ,我们指出只想获取主要语言为Python的仓库的信息。最后一部分
(&sort=stars)指定将项目按其获得的星星数进行排序。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import requests
#执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code: ", r.status_code)
#将API响应存储在一个变量中
response_dict = r.json()
#处理结果
print(response_dict.keys())
print("Toral repositories:", response_dict['total_count'])
#output:
Status code: 200
dict_keys(['total_count', 'incomplete_results', 'items'])
Toral repositories: 9020420

仓库相关信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#探索有关仓库的信息
repo_dicts = response_dict['items']
print("Repositories returned: ", len(repo_dicts))
#验究第一个仓库
repo_dict = repo_dicts[0]
print("\nKeys: ", len(repo_dict))
for key in sorted(repo_dict.keys()):
print(key)
print("\nSelected information about first repository:")
print('Name:', repo_dict['name'])
print('Owner:', repo_dict['owner']['login'])
print('Stars:', repo_dict['stargazers_count'])
print('Repository:', repo_dict['html_url'])
print('Created:', repo_dict['created_at'])
print('Updated:', repo_dict['updated_at'])
print('Description:', repo_dict['description'])
#output:
Repositories returned: 30

Keys: 79
allow_forking
archive_url
archived
assignees_url
blobs_url
branches_url
clone_url
collaborators_url
comments_url
commits_url
compare_url
...

Selected information about first repository:
Name: public-apis
Owner: public-apis
Stars: 211850
Repository: https://github.com/public-apis/public-apis
Created: 2016-03-20T23:49:42Z
Updated: 2022-10-11T02:30:33Z
Description: A collective list of free APIs

#检视检视所有项目的信息
print("\nSelected information about each repository:")
for repo_dict in repo_dicts:
print('\nName:', repo_dict['name'])
print('Owner:', repo_dict['owner']['login'])
print('Stars:', repo_dict['stargazers_count'])
print('Repository:', repo_dict['html_url'])
print('Description:', repo_dict['description'])
#output:
Selected information about each repository:

Name: public-apis
Owner: public-apis
Stars: 211850
Repository: https://github.com/public-apis/public-apis
Description: A collective list of free APIs

Name: system-design-primer
Owner: donnemartin
Stars: 199433
Repository: https://github.com/donnemartin/system-design-primer
Description: Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Name: awesome-python
Owner: vinta
Stars: 144231
Repository: https://github.com/vinta/awesome-python
Description: A curated list of awesome Python frameworks, libraries, software and resources

Name: youtube-dl
Owner: ytdl-org
Stars: 114026
Repository: https://github.com/ytdl-org/youtube-dl
Description: Command-line program to download videos from YouTube.com and other video sites

Name: thefuck
Owner: nvbn
Stars: 73960
Repository: https://github.com/nvbn/thefuck
Description: Magnificent app which corrects your previous console command.

...

2.使用Pygal可视化仓库

创建条形图表示项目获得了多少颗星星。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS
# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code: ", r.status_code)

#将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories: ", response_dict['total_count'])
#研究有关仓库的信息
repo_dicts = response_dict['items']
names, plot_dicts = [], []
for repo_dict in repo_dicts:
names.append(repo_dict['name'])
plot_dicts.append({
'value': repo_dict['stargazers_count'],
#添加自定义工具提示
'label': repo_dict['description'],
#将图表中的每个条形用作网站的链接
'xlink': repo_dict['html_url']
})
#可视化
my_style = LS('#333366', base_style=LCS)
# chart = pygal.Bar(style=my_style, x_label_rotation=45, show_legend=False)
#创建一个配置对象,包含传递给Bar()的所有定制
my_config = pygal.Config()
my_config.x_label_rotation = 45#倾斜45度
my_config.show_legend = False#不展示图例
my_config.title_font_size = 24
my_config.label_font_size = 14
my_config.major_label_font_size = 18
my_config.truncate_label = 15#将较长的项目名缩短为15个字符
my_config.show_y_guides = False#隐藏图中的水平线
my_config.width = 1000
chart = pygal.Bar(my_config, style=my_style)

chart.title = 'Most-Starred Python Projects on Github'
chart.x_labels = names

chart.add('', plot_dicts)
chart.render_to_file('python_repos.svg')

3.Hacker News API

下面的调用返回本书编写时最热门的文章的信息

1
https://hacker-news.firebaseio.com/v0/item/9884165.json

响应是一个字典,包含ID为9884165的文章的信息

1
2
3
4
5
6
7
8
9
10
11
{
"by":"nns",
"descendants":297,
"id":9884165,
"kids":[9885099,9884723,9885165,9884789,9885604,9884137,9886151,9885220,9885790,9884661,9885844,9885029,9884817,9887342,9884545,9884372,9884499,9884881,9884109,9886496,9884342,9887832,9885023,9884334,9884707,9887008,9885348,9885131,9887539,9885880,9884196,9884640,9886534,9885152],
"score":558,
"time":1436875181,
"title":"New Horizons: Nasa spacecraft speeds past Pluto",
"type":"story",
"url":"http://www.bbc.co.uk/news/science-environment-33524589"
}

执行一个API调用,返回Hacker News上当前热门文章的ID,再查看 每篇排名靠前的文章

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import requests
from operator import itemgetter
#执行API调用并存储响应
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print("Status code: ", r.status_code)
#处理有关每篇文章的信息
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
#对于每篇文章,都执行一个API调用
url = ('https://hacker-news.firebaseio.com/v0/item/' + str(submission_id) + '.json')
submission_r = requests.get(url)
print(submission_r.status_code)
response_dict = submission_r.json()
submission_dict = {
'title': response_dict['title'],
'link': 'http://news.ycombinator.com/item?id=' + str(submission_id),
'comments': response_dict.get('descendants', 0)
}
submission_dicts.append(submission_dict)
submission_dicts = sorted(submission_dicts, key=itemgetter('comments'), reverse=True)
for submission_dict in submission_dicts:
print("\nTitle:", submission_dict['title'])
print("Discussion link:", submission_dict['link'])
print("Comments:", submission_dict['comments'])
#output:
Status code: 200
200
200
200
...
Title: Improving Firefox Responsiveness on macOS
Discussion link: http://news.ycombinator.com/item?id=33152472
Comments: 319

Title: U.S. Army Chooses Google Workspace
Discussion link: http://news.ycombinator.com/item?id=33157558
Comments: 273

Title: Ask HN: How did you stop drinking?
Discussion link: http://news.ycombinator.com/item?id=33158947
Comments: 162

...
  • Title: Python学习之路(14)使用API【项目2 数据可视化】
  • Author: 茴香豆
  • Created at : 2022-10-11 10:28:00
  • Updated at : 2022-10-11 16:07:57
  • Link: https://hxiangdou.github.io/2022/10/11/Python_14/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments
On this page
Python学习之路(14)使用API【项目2 数据可视化】