Python学习之路（14）使用API【项目2 数据可视化】

在本章中，你将学习如何编写一个独立的程序，并对其获取的数据进行可视化。

1.使用Web API

本章的可视化将基于来自GitHub的信息，这是一个让程序员能够协作开发项目的网站。我们将使用GitHub的API来请求有关该网站中Python项目的信息，然后使用Pygal生成交互式可视化，以呈现这些项目的受欢迎程度。

https://api.github.com/search/repositories?q=language:python&sort=stars
功能: 返回GitHub当前托管了多少个Python项目，还有有关最受欢迎的Python仓库的信息。
第一部分:
（https://api.github.com/ ）将请求发送到GitHub网站中响应API调用的部分
第二部分（search/repositories ）让API搜索GitHub上的所有仓库。
repositories 后面的问号指出我们要传递一个实参。
q表示查询，而等号让我们能够开始指定查询（q=）。
通过使用language:python ，我们指出只想获取主要语言为Python的仓库的信息。最后一部分
（&sort=stars）指定将项目按其获得的星星数进行排序。

import requests
#执行API调用并存储响应
url = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(url)
print("Status code: ", r.status_code)
#将API响应存储在一个变量中
response_dict = r.json()
#处理结果
print(response_dict.keys())
print("Toral repositories:", response_dict['total_count'])
#output:
Status code:  200
dict_keys(['total_count', 'incomplete_results', 'items'])
Toral repositories: 9020420

仓库相关信息

#探索有关仓库的信息
repo_dicts = response_dict['items']
print("Repositories returned: ", len(repo_dicts))
#验究第一个仓库
repo_dict = repo_dicts[0]
print("\nKeys: ", len(repo_dict))
for key in sorted(repo_dict.keys()):
    print(key)
print("\nSelected information about first repository:")
print('Name:', repo_dict['name'])
print('Owner:', repo_dict['owner']['login'])
print('Stars:', repo_dict['stargazers_count'])
print('Repository:', repo_dict['html_url'])
print('Created:', repo_dict['created_at'])
print('Updated:', repo_dict['updated_at'])
print('Description:', repo_dict['description'])
#output:
Repositories returned:  30

Keys:  79
allow_forking
archive_url
archived
assignees_url
blobs_url
branches_url
clone_url
collaborators_url
comments_url
commits_url
compare_url
...

Selected information about first repository:
Name: public-apis
Owner: public-apis
Stars: 211850
Repository: https://github.com/public-apis/public-apis
Created: 2016-03-20T23:49:42Z
Updated: 2022-10-11T02:30:33Z
Description: A collective list of free APIs
    
#检视检视所有项目的信息
print("\nSelected information about each repository:")
for repo_dict in repo_dicts:
    print('\nName:', repo_dict['name'])
    print('Owner:', repo_dict['owner']['login'])
    print('Stars:', repo_dict['stargazers_count'])
    print('Repository:', repo_dict['html_url'])
    print('Description:', repo_dict['description'])
#output:
Selected information about each repository:

Name: public-apis
Owner: public-apis
Stars: 211850
Repository: https://github.com/public-apis/public-apis
Description: A collective list of free APIs

Name: system-design-primer
Owner: donnemartin
Stars: 199433
Repository: https://github.com/donnemartin/system-design-primer
Description: Learn how to design large-scale systems. Prep for the system design interview.  Includes Anki flashcards.

Name: awesome-python
Owner: vinta
Stars: 144231
Repository: https://github.com/vinta/awesome-python
Description: A curated list of awesome Python frameworks, libraries, software and resources

Name: youtube-dl
Owner: ytdl-org
Stars: 114026
Repository: https://github.com/ytdl-org/youtube-dl
Description: Command-line program to download videos from YouTube.com and other video sites

Name: thefuck
Owner: nvbn
Stars: 73960
Repository: https://github.com/nvbn/thefuck
Description: Magnificent app which corrects your previous console command.
    
...

2.使用Pygal可视化仓库

创建条形图表示项目获得了多少颗星星。

import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS
# 执行API调用并存储响应
URL = 'https://api.github.com/search/repositories?q=language:python&sort=stars'
r = requests.get(URL)
print("Status code: ", r.status_code)

#将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories: ", response_dict['total_count'])
#研究有关仓库的信息
repo_dicts = response_dict['items']
names, plot_dicts = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict['name'])
    plot_dicts.append({
        'value': repo_dict['stargazers_count'],
        #添加自定义工具提示
        'label': repo_dict['description'],
        #将图表中的每个条形用作网站的链接
        'xlink': repo_dict['html_url']
    })
#可视化
my_style = LS('#333366', base_style=LCS)
# chart = pygal.Bar(style=my_style, x_label_rotation=45, show_legend=False)
#创建一个配置对象，包含传递给Bar()的所有定制
my_config = pygal.Config()
my_config.x_label_rotation = 45#倾斜45度
my_config.show_legend = False#不展示图例
my_config.title_font_size = 24
my_config.label_font_size = 14
my_config.major_label_font_size = 18
my_config.truncate_label = 15#将较长的项目名缩短为15个字符
my_config.show_y_guides = False#隐藏图中的水平线
my_config.width = 1000
chart = pygal.Bar(my_config, style=my_style)

chart.title = 'Most-Starred Python Projects on Github'
chart.x_labels = names

chart.add('', plot_dicts)
chart.render_to_file('python_repos.svg')

3.Hacker News API

下面的调用返回本书编写时最热门的文章的信息

https://hacker-news.firebaseio.com/v0/item/9884165.json

响应是一个字典，包含ID为9884165的文章的信息

{
    "by":"nns",
    "descendants":297,
    "id":9884165,
    "kids":[9885099,9884723,9885165,9884789,9885604,9884137,9886151,9885220,9885790,9884661,9885844,9885029,9884817,9887342,9884545,9884372,9884499,9884881,9884109,9886496,9884342,9887832,9885023,9884334,9884707,9887008,9885348,9885131,9887539,9885880,9884196,9884640,9886534,9885152],
    "score":558,
    "time":1436875181,
    "title":"New Horizons: Nasa spacecraft speeds past Pluto",
    "type":"story",
    "url":"http://www.bbc.co.uk/news/science-environment-33524589"
}

执行一个API调用，返回Hacker News上当前热门文章的ID，再查看每篇排名靠前的文章

import requests
from operator import itemgetter
#执行API调用并存储响应
url =  'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print("Status code: ", r.status_code)
#处理有关每篇文章的信息
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
    #对于每篇文章，都执行一个API调用
    url = ('https://hacker-news.firebaseio.com/v0/item/' + str(submission_id) + '.json')
    submission_r = requests.get(url)
    print(submission_r.status_code)
    response_dict = submission_r.json()
    submission_dict = {
        'title': response_dict['title'],
        'link': 'http://news.ycombinator.com/item?id=' + str(submission_id),
        'comments': response_dict.get('descendants', 0)
    }
    submission_dicts.append(submission_dict)
submission_dicts = sorted(submission_dicts, key=itemgetter('comments'), reverse=True)
for submission_dict in submission_dicts:
    print("\nTitle:", submission_dict['title'])
    print("Discussion link:", submission_dict['link'])
    print("Comments:", submission_dict['comments'])
#output:
Status code:  200
200
200
200
...
Title: Improving Firefox Responsiveness on macOS
Discussion link: http://news.ycombinator.com/item?id=33152472
Comments: 319

Title: U.S. Army Chooses Google Workspace
Discussion link: http://news.ycombinator.com/item?id=33157558
Comments: 273

Title: Ask HN: How did you stop drinking?
Discussion link: http://news.ycombinator.com/item?id=33158947
Comments: 162

...

顺利毕业企划

Python学习之路（14）使用API【项目2 数据可视化】

1.使用Web API

2.使用Pygal可视化仓库

3.Hacker News API

预览: