一、新冠肺炎数据爬取
数据来源为腾讯新闻网站api接口,网址:https://news.qq.com/zt2020/page/feiyan.htm#/global
按F12进入开发者选项
2020-04-05_175757.png
意大利:
https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?country=%E6%84%8F%E5%A4%A7%E5%88%A9&
伊朗:
https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?country=%E4%BC%8A%E6%9C%97&
其中,URL链接后面的country=之后是国家名
由此,我们可以得到想要的URL接口:
https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?country=国家名
国内省份数据同理,
https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?province=省份名
下面开始使用Excel-Power Query获取数据
1.新建需要数据的国家sheet1
let
location =(x)=>Json.Document(Web.Contents("https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?country="&Uri.EscapeDataString(x)))[data],
Source = Excel.CurrentWorkbook(){[Name="国家表"]}[Content],
GetDataColumn=Table.AddColumn(Source, "data", each location([国家])),
ExpandedData1 = Table.ExpandListColumn(GetDataColumn, "data"),
ExpandedData2 = Table.ExpandRecordColumn(ExpandedData1,"data", {"date", "confirm", "dead", "heal", "confirm_add"}, {"日期", "累计确诊", "死亡", "治愈", "新增确诊"}),
ChangedType = Table.TransformColumnTypes(ExpandedData2,{{"日期", type date}, {"国家",type text}, {"累计确诊", Int64.Type}, {"死亡", Int64.Type}, {"治愈", Int64.Type}, {"新增确诊", Int64.Type}}),
FilteredRows = Table.SelectRows(ChangedType, let latest = List.Max(ChangedType[日期]) in each [日期] < latest)
in
FilteredRows
2.新建需要数据的国内省份sheet2
let
location =(x)=>Json.Document(Web.Contents("https://api.inews.qq.com/newsqa/v1/query/pubished/daily/list?province="&Uri.EscapeDataString(x)))[data],
Source = Excel.CurrentWorkbook(){[Name="省份表"]}[Content],
GetDataColumn=Table.AddColumn(Source, "data", each location([省份])),
ExpandedData1 = Table.ExpandListColumn(GetDataColumn, "data"),
ExpandedData2 = Table.ExpandRecordColumn(ExpandedData1,"data", {"date", "confirm", "dead", "heal", "confirm_add"}, {"日期", "累计确诊", "死亡", "治愈", "新增确诊"}),
ChangedType = Table.TransformColumnTypes(ExpandedData2,{{"日期", type date}, {"省份",type text}, {"累计确诊", Int64.Type}, {"死亡", Int64.Type}, {"治愈", Int64.Type}, {"新增确诊", Int64.Type}}),
FilteredRows = Table.SelectRows(ChangedType, let latest = List.Max(ChangedType[日期]) in each [日期] < latest)
in
FilteredRows
打开Power Query编辑器,建立查询,并上载数据。
二、Tableau分析
1.导入数据
数据分层
2.数据分层及可视化
选择地图可视化
3.确诊人数前十国家
确诊人数及治愈人数
4.月度时间段对比,预估走势,建立仪表盘
仪表盘










网友评论