美文网首页Python
Python爬取网易财经基金历史净值数据

Python爬取网易财经基金历史净值数据

作者: Vined | 来源:发表于2018-03-04 20:03 被阅读234次

网易财经基金历史净值数据的页面地址是
http://quotes.money.163.com//fund/jzzs_110022.html?start=2018-02-22&end=2018-03-02
jzzs_后面跟基金代码
参数说明如下:

  1. start 开始日期,格式是yyyy-mm-dd
  2. end 结束日期,格式是yyyy-mm-dd
网页截图

页面里的表格主体部分html如下:

<tbody>
    <tr>
        <td>2018-03-01</td>
        <td>2.3540</td>
        <td>2.3540</td>
        <td><span class="cRed">0.56%</span></td>
    </tr>
    <tr>
        <td>2018-02-28</td>
        <td>2.3410</td>
        <td>2.3410</td>
        <td><span class="cGreen">-1.35%</span></td>
    </tr>
    <tr>
        <td>2018-02-27</td>
        <td>2.3730</td>
        <td>2.3730</td>
        <td><span class="cGreen">-2.06%</span></td>
    </tr>
    <tr>
        <td>2018-02-26</td>
        <td>2.4230</td>
        <td>2.4230</td>
        <td><span class="cRed">0.29%</span></td>
    </tr>
    <tr>
        <td>2018-02-23</td>
        <td>2.4160</td>
        <td>2.4160</td>
        <td><span class="cGreen">-0.49%</span></td>
    </tr>
    <tr>
        <td>2018-02-22</td>
        <td>2.4280</td>
        <td>2.4280</td>
        <td><span class="cRed">2.58%</span></td>
    </tr>
</tbody>

获取历史净值数据的方法是用BeautifulSoup库的findAll找到tbody(表格主体)标签,然后在里面找tr(表格中的一行)标签,单元格内容是:

  1. td:nth-of-type(1)(第1个单元格)是净值日期
  2. td:nth-of-type(2)(第2个单元格)是单位净值
  3. td:nth-of-type(3)(第3个单元格)是累计净值
  4. td:nth-of-type(4)(第4个单元格)是日增长率

范例代码如下:

# -*- coding:utf-8 -*-


import requests
from bs4 import BeautifulSoup
from prettytable import *


def get_url(url, params=None, proxies=None):
    rsp = requests.get(url, params=params, proxies=proxies)
    rsp.raise_for_status()
    return rsp.text


def get_fund_data(code, start='', end=''):
    record = {'Code': code}
    url = r'http://quotes.money.163.com//fund/jzzs_' + code + '.html'
    params = {'start': start, 'end': end}
    html = get_url(url, params)
    soup = BeautifulSoup(html, 'html.parser')
    records = []
    tab = soup.findAll('tbody')[0]
    for tr in tab.findAll('tr'):
        if tr.findAll('td') and len((tr.findAll('td'))) == 4:
            record['Date'] = str(tr.select('td:nth-of-type(1)')[0].getText().strip())
            record['NetAssetValue'] = str(tr.select('td:nth-of-type(2)')[0].getText().strip())
            record['ChangePercent'] = str(tr.select('td:nth-of-type(4)')[0].getText().strip())
            records.append(record.copy())
    return records


def demo(code, start, end):
    table = PrettyTable()
    table.field_names = ['Code', 'Date', 'NAV', 'Change']
    table.align['Change'] = 'r'
    records = get_fund_data(code, start, end)
    for record in records:
        table.add_row([record['Code'], record['Date'], record['NetAssetValue'], record['ChangePercent']])
    return table


if __name__ == "__main__":
    print demo('110022', '2018-02-22', '2018-03-02')

输出结果如下:

+--------+------------+--------+--------+
|  Code  |    Date    |  NAV   | Change |
+--------+------------+--------+--------+
| 110022 | 2018-03-02 | 2.3580 |  0.17% |
| 110022 | 2018-03-01 | 2.3540 |  0.56% |
| 110022 | 2018-02-28 | 2.3410 | -1.35% |
| 110022 | 2018-02-27 | 2.3730 | -2.06% |
| 110022 | 2018-02-26 | 2.4230 |  0.29% |
| 110022 | 2018-02-23 | 2.4160 | -0.49% |
| 110022 | 2018-02-22 | 2.4280 |  2.58% |
+--------+------------+--------+--------+

相关文章

网友评论

    本文标题:Python爬取网易财经基金历史净值数据

    本文链接:https://www.haomeiwen.com/subject/tsyqfftx.html