美文网首页
Java爬虫

Java爬虫

作者: 请叫我平爷 | 来源:发表于2022-02-21 10:54 被阅读0次
  1. 引入jsoup包
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.10.2</version>
</dependency>
  1. 获取内容
@Test
public void textDemo() throws Exception {
       String  url = "https://search.jd.com/Search?keyword=java&enc=utf-8";
       //解析网页
       Document document = Jsoup.parse(new URL(url),30000);
       Element element = document.getElementById("J_goodsList");
       //获取所有的li元素
        Elements elements = element.getElementsByTag("li");
        List<ProductBean> productBeanList = new ArrayList<>();
        //获取元素的内容,这里el就是每个li标签
        for (Element el : elements){
            String img = el.getElementsByTag("img").eq(0).attr("src");
            String price = el.getElementsByClass("p-price").eq(0).text();
            String name = el.getElementsByClass("p-name").eq(0).text();
            ProductBean bean = new ProductBean();
            bean.setImg(img);
            bean.setName(name);
            bean.setPrice(price);
            productBeanList.add(bean);
        }
        System.out.println(JSON.toJSON(productBeanList));
}

相关文章

网友评论

      本文标题:Java爬虫

      本文链接:https://www.haomeiwen.com/subject/rgrylrtx.html