2-1爬取豆瓣电影数据
数据来源网站https://movie.douban.com/chartimport requests import json import time def fetch_douban(): all_movies [] start 0 limit 20 print(开始爬取豆瓣电影榜单) headers { User-Agent: Mozilla/5.0, Referer: https://movie.douban.com/chart } while True: url https://movie.douban.com/j/chart/top_list params { type: 11, interval_id: 100:90, action: , start: start, limit: limit } try: response requests.get(url, headersheaders, paramsparams) data response.json() if not data: break print(f已获取 {start} - {start limit}) for index, item in enumerate(data): movie { rank: start index 1, name: item.get(title), score: item.get(score), release_date: item.get(release_date), regions: item.get(regions), actors: item.get(actors), cover: item.get(cover_url) } all_movies.append(movie) start limit time.sleep(1) # except Exception as e: print(出错了, e) break print(总数据条数, len(all_movies)) # 保存文件 with open(douban_chart.json, w, encodingutf-8) as f: json.dump(all_movies, f, ensure_asciiFalse, indent2) print(数据已保存为 douban_chart.json) if __name__ __main__: fetch_douban()
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2452990.html
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!