python音乐的数据抓取与分析_手把手教你使用Python抓取QQ音乐数据!

python⾳乐的数据抓取与分析_⼿把⼿教你使⽤Python抓取QQ

⾳乐数据！

【⼀、项⽬⽬标】

通过⼿把⼿教你使⽤Python抓取QQ⾳乐数据(第⼀弹)我们实现了获取 QQ ⾳乐指定歌⼿单曲排⾏指定页数的歌曲的歌名、专辑名、播放链接。

通过⼿把⼿教你使⽤Python抓取QQ⾳乐数据(第⼆弹)我们实现了获取 QQ ⾳乐指定歌曲的歌词和指定歌曲⾸页热评。

通过⼿把⼿教你使⽤Python抓取QQ⾳乐数据(第三弹)我们实现了获取更多评论并⽣成词云图。

此次我们将将三个项⽬封装在⼀起，通过菜单控制爬取不同数据。

【⼆、需要的库】

主要涉及的库有：requests、openpyxl、html、json、wordcloud、jieba

如需更换词云图背景图⽚还需要numpy库和PIL库(pipinstall pillow)

如需⽣成.exe需要pyinstaller -F

【三、项⽬实现】

1.⾸先确定菜单，要实现哪些功能：

①获取指定歌⼿的歌曲信息(歌名、专辑、链接)

②获取指定歌曲歌词

③获取指定歌曲评论

④⽣成词云图

⑤退出系统

代码如下：

class QQ(): def menu(self): print('欢迎使⽤QQ⾳乐爬⾍系统，以下是功能菜单，请选择。\n') while True: try: print('功能菜单\n1.获取指定歌⼿的歌曲信息\n2.获取指定歌曲歌词\n3.获取指定歌曲评论\n4.⽣成词云图\n5.退出系统\n') choice = int(input('请输⼊数字选择对应的功能：')) if choice == 1:

_info() elif choice == 2: _id() _lyric() elif choice == 3: _id()真的用心良苦

<_comment() elif choice == 4: self.wordcloud() elif choice == 5: print('感谢使⽤！') break else: print('输⼊错误，请重新输⼊。\n') except: print('输⼊错误，请重新输⼊。\n')

第⼀⾏创建类，第⼆⾏定义菜单函数，这⾥⽤了类的实例化，⾥⾯所有函数的第⼀个参数都是self，我认为实例化更⽅便传参数；

whiletrue使菜单⽆限循环；

其他代码为设置输⼊不同数字对应打开不同函数。

2.封装项⽬(⼀)为get_info()

代码如下：

def get_info(self): wb=openpyxl.Workbook() #创建⼯作薄 sheet=wb.active #获取⼯作薄的活动表 sheet.title='song' #⼯作表重命名 sheet['A1'] ='歌曲名' #加表头，给A1单元格赋值 sheet['B1'] ='所属专

辑' #加表头，给B1单元格赋值 sheet['C1'] ='播放链接' #加表头，给C1单元格赋值 url = 'c.y.qq/soso/fcgi-bin/client_search_cp' name = input('请输⼊要查询的歌⼿姓名：') page = int(input('请输⼊需要查询的歌曲页数：')) for x in range(page): params = { 'ct':'24', 'qqmusic_ver': '1298', 'new_json':'1',

'remoteplace':'sizer.yqq.song_next', 'searchid':'64405487069162918', 't':'0', 'aggr':'1', 'cr':'1', 'catZhida':'1', 'lossless':'0',

'flag_qc':'0', 'p':str(x+1), 'n':'20', 'w':name, 'g_tk':'5381', 'loginUin':'0', 'hostUin':'0', 'format':'json', 'inCharset':'utf8',

'outCharset':'utf-8', 'notice':'0', 'platform':'yqq.json', 'needNewCode':'0' } res = (url,params=params) json =

res.json() list = json['data']['song']['list'] for music in list: song_name = music['name'] # 查歌曲名，把歌曲名赋值给

关悦关之琳song_name album = music['album']['name'] # 查专辑名，把专辑名赋给album link = 'y.qq/n/yqq/song/' +

str(music['mid']) + '.html\n\n' # 查播放链接，把链接赋值给link sheet.append([song_name,album,link]) # 把name、album和link 写成列表，⽤append函数多⾏写⼊Excel wb.save(name+'个⼈单曲排⾏前'+str(page*20)+'清单.xlsx') #最后保存并命名这个Excel⽂件print('下载成功！\n')

大龄文艺女青年之歌3.封装项⽬(⼆)为get_id()和get_lyric

代码如下：

def get_id(self): self.i = input('请输⼊歌曲名：') url_1 = 'c.y.qq/soso/fcgi-bin/client_search_cp' # 这是请求歌曲评论的url headers = {'user-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)

Chrome/63.0.3239.132 Safari/537.36'} params = {'ct': '24', 'qqmusic_ver': '1298', 'new_json': '1', 'remoteplace':

'txt.yqq.song', 'searchid': '71600317520820180', 't': '0', 'aggr': '1', 'cr': '1', 'catZhida': '1', 'lossless': '0', 'flag_qc': '0', 'p': '1', 'n':

'10', 'w': self.i, 'g_tk': '5381', 'loginUin': '0', 'hostUin': '0', 'format': 'json', 'inCharset': 'utf8', 'outCharset':

'utf-8', 'notice': '0',

'platform': 'yqq.json', 'needNewCode': '0'} res_music = (url_1,headers=headers,params=params) json_music =

res_music.json() self.id = json_music['data']['song']['list'][0]['id'] # print(self.id) def get_lyric(self): url_2 =

'c.y.qq/lyric/fcgi-bin/fcg_query_lyric_yqq.fcg' # 这是请求歌曲评论的url headers = { 'origin':'y.qq',

'referer':'y.qq/n/yqq/song/001qvvgF38HVc4.html', 'user-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'} params = { 'nobase64':'1',

风中奇缘歌词'musicid':self.id, '-':'jsonp1', 'g_tk':'5381', 'loginUin':'0', 'hostUin':'0', 'format':'json', 'inCharset':'utf8', 'outCharset':'utf-8',

'notice':'0', 'platform':'yqq.json', 'needNewCode':'0', } res_music = (url_2,headers=headers,params=params) js_1 = res_music.json() lyric = js_1['lyric'] lyric

_html = html.unescape(lyric) #⽤了转义字符html.unescape⽅法 # print(lyric_html) f1 = open(self.i+'歌词.txt','a',encoding='utf-8') #存储到txt中 f1.writelines(lyric_html) f1.close() print('下载成功！\n')

这⾥特别说⼀下下载歌词的headers⾥必须加上’origin’和’referer’，要不爬下来数据。

4.封装项⽬(三)为get_comment()和wordcloud()

代码如下：

def get_comment(self): page = input('请输⼊要下载的评论页数：') url_3 = 'c.y.qq/base/fcgi-长安十二时辰的演员

bin/fcg_global_comment_h5.fcg' headers = {'user-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'} f2 = open(self.i+'评论.txt','a',encoding='utf-8') #存储到txt中 for n in range(int(page)): params = {'g_tk_new_20200303': '5381', 'g_tk': '5381', 'loginUin': '0', 'hostUin': '0', 'format': 'json',

'inCharset': 'utf8', 'outCharset': 'GB2312', 'notice': '0', 'platform': 'yqq.json', 'needNewCode': '0', 'cid': '205360772', 'reqtype': '2', 'biztype': '1', 'topid': self.id, 'cmd': '6', 'needmusiccrit': '0', 'pagenum':n, 'pagesize': '15', 'lasthotcommentid':'', 'domain': 'qq', 'ct': '24', 'cv': '10101010'} res_music = reque

<(url_3,headers=headers,params=params) js_2 = res_music.json() comments = js_2['comment']['commentlist'] for i in comments: comment = i['rootcommentcontent'] + '\n ——————————————————————————————————\n' f2.writelines(comment) # print(comment) f2.close() print('下载成功！\n') def wordcloud(self): self.name = input('请输⼊要⽣成词云图的⽂件名称：') def cut(text):

wordlist_jieba=jieba.cut(text) space_wordlist=" ".join(wordlist_jieba) return space_wordlist with open(self.name+".txt"

,encoding="utf-8")as file: ad() text=cut(text) mask_pic=numpy.array(Image.open("⼼.png")) wordcloud = WordCloud(font_path="C:/Windows/f", collocations=False, max_words= 100, min_font_size=10,

max_font_size=500, mask=mask_pic).generate(text) _file(self.name+'云词图.png') # 把词云保存下来 print('⽣成成功！\n')

5.最后类的实例化

qq = QQ()qq.menu()

6.效果展⽰

7. 打包成.exe

⽤pyinstaller -F打包，运⾏会报错、闪退。

明星款羽绒服看上图报错信息应该和词云图有关，注释掉词云图所需的库，def wordcloud()按下图修改可正常打包，但是就没有⽣成词云图的功能了：

下载歌词或评论时，如有重名的歌曲，可在歌曲前⾯加上歌⼿姓名，如上图的“邓紫棋泡沫”。

源码加：850591259

python音乐的数据抓取与分析_手把手教你使用Python抓取QQ音乐数据!

发布评论取消回复

最近发表

热门文章

标签列表