大佬教程收集整理的这篇文章主要介绍了使用 BeautifulSoup 抓取新的 YouTube 视频,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
我是 Python 新手,我想在 YouTube 上进行网络抓取。 我想使用此链接来上传最新的视频:“https://www.youtube.com/results?search_query=progrAMMing&sp=CAISBAgBEAE%253D”,我想抓取新的 5 个视频。我怎样才能做到这一点? 我已经使用这段代码来测试它(我只想要链接)来自 question
from bs4 import BeautifulSoup
import requests
url="https://www.youtube.com/results?search_query=progrAMMing&sp=CAISBAgBEAE%253D"
HTML = requests.get(url)
soup = BeautifulSoup(HTMl.text,features="HTMl.parser")
for entry in soup.find_all("entry"):
for link in entry.find_all("link"):
print(link["href"])
编辑:我没有从 python 终端得到任何响应。这不是刮任何东西。它只有默认的“>>>”。
如果不使用可以通过执行 these steps 获得的 Google 的 YouTube API 密钥,您就无法抓取 YouTube。如果您仍然想尝试,我可以重新发布您问题的合法答案。
同时,尝试使用本网站上的 beautifulsoup 练习解析videvo.net
这里有一些代码可以帮助您入门
def get_source(url):
return BeautifulSoup(requests.get(url,headers={"User-Agent": "Mozilla/5.0"},verify=falsE).text,'html.parser')
soup = get_source('http://videvo.net')
for tags in soup.find_all('a'):
print(tags['href'])
快乐编码!
编辑 我站得更正(稍微)。无法解析 Youtube 的主 URL。你可以试试这个代码
def get_source(url):
return BeautifulSoup(requests.get(url,'html.parser')
soup = get_source('https://www.youtube.com/feeds/videos.xml?user=kinagrAnnis')
for entry in soup.find_all("entry"):
for title in entry.find_all("title"):
print(title.text)
for link in entry.find_all("link"):
print(link["href"])
for name in entry.find_all("name"):
print(name.text)
for pub in entry.find_all("published"):
print(pub.text)
注意:您可以输入任何用户名而不是“kinnagrAnnis”,user=[用户名]
,你可以这样做:
代码(给出一个想法真的很基础)
from requests_html import HTMLSession
session = HTMLSession()
url = "https://www.youtube.com/results?search_query=progrAMMing&sp=CAISBAgBEAE%253D"
response = session.get(url)
response.html.render(sleep=1,keep_page = True,scrolldown = 2)
for links in response.html.find('a#video-title'):
link = next(iter(links.absolute_links))
print(link)
输出:
https://www.youtube.com/watch?v=OUnxJk3Bphk
https://www.youtube.com/watch?v=vWvtt1ESNeY
https://www.youtube.com/watch?v=b8OIZu5y_Ak
https://www.youtube.com/watch?v=xp3fHaT2_VE
https://www.youtube.com/watch?v=e9toQAcjOrw
https://www.youtube.com/watch?v=em0Is0nyaXA
https://www.youtube.com/watch?v=N5JVTUAGmAM
https://www.youtube.com/watch?v=a0hQG-UdhYc
https://www.youtube.com/watch?v=SmQFxQ1fa2o
https://www.youtube.com/watch?v=uuMS1FYLgWQ
https://www.youtube.com/watch?v=8WJ-zSE32ZY
https://www.youtube.com/watch?v=c5MtH-xDspg
https://www.youtube.com/watch?v=5Xktqz6VUTU
https://www.youtube.com/watch?v=Wbo6j_iq2XY
https://www.youtube.com/watch?v=8eu9nliySO4
https://www.youtube.com/watch?v=j28PjOy_uk8
https://www.youtube.com/watch?v=fM2Ordt8Q9E
https://www.youtube.com/watch?v=tFSKaiVyNno
https://www.youtube.com/watch?v=1hDXlc2C3Rw
https://www.youtube.com/watch?v=vH9_Eo7VW3c
在没有无头浏览器的情况下使用 regex
。
您需要到达 var yTinitialData
元素,然后到达 "commandMetadata"
,您将在其中找到视频的 URL {"url":"/watch?v=Ae2TRkpjRCc",....
这是它在 regex101 上抓取 var yTinitialData
内的所有内容的起点
或者,您可以使用来自 SerpApi 的 YouTube Search ENGIne Results API。查看Playground。
要集成的代码:
from serpapi import GoogleSearch
params = {
"ENGIne": "youtube","search_query": "progrAMMing","sp": "CAISBAgBEAE%253D","api_key": "your_secret_api_key"
}
search = GoogleSearch(params)
results = search.get_Dict()
for link in results['video_results']:
print(f"title: {link['title']}\nLink: {link['link']}\n")
输出:
title: CLASS VIII BASIC HTML TAGS and programmiNG 15 4 101`
Link: https://www.youtube.com/watch?v=KIPP63tXKpU
title: For loop in c progrAMMing #bssdlectureclasses
Link: https://www.youtube.com/watch?v=nfRN0x9VvQc
title: [C#] ProgrAMMing NatsukiBot
Link: https://www.youtube.com/watch?v=chnigx-ezwg
title: CS201 Short Lecture - 03 | VU Short Lecture | Introduction to ProgrAMMing in (Urdu / Hindi)
Link: https://www.youtube.com/watch?v=qoxXJchd7N4
title: ProgrAMMing in C Language - While statement
Link: https://www.youtube.com/watch?v=cl0OpNCdF5I
title: Introduction to html and Basic progrAMMing
Link: https://www.youtube.com/watch?v=A4We3NGqxuA
title: Use of Printf & Scanf functions | Part 7 | C ProgrAMMing | PadhoChalo
Link: https://www.youtube.com/watch?v=578xS-Ugc2c
title: C++ course has started | Computer ProgrAMMing | Aashu |
Link: https://www.youtube.com/watch?v=SjFgTK2HqbE
title: Mitsubishi Outlander 2008 prox/twist transponder key progrAMMing tip
Link: https://www.youtube.com/watch?v=HlSJcBwxKFQ
title: Computer ProgrAMMing 1 -Introduction to the course
Link: https://www.youtube.com/watch?v=xdmPbhTT01g
title: ProgrAMMing,Data Structures and Algorithms in Python
Link: https://www.youtube.com/watch?v=0fUddu9cdAU
免责声明,我为 SerpApi 工作。
以上是大佬教程为你收集整理的使用 BeautifulSoup 抓取新的 YouTube 视频全部内容,希望文章能够帮你解决使用 BeautifulSoup 抓取新的 YouTube 视频所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。