Как извлечь заголовки видео с помощью Youtube API (Python)

Я делаю для себя небольшое приложение на Python, которое загружает информацию из видео YouTube, и для этого я использую API YouTube.

Недавно я посмотрел это видео, которое помогло мне получить комментарии и их ответы из видео на Youtube и экспортировать их в файл Excel, и все работает нормально. Однако проблема, с которой я столкнулся сейчас, заключается в том, что я хочу извлечь заголовок из видео на YouTube и, похоже, не могу заставить свой код работать на него.

Как я уже упоминал ранее, я посмотрел связанное видео, а также попытался оставить комментарий автору канала видео за помощью, но, к сожалению, не получил ответа.

Я также пытался поискать на YouTube другие полезные видеоролики, касающиеся моей проблемы, но не смог найти ничего полезного для решения моей проблемы.

Помимо расспросов там и поиска других видео, я еще пытался просмотреть документацию и вот так разгадать код, что тоже не сработало. Ниже приведен код, который я использую:

# CODE FROM HERE: https://github.com/analyticswithadam/Python/blob/main/Pull_all_Comments_and_Replies_for_YouTube_Playlists.ipynb

from googleapiclient.discovery import build
import pandas as pd
import getpass

api_key = "API key here"
playlist_ids = ['Youtube Playlist Link Here']

# Build the YouTube client
youtube = build('youtube', 'v3', developerKey=api_key)

def get_all_video_ids_from_playlists(youtube, playlist_ids):
    all_videos = []  # Initialize a single list to hold all video IDs

    for playlist_id in playlist_ids:
        next_page_token = None

        # Fetch videos from the current playlist
        while True:
            playlist_request = youtube.playlistItems().list(
                part='contentDetails',
                playlistId=playlist_id,
                maxResults=50,
                pageToken=next_page_token)
            playlist_response = playlist_request.execute()

            all_videos += [item['contentDetails']['videoId'] for item in playlist_response['items']]

            next_page_token = playlist_response.get('nextPageToken')

            if next_page_token is None:
                break

    return all_videos

# Fetch all video IDs from the specified playlists
video_ids = get_all_video_ids_from_playlists(youtube, playlist_ids)

# Now you can pass video_ids to the next function
# next_function(video_ids)

'''
# Broken-ass title code

def get_vid_title(youtube, video_id):  # Added video_id as an argument
    all_titles = []
    next_page_token = None

    while True:
        title_request = youtube.channels().list(
            part = "snippet",
            videoId=video_id,
            pageToken=next_page_token,
            textFormat = "plainText",
            maxResults=100
        )
        title_response = title_request.execute()

        for item in title_response['items']:
            vid_title = item['snippet']['title']
            all_titles.append({
                'Title': vid_title['title']
            })
            print(vid_title['title'])

    return all_titles 
'''

# Function to get replies for a specific comment
def get_replies(youtube, parent_id, video_id):  # Added video_id as an argument
    replies = []
    next_page_token = None

    while True:
        reply_request = youtube.comments().list(
            part = "snippet",
            parentId=parent_id,
            textFormat = "plainText",
            maxResults=100,
            pageToken=next_page_token
        )
        reply_response = reply_request.execute()

        for item in reply_response['items']:
            comment = item['snippet']
            replies.append({
                'Timestamp': comment['publishedAt'],
                'Username': comment['authorDisplayName'],
                'VideoID': video_id,
                'Comment': comment['textDisplay'],
                'likeCount': comment['likeCount'],
                'Date': comment['updatedAt'] if 'updatedAt' in comment else comment['publishedAt']
            })

        next_page_token = reply_response.get('nextPageToken')
        if not next_page_token:
            break

    return replies


# Function to get all comments (including replies) for a single video
def get_comments_for_video(youtube, video_id):
    all_comments = []
    next_page_token = None

    while True:
        comment_request = youtube.commentThreads().list(
            part = "snippet",
            videoId=video_id,
            pageToken=next_page_token,
            textFormat = "plainText",
            maxResults=100
        )
        comment_response = comment_request.execute()

        for item in comment_response['items']:
            top_comment = item['snippet']['topLevelComment']['snippet']
            all_comments.append({
                'Timestamp': top_comment['publishedAt'],
                'Username': top_comment['authorDisplayName'],
                'VideoID': video_id,  # Directly using video_id from function parameter
                'Comment': top_comment['textDisplay'],
                'likeCount': top_comment['likeCount'],
                'Date': top_comment['updatedAt'] if 'updatedAt' in top_comment else top_comment['publishedAt']
            })

            # Fetch replies if there are any
            if item['snippet']['totalReplyCount'] > 0:
                all_comments.extend(get_replies(youtube, item['snippet']['topLevelComment']['id'], video_id))

        next_page_token = comment_response.get('nextPageToken')
        if not next_page_token:
            break

    return all_comments

# List to hold all comments from all videos
all_comments = []


for video_id in video_ids:
    video_comments = get_comments_for_video(youtube, video_id)
    all_comments.extend(video_comments)

# Create DataFrame
comments_df = pd.DataFrame(all_comments)


# Export whole dataset to the local machine as CSV File
csv_file = 'comments_data.csv'  # Name your file
comments_df.to_csv(csv_file, index=False)

Любая помощь будет принята с благодарностью.

Редактировать

Благодаря пользователю Маурисио Ариасу Олаве мой код работает именно так, как я хочу.

Вот готовый код для тех, кому интересно:

# CODE FROM HERE: https://github.com/analyticswithadam/Python/blob/main/Pull_all_Comments_and_Replies_for_YouTube_Playlists.ipynb

from googleapiclient.discovery import build
import pandas as pd
import getpass

api_key = "API Key Here"
playlist_ids = ['Youtube playlist here']

# Build the YouTube client
youtube = build('youtube', 'v3', developerKey=api_key)

def get_all_video_ids_from_playlists(youtube, playlist_ids):
    all_videos = []  # Initialize a single list to hold all video IDs

    for playlist_id in playlist_ids:
        next_page_token = None

        # Fetch videos from the current playlist
        while True:
            playlist_request = youtube.playlistItems().list(
                part='snippet,contentDetails',
                playlistId=playlist_id,
                maxResults=50,
                pageToken=next_page_token)
            playlist_response = playlist_request.execute()

            #all_videos += [item['contentDetails']['videoId'] for item in playlist_response['items']]

            for pl_item in playlist_response["items"]:
                all_videos.append({"Video_ID": pl_item['contentDetails']['videoId'], "Title" : pl_item['snippet']['title']})

            next_page_token = playlist_response.get('nextPageToken')

            if next_page_token is None:
                break

    return all_videos

# Fetch all video IDs from the specified playlists
video_ids = get_all_video_ids_from_playlists(youtube, playlist_ids)

# Function to get replies for a specific comment
def get_replies(youtube, parent_id, video_id):  # Added video_id as an argument
    replies = []
    next_page_token = None

    while True:
        reply_request = youtube.comments().list(
            part = "snippet",
            parentId=parent_id,
            textFormat = "plainText",
            maxResults=100,
            pageToken=next_page_token
        )
        reply_response = reply_request.execute()

        for item in reply_response['items']:
            comment = item['snippet']
            replies.append({
                'Timestamp': comment['publishedAt'],
                'Username': comment['authorDisplayName'],
                'VideoID': video_id,
                'Comment': comment['textDisplay'],
                'likeCount': comment['likeCount'],
                'Date': comment['updatedAt'] if 'updatedAt' in comment else comment['publishedAt']
            })

        next_page_token = reply_response.get('nextPageToken')
        if not next_page_token:
            break

    return replies


# Function to get all comments (including replies) for a single video
def get_comments_for_video(youtube, video_id):
    all_comments = []
    next_page_token = None

    while True:
        comment_request = youtube.commentThreads().list(
            part = "snippet",
            videoId=video_id["Video_ID"],
            pageToken=next_page_token,
            textFormat = "plainText",
            maxResults=100
        )
        comment_response = comment_request.execute()

        for item in comment_response['items']:
            top_comment = item['snippet']['topLevelComment']['snippet']
            all_comments.append({
                'Timestamp': top_comment['publishedAt'],
                'Username': top_comment['authorDisplayName'],
                'VideoID': video_id["Video_ID"],  # Directly using video_id from function parameter
                'Title': video_id["Title"], # The title of the video.
                'Comment': top_comment['textDisplay'],
                'likeCount': top_comment['likeCount'],
                'Date': top_comment['updatedAt'] if 'updatedAt' in top_comment else top_comment['publishedAt']
            })

            # Fetch replies if there are any
            if item['snippet']['totalReplyCount'] > 0:
                all_comments.extend(get_replies(youtube, item['snippet']['topLevelComment']['id'], video_id))

        next_page_token = comment_response.get('nextPageToken')
        if not next_page_token:
            break

    return all_comments

# List to hold all comments from all videos
all_comments = []


for video_id in video_ids:
    video_comments = get_comments_for_video(youtube, video_id)
    all_comments.extend(video_comments)

# Create DataFrame
comments_df = pd.DataFrame(all_comments)


# Export whole dataset to the local machine as CSV File
csv_file = 'comments_data.csv'  # Name your file
comments_df.to_csv(csv_file, index=False)

вы использовали print( title_response ) или print( Item ), чтобы увидеть, что вы получаете от сервера?

— 28.06.2024 12:42

Кажется, я использовал print(Item)

— 28.06.2024 13:08

python python-3.x web-scraping youtube-api youtube-data-api

28.06.2024 12:12

Почему в Python есть оператор "pass"?

Оператор pass в Python - это простая концепция, которую могут быстро освоить даже новички без опыта программирования.

Некоторые методы, о которых вы не знали, что они существуют в Python

Python - самый известный и самый простой в изучении язык в наши дни. Имея широкий спектр применения в области машинного обучения, Data Science,...

Основы Python Часть I

Вы когда-нибудь задумывались, почему в программах на Python вы видите приведенный ниже код?

LeetCode - 1579. Удаление максимального числа ребер для сохранения полной проходимости графа

Алиса и Боб имеют неориентированный граф из n узлов и трех типов ребер:

Оптимизация кода с помощью тернарного оператора Python

И последнее, что мы хотели бы показать вам, прежде чем двигаться дальше, это

Советы по эффективной веб-разработке с помощью Python

Как веб-разработчик, Python может стать мощным инструментом для создания эффективных и масштабируемых веб-приложений.

Перейти к ответу Данный вопрос помечен как решенный

Ответы 1

Ответ принят как подходящий

При get_all_video_ids_from_playlists используйте:

part='snippet,contentDetails'

И в этой строке:

all_videos += [item['contentDetails']['videoId'] for item in playlist_response['items']]

Измените его следующим образом:

for pl_item in playlist_response 
    all_videos.append({"Video_ID": item['contentDetails']['videoId'], {"Title" : {item['snippet']['title']}})

Затем в этой строке:

for video_id in video_ids:
    video_comments = get_comments_for_video(youtube, video_id)

Вы передаете item - т.е. {"Video_ID: "xxxx", "Title": "xxxx"}:

Наконец, о вашей функции get_comments_for_video:

# Function to get all comments (including replies) for a single video
def get_comments_for_video(youtube, video_id):

Использовать:

videoId=video_id["Video_ID"]

и:

'VideoID': video_id["Video_ID"],  # Directly using video_id from function parameter
'Title': video_id["Title"], # The title of the video.

Чувак, ты герой, работает безупречно, спасибо. Я опубликую свой пост и добавлю полный код, чтобы его могли увидеть все остальные.

— 28.06.2024 20:58

28.06.2024 16:34

Другие вопросы по теме

Во время запуска uvicorn дочерний процесс умирает в кластере Kubernetes

Как правильно написать функцию фильтра Python в Excel 365?

В Python с numpy, как получить равномерно распределенные случайные числа с плавающей запятой между 0 и натуральным x, включая оба?

Инструменты настройки не могут прочитать файл require.txt из pyproject.toml в Python 3.12.x

Установка пипа | Как установить частный проект из локальной файловой системы при использовании require.txt?

Как извлечь подстроки непосредственно перед ближайшим знаком препинания?

Невозможно прочитать текстовый файл в задании склеивания

«Предварительная обработка» функции Python, чтобы избежать избыточной оценки условной логики

Многопроцессорность Python. Очередь помещает процесс уничтожения в контейнер Docker

Как мне округлить число до сетки с помощью Python?

Как извлечь заголовки видео с помощью Youtube API (Python)

Редактировать

Ответы 1

Другие вопросы по теме

Похожие вопросы