How to get all comments in youtube with selenium? How to get all comments in youtube with selenium? google-chrome google-chrome

How to get all comments in youtube with selenium?


You are getting a limited number of comments as YouTube will load the comments as you keep scrolling down. There are around 394 comments left on that video you have to first make sure all the comments are loaded and then also expand all View Replies so that you will reach the max comments count.

Note: I was able to get 700 comments using the below lines of code.

# get the last commentlastEle = driver.find_element_by_xpath("(//*[@id='content-text'])[last()]")# scroll to the last comment currently loadedlastEle.location_once_scrolled_into_view# wait until the comments loading is doneWebDriverWait(driver,30).until(EC.invisibility_of_element((By.CSS_SELECTOR,"div.active.style-scope.paper-spinner")))# load all commentswhile lastEle != driver.find_element_by_xpath("(//*[@id='content-text'])[last()]"):    lastEle = driver.find_element_by_xpath("(//*[@id='content-text'])[last()]")    driver.find_element_by_xpath("(//*[@id='content-text'])[last()]").location_once_scrolled_into_view    time.sleep(2)    WebDriverWait(driver,30).until(EC.invisibility_of_element((By.CSS_SELECTOR,"div.active.style-scope.paper-spinner")))# open all repliesfor reply in driver.find_elements_by_xpath("//*[@id='replies']//paper-button[@class='style-scope ytd-button-renderer'][contains(.,'View')]"):    reply.location_once_scrolled_into_view    driver.execute_script("arguments[0].click()",reply)time.sleep(5)WebDriverWait(driver, 30).until(        EC.invisibility_of_element((By.CSS_SELECTOR, "div.active.style-scope.paper-spinner")))# print the total number of commentsprint(len(driver.find_elements_by_xpath("//*[@id='content-text']")))


There are a couple of things:

  • The WebElements within the website https://www.youtube.com/ are dynamic. So are the comments dynamically rendered.
  • With in the webpage https://www.youtube.com/watch?v=N0lxfilGfak the comments doesn't render unless user scrolls the following element within the Viewport.

edureka

  • The comments are with in:

    <!--css-build:shady-->

    Which applies, Polymer CSS Builder is used apply Polymer's CSS Mixin shim and ShadyDOM scoping. So some runtime work is still done to convert CSS selectors under the default settings.


Considering the above mentioned factors here's a solution to retrieve all the comments:

Code Block:

from selenium import webdriverfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support import expected_conditions as ECfrom selenium.common.exceptions import TimeoutException, NoSuchElementException, ElementClickInterceptedException, WebDriverExceptionimport timeoptions = webdriver.ChromeOptions() options.add_argument("start-maximized")options.add_experimental_option("excludeSwitches", ["enable-automation"])options.add_experimental_option('useAutomationExtension', False)driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')driver.get('https://www.youtube.com/watch?v=N0lxfilGfak')driver.execute_script("return scrollBy(0, 400);")subscribe = WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//yt-formatted-string[text()='Subscribe']")))driver.execute_script("arguments[0].scrollIntoView(true);",subscribe)comments = []my_length = len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//yt-formatted-string[@class='style-scope ytd-comment-renderer' and @id='content-text'][@slot='content']"))))while True:    try:        driver.execute_script("window.scrollBy(0,800)")        time.sleep(5)        comments.append([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//yt-formatted-string[@class='style-scope ytd-comment-renderer' and @id='content-text'][@slot='content']")))])    except TimeoutException:        driver.quit()        breakprint(comment)


If you don't have to use Selenium I would recommend you to look at the google/youtube api.

https://developers.google.com/youtube/v3/getting-started

Example :

https://www.googleapis.com/youtube/v3/commentThreads?key=YourAPIKey&textFormat=plainText&part=snippet&videoId=N0lxfilGfak&maxResults=100

This would give you the first 100 results and gets you a token that you can append on the next request to get the next 100 results.