Python Selenium - get href value Python Selenium - get href value python python

Python Selenium - get href value


You want driver.find_elements if more than one element. This will return a list. For the css selector you want to ensure you are selecting for those classes that have a child href

elems = driver.find_elements_by_css_selector(".sc-eYdvao.kvdWiq [href]")links = [elem.get_attribute('href') for elem in elems]

You might also need a wait condition for presence of all elements located by css selector.

elems = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".sc-eYdvao.kvdWiq [href]")))


As per the given HTML:

<p class="sc-eYdvao kvdWiq">    <a href="https://www.iproperty.com.my/property/setia-eco-park/sale-1653165/">Shah Alam Setia Eco Park, Setia Eco Park</a></p>

As the href attribute is within the <a> tag ideally you need to move deeper till the <a> node. So to extract the value of the href attribute you can use either of the following Locator Strategies:

  • Using css_selector:

    print(driver.find_element_by_css_selector("p.sc-eYdvao.kvdWiq > a").get_attribute('href'))
  • Using xpath:

    print(driver.find_element_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a").get_attribute('href'))

If you want to extract all the values of the href attribute you need to use find_elements* instead:

  • Using css_selector:

    print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_css_selector("p.sc-eYdvao.kvdWiq > a")])
  • Using xpath:

    print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a")])

Dynamic elements

However, if you observe the values of class attributes i.e. sc-eYdvao and kvdWiq ideally those are dynamic values. So to extract the href attribute you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a"))).get_attribute('href'))
  • Using XPATH:

    print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a"))).get_attribute('href'))

If you want to extract all the values of the href attribute you can use visibility_of_all_elements_located() instead:

  • Using CSS_SELECTOR:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a")))])
  • Using XPATH:

    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a")))])

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait     from selenium.webdriver.common.by import By     from selenium.webdriver.support import expected_conditions as EC


The XPATH

//p[@class='sc-eYdvao kvdWiq']/a

return the elements you are looking for.

Writing the data to CSV file is not related to the scraping challenge. Just try to look at examples and you will be able to do it.