Python Selenium - get href value

python selenium xpath css-selectors webdriverwait

You want driver.find_elements if more than one element. This will return a list. For the css selector you want to ensure you are selecting for those classes that have a child href

elems = driver.find_elements_by_css_selector(".sc-eYdvao.kvdWiq [href]")links = [elem.get_attribute('href') for elem in elems]

You might also need a wait condition for presence of all elements located by css selector.

elems = WebDriverWait(driver,10).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".sc-eYdvao.kvdWiq [href]")))

python selenium xpath css-selectors webdriverwait

As per the given HTML:

<p class="sc-eYdvao kvdWiq">    <a href="https://www.iproperty.com.my/property/setia-eco-park/sale-1653165/">Shah Alam Setia Eco Park, Setia Eco Park</a></p>

As the href attribute is within the <a> tag ideally you need to move deeper till the <a> node. So to extract the value of the href attribute you can use either of the following Locator Strategies:

Using css_selector:

print(driver.find_element_by_css_selector("p.sc-eYdvao.kvdWiq > a").get_attribute('href'))

Using xpath:

print(driver.find_element_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a").get_attribute('href'))

If you want to extract all the values of the href attribute you need to use find_elements* instead:

Using css_selector:

print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_css_selector("p.sc-eYdvao.kvdWiq > a")])

Using xpath:

print([my_elem.get_attribute("href") for my_elem in driver.find_elements_by_xpath("//p[@class='sc-eYdvao kvdWiq']/a")])

Dynamic elements

However, if you observe the values of class attributes i.e. sc-eYdvao and kvdWiq ideally those are dynamic values. So to extract the href attribute you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR:

print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a"))).get_attribute('href'))

Using XPATH:

print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a"))).get_attribute('href'))

If you want to extract all the values of the href attribute you can use visibility_of_all_elements_located() instead:

Using CSS_SELECTOR:

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "p.sc-eYdvao.kvdWiq > a")))])

Using XPATH:

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//p[@class='sc-eYdvao kvdWiq']/a")))])

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait     from selenium.webdriver.common.by import By     from selenium.webdriver.support import expected_conditions as EC

python selenium xpath css-selectors webdriverwait

The XPATH

//p[@class='sc-eYdvao kvdWiq']/a

return the elements you are looking for.

Writing the data to CSV file is not related to the scraping challenge. Just try to look at examples and you will be able to do it.

CodeHunter

Python Selenium - get href value

Dynamic elements

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last