How to use xpath in chrome headless+puppeteer evaluate()? How to use xpath in chrome headless+puppeteer evaluate()? google-chrome google-chrome

How to use xpath in chrome headless+puppeteer evaluate()?


$x() is not a standard JavaScript method to select element by XPath. $x() it's only a helper in chrome devtools. They claim this in the documentation:

Note: This API is only available from within the console itself. You cannot access the Command Line API from scripts on the page.

And page.evaluate() is treated here as a "scripts on the page".

You have two options:

  1. Use document.evaluate

Here is a example of selecting element (featured article) inside page.evaluate():

const puppeteer = require('puppeteer');(async () => {    const browser = await puppeteer.launch();    const page = await browser.newPage();    await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });    const text = await page.evaluate(() => {        // $x() is not a JS standard -        // this is only sugar syntax in chrome devtools        // use document.evaluate()        const featureArticle = document            .evaluate(                '//*[@id="mp-tfa"]',                document,                null,                XPathResult.FIRST_ORDERED_NODE_TYPE,                null            )            .singleNodeValue;        return featureArticle.textContent;    });    console.log(text);    await browser.close();})();
  1. Select element by Puppeteer page.$x() and pass it to page.evaluate()

This example achieves the same results as in the 1. example:

const puppeteer = require('puppeteer');(async () => {    const browser = await puppeteer.launch();    const page = await browser.newPage();    await page.goto('https://en.wikipedia.org', { waitUntil: 'networkidle2' });    // await page.$x() returns array of ElementHandle    // we are only interested in the first element    const featureArticle = (await page.$x('//*[@id="mp-tfa"]'))[0];    // the same as:    // const featureArticle = await page.$('#mp-tfa');    const text = await page.evaluate(el => {        // do what you want with featureArticle in page.evaluate        return el.textContent;    }, featureArticle);    console.log(text);    await browser.close();})();

Here is a related question how to inject $x() helper function to your scripts.


If you insist on using page.$x(), you can simply pass the result to page.evaluate():

const example = await page.evaluate(element => {  return element.textContent;}, (await page.$x('//*[@id="result"]'))[0]);