Google Crawler in Search Console can't found routes in React using Github Page Google Crawler in Search Console can't found routes in React using Github Page reactjs reactjs

Google Crawler in Search Console can't found routes in React using Github Page


I poked around in your source code and don't see anything alarming; however, I found a few posts about similar issues (1) (2). The second seems particularly helpful, so I'll repeat it here. Shout out to @Zerotorescue on Reddit.

Open Google Search Console and go to Crawl -> Fetch as Google and do a fetch and render.

Add this to your site, either as a part of tag in your HTML file or as part of the bundle:

https://gist.github.com/mstijak/715fa2dd3f495a98386c3ebbadbabb8c

I recommend the former since that makes it easier to change if you need to make it more readable (no need to recompile your app).

Push this to your site and then do another fetch and display. The error preventing Google from running your app will now show. The search console resolution is pretty low so you may have to increase the font-size of the error and fetch again. Don't worry, Google doesn't mind repeated calls.

You'll probably find that Google's crawler can't process your code because you're using some ES6 feature it doesn't support. You can fix this by polyfilling. I've tried a couple of things such as https://polyfill.io/ which turned out to not really support Googlebot and while it might sometimes work, it is pretty unreliable. Instead I recommend using babel-polyfill. It will increase your bundle size a little bit for everyone but in my experience it provides the widest browser support with a minimal headache. Just turn it on and you're done.

If you're using create-react-app this is the polyfills.js file I use that you could copy:

https://github.com/WoWAnalyzer/WoWAnalyzer/blob/2c67a970f8bd9026fa816d31201c42eb860fe2a3/config/polyfills.js#L1

Notice there are a lot of comments explaining all the issues the polyfill service introduce that you won't have to deal with if you use babel-polyfill.


I found that when i opened https://huynhsamha.github.io/crypto/algorithm/sha256, I actually received a 404 as a response. I think your workaround for hosting SPA on GitHub using the 404.html is the issue here. While us humans see your app being served on our browser correctly, googlebot doesn't care and just look at the response code and see that it has received a 404. You'll need a different workaround that doesn't involves using the 404.html as the entry point to your app directly.

Try following this workaround by rafrex instead, it redirects the browser to index.html using the 404.html while keeping the original route, it claims that googlebot register that as a 301 instead of a 404, for your case that means adding these changes below to your site, pay attention to the script below the <!-- ------Single Page Apps GitHub Pages Workaround------ -->:

<!-- 404.html --><!DOCTYPE html><html>  <head>    <meta charset="utf-8">    <title>Cryptography</title>    <!-- ------Single Page Apps GitHub Pages Workaround------ -->    <script type="text/javascript">      // Single Page Apps for GitHub Pages      // https://github.com/rafrex/spa-github-pages      // Copyright (c) 2016 Rafael Pedicini, licensed under the MIT License      // ----------------------------------------------------------------------      // This script takes the current url and converts the path and query      // string into just a query string, and then redirects the browser      // to the new url with only a query string and hash fragment,      // e.g. http://www.foo.tld/one/two?a=b&c=d#qwe, becomes      // http://www.foo.tld/?p=/one/two&q=a=b~and~c=d#qwe      // Note: this 404.html file must be at least 512 bytes for it to work      // with Internet Explorer (it is currently > 512 bytes)      // If you're creating a Project Pages site and NOT using a custom domain,      // then set segmentCount to 1 (enterprise users may need to set it to > 1).      // This way the code will only replace the route part of the path, and not      // the real directory in which the app resides, for example:      // https://username.github.io/repo-name/one/two?a=b&c=d#qwe becomes      // https://username.github.io/repo-name/?p=/one/two&q=a=b~and~c=d#qwe      // Otherwise, leave segmentCount as 0.      var segmentCount = 1;      var l = window.location;      l.replace(        l.protocol + '//' + l.hostname + (l.port ? ':' + l.port : '') +        l.pathname.split('/').slice(0, 1 + segmentCount).join('/') + '/?p=/' +        l.pathname.slice(1).split('/').slice(segmentCount).join('/').replace(/&/g, '~and~') +        (l.search ? '&q=' + l.search.slice(1).replace(/&/g, '~and~') : '') +        l.hash      );    </script>  </head>  <body>  </body></html>
<!-- index.html --><!DOCTYPE html><html lang="en"><head>  <meta charset="utf-8">  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">  <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">  <meta name="theme-color" content="#000000">  <meta name="description" content="Cryptography Algorithms: Secure Hash Algorithm (sha256, sha512, ...), Message Digest Algorithm (md5, ripemd160), HMAC-SHA, HMAC-MD, pbkdf2, Advanced Encryption Standard (AES), Triple Data Encryption Standard, (TripleDES, DES), RC4, Rabbit, ...">  <meta name="keywords" content="crypto, algorithms, secure hash, sha, sha512, sha256, message digest, md5, hmac-sha, aes, des, tripledes, pbkdf2, rc4, rabbit, encryption, descryption">  <meta name="author" content="huynhsamha">  <!-- Open Graph -->  <meta property="fb:app_id" content="440168923127908">  <meta property="og:url" content="https://huynhsamha.github.io/crypto">  <meta property="og:title" content="Cryptography Algorithms">  <meta property="og:description" content="Cryptography Algorithms: Secure Hash Algorithm (sha256, sha512, ...), Message Digest Algorithm (md5, ripemd160), HMAC-SHA, HMAC-MD, pbkdf2, Advanced Encryption Standard (AES), Triple Data Encryption Standard, (TripleDES, DES), RC4, Rabbit, ...">  <meta property="og:type" content="website">  <meta property="og:image" content="%PUBLIC_URL%/img/main.jpeg">  <meta property="og:site_name" content="Cryptography">  <meta property="og:locale" content="vi_VN">  <!-- Twitter Card -->  <meta name="twitter:card" content="summary">  <meta name="twitter:site" content="@huynhsamha">  <meta name="twitter:creator" content="@huynhsamha">  <meta name="twitter:url" content="https://huynhsamha.github.io/crypto">  <meta name="twitter:title" content="Cryptography Algorithms">  <meta name="twitter:description" content="Cryptography Algorithms: Secure Hash Algorithm (sha256, sha512, ...), Message Digest Algorithm (md5, ripemd160), HMAC-SHA, HMAC-MD, pbkdf2, Advanced Encryption Standard (AES), Triple Data Encryption Standard, (TripleDES, DES), RC4, Rabbit, ...">  <meta name="twitter:image:src" content="%PUBLIC_URL%/img/main.jpeg">  <!--      manifest.json provides metadata used when your web app is added to the      homescreen on Android. See https://developers.google.com/web/fundamentals/engage-and-retain/web-app-manifest/    -->  <link rel="manifest" href="%PUBLIC_URL%/manifest.json">  <link rel="shortcut icon" href="%PUBLIC_URL%/favicon.ico">  <link rel="author" href="//github.com/huynhsamha">  <link rel="canonical" href="//huynhsamha.github.io/crypto">  <!--      Notice the use of %PUBLIC_URL% in the tags above.      It will be replaced with the URL of the `public` folder during the build.      Only files inside the `public` folder can be referenced from the HTML.      Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will      work correctly both with client-side routing and a non-root public URL.      Learn how to configure a non-root public URL by running `npm run build`.    -->  <link href="//fonts.googleapis.com/css?family=Open+Sans:400,600,700&subset=vietnamese" rel="stylesheet">  <link rel="stylesheet" href="%PUBLIC_URL%/css/bootstrap.min.css">  <link rel="stylesheet" href="%PUBLIC_URL%/lib/font-awesome/css/font-awesome.min.css">  <!-- ------Single Page Apps GitHub Pages Workaround------ -->  <script type="text/javascript">    // Single Page Apps for GitHub Pages    // https://github.com/rafrex/spa-github-pages    // Copyright (c) 2016 Rafael Pedicini, licensed under the MIT License    // ----------------------------------------------------------------------    // This script checks to see if a redirect is present in the query string    // and converts it back into the correct url and adds it to the    // browser's history using window.history.replaceState(...),    // which won't cause the browser to attempt to load the new url.    // When the single page app is loaded further down in this file,    // the correct url will be waiting in the browser's history for    // the single page app to route accordingly.    (function(l) {      if (l.search) {        var q = {};        l.search.slice(1).split('&').forEach(function(v) {          var a = v.split('=');          q[a[0]] = a.slice(1).join('=').replace(/~and~/g, '&');        });        if (q.p !== undefined) {          window.history.replaceState(null, null,            l.pathname.slice(0, -1) + (q.p || '') +            (q.q ? ('?' + q.q) : '') +            l.hash          );        }      }    }(window.location))   </script>  <title>Cryptography</title></head><body>  <noscript>    You need to enable JavaScript to run this app.  </noscript>  <div id="root"></div>  <!--      This HTML file is a template.      If you open it directly in the browser, you will see an empty page.      You can add webfonts, meta tags, or analytics to this file.      The build step will place the bundled scripts into the <body> tag.      To begin the development, run `npm start` or `yarn start`.      To create a production bundle, use `npm run build` or `yarn build`.    -->  <script src="%PUBLIC_URL%/js/jquery-3.3.1.slim.min.js" type="text/javascript"></script>  <script src="%PUBLIC_URL%/js/popper.min.js" type="text/javascript"></script>  <script src="%PUBLIC_URL%/js/bootstrap.min.js" type="text/javascript"></script>  <!-- Google Adsense -->  <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script></body></html>

More info and discussions on GitHub's support for single page app here.


Because, react app is onepage web, You need a sitemap file, you can find it how to make a one here ,too make a 404 page, and every route add property that has a anchorlike to

<a title="This my Route One" href="https://myreactapp/routeOne" alt="Route One"/>