When building this website, I wanted to automatically pull live stats from a variety of sources. The landing page now shows constantly updating statistics from GitHub, LinkedIn, and YouTube.
In this article, I’ll show you how to do the same for free on your own site. I’ll keep the code light and framework-agnostic, so you may need to tweak a few things depending on your setup, but the concepts should be easy to apply anywhere.
If you run into any issues with the web scraping part, check out everything web scraping for more tips and tricks.
The first source I wanted to pull stats from was GitHub. This turned out to be pretty easy, they have a public facing api. It also doesn’t require you to log in which is perfect, as we wouldn’t want to store secrets on the website as it would allow anyone to read it. It does have a rate limit of 60 per hour, but we only need the client to make one request so this isn’t a concern.
For this kind of site all we have to do is integrate it in our website using a fetch()
. Here’s a diagram to help explain it.
When you’re pulling data from external APIs and sites, you’ll often run into a security feature called CORS (Cross-Origin Resource Sharing). Basically, a browser won’t let your site directly hit another domain’s API if that domain doesn’t explicitly allow it, for security reasons. This means even if YouTube or GitHub’s data is technically public, your frontend fetch requests may get blocked.
GitHub’s public repo endpoint is friendly and works in the browser, but things like YouTube’s subscriber count or video stats don’t.
The easiest way to get around this is to have a tiny backend for your website that will make the requests that would get blocked by CORS and forward the results to you in a CORS friendly way. I personally use Cloudflare Functions for this.
They’re free for small projects, cheap, charge only by compute time (not when waiting for api calls), and Cloudflare IPs are usually trusted as not-bots by a lot of web scraping prevention tools!
I won’t go into detail here on how to create a Cloudflare function, you’ll want to follow their getting started guide. But once you have the function set up you can then just do a fetch()
call in your function and send the result up!
You’ll also need to then add these headers to any requests that your function sends back, this allows websites to bypass CORS.
response.headers.set("Access-Control-Allow-Origin", "*");
response.headers.set("Access-Control-Allow-Methods", "GET,HEAD,POST,OPTIONS");
response.headers.set("Access-Control-Allow-Headers", "Content-Type");
Then after deploying your function to the cloud, your website can then use the function as a proxy to a site like YouTube or LinkedIn to return the data without running into CORS issues!
If an API requires some authenticaiton to get metrics like an oauth2 api client, username + password, or anything else that’s deemed as secret. You do not want to publish these values directly in your frontend application, as anyone could open up devtools and see them and steal them there.
The safest way is to use the same approach as before with the Cloudflare functions:
Caching API responses either form the external APIs directly or responses from our Cloudflare functions can save on compute cost (extending our free tier) and also make the UI feel faster.
For APIs with strict rate limiting like GitHub (60 requests / minute), caching can also prevent you from ever hitting or getting near those limits. Since Cloudflare functions are stateless, our caching layer needs to happen at the client level, and stashing our results in LocalStorage or cookies. Be sure to include a time to live (TTL) value so that the metrics get refreshed after it expires.
External APIs can fail, or your parsing logic might break. To keep your site resilient and working, define a fallback value in your client as a hard-coded param. Even if it’s slightly oudated, having a value will always be better than no value or crashing.
Here’s a little flowchart that implements these suggestions.
Hopefully this gave you some ideas to think about when adding live stats to your own site, it’s a fun project that also makes your page feel more dynamic and personal.
You can even take it further, I personally use the same Cloudflare function in my website’s build process to automatically update stats on my resume. Once you have once source of truth for all your stats, it’s pretty easy to integrate in other places.
If you end up building something cool with this, I’d love to hear about it, drop a comment below!
Back to blog