Manage your book inventory in Airtable by scanning barcodes and scraping Goodreads
If you use Airtable to keep an inventory of items, maybe your personal media collection, you can save time by just scanning barcodes and scraping product details from a website like Goodreads.
Airtable’s iPhone app can already scan barcodes through your phone’s camera, so this post shows how to use Zapier and Apify in the background to automatically scrape product details from Goodreads:
We use Apify to crawl Goodreads for book details, and we use Zapier to kick everything off when new barcodes are scanned. Here is an overview of the process:
- Prepare your Airtable base
- Create an Apify crawler for Goodreads
- Get Zapier to start a crawl when a new barcode is scanned in Airtable
- Get Zapier to update Airtable with the crawler’s results
Step 1: Prepare your Airtable base
The fields in your Airtable base will obviously depend on your use case, but at a minimum you’ll need a field with the Barcode
type, and any other metadata fields for the products you’re tracking.
Here’s what I used for a simple Books table:
If you open this table on Airtable’s mobile app, editing the barcode field will pull up your camera for easy scanning!
Step 2: Create an Apify crawler for Goodreads
I’ve covered Apify in more detail in other posts, but in short it makes it very easy to scrape websites and make the results accessible via an API. You can also automate it with Zapier, which is what we’ll do later in the post.
Here’s how to create a crawler to extract book details from a Goodreads page like this:
- Create a new crawler in your Apify account.
- Make the Clickable elements field blank. (We don’t need the crawler to follow any links.)
- Set the Page function to the JavaScript code below:
function pageFunction(context) {
var $ = context.jQuery;
var result = {
title: $("h1#bookTitle").text().strip(),
author: $("#bookAuthors span[itemprop='name']").text(),
coverImageUrl: $("img#coverImage").attr("src"),
rating: parseFloat($("span[itemprop='ratingValue']").text().strip()),
originalBarcodeNumber: context.request.label
};
return result;
}
That’s it! Apify will run this function on Goodreads pages we tell it to scrape, and the JSON object we return will be made available to Zapier.
I explain this step in more detail in a previous post, but all I did here was use Chrome DevTools to understand how Goodreads structures its HTML with CSS classes and IDs, and then use jQuery selectors to extract the relevant details.
It’s important we return context.request.label
in the result object here. This variable holds the original barcode number for this crawl, and when we process the results later we’ll use this information to find the right Airtable record to update.
Step 3: Get Zapier to start a crawl when a new barcode is scanned in Airtable
When we scan a barcode in the Airtable app, we’re left with a record with a barcode and no other details. We want to start our Apify crawler each time one of these records is added. We don’t want to start our crawler if the new row already has details filled in, which might be because we manually entered the details or imported them from somewhere else.
We’ll achieve this in two steps: first we’ll create an Airtable View that filters to books that need details from Goodreads, and then we’ll use a New Record in View trigger in Zapier to only trigger on these rows.
Create a new view in Airtable that filters to records where the title is empty but the barcode is not:
With that view set up, create a new Zapier Zap with this trigger and action:
New Record in View (Trigger)
Make sure you have at least one sample row in your “Books to be scraped” view in Airtable, then choose the view when configuring this step in Zapier.
Run Crawler (Action)
After you’ve selected your Apify crawler, the magic happens when we specify the “Start URLs” in the Crawler Properties:
When Zapier tells Apify to start our crawler, it will pass this JSON object with special settings. Here we specify two key details:
- We specify which Goodreads URL we want to crawl. We use the format
https://www.goodreads.com/search?q=<BARCODE_NUMBER>
, and we use Zapier’s templating functionality to populate the barcode number from the Airtable record that triggered the Zap. (Use the “+” button in the top right to access the “Barcode Text” variable.) - We set the label of our Start URL to the barcode number by setting the
key
property. When we wrote our Apify crawer’s JavaScript function we includedcontext.request.label
in our result object, and now you can see how that variable will hold the barcode number.
Save your Zap and turn it on!
Step 4: Get Zapier to update Airtable with the crawler’s results
Now we’ve got Apify automatically crawling Goodreads when new barcodes are added to Airtable, but we’re not actually doing anything with the results. To complete the cycle, we need a new Zapier Zap to run when the crawler is finished:
Crawler Run Finished (Trigger)
Select your Apify Crawler at this step. Make sure you’ve run your crawler manually at least once by this stage — Zapier will pull in the sample results and it will make setting up the remaining steps much easier.
Find Record (Search)
Choose your Base and Table, set Search by Field to your “Barcode” field, and set Search Values to the original barcode variable from the trigger step. It will be called “Results Original Barcode Number” in the variable list.
Update Record (Action)
For the Record to update, choose “Use a Custom Value (advanced)” to update the record we searched for in the previous step.
All of the crawler results from Goodreads will be available as Zapier variables by this point, so now we just need to map the values to the right Airtable fields.
Save your Zap, and you’re done!
Give it a whirl
At this point the whole cycle should be complete! When you scan a new barcode in the Airtable app, all of your automation steps should kick off magically in the cloud, and you should be able to watch the values appear in your Airtable base!