Importing and parsing locally stored data files and data from Google Sheets API asynchronously with Node.js
Published at Feb 28, 2025
In this post, we’ll explore the backend implementation of this interactive React dashboard that visualizes data of hundreds of research papers. I take a deep dive into the process of importing and parsing locally stored data files and data from Google Sheets API, in CSV or TSV format, particularly focusing on asynchronous error handling.
Fetching locally stored data in CSV format
- 1 Directory Existence Check
- Checks if the specified directory exists using fs.existsSync()
- If the directory does not exist, it immediately throws an error. This is the first error handling mechanism, preventing the function from proceeding with an invalid directory.
- 2 Main Promise Structure
- The function returns a Promise, which will eventually resolve with the processed data or reject with an error
- 3 Data Storage and Directory Reading
- Initializes an empty array allData to store all processed data
- Uses fs.readdir() to read the contents of the directory asynchronously. If there is an error, it logs the error and rejects the main Promise with the error, returning the function early to prevent further execution.
- 4 File Processing Setup
- Filters the directory contents to only include CSV files, to prevent any errors in data processing down the line.
- Maps each CSV file to a new Promise (creating an array of Promises)
- Tracks the file index for later reference
- 5 Individual File Processing
- Creates a new Promise for each file's processing
- Constructs the full file path using path.join()
- Initializes an array to store data from this specific file
- 6 Read Stream Setup
- Creates a read stream for the file. If there's an error reading the file, it logs the error with the specific file name and context, then rejects the file's Promise with error.
- 7 CSV Parsing
- Pipes the file content through a CSV parser
- For each row of CSV data, it transforms the row using a function called transformRow, then adds the transformed row to the file's data array.
- When CSV parsing is complete, it adds all rows from this file to the main allData array, before resolve this file's Promise
- If there's an error during CSV parsing, it rejects this file's Promise with the error
- 8 Processing All Files
- Waits for all file processing Promises to complete using Promise.all()
- Resolves the main Promise with the complete dataset
- If any file processing Promise is rejected, it rejects the main Promise with the error.
Summary of Error Catchment Points
- Directory Check: Throws an error if the directory doesn’t exist
- Directory Reading: Catches and logs errors when reading the directory contents
- File Reading: Catches and logs errors when opening or reading individual files
- CSV Parsing: Catches and logs errors during the CSV parsing process
- Overall Processing: Catches and logs any errors that occur during the overall processing
This robust error handling ensures that:
- Each potential failure point is properly handled
- Errors are logged with context information
- The function fails gracefully with appropriate error messages
- Processing continues for other files even if one file fails
Fetching locally stored in TSV format
TSV format offers advantages over CSV when handling data containing commas in text fields or numeric values. Since CSV files use commas as delimiters, values like ‘1,000.50’ or text containing natural commas require special escaping to prevent parsing errors. TSV files avoid this issue by using tabs as delimiters, which rarely occur in the actual data.
Similar code as with CSV format, except for the parsing of the file content.
- 1 Parsing TSV file
- Splits the text into an array of lines using newline characters. Each element in lines represents one row from the file.
- Takes the first line (index 0) which contains the column headers, splits it by tab characters ( ) to get an array of header names.
- slice(1) skips the header row and takes all remaining lines. map() processes each line into an object.
- Splits each line by tabs to get the values, then pairs each header with its corresponding value using the same index position
Fetching data from Google Sheets API
This approach allows for real-time data updates without requiring application redeployment, making it especially valuable for clients who are unfamiliar with the technical aspects of deployment processes.
- 1 Authentication
- Parses Google API credentials from an environment variable
- Verifies that essential credential fields exist (client_email, private_key, project_id), and throws an error if any required field is missing
- Creates a Google Auth instance with the credentials, requesting read-only access to Google Sheets via the specified scope
- 2 API Client Connection
- Creates a Google Sheets API client (v4) with the authentication
- Verifies the spreadsheet exists and is accessible
- 3 Spreadsheet Processing
- Iterates through each requested sheet name. For each sheet, requests data from range of columns specified
- Checks if data exists with at least headers and one row
- Converts rows to objects using headers as keys and transforms each row using an external transformRow function
How to Connect to Google Sheets API
Set Up a Google Cloud Project
- Go to the Google Cloud Console
- Create a new project or select an existing one
- Enable the Google Sheets API for your project:
- Navigate to “APIs & Services” > “Library”
- Search for “Google Sheets API”
- Click “Enable”
Create Service Account Credentials
- Go to “APIs & Services” > “Credentials”
- Click “Create Credentials” > “Service Account”
- Fill in the service account details and click “Create”
- Grant appropriate roles (basic Viewer role is sufficient for read-only access)
- Click on the created service account and go to the “Keys” tab
- Click “Add Key” > “Create new key”
- Choose JSON format and click “Create”
- The key file will be downloaded to your computer
Share Your Spreadsheet
- Open the Google Spreadsheet you want to access and click the “Share” button
- Add the service account email (found in the JSON file under client_email)
- Set appropriate permissions (Viewer for read-only)
Ensure that it the file is a native Google Sheet, not an Excel file stored in Google Drive. To convert the file to Google Sheets format: Click File → Save as Google Sheets
Setup in Your Project
- Store the entire JSON content as an environment variable
GOOGLE_CREDENTIALS='{"type":"service_account","project_id":"...","private_key":"...","client_email":"...","client_id":"...",...}'
- Install Required Packages
npm install googleapis
Use the importFromGoogleSheets
function shown above, with your specific spreadsheet ID. The spreadsheet ID is in the URL: https://docs.google.com/spreadsheets/d/[SPREADSHEET_ID]/edit
Conclusion
Once files are imported and parsed, Node.js API routes can be established to transmit data to the frontend, with built-in parameter support for dynamic data filtering.