Coding with Jesse

Unable to locate credentials in AWS

The Problem

If you have servers in AWS doing a high volume of AWS service requests, you may come across some rare but frustrating sporadic credential errors like these:

"Unable to locate credentials"

or if you're using aws-sdk in Node.js:

"CredentialsProviderError: Could not load credentials from any providers"

I'm not totally sure why these errors happen, but typically I see them happen across multiple services, accounts and regions around the same time, which leads me to believe that there can be some sporadic flakiness in the metadata service used for fetching IAM credentials.

I tried using metadata retries and other configuration parameters to prevent this, but they didn't seem to make any difference.

The Solution

Looking for a solution, I found this buried in the AWS documentation for instance metadata retrieval:

"If you're using the IMDS to retrieve AWS security credentials, avoid querying for credentials during every transaction or concurrently from a high number of threads or processes, as this might lead to throttling. Instead, we recommend that you cache the credentials until they start approaching their expiry time."

Now, I don't think this throttling was the source of all the errors I was seeing, but it may be playing a role. Maybe the metadata service tolerance for throttling changes over time as demand changes, I don't know.

Either way, this gave me an idea to write a bash script to cache the IAM credentials in ~/.aws/credentials so they could be used by both the AWS CLI, and also any Node.js or Python clients accessing the AWS services:

#!/bin/bash

IMDS_URL="http://169.254.169.254/latest/meta-data/iam/security-credentials/"
AWS_CREDENTIALS_PATH="~/.aws/credentials"
PROFILE_NAME="default"

# 4.5 minutes, because new credentials appear 5 minutes before expiry
EXPIRY_BUFFER=270

get_aws_credentials() {
    local role_name=$(curl -s $IMDS_URL)
    local credentials_url="${IMDS_URL}${role_name}"
    local response=$(curl -s $credentials_url)

    local access_key_id=$(echo $response | jq -r '.AccessKeyId')
    local secret_access_key=$(echo $response | jq -r '.SecretAccessKey')
    local token=$(echo $response | jq -r '.Token')
    local expiration=$(echo $response | jq -r '.Expiration')
    local expiration_time=$(date -d "$expiration" +%s)

    echo "[$PROFILE_NAME]" > $AWS_CREDENTIALS_PATH
    echo "aws_access_key_id = $access_key_id" >> $AWS_CREDENTIALS_PATH
    echo "aws_secret_access_key = $secret_access_key" >> $AWS_CREDENTIALS_PATH
    echo "aws_session_token = $token" >> $AWS_CREDENTIALS_PATH
    echo "expiration = $expiration_time" >> $AWS_CREDENTIALS_PATH
}

should_fetch_credentials() {
    if [[ ! -f $AWS_CREDENTIALS_PATH ]]; then
        return 0
    fi

    local expiration_time=$(grep 'expiration' $AWS_CREDENTIALS_PATH | cut -d ' ' -f 3)
    local current_time=$(date +%s)

    if (( $current_time + $EXPIRY_BUFFER > $expiration_time )); then
        return 0
    fi

    return 1
}

if should_fetch_credentials; then
    get_aws_credentials
fi

Since the credentials have to be refreshed every few hours, I set it up to run in a cron job every minute, to check if the expiration time has come:

* * * * * /home/ec2-user/credentials.sh > /dev/null 2>&1

Voila! No more credential errors! I hope that helps. Let me know if you've run into the same errors, and if you found this approach useful.

Published on May 30th, 2024. © Jesse Skinner

About the author

Jesse Skinner

Hi, I'm Jesse Skinner. I'm a web development coach & consultant. I teach web development teams how to scale up their server infrastructure, improve automated testing and monitoring, reduce costs, and modernize their legacy systems. I focus on empowering teams through customized training and coaching.

Feel free to email me. I'm eager to hear about your challenges and see how I can make your life easier.