AWS ElastiCache with Valkey: Complete Setup Guide
This comprehensive guide covers everything you need to know about setting up and connecting to AWS ElastiCache using Valkey (Redis-compatible). It includes step-by-step setup instructions, connection code examples, and solutions to common issues based on real-world experience.
Table of Contents
Overview
AWS ElastiCache is a fully managed in-memory caching service that supports Redis-compatible engines like Valkey. It provides high-performance, scalable caching for applications requiring fast data access. ElastiCache offers two main deployment options:
Valkey Serverless: Fully managed, auto-scaling option with minimal configuration
Valkey Node-Based Cluster: Traditional cluster deployment with more control over configuration
Prerequisites
Before you begin, ensure you have:
An active AWS account with appropriate IAM permissions
Access to AWS Management Console
A configured VPC with appropriate subnets
Basic understanding of AWS networking (VPC, Security Groups, Subnets)
Node.js application (if connecting from Node.js)
Understanding Deployment Options
Valkey Serverless
Best for:
Applications with variable or unpredictable traffic
Simplified operations with minimal configuration
Auto-scaling requirements
Development and testing environments
Key characteristics:
Automatically scales based on demand
Serverless endpoint (proxy-based)
No cluster management required
Single Redis client connection (not cluster mode)
⚠️ Not compatible with BullMQ (cannot configure maxmemory-policy)
Valkey Node-Based Cluster
Best for:
Applications requiring specific node configurations
BullMQ or other queue systems (requires node-based deployment with custom parameters)
Predictable workloads with known capacity
Fine-grained control over caching infrastructure
Key characteristics:
Manual cluster configuration
Support for cluster mode enabled/disabled
Direct node access
More configuration options
Important for BullMQ Users: If you plan to use BullMQ with Node.js, you must choose the Node-Based Cluster deployment option. BullMQ requires:
Direct node access (not available in Serverless)
Custom
maxmemory-policyset tonoeviction(cannot be configured in Serverless)See the Configuring ElastiCache for BullMQ section for complete setup instructions.
Creating Valkey Serverless Cache
Follow these steps to create a Valkey Serverless cache:
Step 1: Access ElastiCache Console
Sign in to the AWS Management Console
Navigate to ElastiCache Console
In the left navigation pane, select Valkey caches
Click Create Valkey cache button
Step 2: Configure Cache Settings
Deployment option: Select Serverless (default)
Cache settings:
Name: Enter a descriptive name (e.g.,
my-project-cache)Description: (Optional) Add a description for your cache
Configuration: Leave the default settings selected for initial setup
Network settings: ElastiCache will automatically configure networking
Step 3: Create and Wait
Review your configuration
Click Create to create the cache
Wait for the cache status to change to ACTIVE (typically takes 5-10 minutes)
Once active, you can retrieve the endpoint URL from the cache details page
Step 4: Get Connection Endpoint
After the cache is created:
Select your cache from the list
Go to the Connectivity & security tab
Copy the Configuration endpoint (e.g.,
my-cache.serverless.use1.cache.amazonaws.com)
Creating Valkey Node-Based Cluster
Follow these steps to create a Valkey Node-Based Cluster:
Step 1: Access ElastiCache Console
Sign in to the AWS Management Console
Navigate to ElastiCache Console
In the left navigation pane, select Valkey caches
Click Create Valkey cache button
Step 2: Select Deployment Option
Deployment option: Select Design your own cache
Creation method: Select Cluster cache
Cluster mode: Choose Disabled (for simpler setup) or Enabled (for sharding)
Step 3: Configure Cluster Settings
Cluster Information:
Name: Enter a cluster name (e.g.,
my-project-cluster)Description: (Optional) Add a description
Engine version: Use the latest compatible version
Port: Keep default
6379Parameter group: Use default or select custom
Node type: Choose based on your memory and CPU requirements (e.g.,
cache.t3.microfor testing)Number of replicas: Set to
0for single node, or add replicas for high availability
Step 4: Configure Subnet Group
In the Connectivity section:
Subnet groups:
If you don't have a subnet group, select Create a new subnet group
Name: Enter subnet group name (e.g.,
my-subnet-group)Description: Add a description
VPC: Select your VPC from the dropdown
Subnets: Select at least 2 subnets in different availability zones
Click Next
Step 5: Configure Security Settings
In the Selected security groups section, click Manage
Select appropriate security groups:
Choose existing security group OR
Create a new security group with inbound rule allowing port
6379
Encryption:
Enable Encryption at rest (recommended for production)
Enable Encryption in transit (TLS) (recommended)
Step 6: Configure Backup and Maintenance (Optional)
Automatic backups: Enable for production environments
Maintenance window: Choose preferred maintenance window
SNS notifications: (Optional) Configure notifications
Step 7: Review and Create
Click Next to review all settings
Verify your configuration
Click Create to create the cluster
Wait for the cluster status to become Available (typically 10-15 minutes)
Step 8: Get Connection Endpoint
After the cluster is created:
Select your cluster from the list
Go to the Details tab
Copy the Primary endpoint (or Configuration endpoint if cluster mode is enabled)
Connecting to ElastiCache
Understanding Connection Types
AWS ElastiCache has three different Redis connection patterns:
| Deployment Type | Correct Client | Wrong Client |
|---|---|---|
| Self-managed Redis | new Redis() |
- |
| ElastiCache Node-Based (Cluster Mode Disabled) | new Redis() |
new Redis.Cluster() |
| ElastiCache Serverless | new Redis() |
new Redis.Cluster() |
| ElastiCache Node-Based (Cluster Mode Enabled) | new Redis.Cluster() |
new Redis() |
Critical: Serverless endpoints use a proxy architecture and do NOT expose individual cluster nodes. Always use
new Redis()for Serverless, nevernew Redis.Cluster().
Connecting to Valkey Serverless
import Redis from "ioredis";
const redis = new Redis({
host: process.env.REDIS_HOST, // e.g., my-cache.serverless.use1.cache.amazonaws.com
port: 6379,
tls: {}, // TLS is required for AWS ElastiCache
connectTimeout: 10000,
maxRetriesPerRequest: null, // Important for BullMQ
});
// Event listeners for monitoring
redis.on("connect", () => {
console.log("✅ Redis connected successfully");
});
redis.on("error", (err) => {
console.error("❌ Redis connection error:", err);
});
redis.on("close", () => {
console.log("Redis connection closed");
});
// Example usage
async function testConnection() {
try {
// Set a value
await redis.set("test-key", "Hello ElastiCache!");
console.log("✅ Set operation successful");
// Get the value
const value = await redis.get("test-key");
console.log("✅ Retrieved value:", value);
// Clean up
await redis.del("test-key");
} catch (error) {
console.error("❌ Operation failed:", error);
}
}
testConnection();
Connecting to Valkey Node-Based Cluster (Cluster Mode Disabled)
import Redis from "ioredis";
const redis = new Redis({
host: process.env.REDIS_HOST, // Primary endpoint
port: 6379,
tls: {},
connectTimeout: 10000,
retryStrategy: (times) => {
const delay = Math.min(times * 50, 2000);
return delay;
},
});
redis.on("connect", () => console.log("Redis connected"));
redis.on("error", (err) => console.error("Redis error:", err));
Connecting to Valkey Node-Based Cluster (Cluster Mode Enabled)
import Redis from "ioredis";
const cluster = new Redis.Cluster(
[
{
host: process.env.REDIS_HOST, // Configuration endpoint
port: 6379,
},
],
{
dnsLookup: (address, callback) => callback(null, address),
redisOptions: {
tls: {},
connectTimeout: 10000,
},
clusterRetryStrategy: (times) => {
const delay = Math.min(times * 50, 2000);
return delay;
},
},
);
cluster.on("connect", () => console.log("Cluster connected"));
cluster.on("error", (err) => console.error("Cluster error:", err));
Using with BullMQ
import { Queue, Worker } from "bullmq";
import Redis from "ioredis";
// Connection configuration
const connection = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
maxRetriesPerRequest: null, // Critical for BullMQ
});
// Create a queue
const queue = new Queue("my-queue", { connection });
// Add a job
await queue.add("job-name", { data: "example" });
// Create a worker
const worker = new Worker(
"my-queue",
async (job) => {
console.log("Processing job:", job.id);
// Process job here
},
{ connection },
);
Configuring ElastiCache for BullMQ
If you're using BullMQ with AWS ElastiCache, there's a critical configuration requirement you must complete before your queues will work properly.
Why This Configuration Is Required
BullMQ requires Redis to use the noeviction maxmemory policy. This policy ensures that Redis never evicts keys when memory is full, which is essential for queue reliability. If keys are evicted, you could lose jobs from your queue.
Important Notes:
⚠️ Serverless ElastiCache is NOT compatible with BullMQ due to an incompatible default maxmemory-policy that cannot be changed
✅ You must use Node-Based Cluster deployment for BullMQ
Default parameter groups in AWS cannot be modified, so you must create a custom parameter group
Common Error Without This Configuration
Without the correct maxmemory-policy, you may encounter errors such as:
OOM command not allowed when used memory > 'maxmemory'
Or jobs may silently disappear from your queue when memory pressure occurs.
Step-by-Step Configuration Guide
Step 1: Create a Custom Parameter Group
Navigate to ElastiCache Console → Parameter Groups (in the left sidebar)
Click Create parameter group
Configure the parameter group:
Family: Select the Redis version family (e.g.,
redis7.xfor Redis 7)Name: Enter a descriptive name (e.g.,
bullmq-parameters)Description: Add a description (e.g.,
Custom parameters for BullMQ queues)
Click Create
Step 2: Modify the maxmemory-policy Parameter
In the Parameter Groups list, find your newly created parameter group
Click on the parameter group name to open it
Click Edit or Edit parameters
In the search box, type:
maxmemory-policyChange the value from
volatile-lru(default) tonoevictionClick Save changes
Step 3: Apply the Custom Parameter Group to Your Cluster
For existing clusters:
Go to ElastiCache Console → Redis caches (or Valkey caches)
Select your cluster by clicking the checkbox
Click Modify
Scroll down to Cluster settings section
In the Parameter group dropdown, select your custom parameter group (e.g.,
bullmq-parameters)Scroll to the bottom and click Preview changes
Review the changes
Click Modify to apply
For new clusters:
During cluster creation (Step 3 of "Creating Valkey Node-Based Cluster"):
In the Cluster settings section
Find the Parameter group field
Select your custom parameter group from the dropdown
Step 4: Restart Required (For Existing Clusters)
⚠️ Important: Changing the parameter group requires a cluster restart for the changes to take effect.
After modifying, AWS will schedule the change
Choose to apply the change:
Immediately: Cluster will restart now (brief downtime)
During maintenance window: Applied during next maintenance window
Monitor the cluster status until it returns to Available
Step 5: Verify the Configuration
After the cluster is available, verify the configuration:
Option 1: Using Redis CLI from EC2:
# Connect to your ElastiCache instance
redis-cli -h your-cache.region.cache.amazonaws.com -p 6379 --tls
# Check the maxmemory-policy
CONFIG GET maxmemory-policy
Expected output:
1) "maxmemory-policy"
2) "noeviction"
Option 2: Using ioredis in your application:
import Redis from "ioredis";
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
});
async function verifyConfig() {
const policy = await redis.config("GET", "maxmemory-policy");
console.log("maxmemory-policy:", policy[1]); // Should output: noeviction
if (policy[1] !== "noeviction") {
console.error("⚠️ WARNING: maxmemory-policy is not set to noeviction!");
console.error("BullMQ may not work correctly.");
} else {
console.log("✅ Configuration is correct for BullMQ");
}
}
verifyConfig();
Complete BullMQ Setup Example
Once your parameter group is configured correctly:
import { Queue, Worker, QueueEvents } from "bullmq";
import Redis from "ioredis";
// Create connection with BullMQ-optimized settings
const connection = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
maxRetriesPerRequest: null, // Required for BullMQ
enableReadyCheck: false,
maxLoadingRetryTime: 5000,
});
// Verify configuration on startup
connection.config("GET", "maxmemory-policy").then(([, policy]) => {
if (policy !== "noeviction") {
console.error(
'❌ CRITICAL: maxmemory-policy must be "noeviction" for BullMQ',
);
process.exit(1);
}
console.log("✅ Redis configuration verified for BullMQ");
});
// Create a queue
const myQueue = new Queue("my-queue", {
connection,
defaultJobOptions: {
attempts: 3,
backoff: {
type: "exponential",
delay: 1000,
},
removeOnComplete: {
count: 100, // Keep last 100 completed jobs
age: 3600, // Keep jobs for 1 hour
},
removeOnFail: {
count: 500, // Keep last 500 failed jobs
},
},
});
// Create a worker
const worker = new Worker(
"my-queue",
async (job) => {
console.log(`Processing job ${job.id} with data:`, job.data);
// Your job processing logic here
return { success: true };
},
{
connection: connection.duplicate(), // Important: use duplicate connection
concurrency: 10,
limiter: {
max: 100,
duration: 1000,
},
},
);
// Event listeners
worker.on("completed", (job) => {
console.log(`✅ Job ${job.id} completed`);
});
worker.on("failed", (job, err) => {
console.error(`❌ Job ${job.id} failed:`, err.message);
});
// Queue events for monitoring
const queueEvents = new QueueEvents("my-queue", { connection });
queueEvents.on("waiting", ({ jobId }) => {
console.log(`Job ${jobId} is waiting`);
});
// Add jobs to the queue
async function addJobs() {
await myQueue.add("process-data", { userId: 123, action: "process" });
await myQueue.add("send-email", { to: "user@example.com", subject: "Hello" });
console.log("Jobs added to queue");
}
addJobs();
// Graceful shutdown
process.on("SIGTERM", async () => {
console.log("Shutting down...");
await worker.close();
await myQueue.close();
await queueEvents.close();
await connection.quit();
process.exit(0);
});
Best Practices for BullMQ with ElastiCache
Always verify maxmemory-policy on startup - Add a check in your application initialization
Use appropriate maxmemory setting - Set
maxmemoryon your parameter group based on your node type (e.g., 80% of available memory)Monitor memory usage - Set up CloudWatch alarms for memory usage
Use job retention policies - Configure
removeOnCompleteandremoveOnFailto prevent memory bloatDuplicate connections for workers - Use
connection.duplicate()for workers to avoid connection issuesEnable Redis persistence - Consider enabling AOF (Append Only File) for queue durability
Test failover scenarios - If using replicas, test that your application handles failover correctly
Quick Reference: Parameter Group Settings for BullMQ
| Parameter | Recommended Value | Reason |
|---|---|---|
maxmemory-policy |
noeviction |
Required - Prevents job loss |
maxmemory |
80% of node memory | Prevents OOM, leaves room for overhead |
timeout |
300 |
Close idle connections after 5 minutes |
tcp-keepalive |
300 |
Keep connections alive |
appendonly |
yes (optional) |
Persistence for queue durability |
appendfsync |
everysec (optional) |
Balance between performance and safety |
Troubleshooting Common Issues
Error 1: ClusterAllFailedError: Failed to refresh slots cache
Full error message:
ClusterAllFailedError: Failed to refresh slots cache
Cause:
You're using new Redis.Cluster() with a Serverless or Node-Based (Cluster Mode Disabled) endpoint. These endpoints do not expose cluster topology information.
Why it happens:
Serverless endpoints are proxy-based and hide the internal cluster architecture
The Redis Cluster client tries to discover cluster nodes and shard slots
This discovery fails because the endpoint doesn't provide cluster topology
Solution:
Use the standard Redis client instead:
// ❌ WRONG - Don't use this with Serverless
const redis = new Redis.Cluster([
{ host: "my-cache.serverless.use1.cache.amazonaws.com", port: 6379 },
]);
// ✅ CORRECT - Use this instead
const redis = new Redis({
host: "my-cache.serverless.use1.cache.amazonaws.com",
port: 6379,
tls: {},
});
Error 2: ETIMEDOUT - Connection Timeout
Full error message:
Error: connect ETIMEDOUT
at TLSSocket.<anonymous>
errorno: 'ETIMEDOUT',
code: 'ETIMEDOUT',
syscall: 'connect'
Cause:
Your application cannot reach ElastiCache over the network. This is always a networking/security group issue, not a code issue.
90% of the time, the cause is:
ElastiCache security group not allowing inbound traffic from your application
Application and ElastiCache in different VPCs
Missing subnet route configuration
Solution Steps:
Step 1: Verify VPC Configuration
Check that your application and ElastiCache are in the same VPC:
For EC2/ECS/Lambda:
Go to EC2 Console → Select your instance
Click Networking tab → Note the VPC ID
For ElastiCache:
Go to ElastiCache Console → Select your cache
Click Details tab → Note the VPC ID
Verify: Both VPC IDs must be identical
If VPCs are different: Connection will always fail. You need to either recreate the cache in the correct VPC or use VPC peering.
Step 2: Configure Security Group (Most Common Fix)
Configure ElastiCache Security Group:
Go to ElastiCache Console
Select your cache → Connectivity & security tab
Click on the Security group link
Click Edit inbound rules
Add a new rule:
Type Protocol Port Range Source Custom TCP TCP 6379 Select Security Group → Choose your EC2/ECS/Lambda security group
Example:
Type: Custom TCP
Protocol: TCP
Port: 6379
Source: sg-0abc123def456 (your-app-security-group)
Description: Allow Redis traffic from application
Important: Use Security Group ID as the source, not IP addresses. This allows AWS to handle internal routing automatically.
Step 3: Verify Application Security Group (Outbound)
Go to EC2 Console → Security Groups
Select your application's security group
Click Outbound rules tab
Ensure there's a rule allowing outbound traffic:
Type Protocol Port Range Destination All traffic All All 0.0.0.0/0
This is usually configured by default, but verify to be sure.
Step 4: Check Subnet Configuration
ElastiCache attaches to private subnets. Verify:
For ElastiCache:
ElastiCache Console → Subnet groups
Verify subnets have proper route tables
For your application:
Must be in subnets that can route to ElastiCache subnets
Usually automatic if in the same VPC
Step 5: Test Network Connectivity
SSH into your EC2 instance (or exec into your container) and test connectivity:
# Test with netcat (preferred)
nc -zv your-cache.serverless.use1.cache.amazonaws.com 6379
# Test with telnet
telnet your-cache.serverless.use1.cache.amazonaws.com 6379
Expected output:
Connection to your-cache.serverless.use1.cache.amazonaws.com 6379 port [tcp/*] succeeded!
If you see timeout:
Connection timed out
→ Security group or VPC configuration is still incorrect. Review steps 1-4.
If connection succeeds but your app still fails: → Check your TLS configuration in code (ensure tls: {} is set).
Error 3: Connection Refused
Error message:
Error: connect ECONNREFUSED
Causes:
Wrong hostname or port
ElastiCache is not in "Available" or "Active" status
Using
localhostinstead of actual endpoint
Solution:
Verify endpoint:
Go to ElastiCache Console → Your cache
Copy the exact endpoint from Connectivity & security tab
Ensure you're using the correct port (default: 6379)
Check cache status:
Cache must be in Available (Node-Based) or Active (Serverless) status
If status is "Creating" or "Modifying", wait for it to complete
Don't use localhost:
// ❌ WRONG host: "localhost"; // ✅ CORRECT host: "my-cache.serverless.use1.cache.amazonaws.com";
Error 4: Cannot Access from Local Development Machine
Cause:
ElastiCache is private by default and only accessible from within the VPC.
Where ElastiCache works:
✅ EC2 instances in the same VPC
✅ ECS tasks in the same VPC
✅ Lambda functions in the same VPC
✅ Other AWS services in the same VPC
Where ElastiCache does NOT work:
❌ Your local development machine
❌ External servers outside AWS
❌ Different VPCs (without VPC peering/transit gateway)
Solutions for local development:
Option 1: Use SSH Tunnel (Recommended)
# Create SSH tunnel through bastion/EC2 instance
ssh -i your-key.pem -L 6379:your-cache.serverless.use1.cache.amazonaws.com:6379 ec2-user@your-ec2-ip
# Now connect to localhost in your application
const redis = new Redis({
host: 'localhost',
port: 6379,
tls: {}, // Still required
});
Option 2: Use a Separate Development Cache
Create a separate ElastiCache instance with a different configuration for development, or use a local Redis instance.
Option 3: Deploy to EC2 for Testing
Deploy your application to an EC2 instance in the same VPC for testing.
Error 5: TLS Handshake Errors
Error message:
Error: unable to verify the first certificate
Error: TLS handshake failed
Cause:
Missing or incorrect TLS configuration.
Solution:
Always include tls: {} in your connection configuration:
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {}, // This is required for AWS ElastiCache
});
If you need to disable TLS verification (not recommended for production):
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {
rejectUnauthorized: false, // Only for testing
},
});
Error 6: BullMQ Jobs Disappearing or OOM Errors
Error messages:
OOM command not allowed when used memory > 'maxmemory'
Or jobs silently disappear from queues without processing.
Cause:
ElastiCache is using the default maxmemory-policy of volatile-lru or allkeys-lru, which evicts keys when memory is full. BullMQ requires the noeviction policy to ensure jobs are never lost.
Why it happens:
Default parameter groups use eviction policies designed for caching, not queuing
When Redis memory fills up, it evicts keys (including your job data)
BullMQ jobs are stored as Redis keys, so they can be evicted
Solution:
You must create a custom parameter group with maxmemory-policy set to noeviction. See the complete guide in the Configuring ElastiCache for BullMQ section above.
Quick fix steps:
Create custom parameter group with Redis family matching your cluster
Set
maxmemory-policytonoevictionApply the parameter group to your cluster
Restart the cluster (required for changes to take effect)
Prevention:
Add this verification to your application startup:
import Redis from "ioredis";
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
});
// Verify on startup
const [, policy] = await redis.config("GET", "maxmemory-policy");
if (policy !== "noeviction") {
console.error(
'❌ CRITICAL: maxmemory-policy must be "noeviction" for BullMQ',
);
console.error(`Current policy: ${policy}`);
process.exit(1);
}
console.log("✅ Redis configured correctly for BullMQ");
Important: This issue only affects Node-Based clusters. Serverless ElastiCache cannot be configured with noeviction and is not compatible with BullMQ.
Best Practices
1. Use Environment Variables
Never hardcode connection details:
// ✅ GOOD
const redis = new Redis({
host: process.env.REDIS_HOST,
port: parseInt(process.env.REDIS_PORT || "6379"),
tls: {},
});
// ❌ BAD
const redis = new Redis({
host: "my-cache.serverless.use1.cache.amazonaws.com",
port: 6379,
tls: {},
});
2. Implement Connection Error Handling
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
retryStrategy: (times) => {
if (times > 3) {
return null; // Stop retrying after 3 attempts
}
return Math.min(times * 200, 2000);
},
reconnectOnError: (err) => {
const targetError = "READONLY";
if (err.message.includes(targetError)) {
return true; // Reconnect on specific errors
}
return false;
},
});
redis.on("error", (err) => {
console.error("Redis error:", err);
// Send to error tracking service (Sentry, CloudWatch, etc.)
});
redis.on("connect", () => {
console.log("Redis connected");
});
redis.on("close", () => {
console.log("Redis connection closed");
});
3. Use Security Group References, Not IP Addresses
When configuring security groups:
✅ GOOD: Source = sg-xxxxx (security group ID)
❌ BAD: Source = 10.0.1.5/32 (IP address)
Security group referencing allows AWS to handle internal IP changes automatically.
4. Enable Encryption for Production
Always enable:
Encryption at rest (data stored on disk)
Encryption in transit (TLS)
This is configured during cache creation and cannot be changed after creation.
5. Use Multiple Availability Zones
For production environments:
Enable multi-AZ deployment
Use at least 1 replica node
Enables automatic failover
6. Monitor Your Cache
Set up CloudWatch alarms for:
CPUUtilization (alert if > 75%)
DatabaseMemoryUsagePercentage (alert if > 80%)
EngineCPUUtilization (alert if > 75%)
NetworkBytesIn/Out
CurrConnections
7. Implement Connection Pooling
Reuse Redis connections instead of creating new ones for each request:
// Create once at application startup
const redis = new Redis({
host: process.env.REDIS_HOST,
port: 6379,
tls: {},
maxRetriesPerRequest: 3,
enableReadyCheck: true,
lazyConnect: false,
});
// Reuse throughout application
export default redis;
8. Use Appropriate TTLs
Set time-to-live (TTL) for cached data:
// Set with TTL (expires in 1 hour)
await redis.setex("key", 3600, "value");
// Set with TTL using SET command
await redis.set("key", "value", "EX", 3600);
9. Test Network Connectivity During Setup
Before deploying your application, verify connectivity from your compute environment (EC2/ECS/Lambda) to ElastiCache.
10. Document Your Configuration
Keep a record of:
VPC ID
Subnet group
Security groups
Node type
Cluster/Serverless configuration
Backup and maintenance windows
Important Notes
About VPC and Networking
ElastiCache is VPC-private by default and cannot be accessed from the internet
You cannot change the VPC after cache creation
All clients must be in the same VPC (or use VPC peering/transit gateway)
Security groups act as firewalls—configure them correctly
About Serverless vs Node-Based
Serverless is easier to manage but gives less control
Serverless cannot be used with BullMQ (incompatible maxmemory-policy configuration)
Node-Based is required for BullMQ and applications needing custom Redis parameters
You cannot convert between Serverless and Node-Based after creation
About Cluster Mode
Cluster Mode Disabled: Simpler, single endpoint, up to 5 read replicas
Cluster Mode Enabled: Better performance for large datasets, multiple shards, requires Redis Cluster client
About TLS/Encryption
TLS (in-transit encryption) is highly recommended for production
Once set, you cannot disable encryption without recreating the cache
Always use
tls: {}in your Redis client configuration
About Costs
Serverless: Pay for data storage and ECPUs (processing units)
Node-Based: Pay for node hours based on instance type
Data transfer within the same AZ is free
Cross-AZ transfer incurs charges
About Backups
Backups are important for production workloads
Enable automatic snapshots
Backups impact performance slightly during snapshot creation
Additional Resources
Last Updated: March 2026
Author: Md Rakibul Islam
This guide is maintained based on actual deployment experiences and common issues encountered in production environments.

