We love sharing our knowledge about all things tech and community.
Learn from our blog
To talk about Redshift we probably need to talk about Data Warehousing, as Redshift is a fully managed Data Warehouse package. This is not putting your data in a ‘digital warehouse’ to go and gather virtual dust on the back of a virtual shelf somewhere and is more akin to a single source of truth for the state of a business and its information. Whilst Redshift is based on PostgreSQL and can be queried with normal SQL, it is not meant to be a DataBase per se.
Extract, Transform & Load (ETL) is the name of the game when it comes to Data Pipelines.
The Extract portion will acquire data from some source(s), which will then pass through Transform where some alteration may be needed to that data, to then end up Loaded into another storage format, such as Redshift, S3, to name a couple.
Despite being what most people think about when mentioning AWS, I left their Elastic Compute Cloud (EC2) until now, mostly because personally I don’t use it all that much.
With the advent of Lambda, EC2 can feel a bit more laborious as there is a lot more set up involved than with serverless solutions.
On the 7th day of AWS, I would like to mention something that almost stays hidden behind the scenes when using Serverless, but is critical to its use.
CloudFormation allows you to spin up a stack of resources from a config file. It deals with the order of operations, timing and error handling on your behalf and makes it pretty easy to keep on top of all the elements of the stack.
Simple Queue Service (SQS) is an indispensable tool for many batch operations and concurrent events within, and even outside of, the AWS infrastructure. There are two types of queues: Standard or FIFO. The standard queue, where messages enter the queue and exit in the most performance effective manner, can be batched into groups of messages and can be used to trigger Lambdas.
AWS Identity and Access Management (IAM) is where you set up users and groups to create and assign roles and permissions to those users. You may want to assign a user with specific access to certain functionality that will be used by a Lambda, or a user login for someone to log into AWS and view logs. All of these things are done here in IAM.
While this is not an actual AWS tool, it is something so useful for setting up Lambdas that it deserves a day to itself.
When looking for how to get started with Lambdas I ran into Serverless pretty quickly. It’s a framework that does a lot of the nitty-gritty and heavy lifting involved with configuring the elements that surround the code written inside the Lambdas.
When I needed to perform small operations on files stored in S3, I turned to this next tool in the AWS arsenal, Lambda. Lambda is effectively a ‘serverless’ script running platform, whereas of the time of writing, code in Node.js, Python, Java, Ruby, Go, C# or PowerShell will be run whenever triggered by specific events. A few examples of these events are: S3 Object uploaded, Amazon Alexa, and Manual trigger via the Lambda UI.