Hugo static site hosting and delivery on AWS
Set up a static website with Hugo on AWS using Route53, CloudFront, S3, IAM, and Certificate Manager.
What a long strange trip it’s been building this site!
We’ve gone from conflicting documentation, to recommendations from different pros and we’re right back where we started!
In this post, I discuss setting up a static website with multiple pages using AWS. The tech behind it includes:
Route53
CloudFront
Simple Storage Service (S3)
Certificate Manager
Identity Access Management
CodeCommit
I started with the knowledge I had learned from completing the 2018 June edition of the AWS certification course on A Cloud Guru of the developer associate level and after running in to a few hurdles we have liftoff. Well, technically lift off was two days ago, but since I’m posting this today, and have been super busy since then here we are.
To start with, I applied what some might consider to be best-practice in a security sense, in that I created a private bucket, uploaded my files and then created a cloudfront CDN. At creation time, CloudFront kindly offered to create an IAM role for the distribution which could have read access to the bucket and it’s contents.
Whilst that was happening I hopped over to Route53 and created two record sets:
A record for IPv4 connections to CloudFront
AAAA record for IPv6
When trying to alias the CloudFront distribution, it didn’t quite work using the usual point-and-click the AWS asset that you normally can do, as I was able to do with the load balancer in the EC2 section of the course. Instead I had to copy-paste from CloudFront the domain that had been assigned to me, into the alias record. Initially, I thought this was because the CDN was still being formed, and since CloudFront takes it’s sweet ass time, I thought I would just paste it in and take the “see how we go” approach. Side note: I did come back later to see if I could in fact click the AWS asset in the alias dropdown - it still didn’t work.
Next I was in the Certificate Manager. Being somewhat of a newbie to acquiring SSL certificates, I blindly just went for the wildcard option only. So in this case it was *.jeanklaas.com
In the event I wanted to use a sub-domain for something at a later date. This turned out to be a mistake because I also needed to reference the base site too (still not 100% why, I have to look it up one day). So the lesson here was to create multiple named records when acquiring the certificate. So now in the certificate manager under my domain I have:
jeanklaas.com
*.jeanklaas.com
Let’s take stock of where we’re at:
CloudFront ✅
Route53 ✅
Certificate Manager ✅
IAM ✅
We’re looking pretty good! We just need to get our code up into S3, then we can start pushing buttons and seeing if everything works. Queue up image of cat bashing away at keyboard creating the site. Commit work to CodeCommit, and now send up all files to S3. Fun fact about S3 (since we’re here): AWS don’t charge for uploads TO S3, only downloads FROM S3).
Now that CloudFront has done it’s thing and DNS has had some time to propagate hit it!
https://jeanklaas.com
Failure! ❌
We couldn’t connect to the site. I needed to update the CNAME record in CloudFront. In the general settings of the distribution, update this record:
Alternate Domain Names (CNAMEs): jeanklaas.com
Wait another 15-20 minutes-ish for CloudFront to update. Ok, now we’ve got the index page! Yay it worked…. but did it? On closer inspection, only the index (home) page loaded. Ah man, now whenever we try to go to another page we get access denied! But why? This page helped a bit: https://forums.aws.amazon.com/thread.jspa?threadID=85849 but not straight away. Essentially, S3 is a key-value store of objects. So when CloudFront is talking to it, it doesn’t understand what you mean by jeanklaas.com/blog
. Instead it’s like:
Well, I have permissions to read all the items in the bucket, but I can’t give you the value of the folder because there isn’t one, and you’re not allowed to list the contents of a folder, so I’m going to throw you an error.
<Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<Key>services</Key>
<RequestId>7EDCDF31E65BB22A</RequestId>
<HostId>
8AHitmLOp6NkXZ06najDTTWdc+qWSg2D++P0MY+x23wUseh7PtpZf4FfE/OT
</HostId>
</Error>
At first, this was a mind bender because most traditional web servers send you index.html when you request a page if you don’t specify a page type. Eg, contact.php
or flow.do
etc. Yes I’m over simplifying this, but the details are a story for another time.
Anyway, with that problem out of the way, I was now not able to render anything because I was getting security errors relating to the SSL certificate I had. I’m not 100% on this, but I believe that you’re not allowed to do internal SSL sessions between AWS assets, and that SSL termination needs to be at the AWS entrypoint. In this case CloudFront -> S3.
To solve I updated the access from the bucket where the website files are stored to be public reads. Now this is sort of where I was left confused (explained below) but happy that it worked.
In summary so far, to solve the issues I had encountered I had to:
Update the bucket addressing on CloudFront to use the S3 qualified bucket address. Eg: Not jeanklaas.s3.amazon….. rather jeanklaas.s3-website-region-location.amazon…..
Add a CNAME record to CloudFront
Grant public read access to the bucket
Update wildcard addressing on the SSL certificate.
Right so now we’re at a safe point where the website is live, traffic is flowing, addresses work and the chaos has ended. Somewhat.
The next day at work (yesterday), I brought up the previous night’s madness with some colleagues. To see if I could get any insight from some pro’s who work with AWS all day everyday and have many more certifications and much more experience than I do.
I made this post in our Slack (edited for conciseness):
Interesting from AWS: This guide says, when making a static website hosted from S3, you should grant your bucket public read access: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/GettingStarted.html#GettingStartedUploadContent
So then you do that and then this notification appears:
You have provided public access to this bucket. We highly recommend that you never grant any kind of public access to your S3 bucket.
Anyone know which is best method? FWIW - I came across this because I initially set a private bucket, and gave an IAM permission to cloudfront to access this bucket. This worked, kind of, for the site in that the index page was successfully delivered. However, trying to access a sub page, like mydomain.com/about caused 403 forbidden errors. The bucket policy was created at the time the cloudfront was created.
This is why I was confused about what the best practice is because I thought yes, they’re right, buckets should never be public for the world to see, instead the resource accessing them should be granted the permission and then do something with it. In this case, CloudFront should access it, then pass the data to the request rather than the user accessing the bucket.
Later I found another piece of conflicting info:
It seems, despite the warnings on the s3 page, that they need to be public after all https://docs.aws.amazon.com/AmazonS3/latest/dev/WebsiteAccessPermissionsReqd.html
This was met with some disagreement by my colleagues, so this morning we sat down before the work day started and had a crack at getting it setup correctly. After much debate, and to cut an already long story short(er), the implementation that I currently have now is what is best suited for hosting a static website with CloudFront, and is the one that works the best, that public readers would expect to work.