Migrating a Legacy App to Cloud Native — Part 4

Migrating a Legacy App to Cloud Native — Part 4

Photo by Steve Johnson on Unsplash

In part 4 of this series, I add Amplify Storage and explore its security model and how to customize it…

Amplify has a storage module which may be backed in AWS by either S3 or DynamoDB. Back in part 2 when exploring application requirements, I noted that S3 would be used for storage of user settings and data collections. In short, DynamoDB could not be used because:

  1. Records are limited to 400 KB and I don’t want to limit a collection to that size.
  2. Amplify’s storage API may use S3 or DynamoDB; we can’t use both for different data. Thus user settings, while small, will also go into S3.

Note: I later discovered that while the CLI provides an option for DynamoDB storage, none of the Amplify libraries, in any language, support it.

We will use DynamoDB for search (see requirements), but that will come later. Right now, we just want to save and read back user settings and collections.

If it hasn’t been clear yet in this series, I’m writing this as I attempt to use Amplify. This is a series about a journey, not a straight-up how-to guide.

Add Storage

Where did we leave off?

$ amplify status

Current Environment: dev

CategoryResource nameOperationProvider plugin
HostingS3AndCloudFrontNo Changeawscloudformation
AuthsqacauthNo Changeawscloudformation

Hosting endpoint: sqac-amplify-20190817123020-hostingbucket-d..
Hosted UI Endpoint: sqac-dev.auth.us-west-2.amazoncognito.com

I have only hosting and auth. So let’s add storage:

$ amplify add storage
? Please select from one of the below mentioned services Content (Images, audio, video, etc.)
? Please provide a friendly name for your resource that will be used to label this category in the project: storage
? Please provide bucket name: sqac-amplify-user-data
? Who should have access: Auth and guest users
? What kind of access do you want for Authenticated users? create/update, read, delete
? What kind of access do you want for Guest users? read
? Do you want to add a Lambda Trigger for your S3 Bucket? No
Successfully added resource storage locally

Some next steps:
"amplify push" builds all of your local backend resources and provisions them in the cloud
"amplify publish" builds all of your local backend and front-end resources (if you added hosting category) and provisions them in the cloud

I choose content as “Images, audio, video, etc” vs the NoSQL option. This is how you get S3 instead of DynamoDB. I then gave things friendly names, rather than the randomly generated defaults.

When exploring the requirements, I stated that I wanted first-time visitors to be able to play with my application without first creating an account, thus I selected to provide guest read-only access. Authenticated users have full read-write.

While we will eventually add a Lambda Trigger to the S3 bucket, this is a big task for later and so I answered No for now. Amplify allows us to use the amplify storage update command to change things later.

Moving on, let’s deploy these changes…

$ amplify push

Current Environment: dev

CategoryResource nameOperationProvider plugin
StoragestorageCreateawscloudformation
HostingS3AndCloudFrontNo Changeawscloudformation
AuthsqacauthNo Changeawscloudformation

? Are you sure you want to continue? Yes
⠴ Updating resources in the cloud. This may take a few minutes...

… A few dozen lines of CloudFormation output over a few minutes …

✔ All resources are updated in the cloud

Nothing to it. In the AWS Console (web site), I see a new sqac-amplify-user-data-dev empty bucket. (The -dev is my environment name; it gets appended to everything to support multiple environments. 🥳)

So we’re done, right? Well… that depends.

Security Policies and Parameters

Warning: I’m going to dive deep into AWS policies and CloudFormation here. You can follow along with the files in the part 4 pull request, or just let your eyes glaze over. 😳

I poked my nose into the new CloudFormation stack that defines the storage, including policies, and noticed it isn’t exactly what I want. I want users to keep their private data at the private access level, and shared data in the protected access level, and nothing else. The policies though are allowing users write access in the public area too, which is not very useful as anyone could put anything here and any one else can modify or delete it. I really don’t want that. Uh oh? But then I realized I was looking at the stack’s “Parameters” section and there is a parameters.json file in the amplify folder. 💡Here is parameters.json:

{  
   "bucketName": "sqac-amplify-user-data",  
   "authPolicyName": "s3_amplify_7405df3b",  
   "unauthPolicyName": "s3_amplify_7405df3b",  
   "authRoleName": {  
       "Ref": "AuthRoleName"  
   },  
   "unauthRoleName": {  
       "Ref": "UnauthRoleName"  
   },  
   "selectedGuestPermissions": [  
       "s3:GetObject",  
       "s3:ListBucket"  
   ],  
   "selectedAuthenticatedPermissions": [  
       "s3:PutObject",  
       "s3:GetObject",  
       "s3:ListBucket",  
       "s3:DeleteObject"  
   ],  
   "s3PermissionsAuthenticatedPublic": "s3:PutObject,s3:GetObject,s3:DeleteObject",  
   "s3PublicPolicy": "Public_policy_9efc80af",  
   "s3PermissionsAuthenticatedUploads": "s3:PutObject",  
   "s3UploadsPolicy": "Uploads_policy_9efc80af",  
   "s3PermissionsAuthenticatedProtected": "s3:PutObject,s3:GetObject,s3:DeleteObject",  
   "s3ProtectedPolicy": "Protected_policy_7b753c06",  
   "s3PermissionsAuthenticatedPrivate": "s3:PutObject,s3:GetObject,s3:DeleteObject",  
   "s3PrivatePolicy": "Private_policy_7b753c06",  
   "AuthenticatedAllowList": "ALLOW",  
   "s3ReadPolicy": "read_policy_9efc80af",  
   "s3PermissionsGuestPublic": "s3:GetObject",  
   "s3PermissionsGuestUploads": "DISALLOW",  
   "GuestAllowList": "ALLOW",  
   "triggerFunction": "NONE"  
}

Cool. Let’s see what I want here…

"s3PermissionsAuthenticatedPublic": "s3:GetObject",

I think that’ll do it. All I did was remove permission to write to the public area. Anyone, including guests, can still read from it. Only data that I manually put in there can exist, so I can use this for the guest “demo mode” content. (I won’t though, read on.) Nothing in here makes it entirely clear how the protected functionality works, but these are just configuration options not the actual policies. For that, I dig around in the s3-cloudformation-template.json file that Amplify created. The JSON format is a bit verbose though, so I pop open the CloudFormation Designer (GUI) tool in AWS Console and use that to select policies to look at in more concise resulting YAML.

First interesting bit I found:

S3GuestReadPolicy:  
    DependsOn:  
      - S3Bucket  
    Condition: GuestReadAndList  
    Type: 'AWS::IAM::Policy'  
    Properties:  
      PolicyName: !Ref s3ReadPolicy  
      Roles:  
        - !Ref unauthRoleName  
      PolicyDocument:  
        Version: 2012-10-17  
        Statement:  
          - Effect: Allow  
            Action:  
              - 's3:GetObject'  
            Resource:  
              - !Join   
                - ''  
                - - 'arn:aws:s3:::'  
                  - !Ref S3Bucket  
                  - /protected/*  
          - Effect: Allow  
            Action:  
              - 's3:ListBucket'  
            Resource:  
              - !Join   
                - ''  
                - - 'arn:aws:s3:::'  
                  - !Ref S3Bucket  
            Condition:  
              StringLike:  
                's3:prefix':  
                  - public/  
                  - public/*  
                  - protected/  
                  - protected/*

If you haven’t noticed yet, Amplify manages access levels by prefixing the S3 key with public, protected, or private, followed by the authenticated user’s ID. (You can think of these as file system folders, even though technically they are not.) Thus a hypothetical user with Cognito ID 123 will store private data in private/123/ and protected data in protected/123/. Notice here that there is a public/* prefix matching condition. Does that mean that users can only write to their own public area? If so, what is the difference between public and protected? I must dig some more!

This policy does answer a question for me though: a guest user can read protected content, not just public. I may have no need for public at all then.

The S3AuthReadPolicy (for authenticated users, not guests) is similar, but has two more prefixes to allow reads of the user’s private data. Thus authenticated users can read anything except other user’s private sections:

Condition:  
      StringLike:  
        's3:prefix':  
          - public/  
          - public/*  
          - protected/  
          - protected/*  
          - 'private/${cognito-identity.amazonaws.com:sub}/'  
          - 'private/${cognito-identity.amazonaws.com:sub}/*'

There’s an S3GuestUploadPolicy which would allow guests to upload to an uploads/ prefix, but it is nullified by the default parameter of s3PermissionsGuestUploads being set to DISALLOW. 👍This leads to the S3AuthUploadPolicy for authenticated users, which is allowing users to dump files into the uploads/ prefix with no way of reading it back. I figure this must be a feature to allow for uploads that are then processed by a triggered Lambda. I have no use for such uploads, and so in parameters.json I set s3PermissionsAuthenticatedUploads to DISALLOW, just like the parameter for guests.

Moving on, I see an S3AuthPublicPolicy which ties to the s3PermissionsAuthenticatedPublic parameter that I already changed to S3:GetObject; thus authenticated users can only read, not write to the public area. I see that this applies to anywhere in the public prefix, with no restriction around the user’s ID:

Resource:  
      - !Join   
        - ''  
        - - 'arn:aws:s3:::'  
          - !Ref S3Bucket  
          - /public/*

Contrast that to S3AuthProtectedPolicy, which does restrict write activity to the user’s ID:

Resource:  
       - !Join   
          - ''  
          - - 'arn:aws:s3:::'  
            - !Ref S3Bucket  
            - '/protected/${cognito-identity.amazonaws.com:sub}/*'

The S3AuthPrivatePolicy looks nearly the same, protecting the private folder:

Resource:  
       - !Join   
          - ''  
          - - 'arn:aws:s3:::'  
            - !Ref S3Bucket  
            - '/private/${cognito-identity.amazonaws.com:sub}/*'

So what’s the difference? Scroll back up to S3GuestReadPolicy and S3AuthReadPolicy. While S3AuthProtectedPolicy and S3AuthPrivatePolicy cover what a user can write to, the earlier policies allow anybody to read the protected/* prefix.

I have long been puzzled as to why Amplify’s documentation didn’t clearly define the access roles: public vs protected vs private. I couldn’t find any explanation in the past, but see that there are now some details here. However, I still find it a bit ambiguous. Now I see some justification for that: it’s up to you! Editing the parameters.json file lets you dictate the behavior. However, that too is not documented. 😔

Based on this quick study, I’ve put together a summary of what I think are the rules. I may be mistaken on some of it though; no guarantees. Assuming you select guest access and read-write for authenticated users, then this is the default behavior and the parameter to change if you wish:

  • upload/Authenticated users may upload to this. (s3PermissionsAuthenticatedUploads)
    Guests may not. (s3PermissionsGuestUploads)
  • public/*Authenticated users may read and write. (s3PermissionsAuthenticatedPublic)
    Guests may read. (s3PermissionsGuestPublic)
  • protected/Anybody may read (not configurable)
  • protected/{user-id}/*Anybody may read (not configurable)
    The matching authenticated user may write (s3PermissionsAuthenticatedProtected)
  • protected/No access
  • protected/{user-id}/*The matching authenticated user may read and write (s3PermissionsAuthenticatedPrivate)
  • Authenticated users may list contents of any prefix they can read (AuthenticatedAllowList)
  • Guests may list contents of any prefix they can read (GuestAllowList)
  • If in the CLI you selected no guest access, then unauthenticated users have no access.

Possible values of these properties are a comma-separated list of any of s3:ListBucket, s3:GetObject, s3:PutObject, s3:DeleteObject or simply DISALLOW.

The parameters file also includes selectedGuestPermissions and selectedAuthenticatedPermissions, yet I don’t see them used anywhere in the CloudFormation template. 🤔🤦‍♂️

For SqAC, I modified the parameters to disable the upload and public features entirely, while keeping the default behavior for protected and private.

CORS?

As part of the Storage / Using Amazon S3 documentation for Amplify, is this bit telling you to manually configure CORS policy on your S3 bucket. 😲 This would break Infrastructure as Code (IaC) and easy use of multiple environments! Fortunately, the documentation is a red herring — Amplify has already set the bucket policy as documented. (I have submitted a bug report to remove this.)

To be continued…

Find all of this in the part 4 pull request.

I had intended to include updating the client app to use storage as part of this post, but the security policy analysis turned this into a big post already, and the next one is shaping up to be a good bit of work as well.

Coming next time… using Amplify Storage!