3These are scripts to do stuff in AWS.
5## The individual scripts
9# This adds the root of the repo to the PATH, which has cog_helpers.py
10from os.path import abspath, dirname
13sys.path.append(abspath(dirname(dirname("."))))
21 "name": "dynamols.py",
23 print the items in a DynamoDB table, one item per line
27 "name": "s3_unfreeze.py",
29 takes a list of S3 URIs as input, and either restores those objects from Glacier or reports the status of an in-progress restoration
33 "usage": "s3hash.py <S3_URI> [--algorithm=<ALGO>]",
35 get the checksum/hash of an object in S3
41 list objects from an S3 prefix using the <code>ListObjectsV2</code> API, and print them as JSON to stdout.
42 <p><pre><code>$ s3ls s3://wellcomedigitalworkflow-workflow-data
43 {"Key": "10009/import/10009_db_export.xml", "LastModified": "2019-12-17T15:11:45+00:00", "ETag": "\"dd51824d2f7f434eba02b84a3ad2d2e0\"", "Size": 36883, "StorageClass": "STANDARD"}
44 {"Key": "10009/import/pp_cri_h_5_20_box_91_b18181272_mrc.xml", "LastModified": "2019-09-06T15:17:56+00:00", "ETag": "\"51899c7af2f78bee7a9ee79f358e5b67\"", "Size": 3462, "StorageClass": "STANDARD"}
45 {"Key": "10009/taskmanager/2013-09-17_15-54-03_1625/pp_cri_h_5_20_box_91_b18181272_jpg_1378258257168.xml", "LastModified": "2019-09-06T15:18:01+00:00", "ETag": "\"f71d4745ad32863008e158463cdc0bd3\"", "Size": 8279, "StorageClass": "STANDARD"}
47 I typically dump the results of this to a file before doing any processing – listing objects from S3 is moderately slow.
53 delete objects from an S3 prefix.
54 To see a preview of what objects this will delete, use the <code>s3ls.py</code> script – they use the same code to list objects.
60 show a tree-like view of objects and folders in an S3 prefix.
61 It includes clickable links to folders in the S3 console, so I can dig into the objects in more detail.
62 <img src="screenshots/s3tree.png">
66 "name": "sqs_stats.py",
68 prints a summary of messages visible on our SQS queues.
69 The two columns (which are green/red) show messages visible on the main queue and dead-letter queue respectively.
70 <img src="screenshots/sqs_stats.png">
75cog_helpers.create_description_table(folder_name=folder_name, scripts=scripts)
80 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/dynamols.py">
81 <code>dynamols.py</code>
85 print the items in a DynamoDB table, one item per line
89 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/s3_unfreeze.py">
90 <code>s3_unfreeze.py</code>
94 takes a list of S3 URIs as input, and either restores those objects from Glacier or reports the status of an in-progress restoration
98 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/s3hash.py">
99 <code>s3hash.py <S3_URI> [--algorithm=<ALGO>]</code>
103 get the checksum/hash of an object in S3
107 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/s3ls.py">
112 list objects from an S3 prefix using the <code>ListObjectsV2</code> API, and print them as JSON to stdout.
113 <p><pre><code>$ s3ls s3://wellcomedigitalworkflow-workflow-data
114 {"Key": "10009/import/10009_db_export.xml", "LastModified": "2019-12-17T15:11:45+00:00", "ETag": ""dd51824d2f7f434eba02b84a3ad2d2e0"", "Size": 36883, "StorageClass": "STANDARD"}
115 {"Key": "10009/import/pp_cri_h_5_20_box_91_b18181272_mrc.xml", "LastModified": "2019-09-06T15:17:56+00:00", "ETag": ""51899c7af2f78bee7a9ee79f358e5b67"", "Size": 3462, "StorageClass": "STANDARD"}
116 {"Key": "10009/taskmanager/2013-09-17_15-54-03_1625/pp_cri_h_5_20_box_91_b18181272_jpg_1378258257168.xml", "LastModified": "2019-09-06T15:18:01+00:00", "ETag": ""f71d4745ad32863008e158463cdc0bd3"", "Size": 8279, "StorageClass": "STANDARD"}
118 I typically dump the results of this to a file before doing any processing – listing objects from S3 is moderately slow.
122 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/s3rm.py">
127 delete objects from an S3 prefix.
128 To see a preview of what objects this will delete, use the <code>s3ls.py</code> script – they use the same code to list objects.
132 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/s3tree.py">
133 <code>s3tree.py</code>
137 show a tree-like view of objects and folders in an S3 prefix.
138 It includes clickable links to folders in the S3 console, so I can dig into the objects in more detail.
139 <img src="screenshots/s3tree.png">
143 <a href="https://github.com/alexwlchan/scripts/blob/main/aws/sqs_stats.py">
144 <code>sqs_stats.py</code>
148 prints a summary of messages visible on our SQS queues.
149 The two columns (which are green/red) show messages visible on the main queue and dead-letter queue respectively.
150 <img src="screenshots/sqs_stats.png">
153<!-- [[[end]]] (sum: MTjBzgY4Ng) -->
155## Guessing the right account
157Some of the scripts require me to explicitly pick an IAM profile or account; others will guess based on the resource I'm looking at.
159* The
`sqs_stats` script looks at all the SQS queues in an account, so I need to tell it which account to look at.
161* The
`download_sqs_messages` script looks at an individual queue, and takes a queue URL as argument.
162 SQS queue URLs include the account ID, so it can pick a suitable IAM role for that account.