AWS Certified Data Analytics(DAS-C01) — Certification Summary
Amazon’s data services can be divided into five categories: data ingestion, storage, processing, analysis and visualization, and security.
This article is part of a series, each dealing with each of the five topics above.
1. Data Ingestion
2. Data Storage
3. Data Processing
4. Analysis and Visualization5. Security
Security
Cognito
Using Cognito to authenticate and directly calling the Kinesis API is a reliable and straightforward way. Cognito lets you sign up and sign in users to web and mobile apps and scales up to millions of users.
Macie
A fully managed data security and data privacy service that leverages machine learning and pattern matching to discover and protect sensitive data on AWS.
AWS Macie can discover and protect sensitive data in S3. Macie is a fully managed data security and data privacy service that uses machine learning toprotect your sensitive data in AWS. Macie alerts can be searched in the AWS Management Console and sent to EventBridge.
Secrets Manager
Secrets Manager makes it easy to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. Secrets Manager lets you secure and manage secrets used to access resources in the AWS Cloud, third-party services, and on-premises.
IAM & SSO
- Integrate IAM federation to enable SSO to the Amazon redshift data warehouse. => allow Active Directory users to read/write to the database. Integrate IAM authentication and a third-party SAML-2.0 identity provider (IdP), such as AD FS, PingFederate, or Okta. In addition, database users can also be automatically created at their first login based on corporate permissions.
- IAM with a multi-factor authentication(MFA) setup is the shortest way to secure the database.
- Use IAM Policies in combination with the key policy & Key Policy
How IAM works
- IAM Resource: User, group, policy, and identity provider objects stored in IAM.
- IAM certification: IAM resource objects used for identification and grouping (Users, Groups, Roles)
- IAM Entity: IAM Resource entity for identification — IAM user and roles
- Security Principal: A person or application that uses an AWS account root user, IAM user, or IAM role to sign in and make requests to AWS. Principals include federated users and assumed roles. (REQUEST)
- Authentication: A principal must be authenticated (signed in to AWS) using their credentials to send a request to AWS.
- Authorization: You must also be authorized (allowed) to complete your request. During authorization, AWS uses values from the request context to check for policies that apply to the request. It then uses the policies to determine whether to allow or deny the request. => By default, all requests are denied. (In general, requests made using the AWS account root user credentials for resources in the account are always allowed.) An explicit allow in any permissions policy (identity-based or resource-based) overrides this default.
- Actions and Operations: After your request has been authenticated and authorized, AWS approves the actions or operations in your request. Operations are defined by a service, and include things that you can do to a resource, such as viewing, creating, editing, and deleting that resource.
- Resources: After AWS approves the operations in your request, they can be performed on the related resources within your account. A resource is an object that exists within a service.
AWS KMS — Key Management Service (CMK — Customer Master Key)
ACM(Certificate Manager): SSL, TLS Certifiaction
- provision, manage, and deploy public and private SSL/TLS certificates in AWS
- AWS Certificate Manager is a service that lets you easily provision, manage, and deploy public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services and your internal connected resources. SSL/TLS certificates are used to secure network communications and establish the identity of websites over the Internet as well as resources on private networks. AWS Certificate Manager removes the time-consuming manual process of purchasing, uploading, and renewing SSL/TLS certificates.
AWS Cloud HSM
A custom key store helps with security. AWS CloudHSM is a cloud-based harware security module (HSM) that lets you generate your own encryption keys.
- Redshift: Corporate governance policies require that encryption keys be managed through an on-premises hardware security module (HSM). AWS CloudHSM is a modern service. Therefore, Redshift supports existing AWS CloudHSM Classic or on-premises HSMs.
- The HSM can be on-premises or AWS CloudHSM. When using HSMs, you must use client and server certificates to establish a trusted connection between Amazon Redshift and the HSM. Amazon Redshift supports only AWS CloudHSM Classic for key management.
- Create a VPC and establish a VPN connection between the VPC and your on-premises network. Create an HSM connection and client certificate to your on-premises HSM. Launch a cluster in a VPC with the option of using an on-premises HSM to store keys.
- Compliance standards require databases containing sensitive data to be protected using hardware security modules (HSMs) that support automatic key rotation. Establish a trusted connection with the HSM using client and server certificates with automatic key rotation. Create a new HSM encrypted Amazon Redshift cluster and migrate your data to the new cluster.
AWS Cloud Formation
Governance: Wherever possible, your environment should be deployed via AWS CloudFormation based on your enterprise requirements.
AWS Cloud Watch: Fully managed cron job service
- Engineering teams use AWS Glue to process items, AWS Step Functions to orchestrate processes, and Amazon CloudWatch to schedule jobs.
- Create instance group configurations for core and task nodes. Create an automatic scaling policy that scales out a group of instances based on the Amazon CloudWatch YARNMemoryAvailablePercentage metric. Create an automatic scaling policy that scales out a group of instances based on the CloudWatch YARNMemoryAvailablePercentage metric.
- For kinesis stream monitoring, cloud watch metrics: All (PutRecord.Success, GetRecords.Success, PutRecords.Success)
Architecture
Architecture V1
- AWS IAM API call can be captured via CloudTrail and be triggered via Cloudwatch event to send a notification.
- AWS IAM API call can be captured via CloudTrail and be triggered via Cloudwatch event to send a notification. (Cloudwatch and Cloudtrail should be in the same Region.)
Architecture V2
- Send data to Amazon Kinesis Data Firehose using a CloudWatch Logs subscription.
- Use AWS Lambda to transform data from a Kinesis Data Firehose delivery stream and enrich it with data from a DynamoDB table.
- Configure Amazon S3 as a Kinesis Data Firehose delivery destination.