Redshift interview question & answers

Q1) What are the benefits of using AWS Redshift?

Answer:

· We can run multiple queries on multiple nodes.

· We can use PostgreSQL, ODBC and JDBC.

· Automated backup

· Built-in security.

· When applications require analytical function.

· Cost effective compared to traditional data warehousing technique.

Q2) When can we choose the Redshift?

Answer:

When application require more volumes from users then using redshift, we can speed up the process of query response and cut down the infra cost.

Q3) What are the common features of Redshift?

Answer:

AWS Redshift is a fully managed, petabyte-scale data warehouse service in the AWS, We can create a bunch of nodes once we have data warehouse created i.e AWS redshift Cluster. Now we can upload the set of data and perform the query for data analysis.

Q4) What’s the use of redshift?

Answer:

· Scalability up/down

· Pay for whatever we use

Q5) Which is the database redshift is using?

Answer: PostgreSQL

Q6) What Amazon redshift built for?

Answer:

Amazon Redshift is a data warehouse service fully managed, fast. built on the technology Massive Parallel Processing.

Q7) Is redshift can be used with AWS RDS?

Answer:

AWS RDS is used for database using MariaDB, Oracle DB, Amazon Aurora, Mysql and others.

Q8) Does AWS Redshift use SQL?

Answer: Amazon Redshift is using PostgreSQL

Q9) Does Amazon redshift based on concept cluster?

Answer: Amazon Redshift uses nodes; group of nodes are called cluster. Single cluster runs a Amazon Redshift and it has one or multiple number of databases.

Q10) Is AWS Redshift a stored procedure?

Answer: NO. Redshift supports tables, views and functions.

Become an AWS Certified Expert in 25Hours

Q11) Is AWS redshift is one of the RDS in AWS or is it Relation Database?

Answer: Amazon Redshift RDBMS we can use other RDBMS applications

Q12) What is the use of Amazon redshift ODBC driver?

Answer: It will allow us to connect with real time data of the Amazon Redshift from any applications which is supported with ODBC.

Q13) Why AWS redshift is named after Redshift?

Answer: Amazon Web Services given the name RedShift for Oracle trademark red.

Q14) What is the use of AWS redshift driver?

Answer: AWS Redshift gives you ODBC drivers for Linux, Windows, OS X operating systems

Q15) What’s the use AWS redshift cluster Service?

Answer: AWS Redshift data collection of nodes, group of nodes are cluster. cluster runs a AWS Redshift and it has databases.

Q16) How will the AWS redshift work?

Answer: Amazon Redshift set up, operate, and scale, managing backups and updates and monitors nodes.

Q17) What is the difference between s3 and redshift?

Answer: AWS S3 is object-based storage. AWS Redshift is a fast, fully managed, petabyte-scale data warehouse.

Q18) What is cluster AWS cloud?

Answer: Cluster is grouping of similar services. You can create multiple clusters depends on the requirement on the services.

Q19) Does redshift support unstructured data?

Answer: AWS Redshift is using PostgreSQL supports only structured data.

Q20) Why redshift Distribution key is used?

Answer: Redshift distribution Keys to check where the data is located in AWS Redshift.

Q21) Is redshift using primary keys for key-value storage?

Yes

Uses only foreign keys

None of the above

Answer: YES

Q22) What is MPP in Redshift?

Answer: massive protection policy

Q23) Redshift has the fast query performance, storage and technology which of the following is correct.

key-value

database

row

columnar

Answer: columnar

Q24) AWS Redshift is used for which one of the following?

Small queries

complex queries

small data

large and static data

Answer: Complex Queries

Q25) AWS Redshift is basically works on nodes and cluster and which of following option Redshift is based on?

Storage service

database

System Storage

Answer: database

Q26) What kind of application/services uses Redshift database?

Answer: Amazon Redshift is meant for services which are petabyte scale warehousing. Example: Big data analytics and OLAP. Redshift are fully managed and scalable in nature.

Q27) What are the business intelligent tools to which Redshift can be integrated with?

Answer: Redshift can be integrated with Tableau, Jaspersoft, Microstrategy, Cognos Pentaho and Amazon QuickSight.

Q28) When and why we should use Redshift Spectrum?

Answer: When we need to SQL query structured, semi-structured or unstructured data stored in S3 and joining them with our redshift Tables.

Q29) What data formats does Redshift Spectrum support?

Answer: Redshift Spectrum currently supports for Avro, CSV, Grok, Ion, JSON, ORC, Parquet, RCFile, RegexSerDe, Sequence File and Tex.

Q30) Is Redshift similar to RDS?

Ans. Redshift is a heavily version of PostgreSQL, it’s not used for OLTP.OLTP remember is online transaction processing. So, Redshift is not a replacement for RDS. Redshift is OLAP, OLAP stands for online analytical processing. That means that Redshift is used for analytics and data warehousing.

Get AWS 100% Practical Training

Q31) How far Redshift is better in performance as compare to other data warehouse technologies?

Answer: Redshift provides ten times better performance than other data warehouse technologies and Redshift is meant to scale 2 petabytes of data. So, petabytes, 1 petabyte is 1000 terabytes, means a lot of data.

Q32) Is Redshift a row-based storage or columnar based?

Answer: Redshift supports columnar data storage (instead of row based) which makes it good for analytical processing not for transactional processing.

PostgreSQL, RDS, MySQL supports row-based storage of data.

Q33) What is MPP. Is Redshift support MPP?

Answer: MPP stands for massively parallel query execution .It’s highly distributed, when you run a query, it’s going to run it in parallel across so many instances and so many cores .And as such it’s called a massively parallel query execution which in turns makes the database highly available.

Q34) How do we load data into Redshift?

Answer: Data is loaded from S3, DynamoDB, MS and Read Replicas in RDS for example, when you have a RBS database but you want to do analytics on it to create a read replica, to pull that data from the read replica into Redshift and to do the analytics into Redshift.

Q35) How we can scale Redshift database?

Answer: Redshift can go from one node up to 128 nodes and each node has so many cores, and each node by itself has 160 gigabytes of space. So, Overall, it provides lot of space.

Q36) How many types of nodes supported by Redshift and what are the functions of nodes?

Answer: Redshift supports 2 nodes -leader node and compute node

There is a leader node and the leader node are used to planning the queries and aggregate results across all compute nodes. So, the compute nodes are going to actually be performing the queries and they will send the results back to the leader. If you have one node, then that node is both a leader node and a compute node.

Q37) What is Redshift Spectrum?

Answer: In Redshift Spectrum, you don’t need to load the data into Redshift first. Yoy can perform the queries directly against S3. So, it is a great way to do ah hoc queries.

Q38) What is Redshift Enhanced VPC Routing?

Answer: If you enable Redshift Enhanced VPC Routing feature , all the COPY of data from whatever storage you want into Redshift, or UNLOAD from Redshift back to S3 , goes through VPC which gives you enhanced security and maybe better performance as well as your data doesn’t go over the public internet.

Q39) What are the important Features of Redshift?

Answer:

· Operations: Similar to RDS

· Security: IAM, KMS, VPC, SSL (similar to RDS)

· Redshift provides 10times more performance compare to other warehouse services.

· Redshift is highly available and have auto healing feature.

· Redshift provides pay per node provisioned, 1/10th of the cost compare to other data warehouse services.

Q40) For which purpose Redshift is used for?

Answer: Redshift is used for Business Intelligence, Analytics, Data Warehouse

Q41) How will I be charged and billed if I use Amazon Redshift?

Answer: You pay only for what you use, and there are no minimum or setup fees. Billing commences for a data warehouse cluster as soon as the data warehouse cluster is available. Billing continues until the data warehouse cluster terminates, which would occur upon deletion or in the event of instance failure. You are billed based on:

· Compute Node Hours

· Backup Storage

· Data Transfer

· Data Scanned

Q42) Is it possible to access Redshift compute Nodes directly?

Answer: No. Your Amazon Redshift compute nodes are in a private network space and can only be accessed from your data warehouse cluster’s leader node. This provides an additional layer of security for your data

Q43) What kind of application/services uses Redshift database?

Answer: Amazon Redshift is meant for services which are petabyte scale warehousing. Example: Big data analytics and OLAP. Redshift are fully managed and scalable in nature.

Q44) What are the business intelligent tools to which Redshift can be integrated with?

Answer: Redshift can be integrated with Tableau, Jaspersoft, Microstrategy, Cognos Pentaho and Amazon QuickSight.

Q45) When and why we should use Redshift Spectrum.

Answer: When we need to SQL query structured, semi-structured or unstructured data stored in S3 and joining them with our redshift Tables.

Q46) What data formats does Redshift Spectrum support?

Answer: Redshift Spectrum currently supports for Avro, CSV, Grok, Ion, JSON, ORC, Parquet, RCFile, RegexSerDe, SequenceFile and Tex.

Q47) Select the AWS service, which is NoSQL database and serverless, minimal latency?

Answer:

· RDS

· MYSQL

· DYNAMO DB

· REDSHIFT

Q48) Can we access the compute node of Redshift directly?

Answer:

· Redshift compute node lives in private network space and can only be accessed from data

· warehouse cluster leader node.

Q49) How we can monitor the performance of Redshift data warehouse cluster.

Answer:

· Performance metric like compute and storage utilization, read/write traffic can be monitored

· via AWS Management Console or using CloudWatch.

Q50) Which sql query platform are you using to communicate with redshift?

Answer: Postgresql

1. What do you know about the Amazon Database?

Answer: Amazon database is one of the Amazon Web Services that offers managed database along with managed service and NoSQL. It is also a fully managed petabyte-scale data warehouse service and in-memory caching as a service. There are four AWS database services, the user can choose to use one or multiple that meet the requirements. Amazon database services are – DynamoDB, RDS, RedShift, and ElastiCache.

2. Explain Amazon Relational Database.

Answer: Amazon relational database is a service that helps users with a number of services such as operation, lining up, and scaling an on-line database within the cloud. It automates the admin tasks such as info setup, hardware provisioning, backups, and mending. Amazon relational database provides users with resizable and cost-effective capability. By automating the tasks, it saves time and thus let user concentrate on the applications and provide them high availableness, quick performance, compatibility, and security. There are a number of AWS RDS engines, such as:

· MySQL

· Oracle

· PostgreSQL

· SQL Server

· MariaDB

· Amazon Aurora

3. What are the features of Amazon Database?

Answer: Following are the important features of Amazon Database:

· Easy to administer

· Highly scalable

· Durable and reliable

· Faster performance

· Highly available

· More secure

· Cost-effective

4. Which of the AWS DB service is a NoSQL database and server-less, and delivers consistent single-digit millisecond latency at any scale?

Answer: Amazon DynamoDB

5. What is the key-value store?

Answer: Key-value store is a database service which facilitates the storing, updating, and querying of the objects which are generally identified with the key and values. These objects consist of the keys and values which constitutes the actual content that is stored.

AWS Database Interview Questions: DynamoDB

NoSQL is widely used now as compared to the traditional SQL database service. And thus, the demand for professionals with DynamoDB knowledge has increased in the industry. There are a number of opportunities for the professionals aspired to make a career in the database and backend fields. If you have some prior knowledge of DynamoDB and you are looking for AWS database interview questions, here are the AWS DynamoDB interview questions that will help you crack the interview.

6. What is DynamoDB?

Answer: DynamoDB is a NoSQL database service that provides an inevitable and faster performance. DynamoDB is superintendent and offers a high level of scalability. DynamoDB makes users not to worry about the configuration, setup, hardware provisioning, throughput capacity, replication, software patching or cluster scaling. It helps users in offloading the scaling and operating distributed databases to AWS.

7. List some of the benefits of using Amazon DynamoDB.

Answer: Amazon DynamoDB is the NoSQL service that provides a number of benefits to the users. Some of the benefits of AWS DynamoDB are –

· Being a self-managed service, DynamoDB doesn’t require the experts for setup, installation, cluster etc.

· It provides inevitable and faster performance.

· It is highly scalable, available, and durable.

· It provides very high throughput at the low latency.

· It is highly cost-effective.

· It supports and allows the creation of dynamic tables with multi-values attributes i.e. it’s flexible in nature.

8. What is a DynamoDB Mapper Class?

Answer: The mapper class is the entry point of the DynamoDB. It allows users to enter the DynamoDB and access the endpoint. DynamoDB mapper class helps users access the data stored in various tables, then execute queries, scan them against the tables, and perform CRUD operations on the data items.

9. What are the data types supported by DynamoDB?

Answer: DynamoDB supports different types of data types such as collection data types, scalar data types, and even null values.

Scalar Data Types – The scalar data types supported by DynamoDB are:

· Binary

· Number

· Boolean

· String

Collection Data Types – The collection data types supported by DynamoDB are:

· Binary Set

· Number Set

· String Set

· Heterogeneous Map

· Heterogeneous List

10. What do you understand by DynamoDB Auto Scaling?

Answer: DynamoDB Auto Scaling specifies its specialized feature to automatically scale up and down its own read and write capacity or global secondary index.

AWS Database Interview Questions: Redshift

With the increase in popularity in Amazon Redshift, it has become a hot topic for the interviewers due to its advantages. As the digital data is accumulating at an incomprehensible speed the enterprises are experiencing difficulties in managing the data so they are moving the data to the cloud. Here we are presenting you with some of the best AWS RedShift interview questions which are asked frequently by the interviewers.

11. What is a Data Warehouse and how AWS Redshift can play a vital role in the storage?

Answer: A data warehouse can be thought of a repository where the data generated from the company’s systems and other sources is collected and stored. So a data warehouse has three-tier architecture:

· In the bottom tier, we have the tools which cleanse and collect the data.

· in the middle tier, we have tools which transform the data using Online Analytical Processing Server.

· In the top tier, we have different tools where data analysis and data mining is performed at the front end.

Setting up and managing a data warehouse involves a lot of money as the data in an organization continuously increases and the organization has to continuously upgrade their data storage servers. So here AWS Redshift comes into existence where the companies store their data in the cloud-based warehouses provided by Amazon.

12. What is Amazon Redshift and why is it popular among other cloud data warehouses?

Answer: Amazon Redshift is a fast and scalable data warehouse which is easy to use and is cost effective to manage all the organization’s data. The database is ranged from gigabytes to 100’s of petabytes of cloud data storage. A person does not need knowledge of any programming language to use this feature, just upload the cluster and tools which are already known to the user he can start using Redshift.

AWS Redshift is popular due to the following reasons:

1. AWS Redshift is very easy to use: In the console of AWS Redshift, you will find an option of creating a cluster. Just click on that and leave the rest on the machine programming of Redshift. Just fill the correct details as asked and launch the cluster. Now the cluster is ready to be used as Redshift automates most of the task like managing, monitoring and scaling.

2. Scaling of Warehouse is very easy: You just have to resize the cluster size by increasing the number of compute nodes.

3. Redshift gives 10x times better and fast performance: It makes use of specific strategies like columnar storage and massive parallel processing strategies to deliver high throughput and response time.

4. Economical: As it does not require any setup so cost reduces down to 1/10th of the traditional data warehouse.

13. What is Redshift Spectrum?

Answer: The Redshift Spectrum allows you to run queries alongside petabytes of data which is unstructured and that too with no requirement of loading ETL. Spectrum scales millions of queries and allows you to allocate and store the data wherever you want and whatever the type of format is suitable for you.

14. What is a leader node and compute node?

Answer: In a leader node the queries from the client application are received and then the queries are parsed and the execution plan is developed. The steps to process these queries are developed and the result is sent back to the client application.

in a compute node the steps assigned in the leader node are executed and the data is transmitted. The result is then sent back to the leader node before sending it to the client application.

15. How to load data in Amazon Redshift?

Answer: Amazon DynamoDB, Amazon EMR, AWS Glue, AWS Data Pipeline are some of the data sources by which you can load data in Redshift data warehouse. The clients can also connect to Redshift with the help of ODBC or JDBC and give the SQL command ‘insert’ to load the data.

AWS Database Interview Questions: RDS

Amazon Relational Database Service has become an important aspect which allows a user to create and operate relation database. Here we are proving you with the Amazon AWS RDS Interview Questions so that you may also get knowledge about Amazon RDS interview questions asked in the interview.

16. What is Amazon RDS?

Answer: RDS stands for Relational Database Service by which a user can easily manage and scale a relational database in the cloud. You can focus on your application and business instead of managing the time-consuming data administration works. The user can access his files anywhere on the go with high scalability and cost-effective manner.

The code and applications that you use today with the existing database like MySQL, MariaDB, Oracle, SQL Server work efficiently with Amazon RDS. It automatically backups the database and updates regularly with the latest version.

MultiAZ Architecture

17. Mention the database engines which are supported by Amazon RDS.

Answer: The database engines that are supported by Amazon RDS are Amazon Aurora, MySQL, MariaDB, Oracle, SQL Server, and PostgreSQL database engine.

18. What is the work of Amazon RDS?

Answer: When a user wants to set up a relational database then Amazon RDS is used. It provisions the infrastructure capacity that a user requests to install the database software. Once the database is set up and functional RDS automates the tasks like patching up of the software, taking the backup of the data and management of synchronous data replication with automatic failover.

19. What is the purpose of standby RDS instance?

Answer: The main purpose of launching a standby RDS instance is to prevent the infrastructure failure (in case failure occurs) so it is stored in a different availability zone which is a totally different infrastructure physically and independent.

20. Are RDS instances upgradable or downgradable according to the need?

Answer: Yes, you can upgrade the RDS instances with the help of following command: modify-db-instance. If you are unable to detect the amount of CPU needed to upgrade then start with db.m1. small DB instance class and monitor the utilization of CPU with the help of tool Amazon CloudWatch Service.

AWS Database Interview Questions: ElastiCache

Caching is a technique to store the information in a temporary location which is frequently used. Amazon Elastic ache is an in-memory service provided by Amazon. It also plays an important role in the interviewers as it is also a favorite topic for the interviewers. So, we will have a look at some of the top AWS ElastiCache interview questions for your interview preparation.

21. What is Amazon ElastiCache?

Answer: Amazon ElastiCache is an in-memory key-value store which is capable of supporting two key-value engines – Redis and Memcached. It is a fully managed and zero administrations which are hardened by Amazon. With the help of Amazon ElastiCache, you can either build a new high-performance application or improve the existing application. You can find the various application of ElastiCache in the field of Gaming, Healthcare, etc.

22. What is the use of Amazon ElastiCache?

Answer: The performance of web applications could be improved with the help of the caching of information that is used again and again. The information can be accessed very fast using in-memory-caching. With ElastiCache there is no need of managing a separate caching server. You can easily deploy or run an open source compatible in-memory data source with high throughput and low latency.

23. What are the benefits of Amazon ElastiCache?

Answer: There are various benefits of using Amazon ElastiCache some of which are discussed below:

· The cache node failures are automatically detected and recovered.

· It can be easily integrated with other AWS so as to provide a high performance and secured in-memory cache.

· As most of the data is managed by ElastiCache such as setup, configuration, and monitoring so that the user can focus on other high-value applications.

· The performance is enhanced greatly as it only supports the applications which require a very less response time.

· The ElastiCache can easily scale itself up or scale down according to the need.

24. What is an ElastiCache cluster?

Answer: A cluster is a collection of nodes. When you have a Memcached node then the nodes can be in multiple availability zones and in case of Redis cluster there is only a single node i.e. the master node and does not support data partitioning.

ElastiCache Cluster

25. Explain the types of engines in ElastiCache.

Answer: There is two type of engine supported in Elasticache: Memcached and Redis.

Memcached

It is a popular in-memory data store which the developers use for the high-performance cache to speed up applications. By storing the data in memory instead of disk Memcached can retrieve the data in less than a millisecond. It works by keeping every value of the key for every other data to be stored and uniquely identifies each data and lets Memcached quickly find the record.

Redis

Today’s applications need low latency and high throughput performance for real-time processing. Due to the performance, simplicity, and capability of redis, it is most favored by the developers. It provides high performance for real-time apps and sub-millisecond latency. It supports complex datatypes i.e. string, hashes, etc and has a backup and restore capabilities. While Memcached supports key names and values up to 1 MB only Redis supports up to 512 MB.

AWS Database Interview Questions for Experienced

If you are an experience AWS database professional and preparing for the next job interview, you need to be prepared well. The interviewer will ask you more difficult and scenario-based questions to check your knowledge and experience as well. So, here we bring some of the top AWS database interview questions for experienced that are frequently asked in Amazon AWS interview.

26. Can you differentiate DynamoDB, RDS, and RedShift?

Answer: DynamoDB, RDS, and RedShift these three are the database management services offered by Amazon. These can be differentiated as –

Amazon DynamoDB is the NoSQL database service which deals with the unstructured data. DynamoDB offers a high level of scalability with faster and inevitable performance.

Amazon RDS is the database management service for the relational databases which manages upgrading, fixing, patching, and backing up information of the database without your intervention. RDS is solely a database management service for the structure data.

Amazon RedShift is totally different from RDS and DynamoDB. RedShift is a data warehouse product that is used in data analysis.

Features	Amazon DynamoDB	Amazon RDS	Amazon RedShift
Primary Usage	Database for dynamically modified unstructured data	Conventional databases	Data warehouse
Computing Resources	Non-specified, SaaS (Software-as-a-Service)	Instances with 64 vCPU and 244 GB RAM	Nodes with vCPU and 244 GB RAM
Database Engine	NoSQL	MySQL, SQL Server, Oracle, Aurora, Postgre SQL, MariaDB	RedShift
Maintenance Window	No impact	30 minutes every week	30 minutes every week
Multi A-Z Replication	In-built	Additional service	Manual

27. Is it possible to run multiple DB instances for free for Amazon RDS?

Answer: Yes, it is possible to run more than one Single-AZ micro DB instance for Amazon RDS and that’s for free. However, if the usage exceeds 750 instance hours across all the RDS Single-AZ micro DB instances, billing will be done at the standard Amazon RDS pricing across all the regions and database engines.

For example, consider we are running 2 Single-AZ micro DB instances for 400 hours each in one month only, the accumulated usage will be 800 instance hours from which 750 instance hours will be free. In this case, you will be billed for the remaining 50 hours at the standard pricing of Amazon RDS.

28. Which AWS services will you choose for collecting and processing e-commerce data for real-time analysis?

Answer: I’ll use DynamoDB for collecting and processing e-commerce data for real-time analysis. DynamoDB is a fully managed NoSQL database service that can be used for any type of unstructured data. It can even be used for the e-commerce data taken from e-commerce websites. On this retrieved e-commerce data, analysis can be then performed using RedShift. Elastic MapReduce can also be used for analysis but we’ll avoid it here as real-time analysis if required.

29. What will happen to the dB snapshots and backups if any user deletes dB instance?

Answer: When a dB instance is deleted, the user receives an option of making a final dB snapshot. If you will do that it will restore your information from that snapshot. AWS RDS keeps all these dB snapshots together that are created by the user along with the all other manually created dB snapshots when the dB instance is deleted. At the same time, automated backups are deleted while manually created dB snapshots are preserved.

30. When will you prefer to use Provisioned IOPS over normal RDS storage?

Answer: The provisioned IOPS will deliver high IO rates in case of batch-oriented workloads but at the same time, it is a high ticket. On the other hand, execution workloads don’t require manual intervention as they allow complete utilization of systems, so RDS storage is good enough for those. It specifies that we should prefer Provisioned IOPS over normal RDS storage for the batch-oriented workloads.

1. Compare between AWS and OpenStack.

Criteria	AWS	OpenStack
License	Amazon proprietary	Open source
Operating system	Whatever the cloud administrator provides	Whatever AMIs provided by AWS
Performing repeatable operations	Through templates	Through text files

2. What is AWS?

AWS (Amazon Web Services) is a platform to provide secure cloud services, database storage, offerings to compute power, content delivery, and other services to help business level and develop.

3. What is the importance of buffer in Amazon Web Services?

An Elastic Load Balancer ensures that the incoming traffic is distributed optimally across various AWS instances. A buffer will synchronize different components and makes the arrangement additionally elastic to a burst of load or traffic. The components are prone to work in an unstable way of receiving and processing requests. The buffer creates an equilibrium linking various apparatus and crafts them work at an identical rate to supply more rapid services.

4. How are Spot Instance, On-demand Instance, and Reserved Instance different from one another?

Both Spot Instance and On-demand Instance are models for pricing.

Spot Instance	On-demand Instance
With Spot Instance, customers can purchase compute capacity with no upfront commitment at all.	With On-demand Instance, users can launch instances at any time based on the demand.
Spot Instances are spare Amazon instances that you can bid for.	On-demand Instances are suitable for high-availability needs of applications.
When the bidding price exceeds the spot price, the instance is automatically launched, and the spot price fluctuates based on supply and demand for instances.	On-demand Instances are launched by users only with the pay-as-you-go model.
When the bidding price is less than the spot price, the instance is immediately taken away by Amazon.	On-demand Instances will remain persistent without any automatic termination from Amazon.
Spot Instances are charged on an hourly basis.	On-demand Instances are charged on a per-second basis

5. Your organization has decided to have all their workload on the public cloud. But, due to certain security concerns, your organization decides to distribute some of the workload on private servers. You are asked to suggest a cloud architecture for your organization. What will be your suggestion?

A hybrid clouds. The hybrid cloud architecture is where an organization can use the public cloud for shared resources and the private cloud for its confidential workloads.

6. The data on the root volumes of store-backed and EBS-backed instances get deleted by default when they are terminated. If you want to prevent that from happening, which instance would you use?

EBS-backed instances. EBS-backed instances use EBS volume as their root volume. EBS volume consists of virtual drives that can be easily backed up and duplicated by snapshots. The biggest advantage of EBS-backed volumes is that the data can be configured to be stored for later retrieval even if the virtual machine or the instances are shut down.

7. Which one of the storage solutions offered by AWS would you use if you need extremely low pricing and data archiving?

Amazon Glacier. AWS Glacier is an extremely low-cost storage service offered by Amazon that is used for data archiving and backup purposes. The longer you store data in Glacier, the lesser it will cost you.

8. You have connected four instances to ELB. To automatically terminate your unhealthy instances and replace them with new ones, which functionality would you use?

Auto-scaling groups

9. How will you configure an Amazon S3 bucket to serve static assets for your public web application?

By configuring the bucket policy to provide public read access to all objects

10. Your organization wants to send and receive compliance emails to its clients using its own email address and domain. What service would you suggest for achieving the same in an easy and cost-effective way?

Amazon Simple Email Service (Amazon SES), which is a cloud-based email sending service, can be used for this purpose.

11. Can you launch Amazon Elastic Compute Cloud (EC2) instances with predetermined private IP addresses? If yes, then with which Amazon service it is possible?

Yes. It is possible by using VPC (Virtual Private Cloud).

12. Why do we make subnets?

Creating subnets means dividing a large network into smaller ones. These subnets can be created for several reasons. For example, creating and using subnets can help reduce congestion by making sure that the traffic destined for a subnet stays in that subnet. This helps in efficiently routing the traffic coming to the network that reduces the network’s load.

13. If you launched a standby RDS, will it be launched in the same availability zone as your primary?

No, standby instances are automatically launched in different availability zones than the primary, making them physically independent infrastructures. This is because the whole purpose of standby instances is to prevent infrastructure failure. So, in case the primary goes down, the standby instance will help recover all of the data.

14. Which of the following is a global Content Delivery Network service that securely delivers data to users with low latency and high transfer speed.

Amazon CloudFront

15. Which Amazon solution will you use if you want to accelerate moving petabytes of data in and out of AWS, using storage devices that are designed to be secure for data transfer?

Amazon Snowball. AWS Snowball is the data transport solution for large amounts of data that need to be moved into and out of AWS using physical storage devices.

16. If you are running your DB instance as Multi-AZ deployment, can you use standby DB instances along with your primary DB instance?

No, the standby DB instance cannot be used along with the primary DB instances since the standby DB instances are supposed to be used only if the primary instance goes down.

17. Your organization is developing a new multi-tier web application in AWS. Being a fairly new and small organization, there’s limited staff. But, the organization requires high availability. This new application comprises complex queries and table joins. Which Amazon service will be the best solution for your organization’s requirements?

DynamoDB will be the right choice here since it is designed to be highly scalable, more than RDS or any other relational database services.

18. Your organization is using DynamoDB for its application. This application collects data from its users every 10 minutes and stores it in DynamoDB. Then every day, after a particular time interval, the data (respective to each user) is extracted from DynamoDB and sent to S3. Then, the application visualizes this data to the users. You are asked to propose a solution to help optimize the backend of the application for latency at lower cost. What would you recommend?

ElastiCache. Amazon ElastiCache is a caching solution offered by Amazon. It can be used to store a cached version of the application in a region closer to users so that when requests are made by the users the cached version of the application can respond, and hence latency will be reduced.

19. You accidently stopped an EC2 instance in a VPC with an associated Elastic IP. If you start the instance again, what will be the result?

The data stored on the instance will be lost. Elastic IP is disassociated from the instance only if the instance is terminated.

20. Your organization has around 50 IAM users. Now, it wants to introduce a new policy that will affect the access permissions of an IAM user. How can it implement this without having to apply the policy at the individual user level?

It is possible using IAM groups, by adding users in the groups as per their roles and by simply applying the policy to the groups.

21. I created a web application with autoscaling. I observed that the traffic on my application is the highest on Wednesdays and Fridays between 9 AM and 7 PM. What would be the best solution for me to handle the scaling?

Configure a policy in autoscaling to scale as per the predictable traffic patterns.

22. How would you handle a situation where the relational database engine crashes often whenever the traffic to your RDS instances increases, given that the replica of RDS instance is not promoted as the master instance?

A bigger RDS instance type needs to be opted for handling large amounts of traffic, creating manual or automated snapshots to recover data in case the RDS instance goes down.

23. Is there a way to upload a file that is greater than 100 megabytes in Amazon S3?

Yes, it is possible by using multipart upload utility from AWS. With multipart upload utility, larger files can be uploaded in multiple parts that are uploaded independently. You can also decrease upload time by uploading these parts in parallel. After the upload is done, the parts will be merged into a single object or file to create the original file from which the parts were created.

24. Suppose, you hosted an application on AWS that lets the users render images and do some general computing. Which of the below listed services can you use to route the incoming user traffic?

· Classic Load Balancer

· Application Load Balancer

· Network Load balancer

Application Load Balancer: It supports path-based routing of the traffic and hence helps in enhancing the performance of the application structured as smaller services. Using application load balancer, the traffic can be routed based on the requests made. In this case scenario, the traffic where requests are made for rendering images can be directed to the servers only deployed for rendering images and the traffic where the requests are made for computing can be directed to the servers deployed only for general computing purposes.

25. You have an application running on your Amazon EC2 instance. You want to reduce the load on your instance as soon as the CPU utilization reaches 100 percent. How will you do that?

It can be done by creating an autoscaling group to deploy more instances when the CPU utilization exceeds 100 percent and distributing traffic among instances by creating a load balancer and registering the Amazon EC2 instances with it.

26. What would I have to do if I want to access Amazon Simple Storage buckets and use the information for access audits?

AWS CloudTrail can be used in this case as it is designed for logging and tracking API calls, and it has also been made available for storage solutions.

27. I created a key in North Virginia region to encrypt my data in Oregon region. I also added three users to the key and an external AWS account. Then, to encrypt an object in S3, when I tried to use the same key, it was not listed. Where did I go wrong?

The data and the key should be in the same region. That is, the data that has to be encrypted should be in the same region as the one in which the key was created. In this case, the data is in Oregon region, whereas the key is created in North Virginia region.

28. Suppose, I created a subnet and launched an EC2 instance in the subnet with default settings. Which of the following options will be ready to use on the EC2 instance as soon as it is launched?

· Elastic IP

· Private IP

· Public IP

· Internet Gateway

Private IP. Private IP is automatically assigned to the instance as soon as it is launched. While elastic IP has to be set manually, Public IP needs an Internet Gateway which again has to be created since it’s a new VPC.

29. Your organization has four instances for production and another four for testing. You are asked to set up a group of IAM users that can only access the four production instances and not the other four testing instances. How will you achieve this?

We can achieve this by defining tags on the test and production instances and then adding a condition to the IAM policy that allows access to specific tags.

30. What is the maximum number of S3 buckets you can create?

· 50

· 20

· 70

· 100

100

31. Your organization wants to monitor the read and write IOPS for its AWS MySQL RDS instance and then send real-time alerts to its internal operations team. Which service offered by Amazon can help your organization achieve this scenario?

Amazon CloudWatch would help us achieve this. Since Amazon CloudWatch is a monitoring tool offered by Amazon, it’s the right service to use in the above-mentioned scenario.

32. Which of the following services can be used if you want to capture client connection information from your load balancer at a particular time interval?

· Enabling access logs on your load balancer

· Enabling CloudTrail for your load balancer

· Enabling CloudWatch metrics for your load balancer

Enabling CloudTrail for your load balancer. AWS CloudTrail is an inexpensive log monitoring solution provided by Amazon. It can provide logging information for load balancer or any other AWS resources. The provided information can further be used for analysis.

33. You have created a VPC with private and public subnets. In what kind of subnet would you launch the database servers?

Database servers should be ideally launched in private subnets. Private subnets are ideal for the backend services and databases of all applications since they are not meant to be accessed by the users of the applications, and private subnets are not routable from the Internet.

34. Is it possible to switch from an Instance-backed root volume to an EBS-backed root volume at any time?

No, it is not possible.

35. How can you save the data on root volume on an EBS-backed machine?

By overriding the terminate option

36. When should you use the classic load balancer and the application load balancer?

The classic load balancer is used for simple load balancing of traffic across multiple EC2 instances. While, the application load balancing is used for more intelligent load balancing, based on the multi-tier architecture or container-based architecture of the application. Application load balancing is mostly used when there is a need to route traffic to multiple services.

37. Can you change the instance type of the instances that are running in your application tier and are also using autoscaling? If yes, then how? (Choose one of the following)

· Yes, by modifying autoscaling launch configuration

· Yes, by modifying autoscaling tags configuration

· Yes, by modifying autoscaling policy configuration

· No, it cannot be changed

Yes, the instance type of such instances can be changed by modifying autoscaling launch configuration. The tags configuration is used to add metadata to the instances.

38. Can you name the additional network interface that can be created and attached to your Amazon EC2 instance launched in your VPC?

Elastic Network Interface

39. Out of the following options, where does the user specify the maximum number of instances with the autoscaling commands?

· Autoscaling policy configuration

· Autoscaling group

· Autoscaling tags configuration

· Autoscaling launch configuration

Autoscaling launch configuration

40. Which service provided by AWS can you use to transfer objects from your data center, when you are using Amazon CloudFront?

Amazon Direct Connect. It is a network service that acts as an alternative to using the Internet to connect customers in on-premise sites with AWS.

41. You have deployed multiple EC2 instances across multiple availability zones to run your website. You have also deployed a Multi-AZ RDS MySQL Extra Large DB Instance. The site performs a high number of small read and write operations per second. After some time, you observed that there is read contention on RDS MySQL. What would be your approach to resolve the contention and optimize your website?

We can deploy ElastiCache in memory cache running in every availability zone. This will help in creating a cached version of the website for faster access in each availability zone. We can also add RDS MySQL read replica in each availability zone that can help in efficient and better performance for read operations. So, there will not be any increased workload on RDS MySQL instance, hence resolving the contention issue.

42. Your company wants you to propose a solution so that the company’s data center can be connected to Amazon cloud network. What would be your proposal?

The data center can be connected to Amazon cloud network by establishing a virtual private network (VPN) between the VPC and the data center. Virtual private network lets you establish a secure pathway or tunnel from your premise or device to AWS global network.

43. Which of the following Amazon Services would you choose if you want complex querying capabilities but not a whole data warehouse?

· RDS

· Redshift

· ElastiCache

· DynamoDB

Amazon RDS

44. You want to modify the security group rules while it is being used by multiple EC2 instances. Will you be able to do that? If yes, will the new rules be implemented on all previously running EC2 instances that were using that security group?

Yes, the security group that is being used by multiple EC2 instances can be modified. The changes will be implemented immediately and be applied to all the previously running EC2 instances without restarting the instances

45. Which one of the following is a structured data store that supports indexing and data queries to both EC2 and S3?

· DynamoDB

· MySQL

· Aurora

· SimpleDB

SimpleDB

46. How many total VPCs per account/region and subnets per VPC can you have?

· 4, 100

· 7, 40

· 5, 200

· 3, 150

5, 200

47. Which service offered by Amazon will you choose if you want to collect and process e-commerce data for near real-time analysis? (Choose any two)

· DynamoDB

· Redshift

· Aurora

· SimpleDB

DynamoDB. DynamoDB is a fully managed NoSQL database service that can be fed any type of unstructured data. Hence, DynamoDB is the most apt choice for collecting data from e-commerce websites.

For near real-time analysis, we can use Amazon Redshift.

48. If in CloudFront the content is not present at an edge location, what will happen when a request is made for that content?

CloudFront will deliver the content directly from the origin server. It will also store the content in the cache of the edge location where the content was missing.

49. Can you change the private IP address of an EC2 instance while it is in running or in a stopped state?

No, it cannot be changed. When an EC2 instance is launched, a private IP address is assigned to that instance at the boot time. This private IP address is attached to the instance for its entire lifetime and can never be changed.

50. Which of the following options will you use if you have to move data over long distances using the Internet, from instances that are spread across countries to your Amazon S3 bucket?

· Amazon CloudFront

· Amazon Transfer Acceleration

· Amazon Snowball

· Amazon Glacier

Amazon Transfer Acceleration. It throttles the data transfer up to 300 percent using optimized network paths and Amazon Content Delivery Network. Snowball cannot be used here as this service does not support cross-region data transfer.

51. Which of the following services is a data storage system that also has REST API interface and uses secure HMAC-SHA1 authentication keys?

· Amazon Elastic Block Store

· Amazon Snapshot

· Amazon S3

Amazon S3. It gets various requests from applications, and it has to identify which requests are to be allowed and which to be denied. Amazon S3 REST API uses a custom HTTP scheme based on a keyed HMAC for authentication of requests.

52. What kind of IP address can you use for your customer gateway (CGW) address?

We can use the Internet routable IP address, which is a public IP address of your NAT device.

53. Which of the following is not an option in security groups?

· List of users

· Posts

· IP addresses

· List of protocols

Wednesday, June 3, 2020