I normally start a cluster from the UI and decided I post how to create a cluster from the CLI. This is assuming the AWS CLI is installed and configured on your machine.

The command you want to use is aws emr create-cluster. You will have to figure out what release of emr do you want to create, what all applications you need to include , the instance type you want to use, the number of instances, if you are using the fleet (which I am here), etc. Once you have chosen all that , open a notepad and create the command. Mine looks like this :

aws emr create-cluster \
–applications Name=Hadoop Name=Hive Name=Tez \
–tags ‘Project=Covid-19 Analysis’ ‘region=us’ ‘Contact=Raju Pillai’ ‘Name=Covid-19 Analysis’ \
–ec2-attributes ‘{
}’ \
–release-label emr-6.1.0 \
–log-uri ‘s3n://raju-datalake-emr/logs/’ \
–configurations ‘[
]’ \
–instance-fleets ‘[
“Name”:”Master – 1″
“Name”:”Core – 2″
“Name”:”Task – 3″
]’ \
–bootstrap-actions ‘[
“Name”:”Copy Scripts”
]’ \
–ebs-root-volume-size 50 \
–service-role EMR_DefaultRole \
–enable-debugging \
–name ‘Covid-19-EMR6.1-AutoScale’ \
–scale-down-behavior TERMINATE_AT_TASK_COMPLETION \
–region us-east-1

