Setting up python environment on AWS for Machine learning

https://media.giphy.com/media/FnTTPy2bAnLzy/giphy.gif

Conda Environment

  1. Let’s create a conda environment first on local, then we’ll discuss the ways to move this venv on AWS EC2 (Linux machine).
##downloading (you can change your edition by visting anaconda.com)
$ wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
#Installing
$ sh Anaconda3-2020.02-Linux-x86_64.sh
##Making a Virtual Environment for specific project (Recommended)#For specific python version
$ conda create -n your_environment_name python=3.7
#Activate your envirnement
$ conda activate your_environment_name
#Installing three libs
$ pip install transformers allennlp flask
Photo by Marvin Meyer on Unsplash
  1. EC2 with Internet: In this case, we first need to wrap our package distribution info in a file called, requirements.txt. And then push this on the EC2 server.
#Freezing the env info
$ pip freeze > requirements.txt
OR#pipreqs lib to create requirement.txt file
$ pip install pipreqs
$ pipreqs /home/project/location
## Run on EC2 where the requirements.txt is located$ conda create --name your_ec2_environment_name --file requirements.txt
## Method-1: Wrapping up all .whl files on localOn local/source Machine:#Download all .whl files in dir named 'dir_name'
1. $ mkdir dir_name && pip download -r requirements.txt -d dir_name
2. Copy requirement.txt into dir_name
3. Archive it: $ tar -zcf dir_name.tar.gz dir_name
4. Upload this zip to target machine
On Ec2/target Machine:1. Unzip: $ tar -zxf dir_name.tar.gz
2. Create plain conda env here and activate it.
3. Install: $ pip install -r dir_name/requirements.txt --no-index --find-links dir_name
## Method-2: Using Conda-pack ServiceOn local/source Machine:1. Install this service: 
$ pip install conda-pack
2. Pack enviroment your_environment_name into your_enviroment_name.tar.gz:
$ conda pack -n your_environment_name
On Ec2/target Machine:1. Unpack environment into dir 'your_environment_name' :
$ mkdir -p your_environment_name
$ tar -xzf your_environment_name.tar.gz -C your_environment_name
2. Activate the environment:
$ source your_environment_name/bin/activte

Python Virtual Environment

So as you know Conda provides a complete suite for Data science libs but there is one problem, it takes much space. At prod, Engineers try to maintain the python environment as minimal as possible. Therefore if you prefer python virtual environment over conda, then how do we manage this all, let see it through too.

  1. Let’s set up a Python virtual environment first on local, named ‘your_environment_name’.
#Installing pip
$ sudo get install python-pip
#Creating python virtualenv
$ pip install virtualenv
$ virtualenv your_environment_name
$ virtualenv -p /usr/bin/python3 your_environment_name
#Activating
$ source your_environment_name/bin/activate
On Ec2/target Machine:1. Unzip: $ tar -zxf dir_name.tar.gz
3. Install: $ pip install -r dir_name/requirements.txt --no-index --find-links dir_name
##Install with pip command$ pip install -r requirements.txt

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store