Do you ever feel as a Data Scientist that managing a python or conda environment (across dev, prod) is a real pain in the ass!!? If Hell YEAH, then this piece for you. Let’s jump straight into work.

Conda Environment

  1. Let’s create a conda environment first on local, then we’ll discuss the ways to move this venv on AWS EC2 (Linux machine).
##downloading (you can change your edition by visting
$ wget
$ sh
##Making a Virtual Environment for specific project (Recommended)#For specific python version
$ conda create -n…

Welcome, This is Part-1!

You know, for search! But Do you actually know how things working under the hood!? So let’s get our basics straight. This is would be a series of topics that would cover primarily the working mechanism of that topic.

What is ElasticSearch?

We don’t need only search, we want robustness, relevance, quickness in our searches. ElasticSearch, one-stop shop with all these features.

Elasticsearch is the go-to search engine (that been built on top of Lucene search engine), you can see its elasticity in terms of use-cases, searching capabilities, administrating capabilities, etc.

Why Elasticsearch?

All aforementioned features plus it’s Open-Source! :P


This tutorial will give you clairvoyance on building NLP based QA system. Here I’m gonna cover the full stack (from raw txt files to the interactive web app). I’ll use’s Haystack to build this QA system for the Hindi language.

Below are the indexes of this Question-Answering fabrication pipeline which we’ll follow through:

  1. Haystack Pipeline Building
  • Custom training data preparation & Fine-tunning
  • Setup up search pipeline with Fine-Tuned model

Brief about QA system

Remember the passage comprehension in which you read a passage, understand it, remember it…

In this tutorial, we’re gonna implement a rudimentary Semantic Search engine using Haystack. we’ll use ElasticSearech and Faiss (Facebook AI Similarity Search) as DocumentStores.

Photo by Gozha Net on Unsplash

Below are the segments I’m gonna talk about:

  1. Implementation nit&grit
  • Dataset preparation
  • Indexing & Searching

Intro to Semantic Search & Terminologies

In recent times, with NLP (natural language processing) advancement and availability of vast computing power (GPU, TPU unit, etc.), Semantic Search is making its place in the search industry. Contrary to lexical or syntactical search, Semantic/neural search focuses more on the Intent and Semantics of the query. …

There was absolutely no AI Winter in the last decade, with an offset like AlexNet (2012) to flapjack GPT-3 (2020), AI as an Innovation really kicked off .🚀

Quick timeline catchup

In 2018, Google open-sourced BERT (Bidirectional Encoder Representations from Transformer) that became quite a benchmark in NLP. It has 340M parameters. Later, In 2019, OpenAI released their GPT-2 which has 1500M parameters. Furthermore, In 2020, OpenAI announced their third-generation language prediction model in the GPT-n series, GPT-3. It has 175B machine learning parameters. yeah, you heard it right 🤯!! I mean there is a hell of computation and training data included. …

Creating a good educational environment is considered a key factor in the development of any place, it takes a collaborative effort of people to build that environment that can nurture future generations.

Photo by Perry Grone on Unsplash

Due to conflicting situations in the state, students suffer a lot especially students of rural areas with poor financial conditions. …

Prateek Yadav

NLP Engineer @LexisNexis India ||

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store