https://fastdatascience.com/natural-language-processing/finding-similar-documents-nlp/