Natural Language Processing

Week 1

This class introduces the uses and history of NLP. Topics include: 

  • The history of natural language processes and how it is used in the industry today
  • How to parse strings using powerful regular expression tools in Python


Week 2

This class teaches how to use NLP toolkits and preprocessing techniques. Topics include:

  • Explore techniques such as tokenization, stop-word removal, and punctuation manipulation
  • Implement such techniques using Python libraries such as NLTK, TextBlob, spaCy, and Gensim


Week 3

This class introduces how to measure similarity between words. Learn more about:

  • Levenshtein distance, which is used to compare the similarity of two words
  • How computers encode pieces of text into a document-term matrix and what the bag of words assumption is


Week 4

This class shows how machine learning is used for basic text classification. Topics include:

  • The basics of machine learning and a refresher on the terminology
  • A typical machine learning workflow for two different machine learning approaches to classify emails as either spam or not spam


Week 5

This class teaches an algorithm for natural language understanding and topic modeling. Learn more about:

  • How to use the latent Dirichlet allocation algorithm to extract topics from the document-term matrices


Week 6

This class continues to teach how to model and extract topics in text. Learn more about:

  • Alternative algorithms for discovering the topics embedded in texts


Week 7

This week teaches machine learning algorithms for NLP. Topics include:

  • How to use a neural network to transform words into vectors
  • Potential applications of these vectors (such as text classification and information retrieval)


Week 8

Continuing with the topic of machine learning, this class teaches more about applying neural networks. Topics include:

  • Text generation using Markov chains and recurrent neural networks
  • Advanced topics in NLP, such as seq2seq