Natural Language Processing

Course semester
2nd semester
Course category
Elective
ECTS
7,5
Tutors

E. Stamatatos

Goal

Upon successful completion of the course, the student will be able to

  • Understand the levels of Natural Language Analysis and Processing (NLP)
  • Recognize, understand, explain NLP techniques in combination with corresponding applications
  • Highlight the specificity of individual NLP problems, the selection and adaptation to them of appropriate techniques
  • Plan the evaluation of the methods in comparison with each other, recognizing the possibilities and limitations of each NLP method
  • Communicate ideas related to the application of NLP techniques in a clear, concise and formal manner. The overall aims is students to be able to design, build and evaluate NLP systems to solve real-world problems, and explain their operation

Also, the course targets to the following general competencies:

  • Ability to organize and plan work and manage time effectively
  • Ability to communicate effectively (orally and written)
  • Ability to solve problems
  • Ability to develop critical thinking and capacity for critical approaches
  • Ability to work in a team
  • Ability of interdisciplinary approaches
  • Ability to apply theoretical knowledge in practice
  • Ability to evaluate algorithms, analyze and explain results
  • Ability to research
  • Ability to adapt methods and techniques to new situations and conditions
  • Ability to generate new ideas – Creativity

Contents

  • Introduction to natural language processing: basic concepts, layers of linguistic analysis, application examples
  • Morphological analysis, text tokenization, sentence splitting, subword tokenization, regular expressions, text normalization, statistical properties of text and corpora
  • Language modeling: n-gram models, smoothing techniques, neural language models, evaluation of language models
  • Vector representation of words and texts, topic models, static embeddings
  • Pre-trained language models and deep learning, contextualized embeddings
  • Sequence to sequence (seq2seq) methods, Encoder-decoder models, Machine translation, Text summarization
  • Sequence classification: methods and applications
  • Sequence labeling, named-entity recognition and part-of-speech tagging
  • Syntactic analysis: context-free grammars, probabilistic grammars, dependency parsing, full and partial parsing
  • Semantic analysis, word sense disambiguation, semantic role labelling

Bibliography

  • Κωνσταντίνος Τ. Φράγγος, Αναστάσιος Π. Κουτσούκος, «Η τεχνολογία της πληροφορίας στην επεξεργασία φυσικής γλώσσας – προβλήματα επεξεργασίας φυσικής γλώσσας», εκδόσεις ΜΥΡΜΙΔΟΝΕΣ, 2010, ISBN: 978-960-992790-1.
  • Jurafsky, Daniel, and James H. Martin. "Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition." (2009). https://web.stanford.edu/~jurafsky/slp3/
  • Manning, Christopher D., Christopher D. Manning, and Hinrich Schütze. Foundations of statistical natural language processing. MIT press, 1999. https://nlp.stanford.edu/fsnlp/