2021 Spring/Fall Changes

The Natural Language Processing A/B courses are replaced with corresponding Digital Humanities (Language Processing and Information Retrieval) courses. A new website will be up by 2021/4/12.

Fall 2019 Schedule

Date#Content
10/81Introduction: JupyterHub, Python, spaCy
10/152UD page
10/293NLP Examples #1 (English, Standard Ebooks)
11/124NLP Examples #2 (spaCy and corpus statistics, graphing)
11/195NLP Examples #3 (Japanese, Aozora Bunko)
11/266NLP Examples #4 (spaCy & pattern matching)
12/37NLP Examples #5 Machine Translation (Japanese→English) using the Japanese-English Bilingual Corpus of Wikipedia’s Kyoto Articles [Results using OpenNMT-py] (BLEU = 21.59, 58.8/30.6/18.7/12.3 (BP=0.852, ratio=0.862, hyp_len=1022196, ref_len=1186336)) [Transformer model comparison (BLEU = 21.98, 60.7/32.1/20.1/13.5 (BP=0.816, ratio=0.831, hyp_len=985827, ref_len=1186336))] [Human Evaluation]
12/108Discussion of #5 Results and student projects
12/179Student projects
1/710Student projects
1/1411Student projects
1/2112Student projects
1/2813Project Presentations (1)
2/414Project Presentations (2)

Projects

Projects are assigned individually, but may involve collaboration between students. All projects require timely status updates and discussions with the instructor. The final product will consist of a short presentation of the project and a report written in either Japanese or English.

The final project is a chance for students to gain deeper insight into some aspect of NLP. This can be in the form of dataset creation (annotation of new resources), linguistic investigation (using NLP software and basic statistics), or reproductions of existing NLP methods, among others. All projects must follow reproducible research guidelines.

Presentation

The alloted time is 20 minutes for the presentation and 10-15 minutes for Q&A. There is no template and you may use any software you would like to present, including using a Jupyter notebook. Please note that you need an (full-size) HDMI port or adapter if you would like to use your own computer.

Structure

Each presentation should give a quick overview of natural language processing techniques or linguistic knowledge needed for your colleagues to follow along.

Please follow the reproducible research guidelines for how to organize your project.

Report

Based on your presentation, please hand in a report and data/program (if applicable) on the last day of class (2/4). Report format is decided on a case-by-case basis after consultation with instructor (Jupyter notebook is commonly accepted).