2021 Spring/Fall Changes
The Natural Language Processing A/B courses are replaced with corresponding Digital Humanities (Language Processing and Information Retrieval) courses. A new website will be up by 2021/4/12.
Fall 2019 Schedule
Date | # | Content |
---|---|---|
10/8 | 1 | Introduction: JupyterHub, Python, spaCy |
10/15 | 2 | UD page |
10/29 | 3 | NLP Examples #1 (English, Standard Ebooks) |
11/12 | 4 | NLP Examples #2 (spaCy and corpus statistics, graphing) |
11/19 | 5 | NLP Examples #3 (Japanese, Aozora Bunko) |
11/26 | 6 | NLP Examples #4 (spaCy & pattern matching) |
12/3 | 7 | NLP Examples #5 Machine Translation (Japanese→English) using the Japanese-English Bilingual Corpus of Wikipedia’s Kyoto Articles [Results using OpenNMT-py] (BLEU = 21.59, 58.8/30.6/18.7/12.3 (BP=0.852, ratio=0.862, hyp_len=1022196, ref_len=1186336) ) [Transformer model comparison (BLEU = 21.98, 60.7/32.1/20.1/13.5 (BP=0.816, ratio=0.831, hyp_len=985827, ref_len=1186336) )] [Human Evaluation] |
12/10 | 8 | Discussion of #5 Results and student projects |
12/17 | 9 | Student projects |
1/7 | 10 | Student projects |
1/14 | 11 | Student projects |
1/21 | 12 | Student projects |
1/28 | 13 | Project Presentations (1) |
2/4 | 14 | Project Presentations (2) |
Projects
Projects are assigned individually, but may involve collaboration between students. All projects require timely status updates and discussions with the instructor. The final product will consist of a short presentation of the project and a report written in either Japanese or English.
The final project is a chance for students to gain deeper insight into some aspect of NLP. This can be in the form of dataset creation (annotation of new resources), linguistic investigation (using NLP software and basic statistics), or reproductions of existing NLP methods, among others. All projects must follow reproducible research guidelines.
Presentation
The alloted time is 20 minutes for the presentation and 10-15 minutes for Q&A. There is no template and you may use any software you would like to present, including using a Jupyter notebook. Please note that you need an (full-size) HDMI port or adapter if you would like to use your own computer.
Structure
Each presentation should give a quick overview of natural language processing techniques or linguistic knowledge needed for your colleagues to follow along.
Please follow the reproducible research guidelines for how to organize your project.
Report
Based on your presentation, please hand in a report and data/program (if applicable) on the last day of class (2/4). Report format is decided on a case-by-case basis after consultation with instructor (Jupyter notebook is commonly accepted).