Text Analysis
28 Oct 2016Presenter: Joe Sutherland
Plan
We will cover the following topics
- Models of structured and unstructured data;
- Document Vectorization;
- Text features – what are they?
- Pre-Processing: Stemming and Stopping;
- Bag-of-words model;
- Sparse matrices.
Setup Instructions
We will use python
and homebrew
to install additional packages, so please make sure you have them up and running. You can find tips on installation here
Link to Workshop Materials
- Workshop presentation and useful materials for learning about text analysis can be found in this Dropbox Folder.