In this part, we will create variables that measure different characteristics of each of the email message in our training data. We will use these variables to predict if a message is HAM or SPAM in next task.

Some potentially useful functions include: grep(), grepl(), gsub(), gregexpr(), regexec(), regmathes(), nchar(), strsplit(), table(), plot(), smoothScatter(), hexbin().

Implement