Arboreal AI is a machine learning and software architecture consultancy composed of my mentor, Dr. Matthew Lewis, and I. The majority of the work done in the past was initiating and developing data science products for a restaurant technology company.
Our first project was to research and develop a tool for sentiment analysis
of online restaurant reviews. We leveraged a combination of traditional Natural
Language Processing techniques, as well as a convolutional neural network
(CNN) architecture.
The initial stage of this project was, of course, data cleaning.
Anyone who has ever looked at a restaurant review online understands that they are
not the most grammatically sound publications on the internet. We broke down
the semantic structure of each sentence to separate the dataset into clauses, and
we began predicting the sentiment of each clause. This was quickly limited by the
amount of data we had available, but as more reviews were written the tool became
more and more useful. Another tricky aspect of this project was
applying the predicted sentiment to the correct subject. The difficulty of this
problem is quickly made apparent considering a sentence such as:
"The food was slow to arrive."
While the subject of the clause is "food", the subject of the criticism is
the service at the restaurant. This was an interesting piece of the problem
that suprised everyone working on it.
The second major project that Arboreal AI worked on was wait-time estimation for customers at restaurants. This is an important piece of the front-end services rendered by the resturant technology company, as restaurants are very interested in improving the accuracy of their estimates. Inaccurate estimates end up turning away potential revenue for the business on both sides of the inaccuracy. An estimate that is too early is bound to turn away customers when the time they expected their table at has come and gone, while estimates that are too late will discourage guests from waiting, even when the table ends up being available at a time that they would have found acceptable. This problem ended up being quite difficult, as nearly all of the uncertainty in the estimation lies in the queue. In the end, we delivered a tool that generated predictions with an accuracy of over 80%, an increase of about 60% accuracy over traditional human estimates.