Text Mining Loch Ness Monster Sightings

Text Mining with RapidMiner for Loch Ness Monster Sightings

Text mining involves pulling root words from text in a system.  In this example, I pulled all of the Loch Ness Monster sightings from 2000 to 2015 from the Official Loch Ness Monster Website into an Excel spreadsheet.  Then using the Text Processing extension processed the data to determine what are the most common root words that were found in the text.  The comments posted created 54 unique records.  After mining the posting texts the top 10 words that occur in order along with the number of appearance are:

  1. Loch – 35
  2. Water – 34
  3. Saw – 24
  4. Creature – 21
  5. object – 16
  6. boat – 15
  7. said – 14
  8. wake – 13
  9. Urquhart – 12
  10. August – 11

From this data it appears you have the best chance to see a creature in August near Urquhart.  Now how do we apply text mining to a business problem?

Text mining can be used in several business use cases such as Service Request, project, maintenance data or any other text fields that you store.  For service request you can mine the data looking for common words that customer are using when creating a service request.  If you have a baseline of common terms then you can look for trends in terms that are increasing.  For example, if you see a spike in the word “outage” then outages are occurring at a higher rate and action needs to be taken immediately.  Project or maintenance notes can be mined to see what common issues are popping up or parts that need to be replaced.

Leave a Reply

Your email address will not be published. Required fields are marked *