Natural Language Processing (NLP)

Natural language processing (NLP) is a significant subfield of machine learning, which deals with the interactions between machines (computers) and human (natural) languages. Natural languages are not limited to speech and conversation; they can be in writing or sign languages as well. The data for Natural Language Processing (NLP) tasks can be in different forms, for example, text from social media posts, web pages, even medical prescriptions, audio from voice mail, commands to control systems, or even a favourite song or movie.

Nowadays, Natural Language Processing (NLP) has been broadly involved in our daily lives: we cannot live without machine translation; weather forecasts scripts are automatically generated; we find voice search convenient; we get the answer to a question (such as what is the population of Canada) quickly thanks to the intelligent question-answering systems; speech-to-text technology helps people with special needs.

Choosing Natural Language Processing Applications for Your Business - My TechDecisions

Getting Started with Machine Learning and Python, it was mentioned that machine learning-driven programs or computers are good at discovering event patterns by processing and working with data. When the data is well structured or well defined, such as in a Microsoft Excel spreadsheet table and relational database table, it is intuitively obvious why machine learning is better at dealing with it than humans.

Computers read such data the same way as humans, for example, revenue: 5,000,000 as the revenue being 5 million and age: 30 as age being 30; then computers crunch assorted data and generate insights. However, when the data is unstructured, such as words with which humans communicate, news articles, or someone’s speech in French, it seems computers cannot understand words, as well as humans, do (yet).

There is a lot of information in the world in words or raw text, or broadly speaking, natural language. This refers to any language humans use to communicate with each other. Natural language can take various forms, including, but not limited to, the following:

Text, such as a web page, SMS, email, and menus
Audio, such as speech and commands to Siri
Signs and gestures
Many others such as songs, sheet music, and Morse code

If machines are able to understand language like humans do, we consider them intelligent. In 1950, the famous mathematician Alan Turing proposed in an article, Computing Machinery and Intelligence, a test as a criterion of machine intelligence. It’s now called the Turing test, and its goal is to examine whether a computer is able to adequately understand languages so as to fool humans into thinking that this machine is another human. It is probably no surprise to us that no computer has passed the Turing test yet.

But the 1950s is considered when the history of NLP started. Understanding the language might be difficult, but would it be easier to automatically translate texts from one language to another? In my first ever programming course, the lab booklet had the algorithm for coarse machine translation. We could imagine that this type of translation involves looking something up in dictionaries and generating new text. A more practically feasible approach would be to gather texts that are already translated by humans and train a computer program on these texts.

In 1954, scientists claimed, in the Georgetown experiment, that machine translation would be solved in three to five years. Unfortunately, a machine translation system that can beat human expert translators does not exist yet. But machine translation has been greatly evolving since the introduction of deep learning and has incredible achievements in certain areas, for example, social media (Facebook open-sourced a neural machine translation system), real-time conversation (Skype, SwiftKey Keyboard, and Google Pixel Buds), and image-based translation.

Conversational agents, or chat-bots, are another hot topic in NLP. The fact that computers are able to have a conversation with us has reshaped the way businesses are run.

In 2016, Microsoft’s AI chat-bot, Tay, was unleashed to mimic a teenage girl and converse with users on Twitter in real-time. She learned how to speak from all things users posted and commented on Twitter. However, she was overwhelmed by tweets from trolls, and automatically learned their bad behaviours and started to output inappropriate things on her feeds. She ended up being terminated within 24 hours.

There are also several tasks attempting to organize knowledge and concepts in such a way that they become easier for computer programs to manipulate. The way we organize and represent concepts is called ontology. An ontology defines concepts and relations between concepts. For instance, we can have a so-called triple representing the relation between two concepts, such as Python is a language.

An important use case for Natural Language Processing (NLP) at a much lower level, compared to the previous cases, is part-of-speech (PoS) tagging. A part of speech is a grammatical word category such as noun or verb. PoS tagging tries to determine the appropriate tag for each word in a sentence or a larger document. The following table gives examples of English parts of speech:-

Part of speech Examples:

Noun: David, machine
Pronoun: Then, her
Adjective: Awesome, amazing
Verb: Read, write
Adverb: Very, quite
Preposition: Out, at
Conjunction: And, but
Interjection: unfortunately, luckily
Article: A, the

Natural Language Processing (NLP)

Part of speech Examples:

Related

Leave a Comment Cancel Reply

Part of speech Examples:

Share this:

Related

Leave a Comment Cancel Reply