SEMANTICS AND PRAGMATICS IN CURRENT SOFTWARE APPS AND IN WEB SEARCH ENGINES: EXPLORING INNOVATIONS

0
91

In 1938, Charles Morris defined semiotics made up of three components: syntax, semantics and pragmatics. These words were used by computer scientists. Most web users see it as a symbolic system. It was born with this character, but in recent years the semantic web has pragmatic game gained prominence. Although the Semantic Web has become more serious, the basic Syntactic Web cannot be ignored, and the future is set for a pragmatic approach. To follow the path that seems to lead to a pragmatic web, let’s analyze the evolution of the most famous search engine, Google, and one of the applications that seems, at least potentially, to be the turning point in the pragmatics of communication over the Internet: Apple’s Siri.

«If you ask a question, you don’t want to get 10 million answers. They want the 10 best answers.”

What is semantic search?

If you plan a search with a search engine like Google, a list of results will appear based on the digitized text. In few words, syntactic search is based on written words. The results are the same regardless of who enters the text in the search query box.

Semantic search uses a kind of artificial intelligence (AI) to understand the searcher’s intent and the meaning of the query, even if the database is still the same, a dictionary, this type of search uses something from your personal data, your character , your lifestyle to give you the best answer to your question. In a semantic search, Google surfs from word to word how they relate to each other and how they could be used in your search.

Here is an example: In a restaurant we ask for the menu. Everything on the menu is edible? There is soup and food in the list, but also wine, beer and maybe the price of the service. There are large or small portions, but also a phone number and address of the restaurant. All these things are not edible!

Semantics on the web is not a new concept: as of 2008, all major search engines began researching natural language keywords.

Natural Language Understanding – NLU – is a subtopic of natural language processing in artificial intelligence that deals with machine reading comprehension. Dan Miller, Senior Analyst and Founder of Opus Research says, “NLU should allow people to speak naturally and have a reasonable expectation that a machine on the other end will understand their intent.”

Is it true that the technology behind the creation of speech recognition software is actually geared towards recognizing the user’s intent, or what the user wants the machine to ask? Or is it really fast usage of an extremely large database, simply compared to the string of letters typed or spoken by the user?

The first, often quoted definition comes from Tim Berners-Lee, who first proposed the term Semantic Web in 2001: «The Semantic Web is an extension of the current Web, in which information has a highly specific meaning and in which computers and users work together. »

The term is, from that moment, linked to an idea of the web in which agents act – defined by some intelligent agents – capable of understanding the meaning of the texts present in the network, connected by paths that thanks the information it contains is articulated, metadata.

The agents, active part of the software and regulated by the search algorithms, must be able to guide the user to the information sought. Also, they should be able to replace it in some operations, starting with the most common and recurring ones (e.g. “What’s the weather like? What time is it?”).

Agents must be able to follow the instructions of metadata that connects all information available to the user and is independently searched to respond to the user.

In fact, the actual language used and RDF, W3C, as the right language for the Semantic Web seems to represent a new approach to the problem.

The use of triples is the basis of this language: subject, predicate, value. With this system, you can potentially connect any meaning on the web.

The original question, however, is to find out how the systems underlying Semantic Web technology work, but whether this definition and the next in the list (NIELSEN 1996), the pragmatic web, are actually assimilated or at least partially adopted by current tools will research and use the Internet.

Apple Siri, Google Voice Search, IBM Watson

Google Voice Search is a tool that implements Google search on the web and allows the user to query the necessary information directly on your computer by voice. The Apple system called Siri, which will be used on iOS devices from October 2011 and is called the “voice assistant”, was also taken into account.

Everything related to the actual operation of Apple’s Siri software, which was developed by Nuance and then bought by Apple in 2010, and all information concerning it is not public. Inquiries to various companies and structures involved in the process of developing and updating the interface receive no or negative answers. Then we try to understand how the interface works from the available literature and from direct experience.

Although the public for using the interface may seem very similar to Apple’s Siri or Google Voice Search, only in the first case we have an attempt at semantic search, the system called NLU (Natural Language Understanding), which analyzes the sentence or parts of it from user voiced.

Roberto Pieraccini, CEO of ICSI, explains: “For words spoken into a text field (e.g. Google Voice Search), this does not mean that the machine has understood what was said, but can translate sounds into words. But if you go to Siri, you can understand the meaning of the words, because there’s a second part of the application, which is natural language understanding.”

So the NLU seems to be the next step and the prerequisite for the manifestation of a real Semantic Web approach. Going back for a moment to clarify what is meant by the semantic web: a part of the web where user information must be understood and interpreted in order to provide a correct response to the needs expressed. One of the tools that can currently be considered at a high level is called Watson and is a system developed by IBM in 2011. Watson participated in the TV show Jeopardy! part. During the television show, three participants must choose topics from a board and the value of the questions they are given by the presenter. Watson, without much difficulty, beat the program’s competitors, including rehearsals, and answered largely correctly, interpreting the question determined directly by the moderator.

Watson was therefore able to semantically interact with the presenter and the environment and correctly interpret the meaning of the questions? It seems so, but only to a casual observer. IBM provides all relevant information in a video and on a website to understand how the system works. How does Watson work?

«A question is analyzed and a large set of possible answers is extracted based on the search in different data sets. These “candidate” answers are analyzed separately along a whole range of different dimensions (geographical or temporal, or what I found most interesting, by putting the candidate answers back into the original question and re-searching it with different information sources to rank them again). The result is a vector of numeric values representing the results of the analysis along these different dimensions. This “vector” is summed to a final value using weight values for each dimension. The weights themselves are obtained through a previous training process (in this case using a number of stored Jeopardy questions/answers). Finally, the answer with the highest value is returned.»

IBM Watson is a really interesting experiment and certainly a fabulous step towards leveraging the Semantic Web. Watson responds correctly in almost all cases, but the positive results, far superior to the capabilities of human competitors, are due to the context of the test. It takes an average of three or four seconds to get a response from Watson, which is vastly inappropriate for general purpose applications like a web search engine that could not afford to wait for the user to query it. It is therefore said that the system developed by IBM, although absolutely remarkable, is not yet the right solution for environments other than those for which it was created, quiz with prizes.

A semantic Facebook?

A more recent implementation of Facebook is attempting to push the massive social network in the same semantic direction.

In January 2013, Facebook is introducing new status updates to users’ profiles to help them share what they’re feeling, seeing, eating, reading, and more. When users create a post, Facebook asks “What are you doing?” and includes a drop-down menu with options such as Feel, Observe, Eat, and more, which are then attached to the status update along with an emotion or link to the user’s page. This feature allows users to share their activities in a way that can later be used for ad targeting or indexed web searches.

In 2011, Facebook introduced the concept of Open Graph applications for which developers could create custom verbs and publish structured stories about what users were doing in their apps. This helps Facebook collect important information while giving users new ways to express themselves and learn things through their friends. According to Facebook, that information isn’t going to be included in Graph Search just yet, but it’s only natural that it will be. We can also imagine Facebook using these structured status updates in new types of feeds that focus on a specific category. We’ve seen the social network add functionality to its music dashboard and wondered if similar products could be developed for movies, books, or news. There are other interesting ads that could be created, e.g. B. Take a look at how your friends have been feeling lately. Based on those feelings and a user’s likes and actions, Facebook might recommend who to buy gifts for and what those users might want.

Google semantic search: state of the art

Currently, the Google search engine works syntactically through its voice interface Google Voice Search, responding to the user’s query with a result based on a set of probabilities under which a constantly evolving algorithm works. The first intervention of Google-oriented research using semantic criteria was the launch of the Knowledge Graph tool on May 16, 2012. In the article explaining how it works, Google says: “Now Google understands the difference and can use your Narrow down search results to what you mean. Just click on any of the links to see that specific segment of results.”

Unlike the classic Google search syntax we are used to, which compares the words of a query object against the vast computerized archive, Knowledge Graph analyzes the relationships between words or groups of words and decides what to suggest as an answer to the user. The operational scheme includes a set of semantic links that would lead the system to understand the real needs of the connected user.

Knowledge Graph is probably one of the largest databases of concepts and relationships, but unlike the Watson mentioned above, the same relationships are not currently used to perform semantic analysis of the query. All it does with the Google Knowledge Graph at the moment is to provide the user with easy navigation of its content. Under a certain search engine, it extracts information from the database and shows the right side of the SERPs (Search Engine Result Page) to the user. The user is tricked into believing that the relative boxes on the right were selected through a semantic analysis, when it was simply a statistical analysis of the huge database query to determine which of them deserved the box indentation and what types of information each of them was for it appropriate to present in the box.

“The Knowledge Graph allows Google to better understand your query, allowing us to summarize relevant content on that topic, including key facts you’re likely to need for that specific thing.” For example, if you search for Marie Curie, you’ll see when she was born and died, but you’ll also get details about her education and scientific discoveries. How do we know which facts are most likely to be needed for each article? To do this, we go back to our users and study an aggregate of what they asked Google about each article. For example, people are interested in knowing what books Charles Dickens wrote, while they are less interested in what books Frank Lloyd Wright wrote and more interested in what buildings he designed. The Knowledge Graph also helps us understand the relationships between things: Marie Curie is a person on the Knowledge Graph, and she had two children, one of whom also received a Nobel Prize, and a husband, Pierre Curie, who received a third Nobel Prize for the family of which are linked in our graphic. It’s not just a catalog of objects; it also models all these interrelationships. It’s the intelligence between these different entities that is key.”

So this would be the path for the development of the semantic web? The answer seems to be in the negative, despite Google Marketing’s words. Statistical analysis of database queries revealed that writer Charles Dickens was often named by users, specifying the subject of the requested information, such as which books he wrote. This allows Google to know that “books” is a term in the query commonly associated with Charles Dickens, and this is sufficient for Google to determine that an answer to the query [Charles Dickens] is from the list of works of the writer could benefit.

Google therefore has a large archive of terms and contexts, but uses it neither for the stated goal, namely the semantic analysis, nor to suggest the semantic results of the research to the user.

Pragmatic Web and Applications: Limited Results

After this brief analysis of the tools they use semantically in the interaction with the user, or at least announce them as such, let’s examine how they are used pragmatically on the web and with all electronic tools.

The phrase “My daughter is a fox” drives a set of Google results for around 500,000 pages. One would expect to find results relating to cunning, guile, quickness of thought, if not from a daughter then at least from a human.

On the contrary, the main results suggest pages where we talk about foxes, people calling themselves “fox” and experiences with the daughter of someone with a copy of the dog.

The most obvious conclusion after some similar experiments with the same search engine is that in reality we are extremely far from the use of pragmatics by machines.

From a pragmatic point of view, the machines cannot “understand” the conclusions and despite trying to associate words and phrases with metadata, they fail to complete these simple arguments that allow us to produce an utterance like “My daughter is a fox”. to fully understand. . The machines do not collect information for a pragmatic approach, instead playing a human to arrive at a full understanding of the statement. This includes, for example, knowledge from other utterances, the situation of the utterance, the intentions of the interlocutor, the specific context refers to the world in general.

Evi Technologies Ltd. has developed a platform on the Internet, Evi (evi.com), that works using certain algorithms of a pragmatic nature to answer questions written by the user. Trying to write “Where’s Elvis?” Evi says: “Elvis Presley died on August 16, 1977 and is buried in Graceland”. To get an answer like, Evi uses a pragmatic assumption that links the name to the popularity of the appointment. Evi provides a plausible answer, but of course carries the risk that the user refers to other lesser-known Elvis and thus provides a completely out of place answer. At each step, Evi can provide the rationale that is used to provide the answer.

Conclusions

We analyzed all the latest electronic and computing devices, including Apple’s latest Google Voice Search and Siri, many of which are advertised and offered to the user ทางเข้า pragmatic game with semantic capabilities. However, from what we have seen, the semantic field is still a fairly distant goal for the machines. Accessing information about the world is now easy and affordable, and the machines have access to a wealth of content that includes curated responses to user queries. However, semantics remains a difficult goal to achieve, the cause of which is to be found in its pragmatic dimension. Despite the enormous progress in understanding the world, the multitude of intentions and situational speech acts, the relationships between machines and humans are extremely complex, especially when the latter use the conversational context as an integral part of the way they speak.