0
Research Papers

# Automated Discovery of Product Feature Inferences Within Large-Scale Implicit Social Media DataOPEN ACCESS

[+] Author and Article Information
Suppawong Tuarob

Faculty of Information and
Communication Technology,
Mahidol University,
Salaya, Nakhon Pathom 73170, Thailand
e-mail: suppawong.tua@mahidol.edu

Sunghoon Lim

Industrial and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu

Engineering Design and Industrial and
Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu

Contributed by the Computers and Information Division of ASME for publication in the JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING. Manuscript received August 13, 2017; final manuscript received February 17, 2018; published online May 2, 2018. Assoc. Editor: Rahul Rai.

J. Comput. Inf. Sci. Eng 18(2), 021017 (May 02, 2018) (14 pages) Paper No: JCISE-17-1162; doi: 10.1115/1.4039432 History: Received August 13, 2017; Revised February 17, 2018

## Abstract

Recently, social media has emerged as an alternative, viable source to extract large-scale, heterogeneous product features in a time and cost-efficient manner. One of the challenges of utilizing social media data to inform product design decisions is the existence of implicit data such as sarcasm, which accounts for 22.75% of social media data, and can potentially create bias in the predictive models that learn from such data sources. For example, if a customer says “I just love waiting all day while this song downloads,” an automated product feature extraction model may incorrectly associate a positive sentiment of “love” to the cell phone's ability to download. While traditional text mining techniques are designed to handle well-formed text where product features are explicitly inferred from the combination of words, these tools would fail to process these social messages that include implicit product feature information. In this paper, we propose a method that enables designers to utilize implicit social media data by translating each implicit message into its equivalent explicit form, using the word concurrence network. A case study of Twitter messages that discuss smartphone features is used to validate the proposed method. The results from the experiment not only show that the proposed method improves the interpretability of implicit messages, but also sheds light on potential applications in the design domains where this work could be extended.

<>

## Introduction

The rigorous competition in the market space drives designers to create products that better satisfy the majority of customers in a resource efficient manner. Oftentimes, it is crucial that designers are familiar with target customers' needs and preferences, in order to incorporate preferable features and remove weak elements from a design artifact. Recently, the literature has shown that information generated by social media users could prove critical to product designers in learning relevant preferences toward products/product features [16].

Technological advancements in digital communication have allowed many social media platforms to emerge as an alternative means for communication and information exchange in a timely and seamless manner. The literature in various fields of study has shown successful applications that rely on information extracted from large-scale social media data, such as mining healthcare-related information for disease prediction [79], detecting earthquake warnings and emergence needs due to natural disasters [10,11], and predicting financial market movement [12,13].

In the design informatics domain, despite the traditional methods that extract customers' preferences from online product reviews, recent findings have illustrated that social networks could also serve as a viable source of information for mining customers' opinions toward products/product features, due to its fast publication, wide range of users, accessibility, and heterogeneity of contents that provides an opportunity for customers to express opinions about products outside the review sites [2]. A data-driven methodology has been proposed to automatically discover notable product features mentioned in social networks [5]. Later, such notable product feature information is incorporated into a decision support framework that helps designers to develop next-generation products [2]. Furthermore, large-scale social media data have been established as a viable platform to automatically discover innovative users in social networks [1,4]. Such innovative users could prove critical to product design and development as they help designers to discover relevant product feature preferences months or even years before they are desired by general customers.

Implicit speech is a form of language usage in which the actual meaning is intended to be comprehended, but not directly stated. A majority manifestation of implicit speech includes sarcasm, which has become not only abundant, but also a norm in social networks. Maynard and Greenwood found that roughly 22.75% of social media data are sarcastic [14]. While it is evident that knowledge extracted from social media data is useful to product designers, the applicability of such data pertains to the portion expressed in explicit forms, due to the limitation of the underlying natural language processing algorithms that assume the explicit, well-formed textual input. As a result, implicit information would be either treated as noises or misinterpreted, resulting in inaccurate recommendation of product design decision support systems that process the information from large-scale social media data. Hence, the ability to automatically understand and correctly interpret such implicit information in social networks would not only reduce the errors caused by methods that are not specifically designed to handle implicit information, but also allow the methodologies to make use of additional implicit data that would have traditionally been disregarded due to being treated as noise.

Examples of explicit and implicit social media messages are given below:

Explicit

“My old 7 inch Samsung Galaxy Tab is my #1 travel companion - perfect size & functionality.”

Implicit

“I love when my blackberry bold screen freezes, the iphone 4 is definitely on my list of #13thingsiwant right now”

The first example is considered explicit because it can be directly inferred from both keywords and the grammatical structure that the user may be satisfied with the perfect size and functionality of his/her Samsung Galaxy Tab (Seoul, South Korea). On the contrary, the second example does not give any direct information about the screen feature of his/her Blackberry Bold (Waterloo, ON, Canada), and hence is implicit, though it may be inferred that this particular user may feel dissatisfied with his/her Blackberry Bold due to its frozen screen. If these implicit social media messages remain untreated, two problems could occur:

1. (1)Many data mining algorithms are extraction based that would classify a social media message whether it is useful or not. Such methods would disregard such implicit data where explicit knowledge could not be extracted, resulting in low utilization of useful data.
2. (2)Sarcastic social media messages may either exaggerate (i.e., “Apparently, the new iphone 5 helps you lose weight, you buy it and you can't afford food for a month.”) or oppose (i.e., “HOLY SH**!… The iPhone 5 can now have 5 rows of icons. Too amazing. #sarcasm”) the original meaning. The traditional text mining techniques are incapable of correctly interpreting the true meaning of these untreated social media messages.

Regardless of all the useful applications that emerge from social media data, being able to automatically explicate the implicit social media data would not only increase the performance of the existing natural language processing techniques, but would also enable discovery of real important product features that exist in the implicit data.

Processing social media data has been one of the biggest challenges for researchers. Traditional natural language processing techniques that have been shown to work well on traditional documents are reported to fail or underperform when applied on social media data, whose natures differ from traditional documents in the following ways:

1. (1)Social media data are high-dimensional, but sparse. A unit of social media document (aka message) is short, containing only one or two sentences. Some social media services, such as Twitter, enforce the length of a message, urging the users to be creative and use their own combination of word forms to express their opinions within limited context. Traditional techniques for interpreting semantics from documents would fail on social media data due to insufficiency in textual content [7]. Furthermore, the high-dimensionality caused by using creative word forms would prevent such traditional techniques from finding semantic similarity among the pool of social media messages.
2. (2)Social media data is noisy. Noise in social media data comes in multiple forms such as grammatical errors (e.g., “In the middle of the day and takes off running”), intentional/unintentional typographical errors (e.g., “iphone 4 s sooo COOOOLLLL!”), and symbolic word forms (e.g., “:-/,” “LOL”). Since traditional text processing techniques assume documents to be well-formed and grammatically corrected [15], they would fail to operate on social media data.

The existing attempts to interpret the semantic meaning behind implicit social media and relevant kinds of data (i.e., product reviews) include machine learning based implicit sentence detection algorithms proposed by Tsur and coworkers [16,17]. However, their methods only identify whether a piece of textual information is sarcastic or not. The work presented in this paper extends the previous literature by further extracting true meaning from social media messages whose context related to products/product features are implicit.

This paper presents a mathematical model based on the heterogeneous coword network patterns in order to translate implicit context toward a particular product or product feature into the explicit equivalence. A coword network (or word co-occurrence network) is a graph where each node represents a unique word, and an undirected edge represents the frequency of co-occurrence of the two words. In this work, the network is augmented to incorporate parts of speech into each word. The intuition behind using the coword network is that even though a message may be implicit, the similar combination of the words may have been used by other users who express their messages more explicitly. For example, given an implicit message “wow I have to squint to read this on the screen,” other users may have used the terms squint and screen in a more explicit context such as “Don't make me squint @user - your mobile banner needs work on my tiny screen iPhone 5S.” If the combination of the words squint and screen occurs in the messages that contain the word tiny frequently enough, then the system would be able to relate the original message to a more explicit set of terms. Particularly, the system would be able to interpret that the user thinks that the screen feature of this particular product is small.

Specifically, this paper has the following main contributions:

1. (1)The authors adopt the usage of the coword network in a product design context. The coword network has shown to be useful in multiple semantic extraction applications in information retrieval literature [7,18]. To the best of our knowledge, this technique has first been used in the design literature.
2. (2)The authors propose a probabilistic mathematical model in order to map implicit product-related information in social media data into the equivalent explicit context.
3. (3)The authors illustrate the efficacy of the proposed methodology using a case study of real world smartphone data and Twitter data.

## Related Works

While the use of implicit language such as indirect speech and sarcasm has been well explored in multiple psycholinguistic studies [1921], automatic semantic interpretation of implicit information in social networks is still in an infancy stage. This section first surveys the use of social media data pertaining to the product design applications and then discusses the existing natural language processing techniques that have been used to extract semantics from social media data.

###### Applications of Large-Scale Social Media Data in Product Design Domain.

Knowledge extracted from product-related, user-generated information has proved valuable in product design applications. Archak et al. proposed a set of algorithms, both fully automated and semi-automated, to extract opinionated product features from online reviews. The extracted information was successfully used to predict product demand [22]. While their findings were promising, the algorithms were applied on online product reviews whose nature is different from social media data, in terms of noise, amount of indirect language (i.e., sarcasm), and language creativity that do not conform to the standard English grammar. This research primarily aims to interpret semantics of a subset of social media data whose language is presented with sarcasm, that traditional natural language processing techniques would fail to handle effectively. Social media has recently been established as a viable source for product design and development. The previous studies claimed that knowledge extracted from social media data could be more beneficial than traditional product design knowledge sources such as product reviews (from popular online electronic commerce websites such as Amazon, Best Buy, and Walmart) and user study campaigns [2,4,5]. Asur and Huberman were able to use Twitter data collected during a 3-month period to predict the demand of theater movies [23]. They claimed that the prediction results are more accurate than those of the Hollywood Stock Exchange. Their study also found that sentiments in tweets can improve the prediction after a movie has been released. Tuarob and Tucker found that social media data could be a potential data source for extracting user preferences toward particular products or product features [2,5]. In a later work, they presented a methodology for automatic discovery of innovative users (aka. lead users) in online communities, using a set of mathematical models to extract latent features (product features not yet implemented in the market space), then identify lead users based on the volume of innovative features that they express in social media [1,4]. Lim and Tucker proposed a Bayesian-based statistical sampling algorithm that identifies product-feature-related keywords from social media data, without human-labeled training data [6]. Recently, Stone and Choi presented a visualization tool which allows designers to extract useful insights from online product reviews [24].

Since all the above techniques rely on the assumption that social media data are explicit, these techniques would fail to correctly process implicit social media messages which could result in error or inaccurate results. With these emerging product design applications that rely on social media as a knowledge source, it is crucial that the algorithms behind these applications are able to correctly interpret the true meaning of the data.

###### Natural Language Technology for Semantic Interpretation in Social Media.

In this subsection, technologies used to process social media data that go beyond just keyword detection (which works only on explicit data) are reviewed. Multiple studies in the information retrieval field have agreed that it is necessary to develop special text processing techniques for social media messages, since they are different from traditional documents due to smaller textual content, heterogeneous language standards, and higher level of noise [2529].

Social media holds sentiments expressed by its users (primarily in the form of textual data). Sentiment analysis in social media refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information in social media. Thelwall et al. found that important events lead to increases in average negative sentiment strength in tweets during the same period [30]. The authors concluded that the negative sentiment may be the key to popular trends in Twitter. Kucuktunc et al. studied the influence of several factors such as gender, age, education level, discussion topic, and time of day on sentiment variation in Yahoo! Answers [31]. Their findings shed light toward an application on attitude prediction in online question-answering forums. Weber et al. proposed a machine learning based algorithm to mine tips, short, self-contained, concise texts describing nonobvious advice [32]. Lim et al. applied unsupervised sentiment analysis in social media to identify the patient's potential symptoms and latent infectious diseases [9]. Sentiment of each short text is extracted and used as part of the features. Even though sentiment analysis could prove to be useful when designers would like to know how customers feel about a particular product or product feature, most sentiment extraction techniques only output sentiment level in two dimension (i.e., positive and negative). Hence, more advanced techniques are needed in order to narrow down what actually the customers want to say.

Besides sentiment analysis, multiple studies have found that topical analysis could be useful when dealing with noisy textual data such as social media. Even though social media is high in noise due to the heterogeneity of the writing styles, formality, and creativity, such noise bears undiscovered wisdom of the crowd. Paul and Dredze utilized a modified latent Dirichlet allocation [33] model to identify 15 ailments along with descriptions and symptoms in Twitter data [34,35]. Tuarob et al. proposed a methodology for discovering health-related content in social media data by quantifying topical similarity between documents as a feature type [7,8]. Furthermore, a number of studies have devoted to using topical models to detect emerging trends in social networks [3638]. In the design informatics field, Tuarob and Tucker proposed a set of methods that extract product-related information from large-scale social media data, such as customer demands, notable product features, and innovative product ideas [1,2,39]. The techniques mentioned earlier rely on explicit content of social media data and would likely fail or not produce correct results when applied on documents whose meanings are implicit.

Implicit document processing has posed challenges to computational linguists. Researchers have studied on the nature of implicit uses of language; however, none have successfully developed a computational model to translate implicit content into the equivalent explicit form. In dealing with implicit context in social media data, multiple algorithms have been proposed to detect the presence of implicit content in social media [16,40,41]; however, these algorithms do not further attempt to map the implicit content to the equivalent explicit forms. To the best of our knowledge, we are the first to explore the problem of identifying explicit customer preferences toward a product/product feature from large-scale social media data.

## Methodology

The method proposed in this paper mines language usages in the form of word co-occurrence patterns, in order to map implicit context commonly found in social media data to equivalent explicit ones. Figure 1 outlines the overview of the proposed methodology.

First, social media data are collected and preprocessed (Sec. 3.1). The textual content is then fed to the indexer in order to generate the coword network (Sec. 3.2). Once the network is generated and indexed, the user could give the system an implicit message as the query. The query is processed and the results are returned to the user as a ranked list of relevant keywords classified by parts of speech (Sec. 3.4). In this system, the user could be a human designer, or an automated program that mines product-related information from social media messages.

A practical usage of the proposed implicit message inference system would be to aid designers in synthesizing product features, mined from customers' feedback in large-scale social media data, into the next generation products. A framework was presented in Ref. [2], where designers iteratively identify notably good and bad features from existing products, and incorporate/remove them from the next generation products. The method proposed in this paper could be incorporated into such a framework to improve the notable product feature extraction process. Sections 3.13.4 will discuss each component in Fig. 1 in detail.

###### Social Media Data Preprocessing.

Social media provides a means for people to interact, share, and exchange information and opinions in virtual communities and networks [42]. For generalization, the proposed methodology minimizes the assumption about functionalities of social media data, and only assumes that a unit of social media is a tuple of unstructured textual content, a user ID, and a timestamp. Such a unit is referred to as a message throughout the paper. This minimal assumption would allow the proposed methodology to generalize across multiple heterogeneous pools of social media such as Twitter, Facebook, and Google+, as each of these social media platforms has this common data structure. Social media messages, corresponding to each product domain, are retrieved by a query of the product's name (and its variants) within the large stream of social media data.

###### Data Cleaning.

Most social media crawling application program interfaces provide additional information with each social media message such as user identification, geographical information, and other statistics.1 Though this additional information could be useful, it is disregarded and removed not only to save storage space and improve computational speed, but also to preserve the minimal assumption about the social media data mentioned earlier.

Raw social media messages are full of noise that could prevent further steps from achieving the expected performance. In order to remove such noise, the data cleaning process does the following:

1. (1)Lowercasing the textual content.
3. (3)Removing stop words.2

Note that misspelled words (e.g., hahaha and lovin) and emoticons (e.g.,: -) and (“)(-_-)(“)) are intentionally preserved. Even though they are not well-formed and do not exist in traditional dictionaries, they have been shown to carry useful information that infers semantic meaning behind the messages [8,43]. Furthermore, unlike traditional preprocessing techniques for reducing noise in documents, the social media data are not stemmed, since the previous studies have shown that stemming could excessively reduce the dimensionality of the data (especially in short messages, each of which contains roughly 14 words on average [44,45]), and would likely result in poorer performance [7].

###### Sentiment Extraction.

The technique developed by Thelwall et al. is employed to quantify the emotion in a message [43]. The algorithm takes a short text as an input, and outputs two values, each of which ranges from 1 to 5. The first value represents the positive sentiment level, and the other represents the negative sentiment level. The reason for having the two sentiment scores instead of just one (with –/+ sign representing negative/positive sentiment) is because research findings have determined that positive and negative sentiments can coexist [46]. However, in this research, we only focus on the net sentiment level; hence, the positive and negative scores are combined to produce an emotion strength score using the following equation: Display Formula

(1)$emotion strength (ES)=positivescore−negativescore$

A message is then classified into one of the three categories based on the sign of the emotion strength score (i.e., positive (+ve), neutral (0ve), and negative (−ve)). The EmotionStrength scores will later be used to identify whether a particular message conveys a positive or negative attitude toward a particular product or product feature.

###### Feature Extraction.

Product features are extracted from each social media message. In this paper, the feature extraction algorithm used in Ref. [4] is employed. The pseudo-code of the algorithm is outlined in Algorithm 1. At a high level, the algorithm takes a collection of social messages corresponding to a product as input, and outputs a tuple of $〈feature,frequency〉$ such as $〈‘onscreen keyboard’,5〉$, which infers that the on-screen keyboard feature of this specific product was mentioned 5 times within the given corpus of social media messages. Interested readers are encouraged to consult [4] for additional details about the feature extraction algorithm.

The features are extracted because the proposed methodology infers explicit opinions toward a particular product feature, hence it is imperative that product features can automatically be identified.

###### Part of Speech Tagging.

The final step of the social media data preprocess is to tag each word in a social message with a part of speech (POS). In this paper, Carnegie Mellon ARK Twitter POS tagger3 is used for this purpose. This particular POS tagger has not only been developed specially for social media data, but has also been successfully used in the product design domain [2].

The part of speech information is needed in order to disambiguate words with multiple meanings (i.e., homonyms) [47], which can be commonly found in social media. For example, the word “cold” in “Who waits for an iphone5 in this cold weather?” and “I've got a cold this morning. will skip class.” may have different meanings.

Each POS tag will become a node type in the coword network. Besides standard linguistic POS tags offered by the POS tagger tool, a special node type PRODUCT is also introduced to distinguish a word that represents a product name (e.g., iPhone 4, Samsung Galaxy S II, and Nokia N9) from other words. Table 1 lists the node types used in this research, along with their descriptions.

###### Generating and Indexing Coword Network.

A coword network is the collective interconnection of terms based on their paired presence within a specified unit of text. Traditional coword networks represent a node with only textual representation of a word. Variants of co-occurrence networks have been used extensively in the information retrieval field in a wide range of applications that involve semantic analysis such as concept/trend emergence detection [48,49], discovering new words, finding/clustering relevant items [50,51], semantic interpretation [7,52], and document annotation [53,54].

In this paper, a node also incorporates part of speech information for word-sense disambiguation purposes. Concretely, a coword network is an undirected, weighted graph where each node is a pair of $〈Word,POS Tag〉$ (e.g., $〈squint,V〉$ and $〈iPhone 4,PRODUCT〉$) that represents a POS tagged word, and each edge weight is the frequency of co-occurrence. Let D be the set of all social media messages and T be the vocabulary extracted from D. Formally, the coword network G is defined as follows:

$G=〈V,E〉V={〈Word,POS Tag〉∈T}E={(a,b)|a,b∈T}$

Weight (a,b) = $|${d$|$d$∈$D, (a,b)$∈$E, d contains both a and b}$|$

A compound is defined as a set of nodes. A social media message is converted to a compound by converting each word in the message into a node. The nodes are then combined. Replicated nodes are removed. Algorithm 2 explains how the coword network is generated from a corpus of social media messages. First, the set of nodes, V, and the set of edges, E, are initialized to empty sets. For each social media message d in the corpus D, all the words are tagged with appropriated POS tags, and then converted into nodes which are then combined into a compound c. For each node n in the compound c, update V by including n. Then for each possible combination pair of nodes in c, the weight of the edge that links these two nodes is incremented by 1. The coword network generation is finished once all the messages are processed. In this paper, the open-source graph database Neo4J4 is used to store and index the network. Neo4J is used in this task due to its scalability that allows a network with millions of edges to be efficiently stored and indexed.

###### Sarcasm Detection.

A majority of implicit social media data are manifested in the form of sarcastic messages. Maynard and Greenwood reported that roughly 22.75% of social media data is sarcastic [14]. Hence, this work focuses on improving the ability to interpret sarcastic product-related social media messages. In the proposed framework, sarcastic messages are automatically discovered using a machine learning based sarcasm detection algorithm, implemented in Ref. [55]. The algorithm produces a sarcastic message detection model using the features extracted from the training data. These feature sets include the following:

• n-grams: This feature set extracts individual words (uni-grams) and two consecutive words (bi-grams) from a given message. These n-gram features are used extensively to train classification models for text classification tasks. Three and more consecutive words are not used since research has shown the combination of uni-grams and bi-grams are sufficient and optimal that yields the best results while consuming reasonable amounts of computing resources and memory [56].

• Sentiment: It is a hypothesis that sarcastic messages are more negative than nonsarcastic ones. Mathematically Display Formula

(2)$h0:sentimentneg(sarcastic)>sentimentneg(non sarcastic)$

Moreover, studies show that sarcastic messages tend to exhibit the co-existence of positive and negative sentiments [46]. The sentiment features include (1) a positive and a negative sentiment score to each word in the message using the SentiWordNet5 dictionary, and (2) the sentiment score produced by the python library TextBlob.6

• Topics: The topical features are extracted using the Latent Dirichlet Allocation algorithm [33] implemented in gensim.7

The training dataset includes 20,000 sarcastic tweets and 100,000 nonsarcastic tweets over a period of three weeks in June–July 2014. Once the features are extracted from the training data, they are used to train a support vector machine classification model. The trained model is then used to identify a message whether it is sarcastic or nonsarcastic.

###### Query and Result Processing.

A query is a free text message with implicit content. Example queries include “I can't express how much I love the price of iPhone 5 on black Friday” and “I have to squint the screen to read this on Nokia N9.” This section describes how a user query is transformed into the network-compatible format, or a compound Q, for further processing. In particular, in order to process a free text query QText, the following steps are performed:

1. (1)Preprocess the query QText using the mechanism described in Sec. 3.1, in order to clean the raw message, extract features, and assign POS tags.
2. (2)Form the query compound Q, by converting each POS tagged word into a node, and combining them into a set.
3. (3)Remove the nodes in Q that do not exist in the coword network.

The resulting query compound Q is then fed into the system for further processing.

The implicit message translation problem in transformed into a node ranking problem so that traditional information retrieval techniques can be applied. In this context, a node in the coword network is equivalent to a combination of a word and its POS. Given the set of products in the same domain (product space) $S$, the set of all features (feature space) $F$, the coword network $G=〈V,E〉$, and the query compound Q. The node ranking algorithm takes the following steps:

1. Step 1 For each node $t∈V$, compute $P(t|Q,f,s)$, the likelihood (relevant to product features) of the node t given the query compound Q, target product feature $f∈F$, and the product $s∈S$.

2. Step 2 Rank the nodes by their likelihood.

3. Step 3 Top nodes are returned.

$P(t|Q,f,s)$ represents the likelihood that the node t is relevant to the feature f of the product s, given the query compound Q. The relevance of a node is quantified by its relatedness and explicitness to the query compound Q. Hence, mathematically $P(t|Q,f,s)$ is defined as follows: Display Formula

(3)$P(t|Q,f,s)=∑q∈Qwq·relatedness(t,q)·explicitness(t|q)$

where Display Formula

(4)$relatedness(t,q)=weight(t,q)∑n∈Adj(q)weight(n,q)$
Display Formula
(5)$explicitness(t|q)=degree(t)∑n∈Adj(q)degree(n)$
wq is the weight for the node $q∈Q$, and $∑q∈Qwq=1$. Adj(q) is the set of adjacent (neighbor) nodes to q. In the implementation, feature (i.e., f) and product nodes (i.e., s) are given twice the weight of other nodes in the compound. This is because, by giving higher weight to the target feature and product, the likelihood given to each node will be more relevant toward the feature of the product of interest. weight(t, q) is the weight of the edge linking t and q, which is the co-occurrence frequency of the two nodes. Note that if t and q have never been mentioned together, then the Relatedness(t, q) is evaluated to zero.

Relatedness(t, q) hence quantifies how frequently t and q are mentioned together. The score is normalized to range between [0,1] for consistency when combined with other components.

$Explicitness(t|q)$ quantifies explicitness of the term represented by the node t when presented in the same context as the term represented by the node q, and is measured by the normalized degree of the node t. A term is explicit if it makes the context clearer or easier to understand to the readers. An intuitive assumption is made that terms that have explicit meanings tend to be commonly used and mentioned frequently in multiple contexts. Such properties are captured by the degree of the node representing the term, since the higher degree a node has, the more diverse it is comentioned with other words. Table 2 provides examples of ten highest degree nodes and ten lowest degree nodes, classified by parts of speech. From the example, it can be seen that words with high degrees have explicit meanings and would make the context simpler and more clarified. On the other hand, the words with low degrees tend to be spurious words that do not directly associate with the product domain. These words tend to make the context implicit, especially when talking about a product or product feature.

Finally, $P(t|Q,f,s)$ is then a weighted sum of the relevance between the node $t∈V$ and each node in the query compound Q. $P(t|Q,f,s)$ ranges between [0,1], used to approximate the probability of the node t being relevant to the query compound q. Once $P(t|Q,f,s)$ is computed for all the nodes in the coword network, they can then be ranked using this score. The final output of the system would then be the top words classified by their parts of speech.

## Case Study, Results, and Discussion

This section introduces a case study used to verify the proposed methodology and discusses the results.

A case study of 27 smartphone products is presented that uses social media data (Twitter data) to mine relevant product design information. Data pertaining to product specifications from the smartphone domain are then used to validate the proposed methodology. The selected smartphone models include BlackBerry Bold 9900, Dell Venue Pro, HP Veer, HTC ThunderBolt, iPhone 3G, iPhone 3GS, iPhone 4, iPhone 4S, iPhone 5, iPhone 5C, iPhone 5S, Kyocera Echo, LG Cosmos Touch, LG Enlighten, Motorola Droid RAZR, Motorola DROID X2, Nokia E7, Nokia N9, Samsung Dart, Samsung Exhibit 4G, Samsung Galaxy Nexus, Samsung Galaxy S 4G, Samsung Galaxy S II, Samsung Galaxy Tab, Samsung Infuse 4G, Sony Ericsson Xperia Play, and T-Mobile G2x.

Smartphones are used as a case study in this paper because of the large volume of discussion about this product domain in social media. The previous work also illustrated that social media data (i.e., Twitter) contain crucial information about product features of other more mundane products such as automobiles [2,39]. The proposed algorithms may not work well for products which are not prevalently discussed (in terms of quantity of messages related to the product) in social media as the corresponding sets of social media messages may be too small to extract useful knowledge from.

###### Social Media Data Collection.

Twitter8 is a microblog service that allows its users to send and read text messages of up to 140 characters, known as tweets. The Twitter dataset used in this research was collected randomly using the provided Twitter application program interfaces, and comprises 2,117,415,962 (∼2.1 billion) tweets in the U.S. during the period of 31 months, from March 2011 to September 2013.

Tweets related to a product are collected by detecting the presence of the product name (and variants), and preprocessed by cleaning and mapping sentiment level as discussed in Sec. 3.1. Table 3 lists the number of tweets, number of unique Twitter users, and number of extracted features.

###### Coword Network Generation.

The coword network is generated using the procedure outlined in Algorithm 2, using all the social media data associated with the 27 smartphone models. The resulting network contains 95,999 nodes and 2,288,723 edges. A node has a degree of 47.7 and is used 160 times on average. Table 4 lists the numbers and average degrees of nodes categorized by parts of speech.

Figure 2 illustrates a graphical visualization of the generated coword network using the large-scale graph layout generation algorithm OpenORD [57].

###### Query and Result Processing.

This section reports notable results from the proposed methodology.

Given a textual query with implicit content, the system first transforms it into a compound, by removing stop words and converting each remaining distinct word into a node. For example, a textual query “I have to squint the screen to read this on Nokia N9” would be translated into the compound ${〈read,V〉, 〈squint,V〉, 〈screen,N〉$, $〈Nokia N9,PRODUCT〉}$. Note that not all the words in the query are converted into nodes since they could be stop words (e.g., I, have, the, this, and on). Figure 3 shows part of the generated coword network where all the nodes co-occur with the queries nodes (i.e., screen, read, squint, and Nokia N9). The thickness of the edges is proportionate to the actual edge weight. Similarly, the size of each node represents its relative degree.

###### Experiment Procedure.

Six smartphone models are selected for evaluation of the proposed coword implicit message translation model, including HTC ThunderBolt, Motorola Droid, Samsung Galaxy, iPhone 3, iPhone 5, and iPhone 4. The sarcasm detection algorithm described in Sec. 3.3 is used to select sarcastic messages associated with each select smartphone model. To establish ground-truth validation data, each message (and the focus product feature) is manually tagged with actual sentiment (negative, neutral, or positive) of the message poster toward such a feature. For example, “I love how everyone with an iPhone 5 says “look! My camera is 8 megapixels.” No. F*** off. Both of my Evo's have had 8 megapixel camera's.” is associated with iPhone 5, and is tagged with $〈camera,Negative〉$, meaning that the poster may actually feel negative (i.e., unsurprised) about the camera feature of the iPhone 5. Table 5 lists the number of sarcastic messages, manually classified by its actual sentiment, associated with each selected smartphone model.

The evaluation is designed to compare the performance between the proposed coword implicit message translation model (Coword) against a baseline (Baseline). The baseline method returns the original sarcastic message without any modification (hence, the message is not semantically processed with the coword network shown in Fig. 1). Such comparison would allow us to see if the proposed Coword model could translate a given sarcastic message into its explicit form. To compare the efficacy of both methods, this problem is transformed into a classification problem, where both the Coword and the Baseline translated versions are classified based on the sentiment (negative, neutral, and positive) using the sentiment extraction algorithm described in Sec. 3.1.2, and compared with the ground truth actual sentiment. Standard information retrieval evaluation metrics are used, including precision, recall, and F-measure. These metrics have been used extensively to validate the quality of the results of classification algorithms [58].

For each sentiment class $c ∈$ {Negative, Neutral, Positive}, let CC(c) denote the number of sarcastic messages correctly classified as c, CA(c) denote the number of sarcastic messages classified as c, and N(c) denote the number of sarcastic messages labeled as class c, these metrics are defined as follows: Display Formula

(6)$precision(c)=CC(c)CA(c)$
Display Formula
(7)$recall(c)=CC(c)N(c)$
Display Formula
(8)$F−measure(c)=2·precision(c)*recall(c)precision(c)+recall(c)$

Recall is the ratio of a number of messages the classifier can correctly recall to a number of all messages in that class. If there are 10 messages that belong to the class c, and a classifier can recall all 10 messages correctly, then the recall of the classification with respect to class c is 1.0 (100%). If the classifier can recall 7 messages correctly, then the recall ratio is 0.7 (70%). Precision is the ratio of a number of messages the classifier correctly recalls to a number of all messages it recalls (mix of correct and wrong recalls). In other words, precision quantifies how precise of the recalled results. F-measure combines precision and recall into one number with equal weights. Note that, precision, recall, and F-measure range from [0,1].

###### Experiment Results.

Table 6 reports the sentiment classification results from the messages translated by the coword method and the baseline for each select smartphone model. The classification results for each class (Positive, Neutral, and Negative) of both the methods are displayed. The bold figures denote the better result between the coword and the baseline methods. Figure 4 summarizes the classification performance of the six select smartphone models, grouped by precision, recall, and F-measure of the three sentiment classes.

Figure 5 emphasizes the comparison between the F-measure of the classification results of sarcastic messages translated by the coword and the baseline methods. The messages translated by the proposed coword method improve the sentiment extraction algorithm to identify the true negative sentiment for four out of six products, namely Motorola Droid, iPhone 3, iPhone 5, and iPhone 4. The reason why the coword method performs worse than the baseline could be explained in Fig. 4. For negative class, though the overall recall of the coword method surpasses that of the baseline by +39.46%, the precision suffers from the deterioration of −21.21%. This phenomenon suggests that the coword method still misinterprets some of the actual positive messages as negative (since the recall for the positive class also drops by -34% according to Fig. 4), and hence, introduce false positives to the sentiment classifier. Regardless, the overall performance in terms of F-measure is improved by 14% for the negative class.

Figure 6 compares the F-measure of the coword and baseline of the neutral class. Evidently, the proposed coword method allows the sentiment classifier to correctly interpret the actual neutral sentiment of a sarcastic message in all the select six smartphone models. Figure 4 further elaborates this phenomenon by showing the improvement in precision (by +42.57%), recall (by +128.24%), and F-measure (by +86.46%).

Figure 7 compares the F-measure of both the coword and the baseline of the sentiment classification on the positive class. The coword method improves the sentiment classification in four out of six smartphone models including Motorola Droid, Samsung Galaxy, iPhone 3, and iPhone 5. The reason why the coword method is not the clear winner for all the select products for the positive sentiment messages is because the coword technique still tends to misinterpret some of the positive sentiment messages as negative ones. As a result, the amount of the translated messages that appear to be positive is limited, causing a drop in recall of -34% on average (according to Fig. 4). Regardless, since the coword method is selective about the positive class, the precision of the positive class is boosted by +49.15%, leading the F-measure of the positive class to improve by +2.78%.

Overall, the classification results in terms of F-measure are improved on average for all the three sentiment classes (+14% for positive class, +86.46% for neutral class, and +2.78% for positive class). The experiment results presented in this section not only illustrate that the coword technique has the potential to facilitate the translation of implicit social media messages, so that they can be further processed by traditional natural language processing techniques, but also shed light on room to improve and extend this proposed framework for design applications that rely on knowledge extracted from large-scale social media data such as [13].

Table 7 illustrates the actual results from the proposed methodology on 10 sample social media messages whose preferences associated with the target product features are implicit (i.e., in the form of sarcasm). The table lists actual Twitter messages, target features, manual interpretation (by the authors), and the resulting top three relevant keywords (out of 10 keywords returned by the system), classified by parts of speech. Only nouns (N), verbs (V), and adjectives (A) are showed since the combination of these words are mostly sufficient in order to interpret the explicit semantic behind each message.

From the sample results, it is evident that the combination of the top words returned by the system could potentially provide explicit meaning of the implicit message. For example, the meaning behind “I can't express how much I love the price of iPhone 5 on black Friday” may infer that the user would like to buy an iPhone 5 today (which may be a Black Friday) because the price is cheap. Similarly, the user who posts “eh DroidRazr HD resolution? I don't think so.” may convey that the display of his/her Droid Razr is bad and needs to be upgraded.

Most traditional semantic interpretation techniques including sentiment analysis assume that documents are explicit and would fail when dealing with these implicit social media messages. The Column “Sentiment Level (From Implicit Context)” shows quantified sentiment level using the algorithm described in Sec. 3.1.2 on the original tweets. The actual Emotional Strength scores are in parentheses. The Column “Manual Sentiment Evaluation” lists the manual evaluation by the authors on the actual sentiment that each sample tweet infers toward the target product features (either Positive or Negative). The Column “Sentiment Level (From Translated Explicit Context)” shows the sentiment level using the same sentiment extraction algorithm, but on the translated explicit content generated by concatenating the top 20 keywords returned by the system into a single text (disregarding parts of speech). The sentiment levels computed on the translated text agree with the manual evaluation in all the samples shown in Table 7.

Not surprisingly, the sentiment level extracted from the original texts is all incorrect, since the sentiment extraction technique is designed to detect explicit sentiment, and hence would not give correct results when dealing with sarcasm or vague context. It is also interesting to note that the sentiment computed for the implicit sample messages tend to be neutral (Sentiment Level $≈$ 0), regardless of the fact that they are composed with emotion-inspired words (i.e., love, can't, shit, beautifully, and incredible). This agrees with prior findings that messages with implicit sentiment (i.e., sarcasm) would be sentimentally neutralized since such messages tend to have equally high volumes of both Positive and Negative scores, causing the Emotion Strength score to converge to 0 [59].

## Conclusions and Future Works

This paper proposes a knowledge-based methodology for inferring explicit sense from social media messages whose connotations related to products/product features are implicit. The methodology first generates a coword network from the corpus of social media messages, which is used as the knowledge source that captures the relationship among all the words expressed in the stream of large-scale social media data. A set of mathematical formulations is proposed in order to identify a combination of keywords that would best infer explicit connotation to a given implicit message query. A case study of real-world 27 smartphone models with 31 months' worth of Twitter data is presented. The results of selected smartphone models show great promises that the proposed methodology is effective in translating implicit product preferences to their explicit equivalent connotation that could be readily used in further knowledge extraction applications such as synthesizing product features [2], predicting future product demands and long-term product longevity [5], and identifying innovative users in online communities [4]. Future works could strengthen the evaluation process by involving user studies and verify the generalizability of the proposed methodology by examining diverse case studies of different product domains and social media services. Machine learning approaches that process psychological information such as [60] will also be explored to predict behaviors of customers from their sarcasm and other forms of language usages.

## Acknowledgements

This research project is supported by Mahidol University. Suppawong Tuarob is the corresponding author. We are also grateful for help with implementation and experimentation from Lawrence Lee.

## Funding Data

• Mahidol University (A14/2559).

## Appendices

###### Appendix: Statistics for Feature Characteristics

The statistics used to describe the characteristics of the product features extracted from the social media data are described in this section. Given a product $s∈S$, social media message corpus Ms, and the set of extracted features $F(Ms)$, Feature Utilization, Feature Intensity, and Feature Diversity are defined below:

###### Feature Utilization.

For a given product $s∈S$, the feature utilization is defined as Display Formula

(9)$F−utilization(s)=∑f∈F(Ms)|{m∈Ms:f∈m}||F(Ms)|$

The feature utilization quantifies how frequently on average a product feature is mentioned. The notion was first used in Ref. [53,61] as the Tag Utilization metric, and was used to measure how solid and standardized a tag collection is. Similarly, the Feature Utilization measures, overall, how standardized the features of a specific product are.

From Table 3, the products with highest feature utilization are iPhone 5, iPhone 4S, iPhone 5S, iPhone 4, iPhone 5C, Motorola Droid RAZR, and Samsung Galaxy Nexus, respectively. It does not come to a surprise to see the iPhone product line having high feature utilization since the product line has been in the market space for a long time and most features are inherited from the very first generation (such as tough screen, home button, and color (black/white)). After generations, these features may have become standardized as opposed to products with newly emerging features such as Kyocera Echo (F-Utilization = 1.36) which distinctly offers two screens and the ability to use two applications at once.

###### Feature Intensity.

Given a product s, the feature intensity is defined as Display Formula

(10)$F−intensity(s)=|∪f∈F(Ms){m∈Ms:f∈m}||Ms|$

While feature utilization quantifies the overall quality of the features of a product, the feature intensity quantifies the volume of discussion in social media about the product features. It is measured by the proportion of messages related to the features of the product s over all the messages related to s. The feature intensity can infer how many of the consumers care to discuss about the product that they are using.

Interestingly, most of the iPhone products (except newly launched iPhone 5S and iPhone 5C) are among the smartphone products with lowest feature intensity scores. This might be because such products may have been perceived by the consumers as generally good by word of mouth, which induce other consumers to purchase such products without much consideration about the features before making the purchasing decisions.

###### Feature Diversity.

The feature diversity tells how diverse the consumers' opinions are toward a particular feature. For a feature f of product $s∈S$Display Formula

(11)$F−diversity(f,s)=|opinion(f,s)||∪s′∈S,f′∈F(Ms)opinion(f′,s′)|$
Display Formula
(12)$avg−F−diversity(s)=∑f∈F(Ms)F−diversity(f,s)|F(Ms)|$

where opinion(f, s) is the set of distinct opinions toward the feature f. Recall that the feature extraction algorithm (Algorithm 1) also extracts opinions for each extracted feature. The average feature diversity then quantifies the opinion diversity in features of a particular product. The products with highest diversity include LG Enlighten, Samsung Exhibit 4G, LG Cosmos Touch, Samsung Dart, Kyocera Echo, and iPhone 5C. Note that one could observe that these products are either having highly controversial features (i.e., Kyocera Echo which offers dual screens with predictive text input and Samsung Exhibit 4G which offers dual cameras with surprisingly cheap prices.) or newly launched (i.e., iPhone 5C), all of which could incite diverse opinion-related discussion about the product features.

## References

Tuarob, S. , and Tucker, C. S. , 2015, “ Automated Discovery of Lead Users and Latent Product Features by Mining Large Scale Social Media Networks,” ASME J. Mech. Des., 137(7), p. 071402.
Tuarob, S. , and Tucker, C. S. , 2015, “ Quantifying Product Favorability and Extracting Notable Product Features Using Large Scale Social Media Data,” ASME J. Comput. Inf. Sci. Eng., 15(3), p. 031003.
Tuarob, S. , and Tucker, C. S. , 2015, “ A Product Feature Inference Model for Mining Implicit Customer Preferences Within Large Scale Social Media Networks,” ASME Paper No. DETC2015-47225.
Tuarob, S. , and Tucker, C. S. , 2014, “ Discovering Next Generation Product Innovations by Identifying Lead User Preferences Expressed Through Large Scale Social Media Data,” ASME Paper No. DETC2014-34767.
Tuarob, S. , and Tucker, C. S. , 2013, “ Fad or Here to Stay: Predicting Product Market Adoption and Longevity Using Large Scale, Social Media Data,” ASME Paper No. DETC2013-12661.
Lim, S. , and Tucker, C. S. , 2016, “ A Bayesian Sampling Method for Product Feature Extraction From Large-Scale Textual Data,” ASME J. Mech. Des., 138(6), p. 061403.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2014, “ An Ensemble Heterogeneous Classification Methodology for Discovering Health-Related Knowledge in Social Media Messages,” J. Biomed. Inf., 49, pp. 255–268.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2013, “ Discovering Health-Related Knowledge in Social Media Using Ensembles of Heterogeneous Features,” 22nd ACM International Conference on Information & Knowledge Management (CIKM '13), San Francisco, CA, Oct. 27–Nov. 1, pp. 1685–1690.
Lim, S. , Tucker, C. S. , and Kumara, S. , 2017, “ An Unsupervised Machine Learning Model for Discovering Latent Infectious Diseases Using Social Media Data,” J. Biomed. Inf., 66, pp. 82–94.
Sakaki, T. , Okazaki, M. , and Matsuo, Y. , 2010, “ Earthquake Shakes Twitter Users: Real-Time Event Detection by Social Sensors,” 19th International Conference on World Wide Web (WWW'10), Raleigh, NC, Apr. 26–30, pp. 851–860.
Caragea, C. , McNeese, N. , Jaiswal, A. , Traylor, G. , Kim, H. , Mitra, P. , Wu, D. , Tapia, A. , Giles, L. , Jansen, B. , and Yen, J. , 2011, “ Classifying Text Messages for the Haiti Earthquake,” Eighth International Conference on Information Systems for Crisis Response and Management (ISCRAM), Lisbon, Portugal, May 8–11.
Bollen, J. , Mao, H. , and Zeng, X. , 2011, “ Twitter Mood Predicts the Stock Market,” J. Comput. Sci., 2(1), pp. 1–8.
Zhang, X. , Fuehres, H. , and Gloor, P. , 2012, “ Predicting Asset Value Through Twitter Buzz,” Advances in Collective Intelligence 2011, Springer, Berlin, pp. 23–34.
Maynard, D. , and Greenwood, M. A. , 2014, “ Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis,” Ninth International Conference on Language Resources and Evaluation (LREC), Reykjavik, Iceland, May 26–31, pp. 4238–4243.
Dey, L. , and Haque, S. , 2009, “ Studying the Effects of Noisy Text on Text Mining Applications,” Third Workshop on Analytics for Noisy Unstructured Text Data (AND), Barcelona, Spain, July 23–24, pp. 107–114.
Tsur, O. , Davidov, D. , and Rappoport, A. , 2010, “ ICWSM-A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews,” Fourth International Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26, pp. 162–169.
Davidov, D. , Tsur, O. , and Rappoport, A. , 2010, “ Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon,” 14th Conference on Computational Natural Language Learning (CoNLL), Uppsala, Sweden, July 15–16, pp. 107–116.
Navigli, R. , and Velardi, P. , 2005, “ Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation,” IEEE Trans. Pattern Anal. Mach. Intell., 27(7), pp. 1075–1086. [PubMed]
Muecke, D. C. , 1982, Irony and the Ironic, Methuen, London.
Gibbs, R. W. , 1986, “ On the Psycholinguistics of Sarcasm,” J. Exp. Psychol., Gen., 115(1), p. 3.
Gibbs, R. W. , and Colston, H. L. , 2007, Irony in Language and Thought: A Cognitive Science Reader, Lawrence Erlbaum, New York.
Archak, N. , Ghose, A. , and Ipeirotis, P. G. , 2011, “ Deriving the Pricing Power of Product Features by Mining Consumer Reviews,” Manage. Sci., 57(8), pp. 1485–1509.
Asur, S. , and Huberman, B. A. , 2010, “ Predicting the Future With Social Media,” IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Washington, DC, Aug. 31–Sept. 3, pp. 492–499.
Stone, T. , and Choi, S.-K. , 2014, “ Visualization Tool for Interpreting User Needs From User-Generated Content Via Text Mining and Classification,” ASME Paper No. DETC2014-34424.
Zhao, W. X. , Jiang, J. , Weng, J. , He, J. , Lim, E.-P. , Yan, H. , and Li, X. , 2011, “ Comparing Twitter and Traditional Media Using Topic Models,” Advances in Information Retrieval, Springer, Berlin, pp. 338–349.
Yajuan, D. , Zhimin, C. , Furu, W. , Ming, Z. , and Shum, H. Y. , 2012, “ Twitter Topic Summarization by Ranking Tweets Using Social Influence and Content Quality,” 24th International Conference on Computational Linguistics, Mumbai, India, Dec. 8–15, pp. 763–780.
Wang, Y. , Wu, H. , and Fang, H. , 2014, “ An Exploration of Tie-Breaking for Microblog Retrieval,” Advances in Information Retrieval, Springer, Cham, Switzerland, pp. 713–719.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2015, “ Modeling Individual-Level Infection Dynamics Using Social Network Information,” 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, Oct. 19–23, pp. 1501–1510.
Tuarob, S. , and Mitrpanont, J. L. , 2017, “ Automatic Discovery of Abusive Thai Language Usages in Social Networks,” International Conference on Asian Digital Libraries, Bangkok, Thailand, Nov. 13–15, pp. 267–278.
Thelwall, M. , Buckley, K. , and Paltoglou, G. , 2011, “ Sentiment in Twitter Events,” J. Am. Soc. Inf. Sci. Technol., 62(2), pp. 406–418.
Kucuktunc, O. , Cambazoglu, B. B. , Weber, I. , and Ferhatosmanoglu, H. , 2012, “ A Large-Scale Sentiment Analysis for Yahoo! Answers,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 633–642.
Weber, I. , Ukkonen, A. , and Gionis, A. , 2012, “ Answers, Not Links: Extracting Tips From Yahoo! Answers to Address How-to Web Queries,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 613–622.
Blei, D. M. , Ng, A. Y. , and Jordan, M. I. , 2003, “ Latent Dirichlet Allocation,” J. Mach. Learn. Res., 3, pp. 993–1022.
Paul, M. J. , and Dredze, M. , 2011, “ A Model for Mining Public Health Topics From Twitter,” Tech. Rep., 11, p. 16.
Paul, M. J. , and Dredze, M. , 2011, “ You are What You Tweet: Analyzing Twitter for Public Health,” Fifth International AAAI Conference on Weblogs and Social Media (ICWSM), Barcelona, Spain, July 17–21, pp. 265–272.
Ramage, D. , Dumais, S. T. , and Liebling, D. J. , 2010, “ Characterizing Microblogs With Topic Models,” Fourth International AAAI Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26.
Prier, K. W. , Smith, M. S. , Giraud-Carrier, C. , and Hanson, C. L. , 2011, “ Identifying Health-Related Topics on Twitter,” Social Computing, Behavioral-Cultural Modeling and Prediction, Springer, Berlin, pp. 18–25.
Jin, O. , Liu, N. N. , Zhao, K. , Yu, Y. , and Yang, Q. , 2011, “ Transferring Topical Knowledge From Auxiliary Long Texts for Short Text Clustering,” 20th ACM International Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, Oct. 24–28, pp. 775–784.
Tuarob, S. , and Tucker, C. S. , 2016, “ Automated Discovery of Product Preferences in Ubiquitous Social Media Data: A Case Study of Automobile Market,” Computer Science and Engineering Conference (ICSEC), Chiang Mai, Thailand, Dec. 14–17, pp. 1–6.
González-Ibáñez, R. , Muresan, S. , and Wacholder, N. , 2011, “ Identifying Sarcasm in Twitter: A Closer Look,” 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT), Portland, OR, June 19–24, pp. 581–586.
Reyes, A. , Rosso, P. , and Veale, T. , 2013, “ A Multidimensional Approach for Detecting Irony in Twitter,” Lang. Resour. Eval., 47(1), pp. 239–268.
Ahlqvist, T. , 2008, Social Media Roadmaps: Exploring the Futures Triggered by Social Media, VTT, Helsinki, Finland.
Thelwall, M. , Buckley, K. , Paltoglou, G. , Cai, D. , and Kappas, A. , 2010, “ Sentiment in Short Strength Detection Informal Text,” J. Am. Soc. Inf. Sci. Technol., 61(12), pp. 2544–2558.
Guo, W. , Li, H. , Ji, H. , and Diab, M. T. , 2013, “ Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media,” 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, Aug. 4–9, pp. 239–249.
Ramaswamy, S. , 2018, “ Comparing the Efficiency of Two Clustering Techniques: A Case-Study Using Tweets,” Masters of Science Program, University of Maryland, College Park, MD.
Fox, E. , 2008, Emotion Science: Cognitive and Neuroscientific Approaches to Understanding Human Emotions, Palgrave Macmillan, Basingstoke, UK.
Cutting, D. , Kupiec, J. , Pedersen, J. , and Sibun, P. , 1992, “ A Practical Part-of-Speech Tagger,” Third Conference on Applied Natural Language Processing (ANLC '92), Trento, Italy, Mar. 31–Apr. 3, pp. 133–140.
Özgür, A. , Cetin, B. , and Bingol, H. , 2008, “ Co-Occurrence Network of Reuters News,” Int. J. Mod. Phys. C, 19(5), pp. 689–702.
Jia, S. , Yang, C. , Liu, J. , and Zhang, Z. , 2012, “ An Improved Information Filtering Technology,” Future Computing, Communication, Control and Management, Springer, Berlin, pp. 507–512.
Tuarob, S. , Mitra, P. , and Giles, C. L. , 2012, “ Improving Algorithm Search Using the Algorithm Co-Citation Network,” 12th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '12), Washington, DC, June 10–14, pp. 277–280.
Tuarob, S. , Bhatia, S. , Mitra, P. , and Giles, C. , 2013, “ Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning,” 12th International Conference on Document Analysis and Recognition (ICDAR), Washington, DC, Aug. 25–28, pp. 738–742.
Evans, D. A. , Handerson, S. K. , Monarch, I. A. , Pereiro, J. , Delon, L. , and Hersh, W. R. , 1998, Mapping Vocabularies Using Latent Semantics, Springer, Boston, MA.
Tuarob, S. , Pouchard, L. C. , and Giles, C. L. , 2013, “ Automatic Tag Recommendation for Metadata Annotation Using Probabilistic Topic Modeling,” 13th ACM/IEEE-CS Joint Conference on Digital Libraries, (JCDL'13), Indianapolis, IN, July 22–26, pp. 239–248.
Tuarob, S. , Pouchard, L. , Mitra, P. , and Giles, C. , 2015, “ A Generalized Topic Modeling Approach for Automatic Document Annotation,” Int. J. Digital Libr., 16(2), pp. 111–128.
Cliche, M. , 2014, “ The Sarcasm Detector: Learning Sarcasm From Tweets!,” The Sarcasm Detector, accessed Feb. 19, 2017,
Liu, F. , Liu, F. , and Liu, Y. , 2008, “ Automatic Keyword Extraction for the Meeting Corpus Using Supervised Approach and Bigram Expansion,” Spoken Language Technology Workshop (SLT 2008), Goa, India, Dec. 15–19, pp. 181–184.
Martin, S. , Brown, W. M. , Klavans, R. , and Boyack, K. W. , 2011, “ OpenOrd: An Open-Source Toolbox for Large Graph Layout,” SPIE Proc., 7868, p. 786806.
Manning, C. D. , Raghavan, P. , and Schütze, H. , 2008, Introduction to Information Retrieval, Cambridge University Press, New York.
Thelwall, M. , 2017, “ The Heart and Soul of the Web? Sentiment Strength Detection in the Social Web With SentiStrength,” Cyberemotions, Springer, Cham, Switzerland, pp. 119–134.
Tuarob, S. , Tucker, C. S. , Kumara, S. , Giles, C. L. , Pincus, A. L. , Conroy, D. E. , and Ram, N. , 2017, “ How are You Feeling?: A Personalized Methodology for Predicting Mental States From Temporally Observable Physical and Behavioral Information,” J. Biomed. Inf., 68, pp. 1–19.
Tuarob, S. , Pouchard, L. C. , Noy, N. , Horsburgh, J. S. , and Palanisamy, G. , 2012, “ Onemercury: Towards Automatic Annotation of Environmental Science Metadata,” Second International Workshop on Linked Science, Boston, MA, Nov. 12.
View article in PDF format.

## References

Tuarob, S. , and Tucker, C. S. , 2015, “ Automated Discovery of Lead Users and Latent Product Features by Mining Large Scale Social Media Networks,” ASME J. Mech. Des., 137(7), p. 071402.
Tuarob, S. , and Tucker, C. S. , 2015, “ Quantifying Product Favorability and Extracting Notable Product Features Using Large Scale Social Media Data,” ASME J. Comput. Inf. Sci. Eng., 15(3), p. 031003.
Tuarob, S. , and Tucker, C. S. , 2015, “ A Product Feature Inference Model for Mining Implicit Customer Preferences Within Large Scale Social Media Networks,” ASME Paper No. DETC2015-47225.
Tuarob, S. , and Tucker, C. S. , 2014, “ Discovering Next Generation Product Innovations by Identifying Lead User Preferences Expressed Through Large Scale Social Media Data,” ASME Paper No. DETC2014-34767.
Tuarob, S. , and Tucker, C. S. , 2013, “ Fad or Here to Stay: Predicting Product Market Adoption and Longevity Using Large Scale, Social Media Data,” ASME Paper No. DETC2013-12661.
Lim, S. , and Tucker, C. S. , 2016, “ A Bayesian Sampling Method for Product Feature Extraction From Large-Scale Textual Data,” ASME J. Mech. Des., 138(6), p. 061403.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2014, “ An Ensemble Heterogeneous Classification Methodology for Discovering Health-Related Knowledge in Social Media Messages,” J. Biomed. Inf., 49, pp. 255–268.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2013, “ Discovering Health-Related Knowledge in Social Media Using Ensembles of Heterogeneous Features,” 22nd ACM International Conference on Information & Knowledge Management (CIKM '13), San Francisco, CA, Oct. 27–Nov. 1, pp. 1685–1690.
Lim, S. , Tucker, C. S. , and Kumara, S. , 2017, “ An Unsupervised Machine Learning Model for Discovering Latent Infectious Diseases Using Social Media Data,” J. Biomed. Inf., 66, pp. 82–94.
Sakaki, T. , Okazaki, M. , and Matsuo, Y. , 2010, “ Earthquake Shakes Twitter Users: Real-Time Event Detection by Social Sensors,” 19th International Conference on World Wide Web (WWW'10), Raleigh, NC, Apr. 26–30, pp. 851–860.
Caragea, C. , McNeese, N. , Jaiswal, A. , Traylor, G. , Kim, H. , Mitra, P. , Wu, D. , Tapia, A. , Giles, L. , Jansen, B. , and Yen, J. , 2011, “ Classifying Text Messages for the Haiti Earthquake,” Eighth International Conference on Information Systems for Crisis Response and Management (ISCRAM), Lisbon, Portugal, May 8–11.
Bollen, J. , Mao, H. , and Zeng, X. , 2011, “ Twitter Mood Predicts the Stock Market,” J. Comput. Sci., 2(1), pp. 1–8.
Zhang, X. , Fuehres, H. , and Gloor, P. , 2012, “ Predicting Asset Value Through Twitter Buzz,” Advances in Collective Intelligence 2011, Springer, Berlin, pp. 23–34.
Maynard, D. , and Greenwood, M. A. , 2014, “ Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis,” Ninth International Conference on Language Resources and Evaluation (LREC), Reykjavik, Iceland, May 26–31, pp. 4238–4243.
Dey, L. , and Haque, S. , 2009, “ Studying the Effects of Noisy Text on Text Mining Applications,” Third Workshop on Analytics for Noisy Unstructured Text Data (AND), Barcelona, Spain, July 23–24, pp. 107–114.
Tsur, O. , Davidov, D. , and Rappoport, A. , 2010, “ ICWSM-A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews,” Fourth International Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26, pp. 162–169.
Davidov, D. , Tsur, O. , and Rappoport, A. , 2010, “ Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon,” 14th Conference on Computational Natural Language Learning (CoNLL), Uppsala, Sweden, July 15–16, pp. 107–116.
Navigli, R. , and Velardi, P. , 2005, “ Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation,” IEEE Trans. Pattern Anal. Mach. Intell., 27(7), pp. 1075–1086. [PubMed]
Muecke, D. C. , 1982, Irony and the Ironic, Methuen, London.
Gibbs, R. W. , 1986, “ On the Psycholinguistics of Sarcasm,” J. Exp. Psychol., Gen., 115(1), p. 3.
Gibbs, R. W. , and Colston, H. L. , 2007, Irony in Language and Thought: A Cognitive Science Reader, Lawrence Erlbaum, New York.
Archak, N. , Ghose, A. , and Ipeirotis, P. G. , 2011, “ Deriving the Pricing Power of Product Features by Mining Consumer Reviews,” Manage. Sci., 57(8), pp. 1485–1509.
Asur, S. , and Huberman, B. A. , 2010, “ Predicting the Future With Social Media,” IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Washington, DC, Aug. 31–Sept. 3, pp. 492–499.
Stone, T. , and Choi, S.-K. , 2014, “ Visualization Tool for Interpreting User Needs From User-Generated Content Via Text Mining and Classification,” ASME Paper No. DETC2014-34424.
Zhao, W. X. , Jiang, J. , Weng, J. , He, J. , Lim, E.-P. , Yan, H. , and Li, X. , 2011, “ Comparing Twitter and Traditional Media Using Topic Models,” Advances in Information Retrieval, Springer, Berlin, pp. 338–349.
Yajuan, D. , Zhimin, C. , Furu, W. , Ming, Z. , and Shum, H. Y. , 2012, “ Twitter Topic Summarization by Ranking Tweets Using Social Influence and Content Quality,” 24th International Conference on Computational Linguistics, Mumbai, India, Dec. 8–15, pp. 763–780.
Wang, Y. , Wu, H. , and Fang, H. , 2014, “ An Exploration of Tie-Breaking for Microblog Retrieval,” Advances in Information Retrieval, Springer, Cham, Switzerland, pp. 713–719.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2015, “ Modeling Individual-Level Infection Dynamics Using Social Network Information,” 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, Oct. 19–23, pp. 1501–1510.
Tuarob, S. , and Mitrpanont, J. L. , 2017, “ Automatic Discovery of Abusive Thai Language Usages in Social Networks,” International Conference on Asian Digital Libraries, Bangkok, Thailand, Nov. 13–15, pp. 267–278.
Thelwall, M. , Buckley, K. , and Paltoglou, G. , 2011, “ Sentiment in Twitter Events,” J. Am. Soc. Inf. Sci. Technol., 62(2), pp. 406–418.
Kucuktunc, O. , Cambazoglu, B. B. , Weber, I. , and Ferhatosmanoglu, H. , 2012, “ A Large-Scale Sentiment Analysis for Yahoo! Answers,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 633–642.
Weber, I. , Ukkonen, A. , and Gionis, A. , 2012, “ Answers, Not Links: Extracting Tips From Yahoo! Answers to Address How-to Web Queries,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 613–622.
Blei, D. M. , Ng, A. Y. , and Jordan, M. I. , 2003, “ Latent Dirichlet Allocation,” J. Mach. Learn. Res., 3, pp. 993–1022.
Paul, M. J. , and Dredze, M. , 2011, “ A Model for Mining Public Health Topics From Twitter,” Tech. Rep., 11, p. 16.
Paul, M. J. , and Dredze, M. , 2011, “ You are What You Tweet: Analyzing Twitter for Public Health,” Fifth International AAAI Conference on Weblogs and Social Media (ICWSM), Barcelona, Spain, July 17–21, pp. 265–272.
Ramage, D. , Dumais, S. T. , and Liebling, D. J. , 2010, “ Characterizing Microblogs With Topic Models,” Fourth International AAAI Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26.
Prier, K. W. , Smith, M. S. , Giraud-Carrier, C. , and Hanson, C. L. , 2011, “ Identifying Health-Related Topics on Twitter,” Social Computing, Behavioral-Cultural Modeling and Prediction, Springer, Berlin, pp. 18–25.
Jin, O. , Liu, N. N. , Zhao, K. , Yu, Y. , and Yang, Q. , 2011, “ Transferring Topical Knowledge From Auxiliary Long Texts for Short Text Clustering,” 20th ACM International Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, Oct. 24–28, pp. 775–784.
Tuarob, S. , and Tucker, C. S. , 2016, “ Automated Discovery of Product Preferences in Ubiquitous Social Media Data: A Case Study of Automobile Market,” Computer Science and Engineering Conference (ICSEC), Chiang Mai, Thailand, Dec. 14–17, pp. 1–6.
González-Ibáñez, R. , Muresan, S. , and Wacholder, N. , 2011, “ Identifying Sarcasm in Twitter: A Closer Look,” 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT), Portland, OR, June 19–24, pp. 581–586.
Reyes, A. , Rosso, P. , and Veale, T. , 2013, “ A Multidimensional Approach for Detecting Irony in Twitter,” Lang. Resour. Eval., 47(1), pp. 239–268.
Ahlqvist, T. , 2008, Social Media Roadmaps: Exploring the Futures Triggered by Social Media, VTT, Helsinki, Finland.
Thelwall, M. , Buckley, K. , Paltoglou, G. , Cai, D. , and Kappas, A. , 2010, “ Sentiment in Short Strength Detection Informal Text,” J. Am. Soc. Inf. Sci. Technol., 61(12), pp. 2544–2558.
Guo, W. , Li, H. , Ji, H. , and Diab, M. T. , 2013, “ Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media,” 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, Aug. 4–9, pp. 239–249.
Ramaswamy, S. , 2018, “ Comparing the Efficiency of Two Clustering Techniques: A Case-Study Using Tweets,” Masters of Science Program, University of Maryland, College Park, MD.
Fox, E. , 2008, Emotion Science: Cognitive and Neuroscientific Approaches to Understanding Human Emotions, Palgrave Macmillan, Basingstoke, UK.
Cutting, D. , Kupiec, J. , Pedersen, J. , and Sibun, P. , 1992, “ A Practical Part-of-Speech Tagger,” Third Conference on Applied Natural Language Processing (ANLC '92), Trento, Italy, Mar. 31–Apr. 3, pp. 133–140.
Özgür, A. , Cetin, B. , and Bingol, H. , 2008, “ Co-Occurrence Network of Reuters News,” Int. J. Mod. Phys. C, 19(5), pp. 689–702.
Jia, S. , Yang, C. , Liu, J. , and Zhang, Z. , 2012, “ An Improved Information Filtering Technology,” Future Computing, Communication, Control and Management, Springer, Berlin, pp. 507–512.
Tuarob, S. , Mitra, P. , and Giles, C. L. , 2012, “ Improving Algorithm Search Using the Algorithm Co-Citation Network,” 12th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '12), Washington, DC, June 10–14, pp. 277–280.
Tuarob, S. , Bhatia, S. , Mitra, P. , and Giles, C. , 2013, “ Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning,” 12th International Conference on Document Analysis and Recognition (ICDAR), Washington, DC, Aug. 25–28, pp. 738–742.
Evans, D. A. , Handerson, S. K. , Monarch, I. A. , Pereiro, J. , Delon, L. , and Hersh, W. R. , 1998, Mapping Vocabularies Using Latent Semantics, Springer, Boston, MA.
Tuarob, S. , Pouchard, L. C. , and Giles, C. L. , 2013, “ Automatic Tag Recommendation for Metadata Annotation Using Probabilistic Topic Modeling,” 13th ACM/IEEE-CS Joint Conference on Digital Libraries, (JCDL'13), Indianapolis, IN, July 22–26, pp. 239–248.
Tuarob, S. , Pouchard, L. , Mitra, P. , and Giles, C. , 2015, “ A Generalized Topic Modeling Approach for Automatic Document Annotation,” Int. J. Digital Libr., 16(2), pp. 111–128.
Cliche, M. , 2014, “ The Sarcasm Detector: Learning Sarcasm From Tweets!,” The Sarcasm Detector, accessed Feb. 19, 2017,
Liu, F. , Liu, F. , and Liu, Y. , 2008, “ Automatic Keyword Extraction for the Meeting Corpus Using Supervised Approach and Bigram Expansion,” Spoken Language Technology Workshop (SLT 2008), Goa, India, Dec. 15–19, pp. 181–184.
Martin, S. , Brown, W. M. , Klavans, R. , and Boyack, K. W. , 2011, “ OpenOrd: An Open-Source Toolbox for Large Graph Layout,” SPIE Proc., 7868, p. 786806.
Manning, C. D. , Raghavan, P. , and Schütze, H. , 2008, Introduction to Information Retrieval, Cambridge University Press, New York.
Thelwall, M. , 2017, “ The Heart and Soul of the Web? Sentiment Strength Detection in the Social Web With SentiStrength,” Cyberemotions, Springer, Cham, Switzerland, pp. 119–134.
Tuarob, S. , Tucker, C. S. , Kumara, S. , Giles, C. L. , Pincus, A. L. , Conroy, D. E. , and Ram, N. , 2017, “ How are You Feeling?: A Personalized Methodology for Predicting Mental States From Temporally Observable Physical and Behavioral Information,” J. Biomed. Inf., 68, pp. 1–19.
Tuarob, S. , Pouchard, L. C. , Noy, N. , Horsburgh, J. S. , and Palanisamy, G. , 2012, “ Onemercury: Towards Automatic Annotation of Environmental Science Metadata,” Second International Workshop on Linked Science, Boston, MA, Nov. 12.

## Figures

Fig. 1

Overview of the proposed system

Fig. 3

Graphical example of the words co-occurring with the query compound

Fig. 2

Graphical visualization of the generated coword network

Fig. 4

Improvement of the sentiment classification results, grouped by precision, recall, and F-measure, of each sentiment class (Negative, Neutral, and Positive) when translated the sarcastic messages with the coword method

Fig. 5

Comparison of F-measure evaluation of the class Negative, for each selected smartphone model

Fig. 6

Comparison of F-measure evaluation of the class Neutral, for each selected smartphone model

Fig. 7

Comparison of F-measure evaluation of the class Positive, for each selected smartphone model

## Tables

Algorithm 1 The feature extraction algorithm from a collection of documents
Algorithm 2 The coword generation algorithm from a collection of social media messages
Table 1 Node types and their descriptions
Table 3 Statistics of the Twitter data used in this paper, classified by smartphone products. See Appendix for explanation of each statistic.
Table 2 (Left) Top ten nodes (words) with highest degree, classified by parts of speech. (Right) Bottom ten nodes (words) with lowest degree, classified by parts of speech.
Table 5 Number of tweets, categorized by hand-labeled sentiment (Negative, Neutral, and Positive), associated with each selected smartphone model
Table 4 Statistics of the coword network generated using the Twitter data associate with the 27 smartphone products. The number of nodes and average degrees are categorized by the types of nodes.
Table 7 Sample results of 10 sarcastic product-related tweets
Table 6 Comparison of the classification performance between the proposed coword based method and the baseline (no translation process), for each sentiment class. P denotes precision, R denotes recall, and F denotes F-measure.

## Discussions

Some tools below are only available to our subscribers or users with an online account.

### Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related Proceedings Articles
Related eBook Content
Topic Collections