0
Research Papers

Automated Discovery of Product Feature Inferences Within Large-Scale Implicit Social Media Data

[+] Author and Article Information
Suppawong Tuarob

Faculty of Information and
Communication Technology,
Mahidol University,
Salaya, Nakhon Pathom 73170, Thailand
e-mail: suppawong.tua@mahidol.edu

Sunghoon Lim

Industrial and Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: slim@psu.edu

Conrad S. Tucker

Engineering Design and Industrial and
Manufacturing Engineering,
The Pennsylvania State University,
University Park, PA 16802
e-mail: ctucker4@psu.edu

Contributed by the Computers and Information Division of ASME for publication in the JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING. Manuscript received August 13, 2017; final manuscript received February 17, 2018; published online May 2, 2018. Assoc. Editor: Rahul Rai.

J. Comput. Inf. Sci. Eng 18(2), 021017 (May 02, 2018) (14 pages) Paper No: JCISE-17-1162; doi: 10.1115/1.4039432 History: Received August 13, 2017; Revised February 17, 2018

Recently, social media has emerged as an alternative, viable source to extract large-scale, heterogeneous product features in a time and cost-efficient manner. One of the challenges of utilizing social media data to inform product design decisions is the existence of implicit data such as sarcasm, which accounts for 22.75% of social media data, and can potentially create bias in the predictive models that learn from such data sources. For example, if a customer says “I just love waiting all day while this song downloads,” an automated product feature extraction model may incorrectly associate a positive sentiment of “love” to the cell phone's ability to download. While traditional text mining techniques are designed to handle well-formed text where product features are explicitly inferred from the combination of words, these tools would fail to process these social messages that include implicit product feature information. In this paper, we propose a method that enables designers to utilize implicit social media data by translating each implicit message into its equivalent explicit form, using the word concurrence network. A case study of Twitter messages that discuss smartphone features is used to validate the proposed method. The results from the experiment not only show that the proposed method improves the interpretability of implicit messages, but also sheds light on potential applications in the design domains where this work could be extended.

FIGURES IN THIS ARTICLE
<>
Copyright © 2018 by ASME
Your Session has timed out. Please sign back in to continue.

References

Tuarob, S. , and Tucker, C. S. , 2015, “ Automated Discovery of Lead Users and Latent Product Features by Mining Large Scale Social Media Networks,” ASME J. Mech. Des., 137(7), p. 071402. [CrossRef]
Tuarob, S. , and Tucker, C. S. , 2015, “ Quantifying Product Favorability and Extracting Notable Product Features Using Large Scale Social Media Data,” ASME J. Comput. Inf. Sci. Eng., 15(3), p. 031003. [CrossRef]
Tuarob, S. , and Tucker, C. S. , 2015, “ A Product Feature Inference Model for Mining Implicit Customer Preferences Within Large Scale Social Media Networks,” ASME Paper No. DETC2015-47225.
Tuarob, S. , and Tucker, C. S. , 2014, “ Discovering Next Generation Product Innovations by Identifying Lead User Preferences Expressed Through Large Scale Social Media Data,” ASME Paper No. DETC2014-34767.
Tuarob, S. , and Tucker, C. S. , 2013, “ Fad or Here to Stay: Predicting Product Market Adoption and Longevity Using Large Scale, Social Media Data,” ASME Paper No. DETC2013-12661.
Lim, S. , and Tucker, C. S. , 2016, “ A Bayesian Sampling Method for Product Feature Extraction From Large-Scale Textual Data,” ASME J. Mech. Des., 138(6), p. 061403. [CrossRef]
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2014, “ An Ensemble Heterogeneous Classification Methodology for Discovering Health-Related Knowledge in Social Media Messages,” J. Biomed. Inf., 49, pp. 255–268.
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2013, “ Discovering Health-Related Knowledge in Social Media Using Ensembles of Heterogeneous Features,” 22nd ACM International Conference on Information & Knowledge Management (CIKM '13), San Francisco, CA, Oct. 27–Nov. 1, pp. 1685–1690.
Lim, S. , Tucker, C. S. , and Kumara, S. , 2017, “ An Unsupervised Machine Learning Model for Discovering Latent Infectious Diseases Using Social Media Data,” J. Biomed. Inf., 66, pp. 82–94.
Sakaki, T. , Okazaki, M. , and Matsuo, Y. , 2010, “ Earthquake Shakes Twitter Users: Real-Time Event Detection by Social Sensors,” 19th International Conference on World Wide Web (WWW'10), Raleigh, NC, Apr. 26–30, pp. 851–860.
Caragea, C. , McNeese, N. , Jaiswal, A. , Traylor, G. , Kim, H. , Mitra, P. , Wu, D. , Tapia, A. , Giles, L. , Jansen, B. , and Yen, J. , 2011, “ Classifying Text Messages for the Haiti Earthquake,” Eighth International Conference on Information Systems for Crisis Response and Management (ISCRAM), Lisbon, Portugal, May 8–11.
Bollen, J. , Mao, H. , and Zeng, X. , 2011, “ Twitter Mood Predicts the Stock Market,” J. Comput. Sci., 2(1), pp. 1–8. [CrossRef]
Zhang, X. , Fuehres, H. , and Gloor, P. , 2012, “ Predicting Asset Value Through Twitter Buzz,” Advances in Collective Intelligence 2011, Springer, Berlin, pp. 23–34. [CrossRef]
Maynard, D. , and Greenwood, M. A. , 2014, “ Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis,” Ninth International Conference on Language Resources and Evaluation (LREC), Reykjavik, Iceland, May 26–31, pp. 4238–4243.
Dey, L. , and Haque, S. , 2009, “ Studying the Effects of Noisy Text on Text Mining Applications,” Third Workshop on Analytics for Noisy Unstructured Text Data (AND), Barcelona, Spain, July 23–24, pp. 107–114.
Tsur, O. , Davidov, D. , and Rappoport, A. , 2010, “ ICWSM-A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews,” Fourth International Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26, pp. 162–169.
Davidov, D. , Tsur, O. , and Rappoport, A. , 2010, “ Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon,” 14th Conference on Computational Natural Language Learning (CoNLL), Uppsala, Sweden, July 15–16, pp. 107–116.
Navigli, R. , and Velardi, P. , 2005, “ Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation,” IEEE Trans. Pattern Anal. Mach. Intell., 27(7), pp. 1075–1086. [CrossRef] [PubMed]
Muecke, D. C. , 1982, Irony and the Ironic, Methuen, London.
Gibbs, R. W. , 1986, “ On the Psycholinguistics of Sarcasm,” J. Exp. Psychol., Gen., 115(1), p. 3. [CrossRef]
Gibbs, R. W. , and Colston, H. L. , 2007, Irony in Language and Thought: A Cognitive Science Reader, Lawrence Erlbaum, New York.
Archak, N. , Ghose, A. , and Ipeirotis, P. G. , 2011, “ Deriving the Pricing Power of Product Features by Mining Consumer Reviews,” Manage. Sci., 57(8), pp. 1485–1509. [CrossRef]
Asur, S. , and Huberman, B. A. , 2010, “ Predicting the Future With Social Media,” IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Washington, DC, Aug. 31–Sept. 3, pp. 492–499.
Stone, T. , and Choi, S.-K. , 2014, “ Visualization Tool for Interpreting User Needs From User-Generated Content Via Text Mining and Classification,” ASME Paper No. DETC2014-34424.
Zhao, W. X. , Jiang, J. , Weng, J. , He, J. , Lim, E.-P. , Yan, H. , and Li, X. , 2011, “ Comparing Twitter and Traditional Media Using Topic Models,” Advances in Information Retrieval, Springer, Berlin, pp. 338–349. [CrossRef]
Yajuan, D. , Zhimin, C. , Furu, W. , Ming, Z. , and Shum, H. Y. , 2012, “ Twitter Topic Summarization by Ranking Tweets Using Social Influence and Content Quality,” 24th International Conference on Computational Linguistics, Mumbai, India, Dec. 8–15, pp. 763–780.
Wang, Y. , Wu, H. , and Fang, H. , 2014, “ An Exploration of Tie-Breaking for Microblog Retrieval,” Advances in Information Retrieval, Springer, Cham, Switzerland, pp. 713–719. [CrossRef]
Tuarob, S. , Tucker, C. S. , Salathe, M. , and Ram, N. , 2015, “ Modeling Individual-Level Infection Dynamics Using Social Network Information,” 24th ACM International on Conference on Information and Knowledge Management, Melbourne, Australia, Oct. 19–23, pp. 1501–1510.
Tuarob, S. , and Mitrpanont, J. L. , 2017, “ Automatic Discovery of Abusive Thai Language Usages in Social Networks,” International Conference on Asian Digital Libraries, Bangkok, Thailand, Nov. 13–15, pp. 267–278.
Thelwall, M. , Buckley, K. , and Paltoglou, G. , 2011, “ Sentiment in Twitter Events,” J. Am. Soc. Inf. Sci. Technol., 62(2), pp. 406–418. [CrossRef]
Kucuktunc, O. , Cambazoglu, B. B. , Weber, I. , and Ferhatosmanoglu, H. , 2012, “ A Large-Scale Sentiment Analysis for Yahoo! Answers,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 633–642.
Weber, I. , Ukkonen, A. , and Gionis, A. , 2012, “ Answers, Not Links: Extracting Tips From Yahoo! Answers to Address How-to Web Queries,” Fifth ACM International Conference on Web Search and Data Mining (WSDM '12), Seattle, WA, Feb. 8–12, pp. 613–622.
Blei, D. M. , Ng, A. Y. , and Jordan, M. I. , 2003, “ Latent Dirichlet Allocation,” J. Mach. Learn. Res., 3, pp. 993–1022.
Paul, M. J. , and Dredze, M. , 2011, “ A Model for Mining Public Health Topics From Twitter,” Tech. Rep., 11, p. 16.
Paul, M. J. , and Dredze, M. , 2011, “ You are What You Tweet: Analyzing Twitter for Public Health,” Fifth International AAAI Conference on Weblogs and Social Media (ICWSM), Barcelona, Spain, July 17–21, pp. 265–272.
Ramage, D. , Dumais, S. T. , and Liebling, D. J. , 2010, “ Characterizing Microblogs With Topic Models,” Fourth International AAAI Conference on Weblogs and Social Media (ICWSM), Washington, DC, May 23–26.
Prier, K. W. , Smith, M. S. , Giraud-Carrier, C. , and Hanson, C. L. , 2011, “ Identifying Health-Related Topics on Twitter,” Social Computing, Behavioral-Cultural Modeling and Prediction, Springer, Berlin, pp. 18–25. [CrossRef]
Jin, O. , Liu, N. N. , Zhao, K. , Yu, Y. , and Yang, Q. , 2011, “ Transferring Topical Knowledge From Auxiliary Long Texts for Short Text Clustering,” 20th ACM International Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, Oct. 24–28, pp. 775–784.
Tuarob, S. , and Tucker, C. S. , 2016, “ Automated Discovery of Product Preferences in Ubiquitous Social Media Data: A Case Study of Automobile Market,” Computer Science and Engineering Conference (ICSEC), Chiang Mai, Thailand, Dec. 14–17, pp. 1–6.
González-Ibáñez, R. , Muresan, S. , and Wacholder, N. , 2011, “ Identifying Sarcasm in Twitter: A Closer Look,” 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT), Portland, OR, June 19–24, pp. 581–586.
Reyes, A. , Rosso, P. , and Veale, T. , 2013, “ A Multidimensional Approach for Detecting Irony in Twitter,” Lang. Resour. Eval., 47(1), pp. 239–268. [CrossRef]
Ahlqvist, T. , 2008, Social Media Roadmaps: Exploring the Futures Triggered by Social Media, VTT, Helsinki, Finland.
Thelwall, M. , Buckley, K. , Paltoglou, G. , Cai, D. , and Kappas, A. , 2010, “ Sentiment in Short Strength Detection Informal Text,” J. Am. Soc. Inf. Sci. Technol., 61(12), pp. 2544–2558. [CrossRef]
Guo, W. , Li, H. , Ji, H. , and Diab, M. T. , 2013, “ Linking Tweets to News: A Framework to Enrich Short Text Data in Social Media,” 51st Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, Aug. 4–9, pp. 239–249.
Ramaswamy, S. , 2018, “ Comparing the Efficiency of Two Clustering Techniques: A Case-Study Using Tweets,” Masters of Science Program, University of Maryland, College Park, MD.
Fox, E. , 2008, Emotion Science: Cognitive and Neuroscientific Approaches to Understanding Human Emotions, Palgrave Macmillan, Basingstoke, UK.
Cutting, D. , Kupiec, J. , Pedersen, J. , and Sibun, P. , 1992, “ A Practical Part-of-Speech Tagger,” Third Conference on Applied Natural Language Processing (ANLC '92), Trento, Italy, Mar. 31–Apr. 3, pp. 133–140.
Özgür, A. , Cetin, B. , and Bingol, H. , 2008, “ Co-Occurrence Network of Reuters News,” Int. J. Mod. Phys. C, 19(5), pp. 689–702. [CrossRef]
Jia, S. , Yang, C. , Liu, J. , and Zhang, Z. , 2012, “ An Improved Information Filtering Technology,” Future Computing, Communication, Control and Management, Springer, Berlin, pp. 507–512. [CrossRef]
Tuarob, S. , Mitra, P. , and Giles, C. L. , 2012, “ Improving Algorithm Search Using the Algorithm Co-Citation Network,” 12th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '12), Washington, DC, June 10–14, pp. 277–280.
Tuarob, S. , Bhatia, S. , Mitra, P. , and Giles, C. , 2013, “ Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning,” 12th International Conference on Document Analysis and Recognition (ICDAR), Washington, DC, Aug. 25–28, pp. 738–742.
Evans, D. A. , Handerson, S. K. , Monarch, I. A. , Pereiro, J. , Delon, L. , and Hersh, W. R. , 1998, Mapping Vocabularies Using Latent Semantics, Springer, Boston, MA. [CrossRef]
Tuarob, S. , Pouchard, L. C. , and Giles, C. L. , 2013, “ Automatic Tag Recommendation for Metadata Annotation Using Probabilistic Topic Modeling,” 13th ACM/IEEE-CS Joint Conference on Digital Libraries, (JCDL'13), Indianapolis, IN, July 22–26, pp. 239–248.
Tuarob, S. , Pouchard, L. , Mitra, P. , and Giles, C. , 2015, “ A Generalized Topic Modeling Approach for Automatic Document Annotation,” Int. J. Digital Libr., 16(2), pp. 111–128.
Cliche, M. , 2014, “ The Sarcasm Detector: Learning Sarcasm From Tweets!,” The Sarcasm Detector, accessed Feb. 19, 2017, http://www.thesarcasmdetector.com
Liu, F. , Liu, F. , and Liu, Y. , 2008, “ Automatic Keyword Extraction for the Meeting Corpus Using Supervised Approach and Bigram Expansion,” Spoken Language Technology Workshop (SLT 2008), Goa, India, Dec. 15–19, pp. 181–184.
Martin, S. , Brown, W. M. , Klavans, R. , and Boyack, K. W. , 2011, “ OpenOrd: An Open-Source Toolbox for Large Graph Layout,” SPIE Proc., 7868, p. 786806.
Manning, C. D. , Raghavan, P. , and Schütze, H. , 2008, Introduction to Information Retrieval, Cambridge University Press, New York. [CrossRef]
Thelwall, M. , 2017, “ The Heart and Soul of the Web? Sentiment Strength Detection in the Social Web With SentiStrength,” Cyberemotions, Springer, Cham, Switzerland, pp. 119–134. [CrossRef]
Tuarob, S. , Tucker, C. S. , Kumara, S. , Giles, C. L. , Pincus, A. L. , Conroy, D. E. , and Ram, N. , 2017, “ How are You Feeling?: A Personalized Methodology for Predicting Mental States From Temporally Observable Physical and Behavioral Information,” J. Biomed. Inf., 68, pp. 1–19. [CrossRef]
Tuarob, S. , Pouchard, L. C. , Noy, N. , Horsburgh, J. S. , and Palanisamy, G. , 2012, “ Onemercury: Towards Automatic Annotation of Environmental Science Metadata,” Second International Workshop on Linked Science, Boston, MA, Nov. 12.

Figures

Grahic Jump Location
Fig. 1

Overview of the proposed system

Grahic Jump Location
Fig. 3

Graphical example of the words co-occurring with the query compound

Grahic Jump Location
Fig. 2

Graphical visualization of the generated coword network

Grahic Jump Location
Fig. 4

Improvement of the sentiment classification results, grouped by precision, recall, and F-measure, of each sentiment class (Negative, Neutral, and Positive) when translated the sarcastic messages with the coword method

Grahic Jump Location
Fig. 5

Comparison of F-measure evaluation of the class Negative, for each selected smartphone model

Grahic Jump Location
Fig. 6

Comparison of F-measure evaluation of the class Neutral, for each selected smartphone model

Grahic Jump Location
Fig. 7

Comparison of F-measure evaluation of the class Positive, for each selected smartphone model

Tables

Errata

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In