Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Sep 1;3(9):100584. doi: 10.1016/j.patter.2022.100584

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2022 The Author(s)

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

PMC Copyright notice

Linguistic analysis on OTL marketing

(A) The average length of OTL marketing abstracts and inventors' abstracts over time.

(B) The average fraction of adjectives in titles over time.

(C) The correlation between the occurrence of each adjective in the marketing abstract and net income rank. Shown here are adjectives with p <0.05. Font size indicates the frequency of the word. Text color indicates the correlation coefficient with net income rank after controlling for categories: red indicates negative correlation, and blue indicates positive correlation.

(D) Machine-learning classifiers with the marketing abstracts as inputs to predict whether the net income of an invention will be above the median net income of the inventions of the same disclosure year. TF-IDF, the classifier using term frequency-inverse document frequency features; BERT, the state-of-the-art text classifier that utilizes deep learning to provide contextual features for each word. Category baseline: only using category tags of each invention as inputs. Shown are receiver operating characteristic (ROC) curves on the hold-out test set. A classifier using TF-IDF features achieves a 0.71 area under the receiver operating characteristic (AUROC) on the hold-out test set.