Machine learning (ML)
|
A branch of computer science dealing with the simulation of intelligent behavior in computers |
Deep learning (DL)
|
An advanced subset of ML that uses layered neural networks to analyze various factors of data |
Convolutional neural networks (CNNs)
|
A type of deep neural network, primarily used to analyze visual imagery |
Natural language processing (NLP)
|
A branch of AI focused on enabling computers to understand, interpret, and manipulate human language |
Transfer learning
|
A method in ML where a model developed for a task is reused as the starting point for a model on a second task |
Reinforcement learning
|
An area of ML concerned with how agents ought to take actions in an environment to maximize cumulative reward |
Vision-language models
|
Models that understand and generate content combining visual and textual data |
Bag of words (BoW)
|
A simple text representation model in NLP. It treats text as a collection of words regardless of their order or grammar |
Corpus
|
A large collection of text documents or spoken language data used for training and testing NLP models |
Embedding
|
Vector representations of words in a continuous space, capturing semantic relationships for NLP model improvement |
Feature engineering
|
Selecting and transforming linguistic features from raw text for NLP model input preparation |
Hidden markov model (HMM)
|
A statistical model in NLP representing sequence data, useful in tasks like speech recognition |
Information retrieval
|
The process in NLP of retrieving relevant information from large text collections based on user queries |
Jaccard similarity
|
A measure for comparing the similarity of two sets of words or documents based on shared elements |
Keyword extraction
|
Automatically identifying and extracting key words or phrases from a document for summarization |
Lemmatization
|
Reducing words to their base or root form in various word forms like singular/plural or verb tenses |
Machine translation
|
An NLP task for automatically translating text from one language to another computationally |
Named entity recognition (NER)
|
Identifying and classifying named entities (people, places, organizations) in text |
Ontology
|
Formally representing knowledge by defining concepts and entities and their relationships in NLP |
Question answering
|
An NLP task of generating accurate answers to questions posed in natural language |
Recurrence
|
Using recurrent neural networks (RNNs) in NLP for processing data sequences in language modeling |
Sentiment analysis
|
Determining the emotional tone or sentiment of text, typically as positive, negative, or neutral |
Tokenization
|
Breaking text data into individual units (words, n-grams) for analysis in NLP tasks |
Unsupervised learning
|
Training NLP models on data without explicit labels, allowing independent pattern learning |
Vector space model (VSM)
|
A mathematical model transforming text into numerical vectors for similarity calculations in NLP |
Word sense disambiguation (WSD)
|
Identifying the correct meaning of a word in context, especially for words with multiple meanings |
Zero-shot learning
|
Training models to perform tasks they have not explicitly been trained on, used in various NLP applications |