employer cover photo
employer logo
employer logo

Celebal Technologies

Is this your company?

Celebal Technologies interview question

stemming, lemmatization and tokenization

Interview Answer

Anonymous

Sep 14, 2022

Tokenization - It is the process of breaking down the given text into the smallest unit in a sentence called a token. Punctuation marks, words, and numbers can be considered tokens. Stemming- the process of finding the root of words. Lemmatization- The process of finding the form of the related word in the dictionary. It is different from Stemming. It involves longer processes to calculate than Stemming.