Bohdan Baliuk - 2021-05-20

I want to generate n-grams from the WoS RSCI (Russian Science Citation Index) database in CiteSpace. It contains abstracts both in English and Russian languages, but after generating from bibliometrics appears only English n-grams.

I used the following order of operations - 1) configured types of n-grams in 'Edit properties' - Maximum and Minimum words noun phrases, 2) selected 'Noun phrases' at 'Text processing' field on the main page, 3) selected 'Create POS-tags in pop-up windows', 4) clicked 'Go' and processed the dataset, 5) Chose 'Create List terms by tf*idf' in Text tab on the Main page.

By the way - I also tried to generate n-grams from full-text, - but only unigrams available as output (they include both English and Russian unigrams which imply that the logic of generating n-grams shared among different languages) but I want to define types of n-grams which important for my research (not only unigrams but bigrams and trigrams).

So maybe there is some approach on how I can get n-grams in Russian only OR at least in both languages, not only in English?.