On this page:
1.1 Implementation Details
1.1.1 Python Layer
revision
analyze_  all
should_  use_  token
7.9

1 “Trends” Tool

 (require pydrnlp/trends) package: pydrnlp

Documentation forthcoming.

1.1 Implementation Details

1.1.1 Python Layer

 import pydrnlp.trends package: pydrnlp
Core engine for the “Trends” tool.

procedure

(revision)  python-revision-value/c

 = (let* ((this_module_revision 10)) (list this_module_revision (pydrnlp.language.revision)))

Python method

def analyze_all(jsexpr)

Tokenizes jsexpr. TODO: document this.
Tokens which do not satisfy should_use_token with this language are discarded.
The purpose of the “text” field is to provide an example of an actual use of the word, as the lemma FIXME, but some words (e.g. “DuFay”) shouldn’t be. (Also, some lemmas are strange, like “whatev”.)

Python method

def should_use_token(token, *, lang)

Recognizes tokens which should be included in counting with respect to the given spacy.language.Language instance.
Some kinds of tokens which are excluded:
  • punctuation;

  • whitespace;

  • stop words; and

  • tokens which have a “boring” part-of-speech tag.

Part-of-speech tags that are considered “boring” notably include "NUM" (numeral) and "SYM" (symbol). Currently, all part-of-speech tags are considered “boring” except for "NOUN" and "PROPN" (i.e. proper and common nouns).