Semantic Fingerprint API and Natural Language Processing API

Download OpenAPI specification:Download

Fingerprint

Use the Semantic Fingerprint API to create semantic fingerprints from any kind of text in 4 languages (English, Spanish, French, German). Semantic fingerprints are text embeddings that can be used to train language models and perform operations on text like search and classification efficiently.

Returns the semantic fingerprint of the given text.

This endpoint converts the input text into a semantic fingerprint. First, each word is converted into its fingerprint representation. Then these word representations are aggregated and sparsified to create the text fingerprint. Learn more about semantic fingerprints

query Parameters

retina_name required	string (Retina Name) The name of the retina
use_phrase_fingerprints	boolean (Use Phrase Fingerprints) Default: true Whether to tokenize and fingerprint known phrases. If false, fingerprints of the single tokens will be used.
pos_filter	boolean (Pos Filter) Default: true Removes terms not of POS type noun, verb or adj
stop_filter	boolean (Stop Filter) Default: true Removes stopwords if enabled
max_df	number (Max Df) Default: 0.1 Ignores terms with higher document frequency than specified threshold
max_terms	integer (Max Terms) >= 0 Default: 50 Only uses n of the most useful terms
pos_weighting	boolean (Pos Weighting) Default: true Weights nouns and proper nouns higher than adjectives and verbs if enabled. Other POS types are ignored
tfidf_weighting	boolean (Tfidf Weighting) Default: true Weights terms by tf-idf if enabled
inverse_position_weighting	boolean (Inverse Position Weighting) Default: false Weights positions by inverse frequency
max_density	number (Max Density) [ 0 .. 1 ] Default: 0.02 Max density of the aggregated fingerprint

Request Body schema: application/json
required

text

required

string (Text)

Responses

Request samples

Payload

Content type

application/json

{"text": "string"
}

Response samples

200
422

Content type

application/json

{"fp_type": "binary",
"representation": [null
]
}

Texts

Use the Natural Language Processing API to extract keywords, detect languages, compare documents,generate labels or segment long pieces of text.

Supported Languages

This endpoint returns the list of supported languages. The two letter language codes can be used as input for other endpoints. The service supports semantic operations for these languages.

Responses

Response samples

200

Content type

application/json

{"supported_languages": ["en",
"de"
]
}

Keywords Extraction

This endpoint extracts the semantically most relevant words from a given text. The "number of keywords" parameter can ideally be selected in proportion to the length of the text (default is 10 terms).

query Parameters

limit

integer (Limit) >= 0

Default: 0

Maximum number of keywords to return. If unspecified or equal zero, an appropriate number will be automatically determined.

Request Body schema: application/json
required

text required	string (Text) .\S. The input text. Cannot be empty.
language	string (Language) ^([a-z][a-z])?$ Language of the text, e.g. 'en' (ISO 639-1). If not provided, the service will try to infer it from the text.

Responses

Request samples

Payload

Content type

application/json

{"text": "Cortical.io’s mission is to deliver AI-based solutions that streamline the extraction, classification, review and analysis of information hidden in unstructured text while providing short time to value. We accomplish this through our novel, meaning-based approach to natural language understanding that solves many critical challenges of text processing in a business context. With more than 10 years expertise in implementing intelligent document processing solutions in the enterprise, Cortical.io has demonstrated its ability to solve the challenges of language ambiguity and variability across many use cases and verticals and is trusted by major companies across the globe.",
"language": "en"
}

Response samples

200
400
422

Content type

application/json

{"keywords": [{"word": "string",
"document_frequency": 0,
"pos_tags": ["NOUN"
],
"score": 0
}
],
"language": "en"
}

Semantic Similarity

This endpoint computes the semantic similarity between two texts, and returns the value in range [0, 1].

Request Body schema: application/json
required

Array

text required	string (Text) .\S. The input text. Cannot be empty.
language	string (Language) ^([a-z][a-z])?$ Language of the text, e.g. 'en' (ISO 639-1). If not provided, the service will try to infer it from the text.

Responses

Request samples

Payload

Content type

application/json

[{"text": "organ",
"language": "en"
},
{"text": "piano",
"language": "en"
}
]

Response samples

200
400
422

Content type

application/json

{"similarity": 0.7,
"languages": ["en",
"en"
]
}

Detect Language

This endpoint detects the language of a given text by using a fastText language detection model. The model is not precise on very short texts (a few words are needed). If the input text contains multiple languages, the endpoint will only return the language with the highest confidence level.

Request Body schema: application/json
required

text

required

string (Text)

Responses

Request samples

Payload

Content type

application/json

{"text": "What language is this?"
}

Response samples

200
422

Content type

application/json

{"language": "en"
}

Similar Terms

This endpoint identifies the most similar words for a submitted text. The text-fingerprint is compared to all known word-fingerprints and the words with the highest similarities are returned.

query Parameters

limit	integer (Limit) [ 1 .. 1000 ] Default: 10 Number of terms to return
nouns_only	boolean (Nouns Only) Default: true Whether to return only nouns
context	boolean (Context) Default: false Whether to use context removal in finding the labels

Request Body schema: application/json
required

text required	string (Text) .\S. The input text. Cannot be empty.
language	string (Language) ^([a-z][a-z])?$ Language of the text, e.g. 'en' (ISO 639-1). If not provided, the service will try to infer it from the text.

Responses

Request samples

Payload

Content type

application/json

{"text": "Cortical.io’s mission is to deliver AI-based solutions that streamline the extraction, classification, review and analysis of information hidden in unstructured text while providing short time to value. We accomplish this through our novel, meaning-based approach to natural language understanding that solves many critical challenges of text processing in a business context. With more than 10 years expertise in implementing intelligent document processing solutions in the enterprise, Cortical.io has demonstrated its ability to solve the challenges of language ambiguity and variability across many use cases and verticals and is trusted by major companies across the globe.",
"language": "en"
}

Response samples

200
400
422

Content type

application/json

{"labels": [{"word": "string",
"document_frequency": 0,
"pos_tags": ["NOUN"
]
}
],
"language": "en"
}

Text Segmentation

This endpoint breaks down a given text into smaller segments, referred to as 'topical paragraphs'. These 'topical paragraphs' are computed by combining adjacent segments until a shift in linguistic cues or topics is identified.

query Parameters

max_results

integer (Max Results)

Default: -1

Number of slices to return. Returns all segments if max_results < 0

Request Body schema: application/json
required

text required	string (Text) .\S. The input text. Cannot be empty.
language	string (Language) ^([a-z][a-z])?$ Language of the text, e.g. 'en' (ISO 639-1). If not provided, the service will try to infer it from the text.

Responses

Request samples

Payload

Content type

application/json

{"text": "Tigers mostly feed on large and medium-sized mammals, particularly ungulates weighing 60–250 kg (130–550 lb). Range-wide, the most selected prey are sambar deer, Manchurian wapiti, barasingha and wild boar. Tigers are capable of taking down larger prey like adult gaur and wild water buffalo, but opportunistically eat much smaller prey, such as monkeys, peafowl and other ground-based birds, hares, porcupines and fish. They also prey on other predators, including dogs, leopards, bears, snakes and crocodiles. Tiger attacks on adult Asian elephants and Indian rhinoceros have also been reported. The Middle English tigre and Old English tigras derive from Old French tigre, from Latin tigris. This was a borrowing of Classical Greek 'tigris', a foreign borrowing of unknown origin meaning 'tiger' and the river Tigris. The origin may have been the Persian word tigra ('pointed or sharp') and the Avestan word tigrhi ('arrow'), perhaps referring to the speed of the tiger's leap, although these words are not known to have any meanings associated with tigers. The origin may have been the Persian word tigra ('pointed or sharp') and the Avestan word tigrhi ('arrow'), perhaps referring to the speed of the tiger's leap, although these words are not known to have any meanings associated with tigers. There are three other colour variants – white, golden and nearly stripeless snow white – that are now virtually non-existent in the wild due to the reduction of wild tiger populations, but continue in captive populations. The white tiger has white fur and sepia-brown stripes. The golden tiger has a pale golden pelage with a blond tone and reddish-brown stripes. The snow white tiger is a morph with extremely faint stripes and a pale reddish-brown ringed tail. Both snow white and golden tigers are homozygous for CORIN gene mutations.",
"language": "en"
}

Response samples

200
400
422

Content type

application/json

{"segments": ["string"
],
"language": "en"
}

Semantic Fingerprint API and Natural Language Processing API

Fingerprint

Returns the semantic fingerprint of the given text.

query Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Texts

Supported Languages

Responses

Response samples

Keywords Extraction

query Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Semantic Similarity

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Detect Language

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Similar Terms

query Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Text Segmentation

query Parameters

Request Body schema: application/jsonrequired

Responses

Request samples

Response samples

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required

Request Body schema: application/json
required