framenet_tools.span_identification package¶

Submodules¶

framenet_tools.span_identification.spanidentifier module¶

class framenet_tools.span_identification.spanidentifier.SpanIdentifier(cM: framenet_tools.config.ConfigManager)¶

Bases: object

The Span Identifier for predicting possible role spans of a given sentence

Includes multiple ways of predicting:: -static -using allennlp -using a bilstm

generate_BIO_tags(annotation: framenet_tools.data_handler.annotation.Annotation)¶

Generates a list of (B)egin-, (I)nside-, (O)utside- tags for a given annotation.

Parameters:	annotation – The annotation to convert
Returns:	A list of BIO-tags

get_dataset(annotations: List[List[framenet_tools.data_handler.annotation.Annotation]])¶

Loads the dataset and combines the necessary data

Parameters:	annotations – A List of all annotations containing all sentences
Returns:	xs: A list of senctences appended with its FEE ys: A list of frames corresponding to the given sentences

get_dataset_comb(m_reader: framenet_tools.data_handler.reader.DataReader)¶

Generates sentences with their BIO-tags

Parameters:	m_reader – The DataReader to create the dataset from
Returns:	A pair of concurrent lists containing the sequences and their labels

load()¶

Loads the saved model of the span identification network

Returns:

predict_spans(m_reader: framenet_tools.data_handler.reader.DataReader)¶

Predicts the spans of the currently loaded dataset. The predictions are saved in the annotations.

NOTE: All loaded spans and roles are overwritten!

Returns:

prepare_dataset(xs: List[str], ys: List[str], batch_size: int = None)¶

Prepares the dataset and returns a BucketIterator of the dataset

Parameters:	batch_size – The batch_size to which the dataset will be prepared xs – A list of sentences ys – A list of frames corresponding to the given sentences
Returns:	A BucketIterator of the dataset

query(embedded_sentence: List[float], annotation: framenet_tools.data_handler.annotation.Annotation, pos_tags: List[str], use_static: bool = True)¶

Predicts a possible span set for a given sentence.

NOTE: This can be done static (only using syntax) or via an LSTM.

Parameters:	pos_tags – The postags of the sentence embedded_sentence – The embedded words of the sentence annotation – The annotation of the sentence to predict use_static – True uses the syntactic static version, otherwise the NN
Returns:	A list of possible span tuples

query_all(annotation: framenet_tools.data_handler.annotation.Annotation)¶

Returns all possible spans of a sentence. Therefore all correct spans are predicted, achieving a perfect Recall score, but close to 0 in Precision.

NOTE: This creates a power set! Meaning there will be 2^N elements returned (N: words in senctence).

Parameters:	annotation – The annotation of the sentence to predict
Returns:	A list of ALL possible span tuples

query_nn(embedded_sentence: List[float], annotation: framenet_tools.data_handler.annotation.Annotation, pos_tags: List[str])¶

Predicts the possible spans using the LSTM.

NOTE: In order to use this, the network must be trained beforehand

Parameters:	pos_tags – The postags of the sentence embedded_sentence – The embedded words of the sentence annotation – The annotation of the sentence to predict
Returns:	A list of possible span tuples

query_static(annotation: framenet_tools.data_handler.annotation.Annotation)¶

Predicts the set of possible spans just by the use of the static syntax tree.

NOTE: deprecated!

Parameters:	annotation – The annotation of the sentence to predict
Returns:	A list of possible span tuples

to_one_hot(l: List[int])¶

Helper Function that converts a list of numerals into a list of one-hot encoded vectors

Parameters:	l – The list to convert
Returns:	A list of one-hot vectors

train(mReader, mReaderDev)¶

Trains the model on all of the given annotations.

Parameters:	annotations – A list of all annotations to train the model from
Returns:

traverse_syntax_tree(node: <MagicMock name='mock.Token' id='139663473999488'>)¶

Traverses a list, starting from a given node and returns all spans of all its subtrees.

NOTE: Recursive

Parameters:	node – The node to start from
Returns:	A list of spans of all subtrees

framenet_tools.span_identification.spanidnetwork module¶

class framenet_tools.span_identification.spanidnetwork.SpanIdNetwork(cM: framenet_tools.config.ConfigManager, num_classes: int)¶

Bases: object

eval_dev(xs: List[<MagicMock id='139663474125512'>] = None, ys: List[List[int]] = None)¶

Evaluates the model directly on the a prepared dataset

Parameters:	xs – The development sequences, given as a list of tensors ys – The labels of the sequence
Returns:

load_model(path: str)¶

Loads the model from a given path

Parameters:	path – The path from where to load the model
Returns:

predict(sent: List[int])¶

Predicts the BIO-Tags of a given sentence.

Parameters:	sent – The sentence to predict (already converted by the vocab)
Returns:	A list of possibilities for each word for each tag

reset_hidden()¶

Resets the hidden states of the LSTM.

Returns:

save_model(path: str)¶

Saves the current model at the given path

Parameters:	path – The path to save the model at
Returns:

train_model(xs: List[<MagicMock id='139663486415704'>], ys: List[List[int]], dev_xs: List[<MagicMock id='139663474155472'>] = None, dev_ys: List[List[int]] = None)¶

Trains the model with the given dataset Uses the model specified in net

Parameters:	xs – The training sequences, given as a list of tensors ys – The labels of the sequences dev_xs – The development sequences, given as a list of tensors dev_ys – The labels of the sequences
Returns:

framenet_tools.span_identification package¶

Submodules¶

framenet_tools.span_identification.spanidentifier module¶

framenet_tools.span_identification.spanidnetwork module¶

Module contents¶