framenet_tools.span_identification package¶
Submodules¶
framenet_tools.span_identification.spanidentifier module¶
-
class
framenet_tools.span_identification.spanidentifier.
SpanIdentifier
(cM: framenet_tools.config.ConfigManager)¶ Bases:
object
The Span Identifier for predicting possible role spans of a given sentence
- Includes multiple ways of predicting:
- -static -using allennlp -using a bilstm
Generates a list of (B)egin-, (I)nside-, (O)utside- tags for a given annotation.
Parameters: annotation – The annotation to convert Returns: A list of BIO-tags
-
get_dataset
(annotations: List[List[framenet_tools.data_handler.annotation.Annotation]])¶ Loads the dataset and combines the necessary data
Parameters: annotations – A List of all annotations containing all sentences Returns: xs: A list of senctences appended with its FEE ys: A list of frames corresponding to the given sentences
-
get_dataset_comb
(m_reader: framenet_tools.data_handler.reader.DataReader)¶ Generates sentences with their BIO-tags
Parameters: m_reader – The DataReader to create the dataset from Returns: A pair of concurrent lists containing the sequences and their labels
-
load
()¶ Loads the saved model of the span identification network
Returns:
-
predict_spans
(m_reader: framenet_tools.data_handler.reader.DataReader)¶ Predicts the spans of the currently loaded dataset. The predictions are saved in the annotations.
NOTE: All loaded spans and roles are overwritten!
Returns:
-
prepare_dataset
(xs: List[str], ys: List[str], batch_size: int = None)¶ Prepares the dataset and returns a BucketIterator of the dataset
Parameters: - batch_size – The batch_size to which the dataset will be prepared
- xs – A list of sentences
- ys – A list of frames corresponding to the given sentences
Returns: A BucketIterator of the dataset
-
query
(embedded_sentence: List[float], annotation: framenet_tools.data_handler.annotation.Annotation, pos_tags: List[str], use_static: bool = True)¶ Predicts a possible span set for a given sentence.
NOTE: This can be done static (only using syntax) or via an LSTM.
Parameters: - pos_tags – The postags of the sentence
- embedded_sentence – The embedded words of the sentence
- annotation – The annotation of the sentence to predict
- use_static – True uses the syntactic static version, otherwise the NN
Returns: A list of possible span tuples
-
query_all
(annotation: framenet_tools.data_handler.annotation.Annotation)¶ Returns all possible spans of a sentence. Therefore all correct spans are predicted, achieving a perfect Recall score, but close to 0 in Precision.
NOTE: This creates a power set! Meaning there will be 2^N elements returned (N: words in senctence).
Parameters: annotation – The annotation of the sentence to predict Returns: A list of ALL possible span tuples
-
query_nn
(embedded_sentence: List[float], annotation: framenet_tools.data_handler.annotation.Annotation, pos_tags: List[str])¶ Predicts the possible spans using the LSTM.
NOTE: In order to use this, the network must be trained beforehand
Parameters: - pos_tags – The postags of the sentence
- embedded_sentence – The embedded words of the sentence
- annotation – The annotation of the sentence to predict
Returns: A list of possible span tuples
-
query_static
(annotation: framenet_tools.data_handler.annotation.Annotation)¶ Predicts the set of possible spans just by the use of the static syntax tree.
NOTE: deprecated!
Parameters: annotation – The annotation of the sentence to predict Returns: A list of possible span tuples
-
to_one_hot
(l: List[int])¶ Helper Function that converts a list of numerals into a list of one-hot encoded vectors
Parameters: l – The list to convert Returns: A list of one-hot vectors
-
train
(mReader, mReaderDev)¶ Trains the model on all of the given annotations.
Parameters: annotations – A list of all annotations to train the model from Returns:
-
traverse_syntax_tree
(node: <MagicMock name='mock.Token' id='139663473999488'>)¶ Traverses a list, starting from a given node and returns all spans of all its subtrees.
NOTE: Recursive
Parameters: node – The node to start from Returns: A list of spans of all subtrees
framenet_tools.span_identification.spanidnetwork module¶
-
class
framenet_tools.span_identification.spanidnetwork.
SpanIdNetwork
(cM: framenet_tools.config.ConfigManager, num_classes: int)¶ Bases:
object
-
eval_dev
(xs: List[<MagicMock id='139663474125512'>] = None, ys: List[List[int]] = None)¶ Evaluates the model directly on the a prepared dataset
Parameters: - xs – The development sequences, given as a list of tensors
- ys – The labels of the sequence
Returns:
-
load_model
(path: str)¶ Loads the model from a given path
Parameters: path – The path from where to load the model Returns:
-
predict
(sent: List[int])¶ Predicts the BIO-Tags of a given sentence.
Parameters: sent – The sentence to predict (already converted by the vocab) Returns: A list of possibilities for each word for each tag
Resets the hidden states of the LSTM.
Returns:
-
save_model
(path: str)¶ Saves the current model at the given path
Parameters: path – The path to save the model at Returns:
-
train_model
(xs: List[<MagicMock id='139663486415704'>], ys: List[List[int]], dev_xs: List[<MagicMock id='139663474155472'>] = None, dev_ys: List[List[int]] = None)¶ Trains the model with the given dataset Uses the model specified in net
Parameters: - xs – The training sequences, given as a list of tensors
- ys – The labels of the sequences
- dev_xs – The development sequences, given as a list of tensors
- dev_ys – The labels of the sequences
Returns:
-