framenet_tools.data_handler package

Submodules

framenet_tools.data_handler.annotation module

class framenet_tools.data_handler.annotation.Annotation(frame: str = 'Default', fee: str = None, position: int = None, fee_raw: str = None, sentence: List[str] = [], roles: List[str] = [], role_positions: List[Tuple[int, int]] = [])

Bases: object

Annotation class

Saves and manages all data of one frame for a given sentence.

create_handle()

Helper function for ease of programmatic comparison

NOTE: FEE is not compared due to possible differences during preprocessing!

Returns:A handle consisting of all data saved in this object

framenet_tools.data_handler.frame_embedding_manager module

class framenet_tools.data_handler.frame_embedding_manager.FrameEmbeddingManager(path: str = 'data/frame_embeddings/dict_frame_to_emb_100dim_wsb_list.txt')

Bases: object

Loads and provides the specified frame-embeddings

embed(frame: str)

Converts a given frame to its embedding

Parameters:frame – The frame to embed
Returns:The embedding (n-dimensional vector)
read_frame_embeddings()

Loads the previously specified frame embedding file into a dictionary

string_to_array(string: str)

Helper function Converts a string of an array back into an array

NOTE: specified for float arrays !!!

Parameters:string – The string of an array
Returns:The array

framenet_tools.data_handler.rawreader module

class framenet_tools.data_handler.rawreader.RawReader(cM: framenet_tools.config.ConfigManager, raw_path: str = None)

Bases: framenet_tools.data_handler.reader.DataReader

A reader for raw text files.

Inherits from DataReader

read_raw_text(raw_path: str = None)

Reads a raw text file and saves the content as a dataset

NOTE: Applying this function removes the previous dataset content

Parameters:raw_path – The path of the file to read
Returns:

framenet_tools.data_handler.reader module

class framenet_tools.data_handler.reader.DataReader(cM: framenet_tools.config.ConfigManager)

Bases: object

The top-level DataReader

Stores all loaded data from every reader.

embed_frame(frame: str)

Embeds a single frame.

NOTE: if the embeddings of the frame can not be found, a random set of values is generated.

Parameters:frame – The frame to embed
Returns:The embedding of the frame
embed_frames(force: bool = False)

Embeds all the sentences that are currently loaded.

NOTE: if forced, overrides embedded data inside of the annotation objects

Parameters:force – If true, embeddings are generate even if they already exist
Returns:
embed_word(word: str)

Embeds a single word

Parameters:word – The word to embed
Returns:The vector of the embedding
embed_words(force: bool = False)

Embeds all words of all sentences that are currently saved in “sentences”.

NOTE: Can erase all previously embedded data!

Parameters:force – If true, all previously saved embeddings will be overwritten!
Returns:
export_to_json(path: str)

Exports the list of annotations to a json file

Parameters:path – The path of the json file
Returns:
generate_pos_tags(force: bool = False)

Generates the POS-tags of all sentences that are currently saved.

Parameters:force – If true, the POS-tags will overwrite previously saved tags.
Returns:
get_annotations(sentence: List[str] = None)

Returns the annotation object for a given sentence.

Parameters:sentence – The sentence to retrieve the annotations for.
Returns:A annoation object
loaded(is_annotated: bool)

Helper for setting flags

Parameters:is_annotated – flag if loaded data was annotated
Returns:

framenet_tools.data_handler.semaforreader module

class framenet_tools.data_handler.semaforreader.SemaforReader(cM: framenet_tools.config.ConfigManager, path_sent: str = None, path_elements: str = None)

Bases: framenet_tools.data_handler.reader.DataReader

A reader for the Semafor ConLL format

Inherits from DataReader

digest_raw_data(elements: list, sentences: list)

Converts the raw elements and sentences into a nicely structured dataset

NOTE: This representation is meant to match the one in the “frames-files”

Parameters:
  • elements – the annotation data of the given sentences
  • sentences – the sentences to digest
Returns:

digest_role_data(element: str)

Parses a string of role information into the desired format

Parameters:element – The string containing the role data
Returns:A pair of two concurrent lists containing the roles and their spans
read_data(path_sent: str = None, path_elements: str = None)

Reads a the sentence and elements file and saves the content as a dataset

NOTE: Applying this function removes the previous dataset content

Parameters:
  • path_sent – The path to the sentence file
  • path_elements – The path to the elements
Returns:

framenet_tools.data_handler.semevalreader module

class framenet_tools.data_handler.semevalreader.SemevalReader(cM: framenet_tools.config.ConfigManager, path_xml: str = None)

Bases: framenet_tools.data_handler.reader.DataReader

A reader for the Semeval format.

Inherits from DataReader

digest_tree(root: <module 'xml.etree.ElementTree' from '/home/docs/.pyenv/versions/3.7.3/lib/python3.7/xml/etree/ElementTree.py'>)

Parses the xml-tree into a DataReader object.

Parameters:root – The root node of the tree
Returns:
read_data(path_xml: str = None)

Reads a xml file and parses it into the datareader format.

NOTE: Applying this function removes the previous dataset content

Parameters:path_xml – The path of the xml file
Returns:
framenet_tools.data_handler.semevalreader.char_pos_to_sentence_pos(start_char: int, end_char: int, words: List[str])

Converts positions of char spans in a sentence into word positions.

NOTE: Returned end position is represented inclusive!

Parameters:
  • start_char – The first character of the span
  • end_char – The last character of the span
  • words – A list of words in a sentence
Returns:

The start and end position of the WORD in the sentence

framenet_tools.data_handler.word_embedding_manager module

class framenet_tools.data_handler.word_embedding_manager.WordEmbeddingManager(path: str = 'data/word_embeddings/levy_deps_300.w2vt')

Bases: object

Loads and provides the specified word-embeddings

embed(word: str)

Converts a given word to its embedding

Parameters:word – The word to embed
Returns:The embedding (n-dimensional vector)
read_word_embeddings()

Loads the previously specified frame embedding file into a dictionary

string_to_array(strings: List[str])

Helper function Converts a string of an array back into an array

NOTE: specified for float arrays !!!

Parameters:strings – The strings of an array
Returns:The array

Module contents