util — General functions and constants

The util module defines general functions and constants used throughout the library.

class pylangacq.util.ListFromIterables(*iterables)[source]

A class like list that can be initialized with iterables.

pylangacq.util.clean_utterance(utterance, phon=False)[source]

Filter away the CHAT-style annotations in utterance.

Parameters:
  • utterance – The utterance as a str
  • phon – whether we are handling PhonBank data; defaults to False. If True, words like “xxx” and “yyy” won’t be removed.
Returns:

The utterance without CHAT annotations

Return type:

str

pylangacq.util.convert_date_to_tuple(date_str)[source]

Convert date_str to (year, month, day), e.g., from '01-FEB-2016' to (2016, 2, 1).

pylangacq.util.endswithoneof(inputstr, seq)[source]
Check if inputstr ends with one of the items in seq. If it does, return
the item that it ends with. If it doe not, return None.
Parameters:
  • inputstr – input string
  • seq – sequences of items to check
Returns:

the item the the input string ends with (None if not found)

Return type:

str or None

pylangacq.util.find_indices(longstr, substring)[source]

Find all indices of non-overlapping occurrences of substring in longstr

Parameters:
  • longstr – the long string
  • substring – the substring to find
Returns:

list of indices of the long string for where substring occurs

Return type:

list

pylangacq.util.get_lemma_from_mor(mor)[source]

Extract lemma from mor

pylangacq.util.get_participant_code(tier_marker_seq)[source]

Return the participant code from a tier marker set.

Parameters:tier_marker_seq – A sequence of tier markers like {'CHI', '%mor', '%gra'}
Returns:A participant code, e.g., 'CHI'
Return type:str, or None if no participant code is found
pylangacq.util.remove_extra_spaces(inputstr)[source]
Remove extra spaces in inputstr so that there are only single
(but not double, triple etc) spaces.
Parameters:inputstr – input string
Returns:string with replacers replaced by corresponding replacees
pylangacq.util.replace_all(inputstr, replacee_replacer_pairs)[source]
Replace in inputstr all replacers by the corresponding replacees in
replacee_replacer_pairs.
Parameters:
  • inputstr – input string
  • replacee_replacer_pairs – pairs of (replacee, replacer)
Returns:

string with all replacees replaced by their respective replacers

pylangacq.util.startswithoneof(inputstr, seq)[source]
Check if inputstr starts with one of the items in seq. If it does, return
the item that it starts with. If it doe not, return None.
Parameters:
  • inputstr – input string
  • seq – sequences of items to check
Returns:

the item the the input string starts with (None if not found)

Return type:

str or None