bag.text package¶

Submodules¶

Module contents¶

Functions to manipulate strings.

bag.text.break_lines_near(text, length, leeway=4, whitespace=' \\r\\n\\t', end_line_break='…', start_line_break='…')[source]¶

Return a list of text broken in lines of max length.

leeway: how far to search for whitespace
whitespace: characters considered whitespace
end_line_break: character to add to the end of broken words
start_line_break: character to add to the start of broken words

Return type: List[str]

bag.text.capitalize(txt)[source]¶

Trim, then turn only the first character into upper case.

This function can be used as a colander preparer.

Return type: str

bag.text.content_of(paths, encoding='utf-8', sep='\n')[source]¶

Read, join and return the contents of paths.

Makes it easy to read one or many files.

bag.text.find_new_title(dir, filename)[source]¶

Return a path that does not exist yet, in dir.

If filename exists in dir, adds or changes the end of the file title until a name is found that doesn’t yet exist.

For instance, if file “Image (01).jpg” exists in “somedir”, returns “somedir/Image (02).jpg”.

Return type: str

bag.text.keep_digits(txt)[source]¶

Discard from txt all non-numeric characters.

Return type: str

bag.text.parse_iso_date(txt)[source]¶

Parse a datetime in ISO format.

Return type: datetime

bag.text.pluralize(singular)[source]¶

Return plural form of given lowercase singular word (English only).

Based on ActiveState recipe http://code.activestate.com/recipes/413172/

>>> pluralize('')
''
>>> pluralize('goose')
'geese'
>>> pluralize('dolly')
'dollies'
>>> pluralize('genius')
'genii'
>>> pluralize('jones')
'joneses'
>>> pluralize('pass')
'passes'
>>> pluralize('zero')
'zeros'
>>> pluralize('casino')
'casinos'
>>> pluralize('hero')
'heroes'
>>> pluralize('church')
'churches'
>>> pluralize('x')
'xs'
>>> pluralize('car')
'cars'

Return type: str

bag.text.random_string(length, chars='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789')[source]¶

Return a random string of some length.

Return type: str

bag.text.resist_bad_encoding(txt, possible_encodings=('utf8', 'iso-8859-1'))[source]¶

Use this to try to avoid errors from text whose encoding is unknown, when erroring out would be worse than possibly displaying garbage.

Maybe we should use the chardet library instead…

bag.text.shorten(txt, length=10, ellipsis='…')[source]¶

Truncate txt, adding ellipsis to end, with total length.

Return type: str

bag.text.shorten_proper(name, length=11, ellipsis='…', min=None)[source]¶

Shorten a proper name for displaying.

Return type: str

bag.text.simplify_chars(txt, encoding='ascii', byts=False, amap=None)[source]¶

Remove from txt all characters not supported by encoding…

but using a map to “simplify” some characters instead of just removing them.

If byts is true, return a bytestring.

bag.text.slugify(txt, exists=<function <lambda>>, badchars='', maxlength=16, chars='abcdefghijklmnopqrstuvwxyz23456789', min_suffix_length=1, max_suffix_length=4)[source]¶

Return a slug that does not yet exist, based on txt.

You may provide exists, a callback that takes a generated slug and checks the database to see if it already exists.

Each attempt generates a longer suffix in order to keep the number of attempts at a minimum.

Return type: str

bag.text.strip_lower_preparer(value)[source]¶: Colander preparer that trims whitespace and converts to lowercase.

bag.text.strip_preparer(value)[source]¶: Colander preparer that trims whitespace around argument value.

bag.text.to_filename(txt, for_web=False, badchars='', maxlength=0, encoding='latin1')[source]¶

Massage txt until it is a good filename.

Return type: str

bag.text.uncommafy(txt, sep=',')[source]¶

Generate the elements of a comma-separated string.

Takes a comma-delimited string and returns a generator of stripped strings. No empty string is yielded.

Return type: Generator[str, None, None]

bag.text package¶

Submodules¶

Module contents¶

Table of Contents

Previous topic

Next topic

This Page