bag.text package¶
Submodules¶
Module contents¶
Functions to manipulate strings.
- bag.text.break_lines_near(text, length, leeway=4, whitespace=' \\r\\n\\t', end_line_break='…', start_line_break='…')[source]¶
Return a list of text broken in lines of max length.
leeway
: how far to search for whitespacewhitespace
: characters considered whitespaceend_line_break
: character to add to the end of broken wordsstart_line_break
: character to add to the start of broken words
- Return type
List
[str
]
- bag.text.capitalize(txt)[source]¶
Trim, then turn only the first character into upper case.
This function can be used as a colander preparer.
- Return type
str
- bag.text.content_of(paths, encoding='utf-8', sep='\n')[source]¶
Read, join and return the contents of
paths
.Makes it easy to read one or many files.
- bag.text.find_new_title(dir, filename)[source]¶
Return a path that does not exist yet, in
dir
.If
filename
exists indir
, adds or changes the end of the file title until a name is found that doesn’t yet exist.For instance, if file “Image (01).jpg” exists in “somedir”, returns “somedir/Image (02).jpg”.
- Return type
str
- bag.text.pluralize(singular)[source]¶
Return plural form of given lowercase singular word (English only).
Based on ActiveState recipe http://code.activestate.com/recipes/413172/
>>> pluralize('') '' >>> pluralize('goose') 'geese' >>> pluralize('dolly') 'dollies' >>> pluralize('genius') 'genii' >>> pluralize('jones') 'joneses' >>> pluralize('pass') 'passes' >>> pluralize('zero') 'zeros' >>> pluralize('casino') 'casinos' >>> pluralize('hero') 'heroes' >>> pluralize('church') 'churches' >>> pluralize('x') 'xs' >>> pluralize('car') 'cars'
- Return type
str
- bag.text.random_string(length, chars='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789')[source]¶
Return a random string of some length.
- Return type
str
- bag.text.resist_bad_encoding(txt, possible_encodings=('utf8', 'iso-8859-1'))[source]¶
Use this to try to avoid errors from text whose encoding is unknown, when erroring out would be worse than possibly displaying garbage.
Maybe we should use the chardet library instead…
- bag.text.shorten(txt, length=10, ellipsis='…')[source]¶
Truncate
txt
, addingellipsis
to end, with totallength
.- Return type
str
- bag.text.shorten_proper(name, length=11, ellipsis='…', min=None)[source]¶
Shorten a proper name for displaying.
- Return type
str
- bag.text.simplify_chars(txt, encoding='ascii', byts=False, amap=None)[source]¶
Remove from
txt
all characters not supported byencoding
…but using a map to “simplify” some characters instead of just removing them.
If
byts
is true, return a bytestring.
- bag.text.slugify(txt, exists=<function <lambda>>, badchars='', maxlength=16, chars='abcdefghijklmnopqrstuvwxyz23456789', min_suffix_length=1, max_suffix_length=4)[source]¶
Return a slug that does not yet exist, based on
txt
.You may provide
exists
, a callback that takes a generated slug and checks the database to see if it already exists.Each attempt generates a longer suffix in order to keep the number of attempts at a minimum.
- Return type
str
- bag.text.strip_lower_preparer(value)[source]¶
Colander preparer that trims whitespace and converts to lowercase.
- bag.text.strip_preparer(value)[source]¶
Colander preparer that trims whitespace around argument value.