bag.text package

Module contents

Functions to manipulate strings.

bag.text.capitalize(txt)[source]

Trim, then turn only the first character into upper case.

This function can be used as a colander preparer.

bag.text.content_of(paths, encoding='utf-8', sep='\n')[source]

Read, join and return the contents of paths.

Makes it easy to read one or many files.

bag.text.find_new_title(dir, filename)[source]

Return a path that does not exist yet, in dir.

If filename exists in dir, adds or changes the end of the file title until a name is found that doesn’t yet exist.

For instance, if file “Image (01).jpg” exists in “somedir”, returns “somedir/Image (02).jpg”.

Return type:str
bag.text.keep_digits(txt)[source]

Discard from txt all non-numeric characters.

Return type:str
bag.text.parse_iso_date(txt)[source]

Parse a datetime in ISO format.

Return type:datetime
bag.text.pluralize(singular)[source]

Return plural form of given lowercase singular word (English only).

Based on ActiveState recipe http://code.activestate.com/recipes/413172/

>>> pluralize('')
''
>>> pluralize('goose')
'geese'
>>> pluralize('dolly')
'dollies'
>>> pluralize('genius')
'genii'
>>> pluralize('jones')
'joneses'
>>> pluralize('pass')
'passes'
>>> pluralize('zero')
'zeros'
>>> pluralize('casino')
'casinos'
>>> pluralize('hero')
'heroes'
>>> pluralize('church')
'churches'
>>> pluralize('x')
'xs'
>>> pluralize('car')
'cars'
Return type:str
bag.text.random_string(length, chars='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789')[source]

Return a random string of some length.

Return type:str
bag.text.resist_bad_encoding(txt, possible_encodings=('utf8', 'iso-8859-1'))[source]

Use this to try to avoid errors from text whose encoding is unknown, when erroring out would be worse than possibly displaying garbage.

Maybe we should use the chardet library instead…

bag.text.shorten(txt, length=10, ellipsis='…')[source]

Truncate txt, adding ellipsis to end, with total length.

Return type:str
bag.text.shorten_proper(name, length=11, ellipsis='…', min=None)[source]

Shorten a proper name for displaying.

Return type:str
bag.text.simplify_chars(txt, encoding='ascii', byts=False, amap=None)[source]

Remove from txt all characters not supported by encoding

but using a map to “simplify” some characters instead of just removing them.

If byts is true, return a bytestring.

bag.text.slugify(txt, exists=<function <lambda>>, badchars='', maxlength=16, chars='abcdefghijklmnopqrstuvwxyz23456789', min_suffix_length=1, max_suffix_length=4)[source]

Return a slug that does not yet exist, based on txt.

You may provide exists, a callback that takes a generated slug and checks the database to see if it already exists.

Each attempt generates a longer suffix in order to keep the number of attempts at a minimum.

Return type:str
bag.text.strip_lower_preparer(value)[source]

Colander preparer that trims whitespace and converts to lowercase.

bag.text.strip_preparer(value)[source]

Colander preparer that trims whitespace around argument value.

bag.text.to_filename(txt, for_web=False, badchars='', maxlength=0)[source]

Massage txt until it is a good filename.

Return type:str
bag.text.uncommafy(txt, sep=', ')[source]

Generate the elements of a comma-separated string.

Takes a comma-delimited string and returns a generator of stripped strings. No empty string is yielded.

Return type:Generator[str, None, None]