Tatoo Documentation Release 0.7.0 - Dmitry Malinovsky
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Tatoo Documentation Release 0.7.0 Dmitry Malinovsky March 04, 2015
Contents 1 Installation 3 2 Documentation Contents 5 2.1 Why tatoo? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Hello, world! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 User Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 i
ii
Tatoo Documentation, Release 0.7.0 Tatoo is extensible task toolkit. • No global state, • The only hardcoded behaviour is to be configurable, • Extremely extensible. It aims to provide a generic way to inspect the runtime environment such as platform specifics (e.g. OS), current configuration parameters, registered tasks and so forth, and to be configurable as much as possible without any sort of monkey patching. Although such configurability may be painful, tatoo provides sensible defaults. Contents 1
Tatoo Documentation, Release 0.7.0 2 Contents
CHAPTER 1 Installation Grab the latest stable version from PyPI: $ pip install tatoo 3
Tatoo Documentation, Release 0.7.0 4 Chapter 1. Installation
CHAPTER 2 Documentation Contents 2.1 Why tatoo? The answer is very simple: the lack of powerful yet simple task execution toolkits. It tries to bring the awesomeness of Celery to the local task execution. Here are some core features tatoo provides: • laziness, almost everything is created on demand, • extreme configurability, • extensibility, • beautiful programming api, • it works the same in Python 2.6+, 3.3+ and Pypy, • but it is written with Python 3 in mind, so it uses a lot of Python 3 features, carefully backported to Python 2, • it is tested continuously on Linux, Windows and MacOS, • it doesn’t try to reinvent the wheel, requirements have been chosen carefully. Of course, there are alternatives to tatoo and you can use them if they’re more suitable for you. There are two features commonly missing in all listed libraries: • configurability and • extensibility. There are a lot of use-cases when you will need to subclass the base Task class, for example, to provide specific method. How to tell the library to use your custom subclass instead of base class? Or how to add a custom command to the command line interface? The answer is “you can’t”, or “monkey-patch”. Tatoo is written to be configurable and extensible as much as possible. Cases listed above are “easy level” of tatoo configuring abilities. You can even extend it to call tasks remotely! 2.1.1 Why not Pyinvoke? The programming interface that tatoo provides looks very similar to the Pyinvoke’s. However, Pyinvoke makes a lot of assumptions of how to handle task arguments. For instance, it generates command line arguments and options automatically from the function definition and automatically defines types of arguments with default values (using 5
Tatoo Documentation, Release 0.7.0 inspect module). The upside of this is that you don’t need to write any code to make your task available from the command line, keyword arguments are cased to corresponding types automatically. The downside is you can’t really control this process. You can’t specify arguments and options explicitly. You can’t define complex types (e.g. File type). Automatic generation of options makes your command line interface to depend on the task signature - imagine that you want to change the order of arguments, this will cause Pyinvoke to generate options differently. Pyinvoke brings unnecessary concepts of pre- and post-tasks, deduplication, namespaces, contexts, standard tasks, contextualized tasks and so on. This makes Pyinvoke difficult to learn and use. 2.1.2 Why not Doit, Shovel, Paver and others? Other libraries has similar concerns, often they’re too specialized. Some libraries has very verbose syntax for defining tasks, others are too simplified. Also, it seems that these libraries were designed as Make for python. Tatoo is not just yet-another-Make, although it can be used that way. The aim of tatoo is to provide simple, but extensible interface to call tasks and expect the possibility to add additional behaviors, so you can start growing from a very simple task to a number of complex tasks calling each other without days spent on reading the documentation and the source code figuring out how to make a small step aside. 2.2 Hello, world! The first thing you need is an Environment instance. Environment is basically a container for all runtime parame- ters, configuration, tasks and so forth: from tatoo import Environment env = Environment(’myenv’) Each environment instance should have a name. This helps to identify the currently used environment if there are multiple environments are defined. Once you have an environment instance, you can transform a function into a task by decorating it through task() decorator: @env.task def hello(): print(’Hello, world!’) Putting all together in tasks.py file: from tatoo import Environment env = Environment() @env.task def hello(): print(’Hello, world!’) Executing the task is simple as: >>> from tasks import hello >>> hello.apply() # prints Hello, world! 6 Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0 2.3 User Guide 2.3.1 Environment Tatoo uses a central object called the runtime Environment. Instances of this class are used to store the configuration and global objects, for example, tasks. The simplest way to create an environment instance looks like this: from tatoo import Environment env = Environment() Environment instances are thread safe (they do not share any global state) so that multiple environments with different settings, tasks and extensions can co-exist in the same process space. Environment Name To make it easier to identify which environment is currently used, you should specify names for environments: env = Environment(’myenv’) Environment name will be included in logs like this: [2015-01-29 00:42:23,206: INFO | env: myenv] Task add [...] succeeded in 0.000219s: 3 Settings Sometimes you will need to make your application configurable, which basically means that you will need a globally accessible configuration storage. There is one you can use: env.settings[’SOMEKEY’] = ’somevalue’ Although it provides mapping interface, you can also use attribute access: print(env.settings.SOMEKEY) # somevalue Several keys may be updated at once using update() method: env.settings.update( ONEKEY=True, TWOKEY=False, ) New default values can be added using add_defaults() method: env.settings.add_defaults(OTHERKEY=False) Censored Settings If you ever want to print out the configuration, as debugging information or similar, you may also want to filter out sensitive information like passwords and API keys using humanize() method: env.settings.humanize() If you want to get the humanized dictionary instead, consider using table() method: 2.3. User Guide 7
Tatoo Documentation, Release 0.7.0 env.setting.table() Please note that tatoo will not be able to remove all sensitive information, as it merely uses a regular expression to search for commonly named keys. If you add custom settings containing sensitive information you should name the keys using a name that Celery identifies as secret. A configuration setting will be censored if the name contains any of these substrings: API, TOKEN, KEY, SECRET, PASS, SIGNATURE, DATABASE Breaking The Chain This practice is named “object chain” or, specifically, “env chain”. The idea is to pass env instance explicitly to every single object that needs it instead of having a global variable. To become a chain link, a class should conform to the following rules: 1. One must be able to set env attribute directly, and 2. __init__ must accept env argument. Here is the example of class which follows the rules: class EnvCompatibleClass(object): env = None def __init__(self, env=None): self.env = env or self.env This approach makes possible to not have a shared global env registry, env attribute should be accessed instead. Environment as Registry of Objects Note: This section requires understanding the ZCA concepts and is intended for developers. If you’re a end-user, you can freely skip this part. Each environment maintains its own local component registry. In fact, it subclasses zope.interface.registry.Components. You can register any custom interface provider using common methods like registerUtility(). For example, you can register custom tasks registry like this: from tatoo import Environment from zope.interface import implementer_only from tatoo.interfaces import ITaskRegistry from tatoo.tasks import TaskRegistry @implementer_only(ITaskRegistry) class MyTaskRegistry(TaskRegistry): pass myregistry = MyTaskRegistry() env = Environment() env.registerUtility(myregistry) To keep examples short, we subclass TaskRegistry directly. However, it is only required to conform ITaskRegistry interface, so you may write your custom implementation from the scratch. 8 Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0 This rule generally applies to every single object - you can find all interfaces that tatoo defines in tatoo.interfaces. 2.3.2 Tasks Tasks are callables with some helper wrappers around them. To define a task, you need two things: • an environment instance • and a callable. Then you can easily create task from this callable using env.task decorator: from tatoo import Environment env = Environment(__name__) @env.task def hello(): print(’Hello, world!’) Tatoo will create a specially formed class and instantiate it in place, and the callable will be swapped with this instance. Names Every task must be associated with a unique name. You can specify one using name argument: @env.task(name=’world’) def hello(): print(’Hello, world!’) You can also specify a factory accepting two arguments: the environment instance and the wrapped callable: @env.task(name=lambda env, fun: fun.__name__) def hello(): print(’Hello, world!’) If you omit name, it will be automatically generated. The current behavior is to simply take the callable’s __name__. You can tell the name of the task by investigating its name attribute: >>> hello.name ’hello’ Names can be used to get tasks from the task registry: >>> env.tasks[’hello’] Parameters As normal python callables, tasks may have arguments. In the simplest case you can declare a parametrized task like this: 2.3. User Guide 9
Tatoo Documentation, Release 0.7.0 @env.task def copy(src, dst, recursive=False): print(’Copying {src} to {dst}{postfix}’.format( src=src, dst=dst, postfix=’ recursively’ if recursive else ’’ )) However, there are problems with this approach. One of them is that you often want to somehow validate incoming values. Given the example above, • src must be a string representing a path, this path must exist, it must be resolvable (i.e. symlinks must be resolved) and it must be readable, • dst, similarly to src, must be a string representing a path, it may not exist, but it must be writable, • recursive argument acts like a flag and must be bool, • finally, if recursive is False, src must not be a directory. Of course, you can manually validate arguments in the task body, but it will make your task a little harder to understand and to maintain. It will be difficult to say which rules must be followed for a valid argument. Another problem is that if you want to build a command-line interface, and call your tasks with it, there is no trans- parent rule to map arguments defined in the python code to the command-line arguments. Let’s start with the obvious one: cli copy -r/--recursive SRC DST However, there is no restriction to make this: cli copy SRC DST [RECURSIVE] It become even more ambiguous when you have options with parameters, like: @env.task def copy(src, dst, recursive=False, backup=None): if backup not None: backup=’ [backing up to {0}]’.format(backup) print(’Copying {src} to {dst}{postfix}{backup}’.format( src=src, dst=dst, rec=’ recursively’ if recursive else ’’, backup=backup or ’’, )) Now it is completely impossible to programmatically say, should backup be: • a boolean flag, or • an option accepting a path, or • an optional argument. There is one more argument which makes the idea to automatically generate command-line options and arguments from the task signature impracticable. Consider the following example: @env.task def echo(message, options=None, outfile=None): file = sys.stderr if outfile is None else outfile if options is not None: if options[’colored’]: message = colored(message) print(message, file=file) For options argument we generate short flat -o. For outfile we can’t use -o because it is taken already, so we take the next letter - -u. The software evolves, and after a while you decided to swap outfile and options in the 10 Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0 task signature - and now all your scripts are broken because -o flag will now be used for outfile, options will use -p and -u flag is gone. Tatoo solves all these problems with parameter() decorator: from tatoo import parameter from tatoo.task import types def validate_path(path, arguments): if not arguments[’recursive’]: path.dir_ok = False path(arguments[’src’]) @env.task @parameter(’SOURCE’, ’src’, type=types.Path(exists=True, file_ok=True, dir_ok=True, readable=True, resolve_path=True, validator=validate_path)) @parameter(’DEST’, ’dst’, type=types.Path(writable=True)) @parameter(’-R’, ’-r’, ’--recursive’, help=’Copy directories recursively’, is_flag=True) def copy(src, dst, recursive): """Copy SOURCE to DEST.""" Now you can unambiguously say that src can be mapped to SOURCE command-line argument, dst - to DEST and that recursive is a boolean flag that can be specified as short -R, as another short -r and as long --recursive options. All basic validation happens in types module, and you can specify additional validation using validator argument as shown in the example above. You can inspect all registered parameters via parameters attribute. Note: Tatoo itself does not provide a command-line interface. But it can be implemented as external package, and tatoo must provide some way to unambiguously create command-line interfaces. It is also useful without involving a cli - for example, tatoo performs type checks before running tasks when TATOO_VALIDATE_PARAMS setting is True. Execution & Results To execute a task, method apply() should be used: >>> from tatoo import Environment >>> env = Environment(’test’) >>> @env.task ... def add(x, y): ... return x + y ... >>> res = add.apply(args=(1, 2)) Note the returned value is instance of EagerResult class: >>> res It is a convinient result wrapper allows you to inspect various metrics: >>> res.result 3 2.3. User Guide 11
Tatoo Documentation, Release 0.7.0 >>> res.state ’SUCCESS’ >>> res.runtime 4.540802910923958e-05 Let’s make our task to raise TypeError exception: >>> res = add.apply(args=(1, ’2’)) >>> res.failed() True >>> res.result TypeError("unsupported operand type(s) for +: ’int’ and ’str’",) >>> print(res.traceback) Traceback (most recent call last): File "/home//tatoo/task/trace.py", line 107, in trace_task **request[’kwargs’]) File "", line 3, in add TypeError: unsupported operand type(s) for +: ’int’ and ’str’ >>> res.state ’FAILURE’ You can also propagate exceptions like this: >>> res = add.apply(args=(1, ’2’)) >>> res.get() Traceback (most recent call last): File "", line 1, in File "/home//tatoo/task/result.py", line 41, in get self.maybe_reraise() File "/home//tatoo/task/result.py", line 58, in maybe_reraise raise self.result File "/home//tatoo/task/trace.py", line 107, in trace_task **request[’kwargs’]) File "/home//test.py", line 8, in add return x + y TypeError: unsupported operand type(s) for +: ’int’ and ’str’ Also apply() accepts a number of extra parameters which form the execution request: >>> res = add.apply(args=(1,), kwargs={’y’: 2}, request_id=’someid’) >>> res.result 3 >>> res.request_id ’someid’ 2.3.3 Extensions Tatoo follows the microservice architecture by dividing different components into third party packages. It allows tatoo itself to be small, simple and testable. It is very easy to write custom extensions. Let’s create one in maths.py file: from tatoo.extension import Extension ext = Extension(__name__, version=’0.1.0’) @ext.task 12 Chapter 2. Documentation Contents
Tatoo Documentation, Release 0.7.0 def add(x, y): return x + y @ext.task def sub(x, y): return x - y @ext.task def mul(x, y): return x * y @ext.task def div(x, y): return x / y Task names In the task names section you learned that each task has a unique name, and by default it’s generated from the callable’s name. It’s not true for tasks defined in extensions. Default behavior here is to prepend the extension name (and dot), so that add task defined in math extension gets math.add name by default. You will also need a simple setup.py: from setuptools import setup setup( name=’math-tasks’, version=’0.1.0’, py_modules=[’maths’], entry_points=""" [tatoo.extensions] maths = maths:ext """ ) Now you’re able to install this extension using pip: $ pip install . Let’s make sure that our math extension is loaded and is usable: >>> from tatoo import Environment >>> env = Environment(’testenv’) >>> print(list(env.extensions)) [’math’] >>> res = env.tasks[’math.add’].apply(args=(1, 2)) >>> res.result 3 2.3.4 Logging TODO 2.3. User Guide 13
Tatoo Documentation, Release 0.7.0 2.3.5 Signals TODO 14 Chapter 2. Documentation Contents
You can also read