Writing a custom Specifier Type
While glom comes with a lot of built-in features, no library can ever encompass all data manipulation operations.
To cover every case out there, glom provides a way to extend its functionality with your own data handling hooks. This document explains glom’s execution model and how to integrate with it when writing a custom Specifier Type.
When to write a Specifier Type
glom
has always supported arbitrary callables, like so:
glom({'nums': range(5)}, ('nums', sum))
# 10
With this built-in extensibility, what does a glom specifier type add?
Custom specifier types are useful when you want to:
Perform validation at spec construction time
Enable users to interact with new target types and operations
Improve readability and reusability of your data transformations
Temporarily change the glom runtime behavior
If you’re just building a one-off spec for transforming your own data,
there’s no reason to reach for an extension. glom
’s extension API
is easy, but a good old Python lambda
is even easier.
Building your Specifier Type
Any object instance with a glomit
method can participate in a glom
call. By way of example, here is a programming cliché implemented as a
glom specifier type, with comments referencing notes below.
class HelloWorldSpec(object): # 1
def glomit(self, target, scope): # 2
print("Hello, world!")
return target
And now let’s put it to use!
from glom import glom
target = {'example': 'object'}
glom(target, HelloWorldSpec()) # 3
# prints "Hello, world!" and returns target
There are a few things to note from this example:
Specifier types do not need to inherit from any type. Just implement the
glomit
method.The
glomit
signature takes two parameters,target
andscope
. Thetarget
should be familiar from usingglom()
, and it’s thescope
that makes glom really tick.By convention, instances are used in specs passed to
glom()
calls, not the types themselves.
The glom Scope
The glom scope is also used to expose runtime state to the specifier type. Let’s take a look inside a scope:
from glom import glom
from pprint import pprint
class ScopeInspectorSpec(object):
def glomit(self, target, scope):
pprint(dict(scope))
return target
glom(target, ScopeInspectorSpec())
Which gives us:
{T: {'example': 'object'},
<function glom at 0x7f208984d140>: <function _glom at 0x7f208984d5f0>,
<class 'glom.core.Path'>: [],
<class 'glom.core.Spec'>: <__main__.ScopeInspectorSpec object at 0x7f208bf58690>,
<class 'glom.core.Inspect'>: None,
<class 'glom.core.TargetRegistry'>: <glom.core.TargetRegistry object at 0x7f208984b4d0>}
As you can see, all glom’s core workings are present, all under familiar keys:
The current target, accessible using
T
as a scope key.The current spec, accessible under
Spec
.The current path, accessible under
Path
.The
TargetRegistry
, used to register new operations and target types.Even the
glom()
function itself, filed underglom()
.
To learn how to use the scope’s powerful features idiomatically, let’s reimplement at one of glom’s standard specifier types.
Specifiers by example
While we’ve technically created a couple of extensions above, let’s really dig into the features of the scope using an example.
Sum
is a standard extension that ships with glom, and
it works like this:
from glom import glom, Sum
glom([1, 2, 3], Sum())
# 6
The version below does not have as much error handling, but reproduces
all the same basic principles. This version of Sum()
code also
contains comments with references to explanatory notes below.
from glom import glom, Path, T
from glom.core import TargetRegistry, UnregisteredTarget # 1
class Sum(object):
def __init__(self, subspec=T, init=int): # 2
self.subspec = subspec
self.init = init
def glomit(self, target, scope):
if self.subspec is not T:
target = scope[glom](target, self.subspec, scope) # 3
try:
# 4
iterate = scope[TargetRegistry].get_handler('iterate', target, path=scope[Path])
except UnregisteredTarget as ut:
# 5
raise TypeError('can only %s on iterable targets, not %s type (%s)'
% (self.__class__.__name__, type(target).__name__, ut))
try:
iterator = iterate(target)
except Exception as e:
raise TypeError('failed to iterate on instance of type %r at %r (got %r)'
% (target.__class__.__name__, Path(*scope[Path]), e))
return self._sum(iterator)
def _sum(self, iterator): # 6
ret = self.init()
for v in iterator:
ret += v
return ret
Now, let’s take a look at the interesting parts, referencing the comments above:
Specifier types often reference the
TargetRegistry
, which is not part of the top-levelglom
API, and must be imported fromglom.core
. More on this in #4.Specifier type
__init__
methods may take as many or as few arguments as desired, but many glom specifier types take a first parameter of a subspec, meant to be fetched right before the actual specifier’s operation. This helps readability of glomspecs. SeeCoalesce
for an example of this idiom.Specifier types should not reference the
glom()
function directly, instead use theglom()
function as a key to thescope
map to get the currently activeglom()
. This ensures that the extension type is compatible with advanced specifier types which override theglom()
function.To maximize compatiblity with new target types,
glom
allows new types and operations to be registered with theTargetRegistry
. Specifier types should respect this by contextually fetching these standard operators as demonstrated above. At the time of writing, the primary operators used by glom itself are"get"
,"iterate"
,"keys"
,"assign"
, and"delete"
.In the event that the current target does not support your Specifier type’s desired operation, it’s customary to raise a helpful error. Consider creating your own exception type and inheriting from
GlomError
.Specifier types may have other methods and members in addition to the primary
glomit()
method. This_sum()
method implements most of the core of our custom specifier type.
Check out the implementation of the real glom.Sum()
specifier for more details.
Summing up
glom
Specifier Types are more than just add-ons; the extension
architecture is how most of glom
itself is implemented. Build
knowing that the paradigm is as powerful as anything built-in.
If you need more examples, another simple one can be found in
this snippet. glom
’s source code itself
contains many specifiers more advanced than the above. Simply search
the codebase for glomit()
methods and you will find no shortage.
Happy extending!