Flywheel is a library for mapping python objects to DynamoDB tables. It uses a SQLAlchemy-like syntax for queries.
Code lives here: https://github.com/mathcamp/flywheel
Flywheel can be installed with pip
pip install flywheel
Here are the steps to set up a simple example model with flywheel:
# Take care of some imports
from datetime import datetime
from flywheel import Model, Field, Engine
# Set up our data model
class Tweet(Model):
userid = Field(hash_key=True)
id = Field(range_key=True)
ts = Field(data_type=datetime, index='ts-index')
text = Field()
def __init__(self, userid, id, ts, text):
self.userid = userid
self.id = id
self.ts = ts
self.text = text
# Create an engine and connect to an AWS region
engine = Engine()
engine.connect_to_region('us-east-1',
aws_access_key_id=<YOUR AWS ACCESS KEY>,
aws_secret_access_key=<YOUR AWS SECRET KEY>)
# Register our model with the engine so it can create the Dynamo table
engine.register(Tweet)
# Create the dynamo table for our registered model
engine.create_schema()
Now that you have your model, your engine, and the Dynamo table, you can begin adding tweets:
tweet = Tweet('myuser', '1234', datetime.utcnow(), text='@awscloud hey '
'I found this cool new python library for AWS...')
engine.save(tweet)
To get data back out, query it using the engine:
# Get the 10 most recent tweets by 'myuser'
recent = engine.query(Tweet)\
.filter(Tweet.ts <= datetime.utcnow(), userid='myuser')\
.limit(10).all(desc=True)
# Get a specific tweet by a user
tweet = engine.query(Tweet).filter(userid='myuser', id='1234').first()
Since DynamoDB has no schema, you can set arbitrary fields on the tweets:
tweet = Tweet('myuser', '1234', datetime.utcnow(), text='super rad')
tweet.link = 'http://drmcninja.com'
tweet.retweets = 0
engine.save(tweet)
If you want to change a field, just make the change and sync it:
tweet.link = 'http://www.smbc-comics.com'
tweet.sync()
That’s enough to give you a taste. The rest of the docs have more information on creating models, writing queries, or how updates work.
This is what a model looks like:
class Tweet(Model):
userid = Field(hash_key=True)
id = Field(range_key=True)
ts = Field(data_type=datetime, index='ts-index')
text = Field()
The model declares the fields an object has, their data types, and the schema of the table.
Since DynamoDB is a NoSQL database, you can attach arbitrary additional fields (undeclared fields) to the model, and they will be stored appropriately. For example, this tweet doesn’t declare a retweets field, but you could assign it anyway:
tweet.retweets = 7
tweet.sync()
Undeclared fields will not be saved if they begin or end with an underscore. This is intentional behavior so you can set local-only variables on your models.
tweet.retweets = 7 # this is saved to Dynamo
tweet._last_updated = datetime.utcnow() # this is NOT saved to Dynamo
Since models define the schema of a table, you can use them to create or delete tables. Every model has a meta_ field attached to it which contains metadata about the model. This metadata object has the create and delete methods.
from boto.dynamodb2 import connect_to_region
connection = connect_to_region('us-east-1')
Tweet.meta_.create_dynamo_schema(connection)
You can also register your models with the engine and create all the tables at once:
engine.register(User, Tweet, Message)
engine.create_schema()
DynamoDB supports three different data types: STRING, NUMBER, and BINARY. It also supports sets of these types: STRING_SET, NUMBER_SET, BINARY_SET.
You can use these values directly for the model declarations, though they require an import:
from flywheel import Model, Field, STRING, NUMBER
class Tweet(Model):
userid = Field(data_type=STRING, hash_key=True)
id = Field(data_type=STRING, range_key=True)
ts = Field(data_type=NUMBER, index='ts-index')
text = Field(data_type=STRING)
There are other settings for data_type that are represented by python primitives. Some of them (like unicode) are functionally equivalent to the DynamoDB option (STRING). Others, like int, enforce an additional application-level constraint on the data. Each option works transparently, so a datetime field would be set with datetime objects and you could query against it using other datetime‘s.
Below is a table of python types, how they are stored in DynamoDB, and any special notes. For more information, the code for data types is located in types.
Type | Dynamo Type | Description |
---|---|---|
unicode | STRING | Basic STRING type. This is the default for fields |
str | BINARY | |
int | NUMBER | Enforces integer constraint on data |
float | NUMBER | |
set | *_SET | This will use the appropriate type of DynamoDB set |
bool | NUMBER | |
datetime | NUMBER | datetimes MUST be provided in UTC |
date | NUMBER | dates MUST be provided in UTC |
Decimal | NUMBER | If you need decimal precision in your application |
dict | STRING | Stored as json-encoded string |
list | STRING | Stored as json-encoded string |
S3Type | STRING | Stores the S3 key path as a string |
If you attempt to set a field with a type that doesn’t match, it will raise a TypeError. If a field was created with coerce=True it will first attempt to convert the value to the correct type. This means you could set an int field with the value "123" and it would perform the conversion for you.
Note
Certain fields will auto-coerce specific data types. For example, a str field will auto-encode a unicode to utf-8 even if coerce=False. Similarly, a unicode field will auto-decode a str value to a unicode string.
Warning
If an int field is set to coerce values, it will still refuse to drop floating point data. This has the following effect:
>>> class Game(Model):
... title = Field(hash_key=True)
... points = Field(data_type=int, coerce=True)
>>> mygame = Game()
>>> mygame.points = 1.8
ValueError: Field 'points' refusing to convert 1.8 to int! Results in data loss!
If you define a set field with no additional parameters Field(data_type=set), flywheel will ensure that the field is a set, but will perform no type checking on the items within the set. This should work fine for basic uses when you are storing a number or string, but sets are able to contain any data type listed in the table above (and any custom type you declare). All you have to do is specify it in the data_type like so:
from flywheel import Model, Field, set_
from datetime import date
class Location(Model):
name = Field(hash_key=True)
events = Field(data_type=set_(date))
If you don’t want to import set_, you can use an equivalent expression with the python frozenset builtin:
events = Field(data_type=frozenset([date]))
You can use S3Type to quickly and easily reference S3 values from your model objects. This type will store the S3 key in Dynamo and put a Key object in your model.
from flywheel.fields.types import S3Type
class Image(Model):
user = Field(hash_key=True)
name = Field(range_key=True)
taken = Field(data_type=datetime, index='taken-index')
data = Composite('user', 'name', data_type=S3Type('my_image_bucket'),
merge=lambda *a: '/'.join(a))
def __init__(self, user, name, taken):
self.user = user
self.name = name
self.taken = taken
You can use this class like so:
>>> img = Image('Rob', 'Big Sur.jpg', datetime.utcnow())
>>> img.data.set_contents_from_filename(img.name)
>>> engine.save(img)
It will store the image data in the S3 bucket named my_image_bucket and use the path Rob/Big Sur.jpg. See Composite Fields for more about how the key path is generated.
You can define your own custom data types and make them available across all of your models. All you need to do is create a subclass of TypeDefinition. Let’s make a type that will store any python object in pickled format.
from flywheel.fields.types import TypeDefinition, BINARY, Binary
import cPickle as pickle
class PickleType(TypeDefinition):
data_type = pickle # name you use to reference this type
aliases = ['pickle'] # alternate names that reference this type
ddb_data_type = BINARY # data type of the field in dynamo
def coerce(self, value, force):
# Perform no type checking because we can pickle ANYTHING
return value
def ddb_dump(self, value):
# Pickle and convert to a Binary object for boto
return Binary(pickle.dumps(value))
def ddb_load(self, value):
# Convert from a Binary object and unpickle
return pickle.loads(value.value)
Now that you have your type definition, you can either use it directly in your code:
class MyModel(Model):
myobj = Field(data_type=PickleType())
Or you can register it globally and reference it by its data_type or any aliases that were defined.
from flywheel.fields.types import register_type
register_type(PickleType)
class MyModel(Model):
myobj = Field(data_type='pickle')
There are four main key concepts to understanding a DynamoDB table.
Hash key: This field will be sharded. Pick something with relatively random access (e.g. userid is good, timestamp is bad)
Range key: Optional. This field will be indexed, so you can query against it (within a specific hash key).
The hash key and range key together make the Primary key, which is the unique identifier for each object.
Local Secondary Indexes: Optional, up to 5. You may only use these if your table has a range key. These fields are indexed in a similar fashion as the range key. You may also query against them within a specific hash key. You can think of these as a range key with no uniqueness requirements.
Global Secondary Indexes: Optional, up to 5. These indexes have a hash key and optional range key, and can be put on any declared field. This allows you to shard your tables by more than one value.
For additional information on table design, read the AWS docs on best practices
Example declaration of hash and range key:
class Tweet(Model):
userid = Field(hash_key=True)
ts = Field(data_type=datetime, range_key=True)
For this version of a Tweet, each (userid, ts) pair is a unique value.
Indexes also have a Projection Type. Creating an index requires duplicating some amount of data in the storage, and the projection type allows you to optimize how much additional storage is used. The projection types are:
All: All fields are projected into the index
Keys only: Only the primary key and indexed keys are project into the index
Include: Like the “Keys only” projection, but allows you to specify additional fields to project into the index
This is how they it looks in the model declaration:
class Tweet(Model):
userid = Field(hash_key=True)
id = Field(range_key=True)
ts = Field(data_type=datetime).all_index('ts-index')
retweets = Field(data_type=int).keys_index('rt-index')
likes = Field(data_type=int).include_index('like-index', ['text'])
text = Field()
The default index projection is “All”, so you could replace the ts field above with:
ts = Field(data_type=datetime, index='ts-index')
Like their Local counterparts, Global Secondary Indexes can specify a projection type. Unlike their Local counterparts, Global Secondary Indexes are provisioned with a separate read/write throughput from the base table. This can be specified in the model declaration. Here are some examples below:
class Tweet(Model):
__metadata__ = {
'global_indexes': [
GlobalIndex.all('ts-index', 'city', 'ts').throughput(read=10,
write=2),
GlobalIndex.keys('rt-index', 'city', 'retweets')\
.throughput(read=10, write=2),
GlobalIndex.include('like-index', 'city', 'likes',
includes=['text']).throughput(read=10,
write=2),
],
}
userid = Field(hash_key=True)
city = Field()
id = Field(range_key=True)
ts = Field(data_type=datetime)
retweets = Field(data_type=int)
likes = Field(data_type=int)
text = Field()
If you want more on indexes, check out the AWS docs on indexes.
Composite fields allow you to create fields that are combinations of multiple other fields. Suppose you’re creating a table where you plan to store a collection of social media items (tweets, facebook posts, instagram pics, etc). If you make the hash key the id of the item, there is the remote possiblity that a tweet id will collide with a facebook id. Here is the solution:
class SocialMediaItem(Model):
userid = Field(hash_key=True)
type = Field()
id = Field()
uid = Composite('type', 'id', range_key=True)
This will automatically generate a uid field from the values of type and id. For example:
>>> item = SocialMediaItem(type='facebook', id='12345')
>>> print item.uid
facebook:12345
Note that setting a Composite field just doesn’t work:
>>> item.uid = 'ILikeThisIDBetter'
>>> print item.uid
facebook:12345
By default, a Composite field simply joins its subfields with a ':'. You can change that behavior for fancier applications:
def score_merge(likes, replies, deleted):
if deleted:
return None
return likes + 5 * replies
class Post(Model):
userid = Field(hash_key=True)
id = Field(range_key=True)
likes = Field(data_type=int)
replies = Field(data_type=int)
deleted = Field(data_type=bool)
score = Composite('likes', 'replies', 'deleted', data_type=int,
merge=score_merge, index='score-index')
So now you can update the likes or replies count, and the score will automatically change. Which will re-arrange it in the index that you created. Then, if you mark the post as “deleted”, it will remove the score field which removes it from the index.
Whooooaaahh...
The last neat little thing about Composite fields is how you can query them. For numeric Composite fields you probably want to query directly on the score like any other field. But if you’re merging strings like with SocialMediaItem, it can be cleaner to refer to the component fields themselves:
>>> fb_post = engine.query(SocialMediaItem).filter(userid='abc123',
... type='facebook', id='12345').first()
The engine will automatically detect that you’re trying to query on the range key, and construct the uid from the pieces you provided.
Part of the model declaration is the __metadata__ attribute, which is a dict that configures the Model.meta_ object. Models will inherit and merge the __metadata__ fields from their ancestors. Keys that begin with an underscore will not be merged. For example:
class Vehicle(Model):
__metadata__ = {
'_name': 'all-vehicles',
'throughput': {
'read': 10,
'write': 2,
}
}
class Car(Vehicle):
pass
>>> print Car.__metadata__
{'throughput': {'read': 10, 'write': 2}}
Below is a list of all the values that may be set in the __metadata__ attribute of a model.
Key | Type | Description |
---|---|---|
_name | str | The name of the DynamoDB table (defaults to class name) |
_abstract | bool | If True, no DynamoDB table will be created for this model (useful if you just want a class to inherit from) |
throughput | dict | The table read/write throughput (defaults to {‘read’: 5, ‘write’: 5}) |
global_indexes | list | A list of GlobalIndex objects |
The query syntax is heavily inspired by SQLAlchemy. In DynamoDB, queries must use one of the table’s indexes. Queries may only query across a single hash key. This means that for a query there will always be at least one call to filter which will, at a minimum, set the hash key to search on.
# Fetch all tweets made by a user
engine.query(Tweet).filter(Tweet.userid == 'abc123').all()
You may also use inequality filters on range keys and secondary indexes
# Fetch all tweets made by a user in the past day
earlyts = datetime.utcnow() - timedelta(days=1)
engine.query(Tweet).filter(Tweet.userid == 'abc123',
Tweet.ts >= earlyts).all()
There are two final statements that will return all results: all() and gen(). Calling all() will return a list of results. Calling gen() will return a generator. If your query will return a large number of results, using gen() can help you avoid storing them all in memory at the same time.
# Count how many retweets a user has in total
retweets = 0
all_tweets = engine.query(Tweet).filter(Tweet.userid == 'abc123').gen()
for tweet in all_tweets:
retweets += tweet.retweets
there are two final statements that retrieve a single item: first() and one(). Calling first() will return the first element of the results, or None if there are no results. Calling one() will return the first element of the results only if there is exactly one result. If there are no results or more results it will raise a ValueError.
# Get a single tweet by a user
tweet = engine.query(Tweet).filter(Tweet.userid == 'abc123').first()
# Get a specific tweet and fail if missing
tweet = engine.query(Tweet).filter(Tweet.userid == 'abc123',
Tweet.id == '1234').one()
You can set a limit() on a query to limit the number of results it returns:
# Get the first 10 tweets by a user after a timestamp
afterts = datetime.utcnow() - timedelta(hours=1)
tweets = engine.query(Tweet).filter(Tweet.userid == 'abc123',
Tweet.ts >= afterts).limit(10).all()
One way to delete items from a table is with a query. Calling delete() will delete all items that match a query:
# Delete all of a user's tweets older than 1 year
oldts = datetime.utcnow() - timedelta(days=365)
engine.query(Tweet).filter(Tweet.userid == 'abc123',
Tweet.ts < oldts).delete()
99% of the time the query engine will be able to automatically detect which local or global secondary index you intend to use. For that 1% of the time when it’s ambiguous, you can manually specify the index. This can also be useful if you want the results to be sorted by a particular index:
# This is the schema for the following example
class Tweet(Model):
userid = Field(hash_key=True)
id = Field(range_key=True)
ts = Field(data_type=datetime, index='ts-index')
retweets = Field(data_type=int, index='rt-index')
# Get the 10 most retweeted tweets for a user
top_ten = engine.query(Tweet).filter(id='abc123').index('rt-index')\
.limit(10).all(desc=True)
# Get The 10 most recent tweets for a user
top_ten = engine.query(Tweet).filter(id='abc123').index('ts-index')\
.limit(10).all(desc=True)
If you want to avoid typing ‘query’ everywhere, you can simply call the engine:
# Long form query
engine.query(Tweet).filter(Tweet.userid == 'abc123').all()
# Abbreviated query
engine(Tweet).filter(Tweet.userid == 'abc123').all()
Filter constraints with == can be instead passed in as keyword arguments:
# Abbreviated filter
engine(Tweet).filter(userid='abc123').all()
engine(Tweet).filter(userid='abc123', id='1234').first()
You can still pass in other constraints as positional arguments to the same filter:
# Multiple filters in same statement
engine(Tweet).filter(Tweet.ts <= earlyts, userid='abc123').all()
Table scans are similar to table queries, but they do not use an index. This means they have to read every item in the table. This is EXTREMELY SLOW. The benefit is that they do not have to filter based on the hash key, and they have a few additional filter arguments that may be used.
# Fetch all tweets ever
alltweets = engine.scan(Tweet).gen()
# Fetch all tweets that tag awscloud
tagged = engine.scan(Tweet).filter(Tweet.tags.contains_('awscloud')).all()
# Fetch all tweets with annoying, predictable text
annoying = set(['first post', 'hey guys', 'LOOK AT MY CAT'])
first = engine.scan(Tweets).filter(Tweet.text.in_(annoying)).all()
# Fetch all tweets with a link
linked = engine.scan(Tweet).filter(Tweet.link != None).all()
Since table scans don’t use indexes, you can filter on fields that are not declared in the model. Here are some examples:
# Fetch all tweets that link to wikipedia
educational = engine.scan(Tweet)\
.filter(Tweet.field_('link').beginswith_('http://wikipedia')).all()
# You can also use the keyword arguments to filter
best_tweets = engine.scan(Tweet)\
.filter(link='http://en.wikipedia.org/wiki/Morgan_freeman').all()
This section covers the operations you can do to save, read, update, and delete items from the database. All of these methods exist on the Engine object and can be called on one or many items. After being saved-to or loaded-from Dynamo, the items themselves will have these methods attached to them as well. For example, these are both valid:
>>> engine.sync(tweet)
>>> tweet.sync()
Save the item to Dynamo. This is intended for new items that were just created and need to be added to the database. If you save() an item that already exists in Dynamo, it will raise an exception. You may optionally use save(overwrite=True) to instead clobber existing data and write your version of the item to Dynamo.
>>> tweet = Tweet()
>>> engine.save(tweet)
>>> tweet.text = "Let's replace the whole item"
>>> tweet.save(overwrite=True)
Query dynamo to get the most up-to-date version of a model. Clobbers any existing data on the item. To force a consistent read use refresh(consistent=True).
This call is very useful if you query indexes that use an incomplete projection type. The results won’t have all of the item’s fields, so you can call refresh() to get any attributes that weren’t projected onto the index.
>>> tweet = engine.query(Tweet).filter(userid='abc123')\
... .index('ts-index').first(desc=True)
>>> tweet.refresh()
Fetch an item from its primary key fields. This will be faster than a query, but requires you to know the primary key/keys of all items you want fetched.
>>> my_tweet = engine.get(Tweet, userid='abc123', id='1')
You can also fetch many at a time:
>>> key1 = {'userid': 'abc123', 'id': '1'}
>>> key2 = {'userid': 'abc123', 'id': '2'}
>>> key3 = {'userid': 'abc123', 'id': '3'}
>>> some_tweets = engine.get(Tweet, [key1, key2, key3])
Deletes an item. You may pass in delete(raise_on_conflict=True), which will only delete the item if none of the values have changed since it was read.
>>> tweet = engine.query(Tweet).filter(userid='abc123', id='123').first()
>>> tweet.delete()
You may also delete an item from a primary key specification:
>>> engine.delete_key(Tweet, userid='abc123', id='1')
And you may delete many at once:
>>> key1 = {'userid': 'abc123', 'id': '1'}
>>> key2 = {'userid': 'abc123', 'id': '2'}
>>> key3 = {'userid': 'abc123', 'id': '3'}
>>> engine.delete_key(Tweet, [key1, key2, key3])
Save any fields that have been changed on an item. This will update changed fields in Dynamo and ensure that all fields exactly reflect the item in the database. This is usually used for updates, but it can be used to create new items as well.
>>> tweet = Tweet()
>>> engine.sync(tweet)
>>> tweet.text = "Update just this field"
>>> tweet.sync()
Models will automatically detect changes to mutable fields, such as dict, list, and set.
>>> tweet.tags.add('awscloud')
>>> tweet.sync()
Since sync does a partial update, it can tolerate concurrent writes of different fields.
>>> tweet = engine.query(Tweet).filter(userid='abc123', id='1234').first()
>>> tweet2 = engine.query(Tweet).filter(userid='abc123', id='1234').first()
>>> tweet.author = "The Pope"
>>> tweet.sync()
>>> tweet2.text = "Mo' money mo' problems"
>>> tweet2.sync() # it works!
>>> print tweet2.author
The Pope
This “merge” behavior is also what happens when you sync() items to create them. If the item to create already exists in Dynamo, that’s fine as long as there are no conflicting fields. Note that this behavior is distinctly different from save(), so make sure you pick the right call for your use case.
If you call sync() on an object that has not been changed, it is equivalent to calling refresh().
If you use sync(raise_on_conflict=True), the sync operation will check that the fields that you’re updating have not been changed since you last read them. This is very useful for preventing concurrent writes.
Note
If you change a key that is part of a composite field, flywheel will force the sync to raise on conflict. This avoids the risk of corrupting the value of the composite field.
DynamoDB supports truly atomic increment/decrement of NUMBER fields. To use this functionality, there is a special call you need to make:
>>> # Increment the number of retweets by 1
>>> tweet.incr_(retweets=1)
>>> tweet.sync()
BOOM.
Note
Incrementing a field that is part of a composite field will also force the sync to raise on conflict.
DynamoDB also supports truly atomic add/remove to SET fields. To use this functionality, there is another special call:
>>> # Add two users to the set of tagged users
>>> tweet.add_(tags=set(['stevearc', 'dsa']))
>>> tweet.sync()
And to delete:
>>> tweet.remove_(tags='stevearc')
>>> tweet.sync()
Note than you can pass in a single value or a set of values to both add_ and remove_.
You can configure the default behavior for each of these endpoints using default_conflict. The default setting will cause sync() to check for conflicts, delete() not to check for conflicts, and save() to overwrite existing values. Check the attribute docs for more options. You can, of course, pass in the argument to the calls directly to override this behavior on a case-by-case basis.
To get started developing flywheel, run the following command:
wget https://raw.github.com/mathcamp/devbox/master/devbox/unbox.py && \
python unbox.py git@github.com:mathcamp/flywheel
This will clone the repository and install the package into a virtualenv
The command to run tests is python setup.py nosetests. Most of these tests require DynamoDB Local. There is a nose plugin that will download and run the DynamoDB Local service during the tests. It requires the java 6/7 runtime, so make sure you have that installed.
Query constraints
Bases: object
A constraint that will be applied to a query or scan
Attributes
eq_fields | dict | Mapping of field name to field value |
fields | dict | Mapping of field name to (operator, value) tuples |
limit | int | Maximum number of results |
index_name | str | Name of index to use for a query |
Create a Condition on a field
Parameters: | field : str
op : str
other : object
|
---|---|
Returns: | condition : Condition |
Force the query to use a certain index
Parameters: | name : str |
---|---|
Returns: | condition : Condition |
Index definitions
Bases: object
A global index for DynamoDB
Parameters: | name : str
hash_key : str
range_key : str, optional
throughput : dict, optional
|
---|
Select which attributes to project into the index
Set the index throughput
Parameters: | read : int, optional
write : int, optional
|
---|
Notes
This is meant to be used as a chain:
class MyModel(Model):
__metadata__ = {
'global_indexes': [
GlobalIndex('myindex', 'hkey', 'rkey').throughput(5, 2)
]
}
Field type definitions
Bases: flywheel.fields.types.TypeDefinition
Binary strings, stored as a str
Bases: flywheel.fields.types.TypeDefinition
Booleans, backed by a Dynamo Number
Bases: flywheel.fields.types.TypeDefinition
Datetimes, stored as a unix timestamp
Bases: flywheel.fields.types.TypeDefinition
Dates, stored as timestamps
Bases: flywheel.fields.types.TypeDefinition
Numerical values that use Decimal in the application layer.
This should be used if you want to work with floats but need the additional precision of the Decimal type.
Bases: flywheel.fields.types.TypeDefinition
Dict types, stored as a json string
Bases: flywheel.fields.types.TypeDefinition
Float values
Bases: flywheel.fields.types.TypeDefinition
Integer values (includes longs)
Bases: boto.s3.key.Key
Subclass of boto S3 key that adds equality operators
Bases: flywheel.fields.types.TypeDefinition
List types, stored as a json string
Bases: flywheel.fields.types.TypeDefinition
Any kind of numerical value
Bases: flywheel.fields.types.TypeDefinition
Store a link to an S3 key
Parameters: | bucket : str
scheme : str, optional
|
---|
Register a S3 connection scheme
The connection scheme is a collection of keyword arguments that will be passed to connect_s3() when creating a S3 connection.
Parameters: | name : str
**kwargs : dict
|
---|
Bases: flywheel.fields.types.TypeDefinition
Set types
Bases: flywheel.fields.types.TypeDefinition
String values, stored as unicode
Bases: object
Base class for all Field types
Attributes
data_type | |
aliases | list() -> new empty list |
ddb_data_type | |
mutable | bool(x) -> bool |
allowed_filters | set | The set of filters that can be used on this field type |
Check the type of a value and possible convert it
Parameters: | value : object
force : bool
|
---|---|
Returns: | value : object
|
Raises: | exc : TypeError or ValueError
|
Field declarations for models
Bases: flywheel.fields.Field
A field that is composed of multiple other fields
Parameters: | *fields : list
hash_key : bool, optional
range_key : bool, optional
index : str, optional
data_type : str, optional
coerce : bool, optional
check : callable, optional
merge : callable, optional
|
---|
Bases: object
Declarative way to specify model fields
Parameters: | hash_key : bool, optional
range_key : bool, optional
index : str, optional
data_type : object, optional
coerce : bool, optional
check : callable, optional
default : object, optional
|
---|
Notes
Field(index='my-index')
Is shorthand for:
Field().all_index('my-index')
Attributes
name | str | The name of the attribute on the model |
model | class | The Model this field is attached to |
composite | bool | True if this is a composite field |
Index this field and project all attributes
Parameters: | name : str
|
---|
Create a query condition that this field must be between two values (inclusive)
Poetic version of between_()
Check if the provided fields are enough to fully resolve this field
Parameters: | fields : list or set |
---|---|
Returns: | needed : set
|
Index this field and project selected attributes
Parameters: | name : str
includes : list, optional
|
---|
Query engine
Bases: object
Query engine for models
Parameters: | dynamo : boto.dynamodb2.DynamoDBConnection, optional namespace : list, optional
default_conflict : {‘update’, ‘overwrite’, ‘raise’}, optional
|
---|
Notes
The engine is used to save, sync, delete, and query DynamoDB. Here is a basic example of saving items:
item1 = MyModel()
engine.save(item1)
item1.foobar = 'baz'
item2 = MyModel()
engine.save([item1, item2], overwrite=True)
You can also use the engine to query tables:
user = engine.query(User).filter(User.id == 'abcdef).first()
# calling engine() is a shortcut for engine.query()
user = engine(User).filter(User.id == 'abcdef).first()
d_users = engine(User).filter(User.school == 'MIT',
User.name.beginswith_('D')).all()
# You can pass in equality constraints as keyword args
user = engine(User).filter(id='abcdef').first()
Scans are like queries, except that they don’t use an index. Scans iterate over the ENTIRE TABLE so they are REALLY SLOW. Scans have access to additional filter conditions such as “contains” and “in”.
# This is suuuuuper slow!
user = engine.scan(User).filter(id='abcdef').first()
# If you're doing an extremely large scan, you should tell it to return
# a generator
all_users = engine.scan(User).gen()
# to filter a field not specified in the model declaration:
prince = engine.scan(User).filter(User.field_('bio').beginswith_(
'Now this is a story all about how')).first()
Get the default_conflict value
Notes
The default_conflict setting configures the default behavior of save(), sync(), and delete(). Below is an explanation of the different values of default_conflict.
default_conflict | method | default |
---|---|---|
‘update’ | ||
save | overwrite=True | |
sync | raise_on_conflict=True | |
delete | raise_on_conflict=False | |
‘overwrite’ | ||
save | overwrite=True | |
sync | raise_on_conflict=False | |
delete | raise_on_conflict=False | |
‘raise’ | ||
save | overwrite=False | |
sync | raise_on_conflict=True | |
delete | raise_on_conflict=True |
Delete items from dynamo
Parameters: | items : list or Model
raise_on_conflict : bool, optional
|
---|---|
Raises: | exc : boto.dynamodb2.exceptions.ConditionalCheckFailedException
|
Notes
Due to the structure of the AWS API, deleting with raise_on_conflict=False is much faster because the requests can be batched.
Delete one or more items from dynamo as specified by primary keys
Parameters: | model : Model pkeys : list, optional
**kwargs : dict
|
---|---|
Returns: | count : int
|
Notes
If the model being deleted has no range key, you may use strings instead of primary key dicts. ex:
>>> class Item(Model):
... id = Field(hash_key=True)
...
>>> items = engine.delete_key(Item, ['abc', 'def', '123', '456'])
Delete one or more items from dynamo as specified by primary keys
Parameters: | model : Model pkeys : list, optional
**kwargs : dict
|
---|---|
Returns: | count : int
|
Notes
If the model being deleted has no range key, you may use strings instead of primary key dicts. ex:
>>> class Item(Model):
... id = Field(hash_key=True)
...
>>> items = engine.delete_key(Item, ['abc', 'def', '123', '456'])
Fetch one or more items from dynamo from the primary keys
Parameters: | model : Model pkeys : list, optional
consistent : bool, optional
**kwargs : dict
|
---|---|
Returns: | items : list or object
|
Notes
If the model being fetched has no range key, you may use strings instead of primary key dicts. ex:
>>> class Item(Model):
... id = Field(hash_key=True)
...
>>> items = engine.get(Item, ['abc', 'def', '123', '456'])
Overwrite model data with freshest from database
Parameters: | items : list or Model
consistent : bool, optional
|
---|
Register one or more models with the engine
Registering is required for schema creation or deletion
Save models to dynamo
Parameters: | items : list or Model overwrite : bool, optional
|
---|---|
Raises: | exc : boto.dynamodb2.exceptions.ConditionalCheckFailedException
|
Notes
Overwrite will replace the entire item with the new one, not just different fields. After calling save(overwrite=True) you are guaranteed that the item in the database is exactly the item you saved.
Due to the structure of the AWS API, saving with overwrite=True is much faster because the requests can be batched.
Sync model changes back to database
This will push any updates to the database, and ensure that all of the synced items have the most up-to-date data.
Parameters: | items : list or Model
raise_on_conflict : bool, optional
consistent : bool, optional
|
---|---|
Raises: | exc : boto.dynamodb2.exceptions.ConditionalCheckFailedException
|
Model metadata and metaclass objects
Bases: type
Metaclass for Model objects
Merges model metadata, sets the meta_ attribute, and performs validation checks.
Bases: object
Container for model metadata
Parameters: | model : Model |
---|
Attributes
abstract | Getter for abstract |
namespace | list() -> new empty list |
Create all Dynamo tables for this model
Parameters: | connection : DynamoDBConnection tablenames : list, optional
test : bool, optional
wait : bool, optional
throughput : dict, optional
|
---|---|
Returns: | table : str
|
Drop all Dynamo tables for this model
Parameters: | connection : DynamoDBConnection tablenames : list, optional
test : bool, optional
wait : bool, optional
|
---|---|
Returns: | table : str
|
Get a unique ordering from constraint fields
Parameters: | eq_fields : list
fields : list
|
---|---|
Returns: | ordering : Ordering |
Raises: | exc : TypeError
|
Bases: object
A way that the models are ordered
This will be a combination of a hash key and a range key. It may be the primary key, a local secondary index, or a global secondary index.
Model code
Bases: object
Base class for all tube models
For documentation on the metadata fields, check the attributes on the ModelMetadata class.
Attributes
__metadata_class__ | |
__metadata__ | |
meta_ | Container for model metadata |
__engine__ | |
__dirty__ | |
__cache__ | |
__incrs__ |
Construct a dynamo “expects” mapping based on the cached fields
Query and Scan builders
Bases: object
An object used to query dynamo tables
See the Engine for query examples
Parameters: | engine : Engine model : class
|
---|
Return the query results as a list
Parameters: | desc : bool, optional
consistent : bool, optional
attributes : list, optional
|
---|
Add a Condition to constrain the query
Notes
The conditions may be passed in as positional arguments:
engine.query(User).filter(User.id == 12345)
Or they may be passed in as keyword arguments:
engine.query(User).filter(firstname='Monty', lastname='Python')
The limitations of the keyword method is that you may only create equality conditions. You may use both types in a single filter:
engine.query(User).filter(User.num_friends > 10, name='Monty')
Return the first result of the query, or None if no results
Parameters: | desc : bool, optional
consistent : bool, optional
attributes : list, optional
|
---|
Return the query results as a generator
Parameters: | desc : bool, optional
consistent : bool, optional
attributes : list, optional
|
---|
Return the result of the query. If there is not exactly one result, raise a ValueError
Parameters: | consistent : bool, optional
attributes : list, optional
|
---|---|
Raises: | exc : ValueError
|
Bases: flywheel.query.Query
An object used to scan dynamo tables
scans are like Queries except they don’t use indexes. This means they iterate over all data in the table and are SLOW
Parameters: | engine : Engine model : class
|
---|
Unit and system tests for flywheel
Bases: nose.plugins.base.Plugin
Nose plugin to run the Dynamo Local service
See: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.html