May 11, 2014 12:06 PM by Daniel Chambers (last modified on May 18, 2014 2:40 PM)
In my last post, I showed how to use FSharp.Azure to modify data in Azure table storage. FSharp.Azure is a library I recently released that allows F# developers to write idiomatic F# code to talk to Azure table storage. In this post, we’ll look at the opposite of data modification: data querying.
To use FSharp.Azure, install the NuGet package: FSharp.Azure. At the time of writing the package is marked as beta, so you will need to include pre-releases by using the checkbox on the UI, or using the (v1.0.0 has been released!)–Pre
flag on the console.
Once you’ve installed the package, you need to open the TableStorage module to use the table storage functions:
open DigitallyCreated.FSharp.Azure.TableStorage
To provide an idiomatic F# experience when querying table storage, FSharp.Azure supports the use of record types when querying. For example, the following record type would be used to read a table with columns that match the field names:
type Game = { Name : string Developer : string HasMultiplayer : bool Notes : string }
We will use this record type in the examples below. We will also assume, for the sake of these examples, that the Developer field is also used as the PartitionKey and the Name field is used as the RowKey.
FSharp.Azure also supports querying class types that implement the Microsoft.WindowsAzure.Storage.Table.ITableEntity
interface.
The easiest way to use the FSharp.Azure API is to define a quick helper function that allows you to query for rows from a particular table:
open Microsoft.WindowsAzure.Storage open Microsoft.WindowsAzure.Storage.Table let account = CloudStorageAccount.Parse "UseDevelopmentStorage=true;" //Or your connection string here let tableClient = account.CreateCloudTableClient() let fromGameTable q = fromTable tableClient "Games" q
The fromGameTable
function fixes the tableClient
and table name parameters of the fromTable
function, so you don't have to keep passing them. This technique is very common when using the FSharp.Azure API.
Here's how we'd query for all rows in the "Games" table:
let games = Query.all<Game> |> fromGameTable
games
above is of type seq<Game * EntityMetadata>
. The EntityMetadata
type contains the Etag and Timestamp of each Game. Here's how you might work with that:
let gameRecords = games |> Seq.map fst let etags = games |> Seq.map (fun game, metadata -> metadata.Etag)
The etags in particular are useful when updating those records in table storage, because they allow you to utilise Azure Table Storage's optimistic concurrency protection to ensure nothing else has changed the record since you queried for it.
The Query.where
function allows you to use an F# quotation of a lambda to specify what conditions you want to filter by. The lambda you specify must be of type:
'T -> SystemProperties -> bool
The SystemProperties
type allows you to construct filters against system properties such as the Partition Key and Row Key, which are the only two properties that are indexed by Table Storage, and therefore the ones over which you will most likely be performing filtering.
For example, this is how we'd get an individual record by PartitionKey and RowKey:
let halo4, metadata = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "343 Industries" && s.RowKey = "Halo 4" @> |> fromGameTable |> Seq.head
You can, however, query over properties on your record type too. Be aware that queries over those properties are not indexed by Table Storage and as such will suffer performance penalties.
For example, if we wanted to find all multiplayer games made by Valve, we could write:
let multiplayerValveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" && g.HasMultiplayer @> |> fromGameTable
The following operators/functions are supported for use inside the where lambda:
=
, <>
, <
, <=
, >
, >=
operators not
function Table storage allows you to limit the query results to be only the first 'n' results it finds. Naturally, FSharp.Azure supports this.
Here's an example query that limits the results to the first 5 multiplayer games made by Valve:
let multiplayerValveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" && g.HasMultiplayer @> |> Query.take 5 |> fromGameTable
Azure table storage may not return all the results that match the query in one go. Instead it may split the results over multiple segments, each of which must be queried for separately and sequentially. According to MSDN, table storage will start segmenting results if:
FSharp.Azure supports handling query segmentation manually as well as automatically. The fromTable
function we used in the previous examples returns a seq
that will automatically query for additional segments as you iterate.
If you want to handle segmentation manually, you can use the fromTableSegmented
function instead of fromTable
. First, define a helper function:
let fromGameTableSegmented c q = fromTableSegmented tableClient "Games" c q
The fromGameTableSegmented
function will have the type:
TableContinuationToken option -> EntityQuery<'T> -> List<'T * EntityMetadata> * TableContinuationToken option
This means it takes an optional continuation token and the query, and returns the list of results in that segment, and optionally the continuation token used to access the next segment, if any.
Here's an example that gets the first two segments of query results:
let query = Query.all<Game> let games1, segmentToken1 = query |> fromGameTableSegmented None //None means querying for the first segment (ie. no continuation) //We're making the assumption segmentToken1 here is not None and therefore //there is another segment to read. In practice, this is a very poor assumption //to make, since segmentation is performed arbitrarily by table storage if segmentToken1.IsNone then failwith "No segment 2!" let games2, segmentToken2 = query |> fromGameTableSegmented segmentToken1
In practice, you'd probably write a recursive function or a loop to iterate through the segments until a certain condition.
FSharp.Azure also supports asynchronous equivalents of fromTable
and fromTableSegmented
. To use them, you would first create your helper functions:
let fromGameTableAsync q = fromTableAsync tableClient "Games" q let fromGameTableSegmentedAsync c q = fromTableSegmentedAsync tableClient "Games" c q
fromTableAsync
automatically and asynchronously makes requests for all the segments and returns all the results in a single seq
. Note that unlike fromTable
, all segments are queried for during the asynchronous operation, not during sequence iteration. (This is because seq
doesn't support asynchronous iteration.)
Here's an example of using fromTableAsync
:
let valveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" @> |> fromGameTableAsync |> Async.RunSynchronously
And finally, an example using the asynchronous segmentation variant:
let asyncOp = async { let query = Query.all<Game> let! games1, segmentToken1 = query |> fromGameTableSegmentedAsync None //None means querying for the first segment (ie. no continuation) //We're making the assumption segmentToken1 here is not None and therefore //there is another segment to read. In practice, this is a very poor assumption //to make, since segmentation is performed arbitrarily by table storage if segmentToken1.IsNone then failwith "No segment 2!" let! games2, segmentToken2 = query |> fromGameTableSegmentedAsync segmentToken1 return games1 @ games2 } let games = asyncOp |> Async.RunSynchronously
In this post, we’ve covered the nitty gritty details of querying with FSharp.Azure. Hopefully you find this series of posts and the library itself useful; if you have, please do leave a comment or tweet to me at @danielchmbrs.