When I started work on the Action Oriented Workflow algorithm the idea I was trying to mine as that sufficient history on work actions on given business objects could be used to determine predictive ways to route new work through an organization such that traditional boundaries of stagnation are overridden. These boundaries include the offices, teams, buildings, organizational units, divisions and regions that all corporations strive to have as they grow to international power.
It there for was always about not so much "big data" gathered relentlessly and inquired for trends by interrogation of the data, instead it works by isolating the best trends within medium to small groups of interaction...it is more about "right data" than "big data". The statistical pitfalls that attend cross boundary mixing of any kind of data are naturally modeled in AOW because workflows always *start* by encapsulating the action patterns between known groups of working agents which are always small at first. This is key as it means true signals of the action landscape are built first before being diluted to a different optimal regime when new agents from other workflows are crossed into interaction.
It is only when these groups couple their work via the natural action of cross team collaboration (and under social oversight) that the new dna of adjacent workflows mixes such that a new optimal regime is explored involving the two (or more) connected groups of agents possessing of those workflows...thus statistical shaping to outlier interactions within these small groups start at their *highest* resolution of efficiency and then are dwindled down only as cross workflow interactions are engaged...meanwhile the algorithm continues to learn optimal for that new regime.
So there is no global guess done over all possible actions of disparate agents being sampled there is a refined but growing set of local guesses.
I posit that this is the optimal way to determine predictive routing between agents action histories and is far less susceptible to the gross predictions made by global guess algorithms like the one used in Google Flu that could have simple critical assumptions about the sample space lead to wildly inaccurate "predictions".