Safe Haskell | None |
---|---|
Language | Haskell2010 |
Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq
Description
Sequences of diffs for ledger tables.
These diff sequences are an instantiation of a strict finger tree with root measures. The tree/sequence itself contains diffs and slot information, while the root measure is the total sum of all diffs in the sequence. The internal measure is used to keep track of sequence length and maximum slot numbers.
The diff datatype that we use forms a cancellative monoid, which allows for
relatively efficient splitting of finger trees with respect to recomputing
measures by means of subtracting diffs using the stripPrefix
and
stripSuffix
functions that cancellative monoids provide. Namely, if either
the left or right part of the split is small in comparison with the input
sequence, then we can subtract the diffs in the smaller part from the root
measure of the input to (quickly) compute the root measure of the other
part of the split. This is much faster than computing the root measures from
scratch by doing a linear-time pass over the elements of the split parts, or
a logarithmic-time pass over intermediate sums of diffs in case we store
cumulative diffs in the nodes of the finger tree.
Example of fast splits
As an analogy, consider this example: we have a sequence of consecutive
integer numbers xs = [1..n]
where n
is large, and we define the root
measure of the sequence to be the total sum of these numbers, rmxs = sum
[1..n]
(we assume rmxs
is fully evaluated). Say we split this sequence of
integer numbers at the index 2
, then we get left and right parts of the
split ys
and zs
respectively.
splitAt 2 xs = (ys, zs) = ([1..2], [3..n])
How should we compute we the root measure rmys
of ys
? Since ys
is
small, we can just compute rmys = sum [1..2]
. How should we compute the
root measure rmzs
of zs
? We should not compute rmzs = sum [3..n]
in
this case, since n
is large. Instead, we compute rmzs = rmxs - rmys
,
which evaluates to its result in time that is linear in the length of ys
,
in this case O(1)
.
Why not store sums of diffs in the internal measure instead of the root
measure?
We could also have used the interal measure of the strict finger tree to
store intermediate sums of diffs for all subtrees of the node. The subtree
rooted at the root of the tree would then store the total sum of diffs.
However, we would have now to recompute a possibly logarithmic number of sums
of diffs when we split or extend the sequence. Given that in consensus
we
use the total sum of diffs nearly as often as we split or extend the diff
sequence, this proved to be too costly. The single-instance root measure
reduces the overhead of this "caching" of intermediate sums of diffs by only
using a single total sum of diffs, though augmented with stripPrefix
and
stripSuffix
operations to facilitate computing updated root measures.
Root measures in practice
In consensus, we have the following access pattern. We perform A
then B
a
total of n
times, and then we perform C(n)
once. Repeat.
A = retrieve the total sum of diffs B = snoc a diff to the sequence C(n) = split n diffs from the left of the sequence
In Cardano, n == 100
by default. That means we split roughly 2^7
diffs
from a sequence of length roughly 2^11
. At first glance, it seems
counterintuitive that using a root measured finger tree would be quicker than
using a "normal" finger tree, because the former has a split function with a
linear cost. It needs to recompute the sum of 2^7
diffs, instead of 7
diffs if we were to use the normal finger tree split, which has logarithmic
complexity.
We wrote a benchmark that exercises the root measured finger tree and the normal finger tree according to the described access pattern. It turned out that the root measured fingertree was faster. If we look at the complexity of these operations, then for a normal fingertree:
A = O(1) amortised B = O(1) amortised C(100) = O(log 100) amortised
For a root measured fingertree:
A = O(1) worst-case B = O(1) worst-case C(100) = O(100) worst-case
Complexity wise, the root measured finger tree looks worse, but in practice it performs a bit better than the normal finger tree. It might mean there are higher constants at play for the computational complexity of the normal finger tree operations.
TODO: I wonder if is worth it to keep using the root measured finger tree. The
root measured finger tree sacrifices computational complexity for an algorithm
that works well in pratice for n=100
; given that the flush frequency is
configurable, using a value other than 100
might lead to worse performance
than if we were to use a normal finger tree.
Synopsis
- newtype DiffSeq k v = UnsafeDiffSeq (StrictFingerTree (RootMeasure k v) (InternalMeasure k v) (Element k v))
- data Element k v = Element {}
- data InternalMeasure k v = InternalMeasure {
- imLength ∷ !Length
- imSlotNoL ∷ !(StrictMaybe SlotNoLB)
- imSlotNoR ∷ !(StrictMaybe SlotNoUB)
- newtype Length = Length {}
- data RootMeasure k v = RootMeasure {
- rmLength ∷ !Length
- rmDiff ∷ !(Diff k v)
- rmNumInserts ∷ !(Sum Int)
- rmNumDeletes ∷ !(Sum Int)
- newtype SlotNoLB = SlotNoLB {}
- newtype SlotNoUB = SlotNoUB {}
- type SM k v = SuperMeasured (RootMeasure k v) (InternalMeasure k v) (Element k v)
- cumulativeDiff ∷ SM k v ⇒ DiffSeq k v → Diff k v
- length ∷ SM k v ⇒ DiffSeq k v → Int
- numDeletes ∷ SM k v ⇒ DiffSeq k v → Sum Int
- numInserts ∷ SM k v ⇒ DiffSeq k v → Sum Int
- append ∷ (Ord k, Eq v) ⇒ DiffSeq k v → DiffSeq k v → DiffSeq k v
- empty ∷ (Ord k, Eq v) ⇒ DiffSeq k v
- extend ∷ SM k v ⇒ DiffSeq k v → SlotNo → Diff k v → DiffSeq k v
- maxSlot ∷ SM k v ⇒ DiffSeq k v → StrictMaybe SlotNo
- minSlot ∷ SM k v ⇒ DiffSeq k v → StrictMaybe SlotNo
- split ∷ SM k v ⇒ (InternalMeasure k v → Bool) → DiffSeq k v → (DiffSeq k v, DiffSeq k v)
- splitAt ∷ SM k v ⇒ Int → DiffSeq k v → (DiffSeq k v, DiffSeq k v)
- splitAtFromEnd ∷ (SM k v, HasCallStack) ⇒ Int → DiffSeq k v → (DiffSeq k v, DiffSeq k v)
- splitAtSlot ∷ SM k v ⇒ SlotNo → DiffSeq k v → (DiffSeq k v, DiffSeq k v)
- fromAntiDiff ∷ Diff k v → Diff k v
- toAntiDiff ∷ Diff k v → Diff k v
Sequences of diffs
A sequence of key-value store differences.
INVARIANT: The slot numbers of consecutive elements should be strictly
increasing. Manipulating the underlying
directly may
break this invariant.StrictFingerTree
Constructors
UnsafeDiffSeq (StrictFingerTree (RootMeasure k v) (InternalMeasure k v) (Element k v)) |
Instances
Instances
Functor (Element k) Source # | |||||
Generic (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Associated Types
| |||||
(Show k, Show v) ⇒ Show (Element k v) Source # | |||||
(Eq k, Eq v) ⇒ Eq (Element k v) Source # | |||||
(NoThunks k, NoThunks v) ⇒ NoThunks (Element k v) Source # | |||||
Measured (InternalMeasure k v) (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods measure ∷ Element k v → InternalMeasure k v Source # | |||||
(Ord k, Eq v) ⇒ RootMeasured (RootMeasure k v) (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods measureRoot ∷ Element k v → RootMeasure k v Source # | |||||
type Rep (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq type Rep (Element k v) = D1 ('MetaData "Element" "Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq" "ouroboros-consensus-0.26.0.0-inplace" 'False) (C1 ('MetaCons "Element" 'PrefixI 'True) (S1 ('MetaSel ('Just "elSlotNo") 'SourceUnpack 'SourceStrict 'DecidedStrict) (Rec0 SlotNo) :*: S1 ('MetaSel ('Just "elDiff") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Diff k v)))) |
data InternalMeasure k v Source #
Constructors
InternalMeasure | |
Fields
|
Instances
Functor (InternalMeasure k) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods fmap ∷ (a → b) → InternalMeasure k a → InternalMeasure k b # (<$) ∷ a → InternalMeasure k b → InternalMeasure k a # | |||||
Sized (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods size ∷ InternalMeasure k v → Int Source # | |||||
Monoid (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods mempty ∷ InternalMeasure k v # mappend ∷ InternalMeasure k v → InternalMeasure k v → InternalMeasure k v # mconcat ∷ [InternalMeasure k v] → InternalMeasure k v # | |||||
Semigroup (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods (<>) ∷ InternalMeasure k v → InternalMeasure k v → InternalMeasure k v # sconcat ∷ NonEmpty (InternalMeasure k v) → InternalMeasure k v # stimes ∷ Integral b ⇒ b → InternalMeasure k v → InternalMeasure k v # | |||||
Generic (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Associated Types
Methods from ∷ InternalMeasure k v → Rep (InternalMeasure k v) x # to ∷ Rep (InternalMeasure k v) x → InternalMeasure k v # | |||||
Show (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods showsPrec ∷ Int → InternalMeasure k v → ShowS # show ∷ InternalMeasure k v → String # showList ∷ [InternalMeasure k v] → ShowS # | |||||
Eq (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods (==) ∷ InternalMeasure k v → InternalMeasure k v → Bool # (/=) ∷ InternalMeasure k v → InternalMeasure k v → Bool # | |||||
NoThunks (InternalMeasure k v) Source # | |||||
Measured (InternalMeasure k v) (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods measure ∷ Element k v → InternalMeasure k v Source # | |||||
type Rep (InternalMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq type Rep (InternalMeasure k v) = D1 ('MetaData "InternalMeasure" "Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq" "ouroboros-consensus-0.26.0.0-inplace" 'False) (C1 ('MetaCons "InternalMeasure" 'PrefixI 'True) (S1 ('MetaSel ('Just "imLength") 'SourceUnpack 'SourceStrict 'DecidedStrict) (Rec0 Length) :*: (S1 ('MetaSel ('Just "imSlotNoL") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (StrictMaybe SlotNoLB)) :*: S1 ('MetaSel ('Just "imSlotNoR") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (StrictMaybe SlotNoUB))))) |
Length of a sequence of differences.
Instances
Monoid Length Source # | |
Semigroup Length Source # | |
Generic Length Source # | |
Num Length Source # | |
Show Length Source # | |
Eq Length Source # | |
Ord Length Source # | |
LeftCancellative Length Source # | |
LeftReductive Length Source # | |
RightCancellative Length Source # | |
RightReductive Length Source # | |
NoThunks Length Source # | |
type Rep Length Source # | |
data RootMeasure k v Source #
Constructors
RootMeasure | |
Fields
|
Instances
Functor (RootMeasure k) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods fmap ∷ (a → b) → RootMeasure k a → RootMeasure k b # (<$) ∷ a → RootMeasure k b → RootMeasure k a # | |||||
(Ord k, Eq v) ⇒ Monoid (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods mempty ∷ RootMeasure k v # mappend ∷ RootMeasure k v → RootMeasure k v → RootMeasure k v # mconcat ∷ [RootMeasure k v] → RootMeasure k v # | |||||
(Ord k, Eq v) ⇒ Semigroup (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods (<>) ∷ RootMeasure k v → RootMeasure k v → RootMeasure k v # sconcat ∷ NonEmpty (RootMeasure k v) → RootMeasure k v # stimes ∷ Integral b ⇒ b → RootMeasure k v → RootMeasure k v # | |||||
Generic (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Associated Types
Methods from ∷ RootMeasure k v → Rep (RootMeasure k v) x # to ∷ Rep (RootMeasure k v) x → RootMeasure k v # | |||||
(Show k, Show v) ⇒ Show (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods showsPrec ∷ Int → RootMeasure k v → ShowS # show ∷ RootMeasure k v → String # showList ∷ [RootMeasure k v] → ShowS # | |||||
(Eq k, Eq v) ⇒ Eq (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods (==) ∷ RootMeasure k v → RootMeasure k v → Bool # (/=) ∷ RootMeasure k v → RootMeasure k v → Bool # | |||||
(Ord k, Eq v) ⇒ LeftCancellative (RootMeasure k v) Source # | |||||
(Ord k, Eq v) ⇒ LeftReductive (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods isPrefixOf ∷ RootMeasure k v → RootMeasure k v → Bool Source # stripPrefix ∷ RootMeasure k v → RootMeasure k v → Maybe (RootMeasure k v) Source # | |||||
(Ord k, Eq v) ⇒ RightCancellative (RootMeasure k v) Source # | |||||
(Ord k, Eq v) ⇒ RightReductive (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods isSuffixOf ∷ RootMeasure k v → RootMeasure k v → Bool Source # stripSuffix ∷ RootMeasure k v → RootMeasure k v → Maybe (RootMeasure k v) Source # | |||||
(NoThunks k, NoThunks v) ⇒ NoThunks (RootMeasure k v) Source # | |||||
(Ord k, Eq v) ⇒ RootMeasured (RootMeasure k v) (Element k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq Methods measureRoot ∷ Element k v → RootMeasure k v Source # | |||||
type Rep (RootMeasure k v) Source # | |||||
Defined in Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq type Rep (RootMeasure k v) = D1 ('MetaData "RootMeasure" "Ouroboros.Consensus.Storage.LedgerDB.V1.DiffSeq" "ouroboros-consensus-0.26.0.0-inplace" 'False) (C1 ('MetaCons "RootMeasure" 'PrefixI 'True) ((S1 ('MetaSel ('Just "rmLength") 'SourceUnpack 'SourceStrict 'DecidedStrict) (Rec0 Length) :*: S1 ('MetaSel ('Just "rmDiff") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Diff k v))) :*: (S1 ('MetaSel ('Just "rmNumInserts") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Sum Int)) :*: S1 ('MetaSel ('Just "rmNumDeletes") 'NoSourceUnpackedness 'SourceStrict 'DecidedStrict) (Rec0 (Sum Int))))) |
A lower bound on slot numbers.
Constructors
SlotNoLB | |
Fields |
Instances
Monoid SlotNoLB Source # | |
Semigroup SlotNoLB Source # | |
Generic SlotNoLB Source # | |
Num SlotNoLB Source # | |
Show SlotNoLB Source # | |
Eq SlotNoLB Source # | |
Ord SlotNoLB Source # | |
NoThunks SlotNoLB Source # | |
type Rep SlotNoLB Source # | |
An upper bound on slot numbers.
Constructors
SlotNoUB | |
Fields |
Instances
Monoid SlotNoUB Source # | |
Semigroup SlotNoUB Source # | |
Generic SlotNoUB Source # | |
Num SlotNoUB Source # | |
Show SlotNoUB Source # | |
Eq SlotNoUB Source # | |
Ord SlotNoUB Source # | |
NoThunks SlotNoUB Source # | |
type Rep SlotNoUB Source # | |
Short-hands for type-class constraints
type SM k v = SuperMeasured (RootMeasure k v) (InternalMeasure k v) (Element k v) Source #
Short-hand for
.SuperMeasured
Queries
Construction
Slots
Splitting
splitAtFromEnd ∷ (SM k v, HasCallStack) ⇒ Int → DiffSeq k v → (DiffSeq k v, DiffSeq k v) Source #
Conversion
fromAntiDiff ∷ Diff k v → Diff k v Source #
toAntiDiff ∷ Diff k v → Diff k v Source #