Chapter 4: Temporal and Dynamic Networks
About This Chapter
Every network we have studied in the preceding five chapters has been a photograph. We froze the world at a single moment, assembled an edge list, and analysed the resulting static structure. That was the right starting point. Static graph theory gives us indispensable vocabulary — degree, centrality, clustering, community structure, network formation — and the theorems proved in that framework are true. But the world does not hold still long enough for the photograph to remain a faithful description.
Consider what happens when you take the same snapshot approach to a financial network. At the end of each quarter, a large bank files a report listing its aggregate exposures to other institutions. The picture looks reassuring: well-diversified counterparty positions, no single institution accounting for more than a few percent of total exposure. But within a single trading day in September 2008, intraday credit lines were extended and retracted, collateral calls propagated through the system, and Lehman Brothers went from a functioning institution to a defaulting one within hours. A yearly snapshot of the interbank network saw nothing coming. The temporal sequence of who extended credit to whom, and when, was the true risk landscape — and it was invisible to static analysis.
Or consider the spread of a viral tweet. Whether a piece of misinformation reaches ten thousand people or ten million is not determined by the eventual aggregate structure of the retweet graph. It is determined by the ordering of edges: a retweet that occurs while the original is trending on the main feed will propagate to millions; the same retweet six hours later, after the algorithm has moved on, reaches almost nobody. The edges in the final aggregate graph are the same in both cases. The timing is everything.
This chapter develops the tools for studying networks where time is not an afterthought but a first-class dimension of the data. We cover three ways of representing time-stamped network data; the concept of a temporal path and its implications for reachability; temporal analogues of the centrality measures from Chapter 1; the formal notion of a temporal motif; dynamic link prediction; the effect of bursty human interaction patterns on the speed of spreading processes; and community detection across evolving snapshots. We close with a mini case study in which a synthetic financial transaction network reveals a systemic-risk node that is unremarkable under static analysis but dominant under temporal analysis.
Table of Contents
- Representing Time-Stamped Edges
- Temporal Paths and Reachability
- Temporal Centralities
- Temporal Motifs
- Dynamic Link Prediction
- Spreading on Temporal Networks
- Snapshot-Based Community Detection
- Mini Case Study: Temporal Betweenness in a Financial Transaction Network
Representing Time-Stamped Edges
Why the representation choice matters
A temporal network is a collection of events, each of which involves a pair of nodes and a time. The raw material of such data arrives in many forms: server logs recording every API call between microservices, contact-tracing records showing which individuals were within two metres of each other and for how long, order-book records in an electronic exchange listing every matched trade between a buyer and a seller at a millisecond timestamp. Before any analysis can begin, we must decide how to represent these events mathematically. That decision is not neutral — it determines which questions are easy to answer and which become invisible.
Three representations are in common use. Each trades off temporal resolution against analytical tractability in a different way.
Representation 1: the event stream
The most faithful representation stores each contact as a triple \((u, v, t)\) meaning “node \(u\) and node \(v\) interacted at time \(t\).” No information is discarded. The data structure is simply an ordered list of triples, sorted by time. This is called a contact sequence or event stream.
\[\mathcal{E} = \{(u_1, v_1, t_1),\; (u_2, v_2, t_2),\; \ldots,\; (u_m, v_m, t_m)\}, \quad t_1 \leq t_2 \leq \cdots \leq t_m\]
The event stream is the natural representation for algorithms that need to respect time ordering — temporal path computation, motif detection, and burstiness analysis all operate on the raw stream. Its weakness is that it does not aggregate: two nodes who interact a hundred times each day produce a hundred distinct events, making the stream very long and complicating statistical summaries.
Representation 2: the snapshot sequence
The snapshot representation partitions continuous time into discrete windows \([0, \Delta), [\Delta, 2\Delta), [2\Delta, 3\Delta), \ldots\) and, within each window, collapses all events between any given pair of nodes into a single edge (possibly weighted by frequency). The result is a sequence of static graphs \(G_1, G_2, \ldots, G_T\), one per time window.
\[G_\tau = \left(V,\; E_\tau\right), \quad E_\tau = \{(u,v) : \exists\, (u,v,t) \in \mathcal{E},\; (\tau-1)\Delta \leq t < \tau\Delta\}\]
Snapshot graphs are the most common representation in practice because they are easy to store (each snapshot is a standard static graph) and easy to feed into existing graph algorithms. The cost is the choice of window size \(\Delta\): too coarse, and you erase causal ordering within a window; too fine, and most snapshots are nearly empty and the sequence is very long. There is no universally correct choice — the right \(\Delta\) is determined by the timescale of the process you are studying. For epidemic contact tracing, a window of one day is natural. For high-frequency trading, a window of one second may be too coarse.
Representation 3: the interval graph
A third representation, used when contacts have duration rather than being instantaneous, assigns an interval \([t_{\text{start}}, t_{\text{end}}]\) to each edge. This is natural for phone calls (the call has a beginning and an end), for physical proximity measured by wearable sensors, or for credit lines that are open for a period of days. Formally:
\[\mathcal{E}_{\text{interval}} = \{(u, v, t_{\text{start}}, t_{\text{end}})\}\]
The interval representation allows asking whether two contacts overlap in time — a question that matters for disease transmission (two people must be present simultaneously) but not for information diffusion via asynchronous messages. For the remainder of this chapter we work primarily with the event stream and snapshot representations, which suffice for the key algorithms we develop.
High-frequency trading venues generate event streams of extraordinary density. The NASDAQ matching engine processes over ten million order events per second during active market hours. Each event is a tuple \((b, s, t, p, q)\) — buyer, seller, timestamp, price, quantity. Researchers studying market microstructure — how prices incorporate information, how liquidity providers respond to order flow — need the full event stream; any aggregation into snapshots at even one-second resolution erases the causal structure that determines whose order filled first and at what price. The temporal network toolbox developed in this chapter is directly applicable to order-book data, though the sheer volume requires specialised database infrastructure (e.g., kdb+/q at most proprietary trading shops) that goes well beyond NetworkX.
Live cell: building all three representations from a single event stream
The cell below defines a small temporal network of eight nodes exchanging messages over forty time steps, and then builds the event stream, a three-snapshot sequence, and an aggregate static graph from the same data. Examining all three simultaneously makes the trade-offs concrete.
Reading the figure. The three snapshot panels show how edge activity is distributed across time: some edges appear in all three snapshots (persistent ties), while others flicker in and out. The aggregate panel collapses all of this into a single dense graph that looks well-connected. The aggregate representation is easy to analyse with the tools from Chapters 1–5, but it conceals whether two nodes were active simultaneously or months apart — information that is essential for questions about causation, contagion, and timing.
Temporal Paths and Reachability
The time-respecting path
In a static graph, a path from \(u\) to \(v\) is any sequence of adjacent nodes. In a temporal network, a path must additionally respect the arrow of time: you cannot travel backwards. Formally, a temporal path from \(u\) to \(v\) is a sequence of events
\[u = w_0 \xrightarrow{t_1} w_1 \xrightarrow{t_2} w_2 \xrightarrow{t_3} \cdots \xrightarrow{t_k} w_k = v\]
such that \((w_{i-1}, w_i, t_i) \in \mathcal{E}\) for all \(i\), and, crucially,
\[t_1 \leq t_2 \leq \cdots \leq t_k\]
Each step in the path must occur at a time no earlier than the previous step. This constraint is the heart of temporal network theory, and it has immediate consequences that have no analogue in static graphs.
Reachability is not symmetric. In an undirected static graph, if \(u\) can reach \(v\) then \(v\) can also reach \(u\). This is simply because you can traverse any path in reverse. In a temporal undirected graph, this symmetry breaks completely. If the only path from \(u\) to \(v\) uses edges at times \(t_1 < t_2 < t_3\), then the reversed path would require traversing those same edges in the order \(t_3 > t_2 > t_1\) — moving backwards in time, which is not permitted. It is entirely possible for \(v\) to be reachable from \(u\) but for \(u\) to be unreachable from \(v\), even when the underlying undirected graph is completely connected.
This asymmetry has profound practical consequences. In an epidemic model, the forward reachable set of a node \(u\) at time \(t_0\) — the set of nodes that can be infected if \(u\) is infected at time \(t_0\) — can be much smaller than the reachable set implied by the static aggregate graph. A contact that occurred before \(t_0\) provides no forward route; only contacts at time \(t \geq t_0\) matter. Conversely, the backward reachable set of a node (who could have infected this node, given that it is observed infected at time \(T\)?) runs time in reverse: it includes only nodes with whom the infected node had contact at \(t \leq T\).
Defining reachable sets formally
Fix a temporal network \(\mathcal{E}\) and a starting condition: node \(s\) is active at time \(t_0\). The forward reachable set \(\mathcal{F}(s, t_0)\) is:
\[\mathcal{F}(s, t_0) = \{v \in V : \exists \text{ a temporal path from } s \text{ to } v \text{ using only events at time } t \geq t_0\}\]
The backward reachable set \(\mathcal{B}(s, T)\) for a node \(s\) observed at time \(T\) is:
\[\mathcal{B}(s, T) = \{u \in V : \exists \text{ a temporal path from } u \text{ to } s \text{ using only events at time } t \leq T\}\]
Note that \(s \in \mathcal{F}(s, t_0)\) and \(s \in \mathcal{B}(s, T)\) by convention. In general, \(|\mathcal{F}(s, t_0)|\) will differ from $|(s, t_0’)| $ for different starting times \(t_0 \neq t_0'\), and \(|\mathcal{F}(s, t_0)|\) can be much smaller than the static reachable set of \(s\) in the aggregate graph.
Forward and backward reachability form the backbone of contact-tracing algorithms in epidemiology. When a positive COVID-19 test is confirmed for individual \(s\) at time \(T\) (after an incubation period of, say, five days), the relevant question for public health authorities is: who did \(s\) have contact with in the window \([T-5, T]\)? That is exactly the backward reachable set \(\mathcal{B}(s, T)\), restricted to the relevant incubation window. Forward tracing then asks: given that \(s\) was infectious during \([T-5, T]\), who might \(s\) have infected? That is \(\mathcal{F}(s, T-5)\) restricted to contacts in \([T-5, T]\). Bluetooth-based contact-tracing apps (the Google–Apple Exposure Notification framework used in most countries’ COVID apps) maintain rolling logs of exactly this event-stream data, then execute variants of the reachability computation described here.
Live cell: computing forward and backward reachable sets
Reading the output. Notice how the forward reachable set from node 0 shrinks as \(t_0\) increases: starting later in time means fewer future edges are available to carry the spreading process. When \(t_0 = 20\), the reachable set may be considerably smaller than the full node set, even though the aggregate graph shows node 0 connected to nearly everyone. The static reachable set, printed last, treats all edges as simultaneously available and therefore always returns the entire connected component — an overestimate of what is actually reachable within any finite time window.
Temporal Centralities
Why static centrality is insufficient
The centrality measures of Chapter 1 — degree, closeness, betweenness, eigenvector — are defined on static graphs. Applying them to the aggregate temporal graph produces answers that are sometimes misleading. A node may appear highly central in the aggregate because it was a hub during a brief early window and then became inactive; its apparent centrality says nothing about its current influence. Conversely, a node that becomes active only in the final time steps may be critically well-positioned for future spreading but looks peripheral in the aggregate. Temporal centralities address this by extending each classical definition to respect time ordering.
Temporal closeness centrality
In a static graph, closeness centrality of node \(i\) is the inverse of the average geodesic distance from \(i\) to all other nodes. In a temporal network, we replace geodesic distance with the shortest temporal path duration from \(i\) to each other node, starting at time \(t_0\).
Let \(d^{\mathcal{T}}(i, j; t_0)\) denote the length (number of hops) of the fastest temporal path from \(i\) to \(j\) using events at time \(t \geq t_0\), or \(\infty\) if \(j \notin \mathcal{F}(i, t_0)\). Define also the temporal latency \(\ell^{\mathcal{T}}(i, j; t_0) = t_{\text{arrival}} - t_0\) as the elapsed time from \(t_0\) until \(j\) is reached. The temporal closeness centrality of node \(i\) with respect to starting time \(t_0\) is:
\[C^{\mathcal{T}}_C(i; t_0) = \frac{|\mathcal{F}(i, t_0)| - 1}{\displaystyle\sum_{j \in \mathcal{F}(i,\,t_0),\, j \neq i} \ell^{\mathcal{T}}(i, j; t_0)}\]
The numerator normalises for the size of the forward reachable set (nodes unreachable at \(t_0\) are excluded from the sum, but the normalisation penalises a small reachable set). When \(|\mathcal{F}(i, t_0)| = 1\) (only \(i\) itself is reachable), we set \(C^{\mathcal{T}}_C(i; t_0) = 0\) by convention. Note that this reduces to the standard closeness formula when the temporal network is fully connected and all edges carry unit latency.
Temporal betweenness centrality
Temporal betweenness centrality of node \(i\) counts how often \(i\) appears as an intermediate step on the fastest temporal path between two other nodes \(s\) and \(t\):
\[C^{\mathcal{T}}_B(i) = \sum_{s \neq i \neq t} \frac{\sigma^{\mathcal{T}}_{st}(i)}{\sigma^{\mathcal{T}}_{st}}\]
where \(\sigma^{\mathcal{T}}_{st}\) is the number of fastest temporal paths from \(s\) to \(t\) (optimising over all possible starting times) and \(\sigma^{\mathcal{T}}_{st}(i)\) is the number of such paths that pass through \(i\). The normalisation by \((n-1)(n-2)\) (or half that for symmetric computation) converts the raw count to a fraction.
The key difference from static betweenness is that the set of paths being counted is much smaller: many \((s, t)\) pairs that are connected in the aggregate graph have no time-respecting path at all, or have only a single time-respecting path, so \(i\) either appears on it or does not. Temporal betweenness therefore identifies nodes that are structurally indispensable as relays in the causal flow of the network, not merely nodes that sit on shortest paths in the aggregate graph.
Temporal PageRank
The snapshot-based extension of PageRank propagates rank scores forward through the snapshot sequence. Let \(\mathbf{r}^{(\tau)}\) be the PageRank vector on snapshot \(G_\tau\). The temporal PageRank at snapshot \(\tau + 1\) is:
\[\mathbf{r}^{(\tau+1)} = \alpha \cdot M^{(\tau+1)} \mathbf{r}^{(\tau)} + (1-\alpha) \frac{\mathbf{1}}{n}\]
where \(M^{(\tau+1)}\) is the column-stochastic version of the adjacency matrix of \(G_{\tau+1}\) and \(\alpha\) is the damping factor. This formulation allows rank to “carry over” from one period to the next: a node that was highly authoritative in snapshot \(\tau\) starts snapshot \(\tau+1\) with an elevated initial score. Several variants exist (some reset rank fully at each snapshot; others decay it at a rate that models “forgetting”); the formulation above is the natural first-order generalisation.
Live cell: temporal closeness vs. static closeness
The cell below computes temporal closeness (using arrival-time latency as the distance measure) for all nodes across two starting times, then compares to static closeness on the aggregate graph.
Reading the output. The rankings under temporal closeness (starting at \(t_0 = 1\)) will broadly resemble the static ranking, because at \(t_0 = 1\) nearly all edges are available. However, starting at \(t_0 = 20\) produces a meaningfully different ranking: nodes whose edges are concentrated in the first half of the time series lose closeness, while nodes whose contacts are spread evenly or concentrated later gain it. The correlation figures quantify how much the static centrality overstates or misranks nodes relative to a temporally-aware measure.
Key takeaway. Static centrality computed on the aggregate graph conflates structural position with temporal availability. A node that was active early and is now dormant looks central in the static picture but can reach nobody going forward. In any application where the direction of time matters — epidemics, financial contagion, information diffusion — temporal centrality measures should be preferred over their static counterparts.
Temporal Motifs
What is a temporal motif?
In a static graph, a motif is a small connected subgraph that appears more often than expected by chance. Three-node motifs — triangles, wedges, directed feedforward loops — have been studied extensively as structural signatures of social networks, biological regulatory networks, and citation graphs. The extension to temporal networks is due to Paranjape, Benson, and Leskovec (2017), whose framework we follow here.
A \(\delta\)-temporal motif is an ordered sequence of \(k\) edges
\[M = \bigl((u_1, v_1, t_1),\, (u_2, v_2, t_2),\, \ldots,\, (u_k, v_k, t_k)\bigr)\]
satisfying two conditions: 1. The induced graph on the nodes \(\{u_1, v_1, \ldots, u_k, v_k\}\) is connected and matches a specified static graph pattern (e.g., a triangle, a wedge, a path of length 2). 2. The events are temporally ordered and bounded: \(t_1 \leq t_2 \leq \cdots \leq t_k\) and \(t_k - t_1 \leq \delta\).
The time window \(\delta\) is the key parameter. A small \(\delta\) counts only bursts of activity in which all edges of the motif occur within a short interval — capturing tight causal sequences. A large \(\delta\) relaxes the timing constraint and approaches the static motif count. For most social interaction data, \(\delta\) is chosen to match the timescale of the interaction process: minutes for online messaging, hours for email, days for citation-based interaction.
Why temporal motifs are richer than static motifs
Consider a three-edge triangle \((A \to B, B \to C, A \to C)\) in a directed temporal network. The static triangle has a single structural type. But temporally, the three edges can be ordered in \(3! = 6\) distinct ways, each corresponding to a different causal story:
- \((A \to B)\) then \((A \to C)\) then \((B \to C)\): \(A\) contacts \(B\) and \(C\) independently, then \(B\) contacts \(C\) — possibly \(B\) was introduced to \(C\) through \(A\).
- \((A \to B)\) then \((B \to C)\) then \((A \to C)\): a sequential chain followed by a direct redundant link — consistent with broadcast-then-relay.
- \((B \to C)\) then \((A \to B)\) then \((A \to C)\): \(B\) already knows \(C\) when \(A\) contacts \(B\); \(A\) then independently contacts \(C\).
Each temporal ordering encodes a different social mechanism. Counting which orderings are over-represented relative to a temporal null model (typically obtained by randomly permuting timestamps) tells you which causal processes are the dominant drivers of network activity.
Fraud-ring detection in transaction networks relies heavily on temporal motifs. A fraud ring typically operates by opening a sequence of accounts (nodes) and executing a sequence of transactions between them in a specific causal order: an initial “layering” sequence in which money moves A → B → C in rapid succession, followed by a “integration” event that collapses the trail. The temporal motif corresponding to this pattern — two sequential directed edges sharing a relay node, both within a window of, say, one hour — is many times more common in known fraudulent transaction streams than in legitimate payment flows. Financial institutions implement real-time temporal motif counters on their transaction graphs as a core component of anti-money-laundering (AML) detection pipelines.
Formal count for 2-node, 2-edge motifs
To make the counting concrete and computationally tractable in a browser environment, we work with the simplest non-trivial case: two-node motifs involving two edges between the same pair of nodes within a window \(\delta\). For a directed network there are two types: a reciprocal pair \((u \to v, v \to u)\) and a repeated contact \((u \to v, u \to v)\). For an undirected network, only the repeated contact is non-trivial. We then extend to three-edge path motifs: \((u,v)\) at \(t_1\), \((v,w)\) at \(t_2 > t_1\), with \(t_2 - t_1 \leq \delta\) — a time-respecting two-hop relay.
\[\text{2-hop relay motif count}(\delta) = \bigl|\{(u,v,w,t_1,t_2) : (u,v,t_1) \in \mathcal{E},\; (v,w,t_2) \in \mathcal{E},\; u \neq w,\; 0 < t_2 - t_1 \leq \delta\}\bigr|\]
Live cell: counting 3-edge temporal motifs
Reading the output. The relay motif count grows monotonically with \(\delta\) because a larger window admits more pairs of events. The interesting comparison is not the absolute count but its growth rate: in a network with bursty activity, the count grows rapidly for small \(\delta\) (many events cluster in short bursts) and then plateaus for large \(\delta\) (the remaining pairs are separated by long gaps). A flatter growth curve indicates more regular, Poisson-like inter-event timing. This connection between motif count profiles and inter-event time distributions is developed further in Section 6 when we study spreading on temporal networks.
Dynamic Link Prediction
The problem
Given the history of a temporal network up to time \(t\), which edges will appear in the future interval \((t, t + \Delta]\)? This is the dynamic link prediction problem, and it is one of the most practically important problems in temporal network analysis. Recommendation systems — “people you may know,” “accounts you might want to follow,” “products frequently bought together” — are all link prediction systems operating on snapshots of evolving interaction graphs. Credit officers at banks assess whether two firms that have never transacted are likely to form a future credit relationship. Fraud analysts ask whether two accounts that appear unconnected are likely to interact in the future, which would be a warning signal for a dormant fraud ring.
Common baselines
The simplest link prediction baselines use local structural similarity computed on the most recent snapshot, or on a recency-weighted aggregate, to score candidate edges.
Common Neighbor (CN) score. The number of nodes adjacent to both \(u\) and \(v\) in the current snapshot:
\[\text{CN}(u, v) = |N(u) \cap N(v)|\]
Adamic–Adar (AA) score. Proposed by Adamic and Adar in 2003 for predicting missing links in social networks, this weights each common neighbour \(w\) by the inverse log of its degree — the intuition being that a rare mutual acquaintance provides stronger evidence of a tie than a mutual acquaintance who is friends with everyone:
\[\text{AA}(u, v) = \sum_{w \in N(u) \cap N(v)} \frac{1}{\log\bigl(\deg(w)\bigr)}\]
When \(\deg(w) = 1\), the term is undefined; in practice one adds a smoothing constant, replacing \(\log(\deg(w))\) with \(\log(1 + \deg(w))\).
Jaccard coefficient. The fraction of the combined neighbourhood that is shared:
\[\text{Jaccard}(u, v) = \frac{|N(u) \cap N(v)|}{|N(u) \cup N(v)|}\]
For dynamic link prediction, these scores are computed on a snapshot \(G_{\tau}\) (the “training” window) and evaluated against the edges that appear in a subsequent snapshot \(G_{\tau+1}\) (the “test” window). Candidate edges are all non-edges in \(G_\tau\) among node pairs that have at least one common neighbour. The quality of predictions is measured by the area under the ROC curve (AUC): the probability that a randomly chosen future edge scores higher than a randomly chosen non-edge that does not appear in the future window.
LinkedIn’s “People You May Know” feature is one of the most widely used link prediction systems in existence, generating tens of millions of suggestions per day to over 900 million registered users. The production system uses a combination of features that generalize the baseline scores above: common connections, shared employers, shared schools, similar job titles, reciprocal profile views, and geographic proximity. A gradient-boosted tree ensemble (similar to XGBoost) is trained on historical connection events with these features as inputs. The Adamic–Adar score for the professional network context translates naturally: a mutual connection who is themselves a widely-connected “super-connector” provides weaker evidence of a meaningful relationship than a mutual connection who is selectively connected to a small number of colleagues in the same niche.
Live cell: Adamic–Adar link prediction with AUC evaluation
Reading the output. An AUC above 0.5 indicates that the predictor is better than random at ranking future edges above non-edges. On this small eight-node network, the separation is modest but detectable. In larger empirical networks (email networks, collaboration graphs, financial transaction networks with hundreds of nodes and thousands of events), Adamic–Adar typically achieves AUC values in the range 0.7–0.85 on held-out future windows. The histogram confirms the mechanism: future edges tend to have higher scores because the pairs that transact in the test window already shared more common neighbours in the training window.
Before running the cell, predict which score — Adamic–Adar or Common Neighbors — will achieve the higher AUC on this network, and why. Consider: the two scores differ only in how they weight common neighbours. In this small network with at most 7 nodes, degrees are never very large. Does the log-weighting of Adamic–Adar make much difference when degrees are small? Write your prediction before running.
Spreading on Temporal Networks
The burstiness effect
One of the most important and counterintuitive findings in temporal network science concerns the relationship between the temporal pattern of human interactions and the speed of spreading processes. Conventional epidemic models — including the SIR model introduced in Chapter 4 — assume that contacts occur at a constant Poisson rate: the time between successive contacts between any two individuals is exponentially distributed with a fixed mean. This is mathematically convenient but empirically wrong.
Empirical studies of human communication and physical contact — mobile phone records (Barabási, 2005), face-to-face contact via wearable sensors (Cattuto et al., 2010), and email exchange logs (Karsai et al., 2011) — consistently show that inter-event times between contacts follow a heavy-tailed (typically power-law) distribution rather than an exponential one. Interactions cluster in bursts: a person sends many messages in rapid succession, then is silent for a long time. The waiting time between bursts can be orders of magnitude longer than the waiting time within a burst.
This clustering has a surprising consequence for spreading processes: burstiness slows contagion relative to a Poisson baseline with the same mean contact rate. The intuition is as follows. An infected node that is currently in a “silent” gap between bursts cannot transmit to its neighbours, even though it would have constant transmission opportunities under the Poisson assumption. The long tails of the inter-event distribution mean that some infected nodes will wait a very long time before their next burst of contacts, delaying the spread. Under the Poisson process, by contrast, waiting times are short and roughly uniform, so every infected node is quickly given an opportunity to transmit.
This result was formalised by Iribarren and Moro (2009) and Karsai et al. (2011), among others, and has been confirmed in multiple empirical datasets. It has practical implications: if you want to model the spread of information or disease over a communication network, using the real temporal contact pattern (with burstiness) will generally give slower, more realistic diffusion dynamics than a memoryless Poisson-rate model.
Formalising the comparison
Let \(\mathcal{E}_{\text{real}}\) denote the original event stream. Define the Poisson-resampled event stream \(\mathcal{E}_{\text{Poisson}}\) by keeping the same set of node pairs (preserving the static aggregate graph) but replacing each node pair’s sequence of event times with a homogeneous Poisson process having the same mean rate as the original. This preserves who interacts with whom and how often, on average, but destroys the temporal clustering.
An SI spreading process on \(\mathcal{E}_{\text{real}}\) versus \(\mathcal{E}_{\text{Poisson}}\) then reveals whether burstiness accelerates or retards diffusion. In the SI model on an event stream, a susceptible node \(v\) becomes infected the first time it experiences an event \((v, u, t)\) with an already-infected \(u\) at time \(t \geq t_0\).
The interplay between burstiness and epidemic spread became practically important during the COVID-19 pandemic. Superspreading events — a choir rehearsal, a nightclub, a wedding — represent extreme bursts of contact: many simultaneous or near-simultaneous exposures within a short time window. Contact patterns during such events are highly non-Poisson. Models that assumed Poisson contact rates substantially underestimated the risk posed by these rare high-density events, contributing to delayed recognition of their outsized role in transmission chains. Subsequent epidemic models incorporated overdispersion of the secondary case distribution (the \(k\) parameter in the negative binomial model) as a proxy for bursty contact patterns, producing much better fits to observed outbreak data.
Live cell: SI spreading on bursty vs. Poisson-resampled events
Reading the figure. The left panel shows that the Poisson-resampled event stream (red dashed) typically produces faster spreading than the bursty original (blue solid). The right panel explains why: the bursty inter-event time distribution has a heavy right tail — many long gaps between contacts — while the Poisson distribution is concentrated near the mean, giving infected nodes more regular and frequent transmission opportunities. The “time to infect 50%” summary statistic printed below the figure typically shows the Poisson variant reaching the halfway point noticeably earlier than the bursty variant. This confirms the theoretical prediction: burstiness retards diffusion relative to a memoryless Poisson baseline, even when the aggregate contact rate is identical.
Snapshot-Based Community Detection
Communities that evolve
Chapter 3 studied community detection in static graphs: algorithms — Girvan-Newman, Louvain, Infomap — that partition a fixed set of nodes into groups whose internal edges are dense relative to between-group edges. In a temporal network, community structure is not fixed. Groups emerge, grow, contract, merge with other groups, split into sub-communities, and eventually dissolve. Understanding this evolution is often as important as understanding the communities themselves.
The natural framework for temporal community detection is the snapshot sequence. Apply a static community detection algorithm independently to each snapshot \(G_1, G_2, \ldots, G_T\), obtaining a partition \(\mathcal{P}_\tau\) at each time step. The challenge is then to identify corresponding communities across snapshots — to say “community \(A\) in \(G_3\) is the same community as community \(A'\) in \(G_4\)” — and to detect the life-cycle events (birth, death, growth, contraction, merge, split) that define the evolution.
Life-cycle events
Seven elementary life-cycle events are commonly distinguished in the literature (Palla et al., 2007; Greene et al., 2010):
- Birth: a community \(C\) appears in \(G_\tau\) with no corresponding community in \(G_{\tau-1}\).
- Death: a community \(C\) in \(G_{\tau-1}\) has no corresponding community in \(G_\tau\).
- Growth: a community gains members while retaining most of its previous core.
- Contraction: a community loses members while retaining most of its previous core.
- Merge: two or more communities in \(G_{\tau-1}\) combine into a single community in \(G_\tau\).
- Split: a single community in \(G_{\tau-1}\) divides into two or more communities in \(G_\tau\).
- Continuation: a community retains most of its members with minor changes.
Community correspondence across snapshots is typically determined by a Jaccard overlap threshold: community \(C_\tau\) in \(G_\tau\) is matched to community \(C_{\tau-1}\) in \(G_{\tau-1}\) if
\[J(C_\tau, C_{\tau-1}) = \frac{|C_\tau \cap C_{\tau-1}|}{|C_\tau \cup C_{\tau-1}|} > \theta\]
for a threshold \(\theta\) (commonly 0.3 to 0.5). A merge is detected when two communities at time \(\tau-1\) both satisfy the Jaccard threshold with the same community at time \(\tau\). A split is the reverse.
Live cell: community split across three snapshots
Reading the figure. In Snapshot 1, the greedy modularity algorithm correctly identifies two communities — one anchored on nodes \(\{0,1,2,3\}\) and one on \(\{4,5,6,7\}\) — connected by the bridge edge \((3,4)\). In Snapshot 2, the bridge is removed, which may or may not change the community structure depending on how the algorithm handles the now-disconnected pair. In Snapshot 3, nodes \(\{0,1\}\) and \(\{2,3\}\) are separated entirely on the left side, producing a split: the original community \(\{0,1,2,3\}\) has divided into two smaller communities. The Jaccard overlap scores printed in the earlier cell confirm this split: the Snapshot 2 community containing \(\{0,1,2,3\}\) has Jaccard overlaps of approximately 0.5 with each of the two post-split communities in Snapshot 3, satisfying the typical threshold for declaring a split event.
Key takeaway. Running static community detection independently at each snapshot and tracking Jaccard overlaps across snapshots is a pragmatic and widely used approach to temporal community evolution. Its weakness is that it may produce artificially erratic evolution when the community detection algorithm is sensitive to small perturbations in the graph. More sophisticated methods — tensor decompositions, dynamic stochastic block models — treat the entire snapshot sequence as a single unified inference problem, at the cost of substantially greater computational complexity. For most applied problems involving a moderate number of snapshots, the Jaccard-based approach described here is a robust starting point.
Mini Case Study: Temporal Betweenness in a Financial Transaction Network
Setup and motivation
The preceding sections have developed a rich set of analytical tools. We now apply them to a stylised financial network that captures the essential structure of an interbank payment system. The scenario is this: eight firms — labelled A through H — exchange payments with one another over fifty discrete time steps. The payments follow a realistic pattern: Firm D acts as a central clearinghouse that routes many transactions between other firms, but only during a specific window of the observation period. During that window, all flows pass through D. Outside that window, D is relatively inactive.
The question we will ask is: does static betweenness centrality on the aggregate graph identify D as systemically important? And what does temporal betweenness reveal that the static measure misses?
This question has direct regulatory relevance. The designation of systemically important financial institutions (SIFIs) under Basel III and the Dodd-Frank Act rests on measures of interconnectedness — essentially, how central is a given institution to the payment and settlement network? If static centrality and temporal centrality disagree, regulators who rely only on aggregate snapshots may be designating the wrong institutions as SIFIs, or failing to identify truly systemic nodes that are only intermittently central.
Generating the synthetic transaction network
Computing static vs. temporal betweenness
Interpretation: the punchline
The comparison above is the central lesson of this chapter, expressed in a single table and two network diagrams.
In the static aggregate graph, Firm D appears as one node among several with moderate betweenness centrality. Every pair of firms that ever transacted — even if their only transactions were baseline direct contacts during periods when D was inactive — contributes a shortest path to the static calculation. Many of those paths do not pass through D. The aggregate graph “sees” D as moderately connected, but not dramatically more central than Firms B, C, or E, which also participate in many pairwise transactions.
In the temporal event stream, the picture is different. The earliest-arrival temporal path between any two firms whose transaction had to pass through D during the \(t = [15, 35]\) window is a path that goes through D. Firms that had no direct contact with each other during that window — and whose only route to each other was via D — all contribute temporal betweenness to D. Meanwhile, the baseline direct contacts between non-D firms contribute relatively little temporal betweenness because those paths have no intermediaries. The result is that Firm D’s temporal betweenness rank rises sharply relative to its static rank.
The regulatory implication is direct. A regulator examining quarterly aggregate exposure data — the static picture — might identify Firm D as moderately important but not systemically critical. A regulator with access to intraday transaction event streams would identify Firm D as the single node through which the entire payment system flowed for a three-week period. If Firm D had failed during that window, the temporal reachability of every other firm from every other firm would have collapsed — a cascade that no static analysis would have predicted.
This is not a theoretical curiosity. The 2008 financial crisis featured exactly this pattern. AIG’s credit default swap desk served as a temporal intermediary for enormous volumes of risk transfer during a concentrated period. Its role was not visible in quarterly balance sheet snapshots; it became visible only when transaction-level data was assembled after the fact. The tools in this chapter provide the framework for identifying such nodes prospectively, before a crisis rather than after.
Modern payment system regulators — the Federal Reserve, the European Central Bank, the Bank of England — now routinely analyse real-time gross settlement (RTGS) systems using temporal network methods. The Fed’s Fedwire system, for example, settles over $4 trillion in transactions per day. The temporal betweenness of individual banks within a single business day fluctuates dramatically as settlement cycles progress through the day, and certain banks act as highly central intermediaries for only a two-hour window around the end-of-day settlement deadline. Identifying these temporal hubs requires exactly the kind of event-stream analysis developed in this chapter. Several central banks have published working papers applying temporal centrality measures to their RTGS data (e.g., the Bank of Canada, the European Central Bank’s TARGET2 analysis); these provide excellent primary sources for the applied reader.
Summary
This chapter has extended the network analysis toolkit from static graphs to temporal networks, in which edges carry timestamps and the ordering of events is analytically essential.
Section 1 established three representations of temporal data — the event stream, the snapshot sequence, and the interval graph — and showed how the choice of representation determines which questions are tractable. The aggregate graph, obtained by ignoring timestamps, is a useful but reductive summary that erases causal structure.
Section 2 introduced the temporal path, which must respect time ordering. The resulting asymmetry of reachability — that \(v\) being reachable from \(u\) does not imply \(u\) is reachable from \(v\), even on an undirected temporal network — is the foundational structural insight of the chapter and drives all subsequent analysis. Forward and backward reachable sets are the temporal analogues of connected components.
Section 3 defined temporal closeness and temporal betweenness centrality. These measures weight only time-respecting paths, producing rankings that can differ substantially from their static counterparts. A node that is active only in a particular time window may appear unremarkable in the static aggregate but dominant in any temporal measure computed over that window.
Section 4 formalised temporal motifs following Paranjape, Benson, and Leskovec (2017): ordered \(k\)-edge subgraphs that occur within a time window \(\delta\). Temporal motif counts reveal causal interaction patterns — relays, cascades, reciprocal exchanges — that static motif counts collapse into single undifferentiated types.
Section 5 developed dynamic link prediction, showing how Adamic–Adar and common-neighbor scores computed on a training snapshot can be used to predict which edges will activate in a future test snapshot. AUC evaluation against held-out future edges provides a natural performance criterion.
Section 6 demonstrated the burstiness effect: heavy-tailed inter-event time distributions, which characterise real human communication patterns, slow the spread of SI contagion relative to a Poisson baseline with the same mean contact rate. The practical implication is that epidemic models calibrated on aggregate contact rates will systematically overestimate the speed of diffusion in real human networks.
Section 7 applied snapshot-based community detection to a three-snapshot toy graph, showing a community split event detected by tracking Jaccard overlaps across consecutive partitions. The seven elementary community life-cycle events — birth, death, growth, contraction, merge, split, continuation — provide a structured vocabulary for describing community evolution.
The mini case study brought all of these threads together in a financial transaction network. The central finding — that Firm D’s temporal betweenness rank substantially exceeds its static betweenness rank, because D serves as a temporal relay during a concentrated window that is invisible in aggregate data — is the chapter’s punchline and a direct application to systemic risk identification.
Looking ahead. The tools developed in this chapter treat each node’s activity as determined by an exogenous event stream. Chapter 7 relaxes this assumption and asks how strategic agents embedded in a network choose their interactions: when to form ties, how much information to share, and when to free-ride on others’ information. The tension between temporal dynamics and strategic interaction — epitomised by the question “does knowing the network’s temporal structure change a rational agent’s behaviour?” — is an active frontier of both network science and economic theory.
Prof. Xuhu Wan · HKUST · Modern AI Stack for Social Data · 2026 Edition